Comparing b9ce7ec11f..c72ab8ba06 - mesa

fran/mesa

Author	SHA1	Message	Date
Andres Gomez	8d3ccdbb9b	mesa: replace binary constants with hexadecimal constants The binary constant notation "0b" is a GCC extension. Instead, we use hexadecimal notation to fix the MSVC 2013 build: Compiling src\mesa\main\texcompress_astc.cpp ... texcompress_astc.cpp src\mesa\main\texcompress_astc.cpp(111) : error C2059: syntax error : 'bad suffix on number' ... src\mesa\main\texcompress_astc.cpp(1007) : fatal error C1003: error count exceeds 100; stopping compilation scons: *** [build\windows-x86-debug\mesa\main\texcompress_astc.obj] Error 2 scons: building terminated because of errors. v2: Fix wrong conversion (Ilia). Fixes: `38ab39f650` ("mesa: add ASTC 2D LDR decoder") Cc: Marek Olšák <marek.olsak@amd.com> Cc: Brian Paul <brianp@vmware.com> Cc: Roland Scheidegger <sroland@vmware.com> Cc: Mike Lothian <mike@fireburn.co.uk> Cc: Gert Wollny <gert.wollny@collabora.com> Cc: Dieter Nützel <Dieter@nuetzel-hh.de> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-08-02 10:06:44 +03:00
Andres Gomez	1090e97e77	ddebug: use util_snprintf() in dd_get_debug_filename_and_mkdir Instead of plain snprintf(). To fix the MSVC 2013 build: Compiling src\gallium\auxiliary\driver_ddebug\dd_draw.c ... dd_draw.c c:\projects\mesa\src\gallium\auxiliary\driver_ddebug\dd_util.h(60) : warning C4013: 'snprintf' undefined; assuming extern returning int ... gallium.lib(dd_draw.obj) : error LNK2001: unresolved external symbol _snprintf build\windows-x86-debug\gallium\targets\graw-gdi\graw.dll : fatal error LNK1120: 1 unresolved externals scons: *** [build\windows-x86-debug\gallium\targets\graw-gdi\graw.dll] Error 1120 scons: building terminated because of errors. Fixes: `6ff0c6f4eb` ("gallium: move ddebug, noop, rbug, trace to auxiliary to improve build times") Cc: Marek Olšák <marek.olsak@amd.com> Cc: Brian Paul <brianp@vmware.com> Cc: Roland Scheidegger <sroland@vmware.com> Cc: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-08-02 10:06:44 +03:00
Andres Gomez	d7694136d3	kutil/queue: use util_snprintf() in util_queue_init Instead of plain snprintf(). To fix the MSVC 2013 build: Compiling src\util\u_queue.c ... u_queue.c src\util\u_queue.c(325) : warning C4013: 'snprintf' undefined; assuming extern returning int ... mesautil.lib(u_queue.obj) : error LNK2001: unresolved external symbol _snprintf scons: building terminated because of errors. Fixes: `b238e33bc9` ("kutil/queue: add a process name into a thread name") Cc: Marek Olšák <marek.olsak@amd.com> Cc: Brian Paul <brianp@vmware.com> Cc: Roland Scheidegger <sroland@vmware.com> Cc: Timothy Arceri <tarceri@itsqueeze.com> Cc: Eric Engestrom <eric.engestrom@intel.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-08-02 10:06:44 +03:00
Andres Gomez	18d9dc179f	gallium/aux/util: use util_snprintf() in test_texture_barrier Instead of plain snprintf(). To fix the MSVC 2013 build: Compiling src\gallium\auxiliary\util\u_tests.c ... u_tests.c src\gallium\auxiliary\util\u_tests.c(624) : warning C4013: 'snprintf' undefined; assuming extern returning int ... gallium.lib(u_tests.obj) : error LNK2019: unresolved external symbol _snprintf referenced in function _test_texture_barrier build\windows-x86-debug\gallium\targets\graw-gdi\graw.dll : fatal error LNK1120: 1 unresolved externals scons: *** [build\windows-x86-debug\gallium\targets\graw-gdi\graw.dll] Error 1120 scons: building terminated because of errors. Fixes: `56342c97ee` ("gallium/u_tests: test FBFETCH and shader-based blending with MSAA") Cc: Marek Olšák <marek.olsak@amd.com> Cc: Brian Paul <brianp@vmware.com> Cc: Roland Scheidegger <sroland@vmware.com> Cc: Dieter Nützel <Dieter@nuetzel-hh.de> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-08-02 10:06:44 +03:00
Andres Gomez	9d220fa950	glsl: use util_snprintf() Instead of plain snprintf(). To fix the MSVC 2013 build. Fixes: `6ff0c6f4eb` ("gallium: move ddebug, noop, rbug, trace to auxiliary to improve build times") Cc: Marek Olšák <marek.olsak@amd.com> Cc: Brian Paul <brianp@vmware.com> Cc: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-08-02 10:06:44 +03:00
Jordan Justen	8fcdb71d8c	intel/compiler: Add brw_get_compiler_config_value for disk cache During code review, Jason pointed out that: `2b3064c073` "i965, anv: Use INTEL_DEBUG for disk_cache driver flags" Didn't account for INTEL_SCALER_* environment variables. To fix this, let the compiler return the disk_cache driver flags. Another possible fix would be to pull the INTEL_SCALER_* into INTEL_DEBUG bits, but as we are currently using 41 of 64 bits, I didn't think it was a good use of 4 more of these bits. (5 since INTEL_PRECISE_TRIG needs to be accounted for as well.) Cc: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-01 23:49:16 -07:00
Jordan Justen	3887700dfd	i965: Disable shader cache with INTEL_DEBUG=shader_time Shader time hard codes an index of the shader time buffer within the gen program. In order to support shader time in the disk shader cache, we'd need to add the shader time index into the program key. This should work, but probably is not worth it for this particular debug feature. Therefore, let's just disable the disk shader cache if the shader time debug feature is used. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106382 Fixes: `96fe36f7ac` "i965: Enable disk shader cache by default" Cc: Eero Tamminen <eero.t.tamminen@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-01 23:30:49 -07:00
Timothy Arceri	bea4722c2e	glsl: make a copy of array indices that are used to deref a function out param Fixes new piglit test: tests/spec/glsl-1.20/execution/qualifiers/vs-out-conversion-int-to-float-vec4-index.shader_test Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-08-02 11:06:28 +10:00
Jason Ekstrand	de9e5cf35a	anv/pipeline: Add populate_tcs/tes_key helpers They don't really do anything interesting, but it's more consistent this way. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-01 18:02:28 -07:00
Jason Ekstrand	e621f57556	anv/pipeline: Rework the parameters to populate_wm_prog_key Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-01 18:02:28 -07:00
Jason Ekstrand	b2e0b0dad6	anv/pipeline: More aggressively optimize away color attachments Instead of just looking at the number of color attachments, look at which ones are actually used by the subpass. This lets us potentially throw away chunks of the fragment shader. In DXVK, for example, all subpasses have 8 attachments and most are VK_ATTACHMENT_UNUSED so this is very helpful in that case. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-01 18:02:28 -07:00
Jason Ekstrand	80bc0b728c	anv: Restrict the number of color regions to those actually written The back-end compiler emits the number of color writes specified by wm_prog_key::nr_color_regions regardless of what nir_store_outputs we have. Once we've gone through and figured out which render targets actually exist and are written by the shader, we should restrict the key to avoid extra RT write messages. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-01 18:02:28 -07:00
Jason Ekstrand	4d57e543b8	anv/pipeline: Fix up deref modes if we delete a FS output With the new deref instructions, we have to keep the modes consistent between the derefs and the variables they reference. Since we remove outputs by changing them to local variables, we need to run the fixup pass to fix the modes. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-01 18:02:28 -07:00
Jason Ekstrand	7f75cf2a94	nir/lower_indirect: Bail early if modes == 0 There's no point in walking the program if we're never going to actually lower anything. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-01 18:02:28 -07:00
Jason Ekstrand	4434591bf5	intel/nir: Call nir_lower_io_to_scalar_early Shader-db results on Kaby Lake: total instructions in shared programs: 15166953 -> 15073611 (-0.62%) instructions in affected programs: 2390284 -> 2296942 (-3.91%) helped: 16469 HURT: 505 total loops in shared programs: 4954 -> 4951 (-0.06%) loops in affected programs: 3 -> 0 helped: 3 HURT: 0 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-01 18:02:28 -07:00
Jason Ekstrand	b0bb547f78	intel/nir: Split IO arrays into elements The NIR nir_lower_io_arrays_to_elements pass attempts to split I/O variables which are arrays or matrices into a sequence of separate variables. This can help link-time optimization by allowing us to remove varyings at a more granular level. Shader-db results on Kaby Lake: total instructions in shared programs: 15177645 -> 15168494 (-0.06%) instructions in affected programs: 79857 -> 70706 (-11.46%) helped: 392 HURT: 0 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-01 18:02:28 -07:00
Jason Ekstrand	57804efa88	i965/fs: Flag all slots of a flat input as flat Otherwise, only the first vec4 of a matrix or other complex type will get marked as flat and we'll interpolate the others. This was caught by a dEQP test which started failing because it did a SSO vs. non-SSO comparison. Previously, we did the interpolation wrong consistently in both versions. However, with one of Tim Arceri's NIR linkingpatches, we started splitting the matrix input into vectors at link time in the non-SSO version and it started getting correctly interpolated which didn't match the broken SSO version. As of this commit, they both get correctly interpolated. Fixes: `e61cc87c75` "i965/fs: Add a flat_inputs field to prog_data" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-01 18:02:28 -07:00
Jason Ekstrand	4e060385e9	intel/nir: Use the correct scalar stage for consumers when linking Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-01 18:02:28 -07:00
Dave Airlie	70c34a1bd2	docs: update 18.2.0 release notes for virgl	2018-08-02 08:43:56 +10:00
Dylan Baker	34998aae18	nir/meson: fix c vs cpp args for nir test Fixes: `d1992255bb` ("meson: Add build Intel "anv" vulkan driver") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-01 12:51:22 -07:00
Dylan Baker	2877b6555c	gallium: fix ddebug on windows By including the proper headers for getpid and for mkdir. Fixes: `6ff0c6f4eb` ("gallium: move ddebug, noop, rbug, trace to auxiliary to improve build times") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-08-01 12:50:25 -07:00
Dylan Baker	17f49950da	util: move process.[ch] to u_process.[ch] On windows process.h is a system provided header, and it's required in include/c11/threads_win32.h. This header interferes with searching for that header, and results in windows build warnings with scons, but errors in meson which doesn't allow implicit function declarations. Just rename process to u_process, which follows the style of utils anyway. Fixes: `2e1e6511f7` ("util: extract get_process_name from xmlconfig.c") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-08-01 12:47:16 -07:00
Marek Olšák	cb6b241c30	ac,radeonsi: reduce optimizations for complex compute shaders on older APUs (v2) To make dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23 finish sooner on the older CPUs. (otherwise it gets killed and we fail the test) Acked-by: Dave Airlie <airlied@gmail.com>	2018-08-01 15:25:18 -04:00
Eric Anholt	c2eab33b08	v3d: Actually put the "%s" in the snprintf. I missed an important part when porting the change over, fixing my compiler warning but breaking -Werror=format-security. Fixes: `e6ff5ac446` ("v3d: use snprintf(..., "%s", ...) instead of strncpy") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107443	2018-08-01 11:39:19 -07:00
Juan A. Suarez Romero	d742270564	vc4: Fix automake linking error. CXXLD gallium_dri.la ../../../../src/gallium/drivers/vc4/.libs/libvc4.a(vc4_cl_dump.o): In function `vc4_dump_cl': src/gallium/drivers/vc4/vc4_cl_dump.c:45: undefined reference to `clif_dump_init' src/gallium/drivers/vc4/vc4_cl_dump.c:82: undefined reference to `clif_dump_destroy' ../../../../src/broadcom/cle/.libs/libbroadcom_cle.a(cle_libbroadcom_cle_la-v3d_decoder.o): In function `v3d_field_iterator_next': src/broadcom/cle/v3d_decoder.c:902: undefined reference to `clif_lookup_bo' Fixes: `e92959c4e0` ("v3d: Pass the whole clif_dump structure to v3d_print_group().") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107423 CC: Eric Anholt <eric@anholt.net> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Andres Gomez <agomez@igalia.com>	2018-08-01 20:33:07 +02:00
Juan A. Suarez Romero	810c9a4eba	scons: require scons 2.4 or greater There is a bug with scons 2.3, used in Travis, where it fails to detect some C functions. Reviewed-by: Andres Gomez <agomez@igalia.com>	2018-08-01 20:33:00 +02:00
Juan A. Suarez Romero	fea0b92042	travis: install scons from pip The ubuntu version provided by Travis is a bit old, and does not detect correctly some C functions. Use a more modern version through scons. Reviewed-by: Andres Gomez <agomez@igalia.com>	2018-08-01 20:32:42 +02:00
Marek Olšák	26d3e2b4b0	docs: mark ARB_ES3_2_compatibility as done for radeonsi	2018-08-01 11:38:54 -04:00
Lionel Landwerlin	2477e516d9	intel: tools: aubwrite: split gen[89] from gen10+ Gen10+ has an additional bit in MI_BATCH_BUFFER_END to signal the end of the context image. We select the largest size for the context image regardless of the generation. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-08-01 15:31:56 +01:00
Mathieu Bridon	91939255a7	python: Use the unicode_escape codec Python 2 had string_escape and unicode_escape codecs. Python 3 only has the latter. These work the same as far as we're concerned, so let's use the future-proof one. However, the reste of the code expects unicode strings, so we need to decode them again. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-01 14:26:19 +01:00
Mathieu Bridon	ad363913e6	python: Explicitly add the 'L' suffix on Python 3 Python 2 had two integer types: int and long. Python 3 dropped the latter, as it made the int type automatically support bigger numbers. As a result, Python 3 lost the 'L' suffix on integer litterals. This probably doesn't make much difference when compiling the generated C code, but adding it explicitly means that both Python 2 and 3 generate the exact same C code anyway, which makes it easier to compare and check for discrepencies when moving to Python 3. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-01 14:26:19 +01:00
Mathieu Bridon	a71df20855	python: Explicitly use byte strings In both Python 2 and 3, zlib.Compress.compress() takes a byte string, and returns a byte string as well. In Python 2, the script was working because: 1. string literalls were byte strings; 2. opening a file in unicode mode, reading from it, then passing the unicode string to compress() would automatically encode to a byte string; On Python 3, the above two points are not valid any more, so: 1. zlib.Compress.compress() refuses the passed unicode string; 2. compressed_data, defined as an empty unicode string literal, can't be concatenated with the byte string returned by compress(); This commit fixes this by explicitly using byte strings where appropriate, so that the script works on both Python 2 and 3. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-01 14:26:19 +01:00
Mathieu Bridon	8678fe537a	python: Use open(), not file() The latter is a constructor for file objects, but when actually opening a file, using the former is more idiomatic. In addition, file() is not a builtin any more in Python 3, so this makes the script compatible with both Python 2 and Python 3. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-01 14:26:19 +01:00
Mathieu Bridon	c24d826968	python: Open file in binary mode The XML parser wants byte strings, not unicode strings. In both Python 2 and 3, opening a file without specifying the mode will open it for reading in text mode ('r'). On Python 2, the read() method of the file object will return byte strings, while on Python 3 it will return unicode strings. Explicitly specifying the binary mode ('rb') makes the behaviour identical in both Python 2 and 3, returning what the XML parser expects. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-01 14:26:19 +01:00
Mathieu Bridon	e40200e0aa	python: Don't abuse hex() The hex() builtin returns a string containing the hexa-decimal representation of an integer. When the argument is not an integer, then the function calls that object's __hex__() method, if one is defined. That method is supposed to return a string. While that's not explicitly documented, that string is supposed to be a valid hexa-decimal representation for a number. Python 2 doesn't enforce this though, which is why we got away with returning things like 'NIR_TRUE' which are not numbers. In Python 3, the hex() builtin instead calls an object's __index__() method, which itself must return an integer. That integer is then automatically converted to a string with its hexa-decimal representation by the rest of the hex() function. As a result, we really can't make this compatible with Python 3 as it is. The solution is to stop using the hex() builtin, and instead use a hex() object method, which can return whatever we want, in Python 2 and 3. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-01 14:26:19 +01:00
Mathieu Bridon	12eb5b496b	python: Better get character ordinals In Python 2, iterating over a byte-string yields single-byte strings, and we can pass them to ord() to get the corresponding integer. In Python 3, iterating over a byte-string directly yields those integers. Transforming the byte string into a bytearray gives us a list of the integers corresponding to each byte in the string, removing the need to call ord(). This makes the script compatible with both Python 2 and 3. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-01 14:26:19 +01:00
Mario Kleiner	9bd8b0f700	loader_dri3: Handle mismatched depth 30 formats for Prime renderoffload. Detect if the display (X-Server) gpu and Prime renderoffload gpu prefer different channel ordering for color depth 30 formats ([X/A]BGR2101010 vs. [X/A]RGB2101010) and perform format conversion during the blitImage() detiling op from tiled backbuffer -> linear buffer. For this we need to find the visual (= red channel mask) for the X-Drawable used to display on the server gpu. We use the same proven logic for finding that visual as in commit "egl/x11: Handle both depth 30 formats for eglCreateImage()". This is mostly to allow "NVidia Optimus" at depth 30, as Intel/AMD gpu's prefer xRGB2101010 ordering, whereas NVidia gpu's prefer xBGR2101010 ordering, so we can offload to nouveau without getting funky colors. Tested on Intel single gpu, NVidia single gpu, Intel + NVidia prime offload with DRI3/Present. Note: An unintended but pleasant surprise of this patch is that it also seems to make the modesetting-ddx of server 1.20.0 work at depth 30 on nouveau, at least with unredirected "classic" X rendering, and with redirected desktop compositing under XRender accel, and with OpenGL compositing under GLX. Only X11 compositing via OpenGL + EGL still gives funky colors. modesetting-ddx + glamor are not yet ready to deal with nouveau's ABGR2101010 format, and treat it as ARGB2101010, also exposing X-visuals with ARGB2101010 style channel masks. Seems somehow this triggers the logic in this patch on modesetting-ddx + depth 30 + DRI3 buffer sharing and does the "wrong" channel swizzling that then cancels out the "wrong" swizzling of glamor and we end up with the proper pixel formatting in the scanout buffer :). This so far tested on a NVA5 Tesla card under KDE5 Plasma as shipping with Ubuntu 16.04.4 LTS. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-01 12:55:37 +01:00
Mario Kleiner	61a02729f7	egl/x11: Handle both depth 30 formats for eglCreateImage(). (v4) We need to distinguish if the backing storage of a pixmap is XRGB2101010 or XBGR2101010, as different gpu hw supports different formats. NVidia hw prefers XBGR, whereas AMD and Intel are happy with XRGB. Use the red channel mask of the first depth 30 visual of the x-screen to distinguish which hw format to choose. This fixes desktop composition of color depth 30 windows when the X11 compositor uses EGL. v2: Switch from using the visual of the root window to simply using the first depth 30 visual for the x-screen, as testing shows that each driver only exports either xrgb ordering or xbgr ordering for the channel masks of its depth 30 visuals, so this should be unambiguous and avoid trouble if X ever supports depth 30 pixmaps on screens with a non-depth 30 root window visual. This per Michels suggestion. v3: No change to v2, but spent some time testing this more on AMD hw, with my software hacked up to intentionally choose pixel formats/visual with the non-preferred xBGR2101010 ordering on the ati-ddx, also with a standard non-OpenGL X-Window with depth 30 visual, to make sure that things show up properly with the right colors on the screen when going through EGL+OpenGL based compositing on KDE-5. Iow. to confirm that my explanation to the v2 patch on the mailing list of why it should work and the actual practice agree (or possibly that i am good at fooling myself during testing ;). v4: Drop the local `red_mask` and just `return visual->red_mask`/ `return 0`, as suggested by Eric Engestrom. Rebased onto current master, to take the cleanup via the new function dri2_format_for_depth() into account. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-01 12:55:37 +01:00
Daniel Stone	753f603b52	gbm: Add support for 10bpp BGR formats Add support for XBGR2101010 and ABGR2101010 formats. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Mario Kleiner <mario.kleiner.de@gmail.com> Tested-by: Mario Kleiner <mario.kleiner.de@gmail.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-01 12:55:37 +01:00
Daniel Stone	275b23ed0e	egl/wayland: Add 10bpc BGR configs Add support for XBGR2101010 and ABGR2101010. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Mario Kleiner <mario.kleiner.de@gmail.com> Tested-by: Mario Kleiner <mario.kleiner.de@gmail.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-08-01 12:55:37 +01:00
Iago Toral Quiroga	471bce5689	intel/compiler: implement 8-bit constant load Fixes VK-GL-CTS CL#2567 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-08-01 08:08:15 +02:00
Iago Toral Quiroga	7e6c8b0cb7	intel/compiler: add setup_imm_(u)b helpers The hardware doesn't support byte immediates, so similar to setup_imm_df() for doubles, these helpers work by loading the constant value into a VGRF. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-08-01 08:08:15 +02:00
Rhys Perry	bd56e117ff	glsl: fix function inlining with opaque parameters Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-08-01 00:10:01 -04:00
Rhys Perry	f903bce8a6	glsl, glsl_to_tgsi: fix sampler/image constants Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-08-01 00:10:01 -04:00
Rhys Perry	ea2a3f52b4	glsl: allow ?: operator with images and samplers when bindless is enabled Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-08-01 00:10:01 -04:00
Rhys Perry	42d4acb39d	glsl_to_tgsi: allow bound samplers and images to be used as l-values Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-08-01 00:10:00 -04:00
Rhys Perry	00589be6c4	gallium: add new SAMP2HND and IMG2HND opcodes This commit does not add support for the opcodes in gallivm or tgsi_to_nir.c Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-08-01 00:10:00 -04:00
Dave Airlie	1fb388cd20	docs/features: update virgl GLES 3.1/3.2 status virgl now exposes GLES3.1 and 3.2	2018-08-01 14:09:11 +10:00
Dave Airlie	e2c62170d5	docs/features: update virgl GL 4.3 support virgl with up to date host renderer now exposes GL 4.3.	2018-08-01 14:08:33 +10:00
Erik Faye-Lund	21e33f4a10	virgl: enable FBFETCH if virglrenderer supports it This fixes the following dEQP-GLES31 cases from NotSupported to Pass for me: - dEQP-GLES31.functional.blend_equation_advanced.state_query.* - dEQP-GLES31.functional.blend_equation_advanced.basic.* - dEQP-GLES31.functional.blend_equation_advanced.srgb.* - dEQP-GLES31.functional.blend_equation_advanced.msaa.* - dEQP-GLES31.functional.blend_equation_advanced.barrier.* - dEQP-GLES31.functional.draw_buffers_indexed.overwrite_advanced_blend_eq - dEQP-GLES31.functional.state_query.indexed.blend_equation_advanced_* - dEQP-GLES31.functional.debug.negative_coverage..advanced_blend. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-08-01 14:05:22 +10:00
Erik Faye-Lund	7ef86a03f0	virgl: add texture_barrier stub In gallium, supporting FBFETCH means supporting non-coherent fetches, but in virglrenderer, due to technical reasons this is backed by coherent fetches instead. This means we don't need to do anything for the barriers. However, if we don't have a texture_barrier implementation, we get crashes because the non-coherent extensions is exposed. So, let's leave this as a NOP for now. [airlied: I've got a more complete impl of this somewhere, once we land the host side]. Reviewed-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2018-08-01 14:03:51 +10:00
Dave Airlie	6f5d463a78	virgl: enable robustness if the host exposes it Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2018-08-01 14:00:38 +10:00
Dave Airlie	2df8b80c4c	virgl: Support ARB_framebuffer_no_attachments This uses new protocol to send the default sizes to the host. Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2018-08-01 14:00:35 +10:00
Dave Airlie	f8a8ea6a2d	virgl: add initial ARB_compute_shader support This hooks up compute shader creation and launch grid support. Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2018-08-01 14:00:31 +10:00
Marek Olšák	157c6e8195	util: don't use __builtin_clz unconditionally This fixes the build if __builtin_clz is unsupported. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-07-31 23:28:01 -04:00
Marek Olšák	c5c6e0187f	ac/surface: fix MSAA corruption on Vega due to FMASK tile swizzle a needle in the haystack? Cc: 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-31 22:56:40 -04:00
Eric Anholt	e6ff5ac446	v3d: use snprintf(..., "%s", ...) instead of strncpy Fixes a compiler warning about terminator NUL, based on `f836d799f9` ("intel/decoder: use snprintf(..., "%s", ...) instead of strncpy")	2018-07-31 16:42:11 -07:00
Eric Anholt	3471ce9985	v3d: Add support for the TMUWT instruction. This instruction is used to ensure that TMU stores have been processed before moving on. In particular, you need any TMU ops to be done by the time the shader ends.	2018-07-31 16:05:04 -07:00
Marek Olšák	7d36c866d2	radeonsi: report supported EQAA combinations from is_format_supported Framebuffer without attachments now supports 16 samples. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-07-31 18:28:41 -04:00
Marek Olšák	20dd75a926	radeonsi: use storage_samples instead of color_samples in most places and use pipe_resource::nr_storage_samples instead of r600_texture::num_color_samples. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-07-31 18:28:41 -04:00
Marek Olšák	966f155623	gallium: add storage_sample_count parameter into is_format_supported Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-07-31 18:28:41 -04:00
Marek Olšák	8632626c81	gallium: add pipe_resource::nr_storage_samples, and set it same as nr_samples Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-07-31 18:28:41 -04:00
Marek Olšák	0caf74bbcd	gallium: add PIPE_CAP_FRAMEBUFFER_MSAA_CONSTRAINTS Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-07-31 18:28:41 -04:00
Marek Olšák	55d56dd859	docs: update radeonsi features and release notes	2018-07-31 18:12:37 -04:00
Marek Olšák	ed8b4ed6c4	st/mesa: implement ASTC 2D LDR fallback for all drivers Tested-by: Mike Lothian <mike@fireburn.co.uk> Tested-By: Gert Wollny<gert.wollny@collabora.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>	2018-07-31 18:09:57 -04:00
Marek Olšák	5fe52044ef	st/mesa: add ETC2 & ASTC fast path for GetTex(Sub)Image Not sure if GL/GLES can hit this path, but it's just decompression. Tested-by: Mike Lothian <mike@fireburn.co.uk> Tested-By: Gert Wollny<gert.wollny@collabora.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>	2018-07-31 18:09:57 -04:00
Marek Olšák	ebe03d3699	st/mesa: generalize fallback_copy_image for compressed textures in order to support ASTC Tested-by: Mike Lothian <mike@fireburn.co.uk> Tested-By: Gert Wollny<gert.wollny@collabora.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>	2018-07-31 18:09:57 -04:00
Marek Olšák	c3fafa127a	st/mesa: generalize code for the compressed texture map/unmap fallback in order to support ASTC Tested-by: Mike Lothian <mike@fireburn.co.uk> Tested-By: Gert Wollny<gert.wollny@collabora.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>	2018-07-31 18:09:57 -04:00
Marek Olšák	3d7e4311bf	st/mesa: use st_compressed_format_fallback more Tested-by: Mike Lothian <mike@fireburn.co.uk> Tested-By: Gert Wollny<gert.wollny@collabora.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>	2018-07-31 18:09:57 -04:00
Marek Olšák	912e0525be	st/mesa: generalize st_etc_fallback -> st_compressed_format_fallback for ASTC support later Tested-by: Mike Lothian <mike@fireburn.co.uk> Tested-By: Gert Wollny<gert.wollny@collabora.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>	2018-07-31 18:09:57 -04:00
Marek Olšák	38ab39f650	mesa: add ASTC 2D LDR decoder Tested-by: Mike Lothian <mike@fireburn.co.uk> Tested-By: Gert Wollny <gert.wollny@collabora.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-07-31 18:09:57 -04:00
Dave Airlie	5be352b430	docs/features: mark virgl image features and GL4.2 as done	2018-08-01 08:06:41 +10:00
Gurchetan Singh	9c136e8a07	virgl: also mark sampler views as dirty When texture buffers are used as images in compute shaders, the guest never sees the modified data since the TBO is always marked as clean. Fixes most dEQP-GLES31.functional.image_load_store.buffer.* tests. Example test cases: dEQP-GLES31.functional.image_load_store.buffer.load_store.r32ui dEQP-GLES31.functional.image_load_store.buffer.qualifiers.coherent_r32f dEQP-GLES31.functional.image_load_store.buffer.format_reinterpret.rgba8_rgba8ui Note: virglrenderer side patch also needed to bind TBOs correctly Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-08-01 08:05:39 +10:00
Dave Airlie	a090df0d5d	virgl: add memory barrier support Reviwed-by: Gert Wollny <gert.wollny@collabora.com>	2018-08-01 08:02:35 +10:00
Dave Airlie	6f75058359	virgl: add TXQS support Reviwed-by: Gert Wollny <gert.wollny@collabora.com>	2018-08-01 08:02:32 +10:00
Dave Airlie	452eea140d	virgl: add initial images support (v2) v2: add max image samples support Reviwed-by: Gert Wollny <gert.wollny@collabora.com>	2018-08-01 08:02:27 +10:00
Jon Turney	faa29c0e24	Make glXChooseFBConfig handle unspecified sRGB correctly Make glXChooseFBConfig properly handle the case where the only matching configs have the sRGB flag set, but no sRGB attribute is specified. Since `6e06e281`, the sRGBcapable flag is now actually compared, using MATCH_DONT_CARE. `7b0f912e` added defaulting of sRGBcapable to GL_FALSE in __glXInitializeVisualConfigFromTags(), to handle servers which don't report it, but this function is also used by glXChooseFBConfig(), so sRGBcapable is implicitly false when not explicitly specified. (This can cause e.g. glxinfo to fail to find anything matching the simple config it looks for if all the candidates have the sRGB flag set to true. I'm assuming this doesn't happen 'normally' as candidate configs with and without sRGB true are available) Move this defaulting to createConfigsFromProperties(), and set the default for glXChooseFBConfig() in init_fbconfig_for_chooser() to GLX_DONT_CARE. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-31 13:56:13 -04:00
Olivier Fourdan	03a61b977e	dri3: For 1.2, use root window instead of pixmap drawable get_supported_modifiers() and pixmap_from_buffers() requests both expect a window as drawable, passing a pixmap will fail as the Xserver will fail to match the given drawable to a window. That leads to dri3_alloc_render_buffer() to return NULL and breaks rendering when using GLX_DOUBLEBUFFER on pixmaps. Query the root window of the pixmap on first init, and use the root window instead of the pixmap drawable for get_supported_modifiers() and pixmap_from_buffers(). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107117 Fixes: `069fdd5` ("egl/x11: Support DRI3 v1.1") Signed-off-by: Olivier Fourdan <ofourdan@redhat.com> Reviewed-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-31 13:51:59 -04:00
Alejandro Piñeiro	16b5e15e91	i965: enable XFB and GeometryStreams for gen7+ Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-31 13:33:37 +02:00
Neil Roberts	b7421cda86	i965: Link XFB varyings for SPIR-V shaders Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-31 13:33:37 +02:00
Neil Roberts	b9719b4b05	nir/linker: Add the start of a pure-NIR linker for XFB v2: ignore names on purpose, for consistency with other places where we are doing the same (Alejandro) v3: changes proposed by Timothy Arceri, implemented by Alejandro Piñeiro: * Remove redundant 'struct active_xfb_varying' * Update several comments, including spec quotes if needed * Rename struct 'active_xfb_varying_array' to 'active_xfb_varyings' * Rename variable 'array' to 'active_varyings' * Replace one if condition for an assert (<MAX_FEEDBACK_BUFFERS) * Remove BufferMode initialization (was already done) v4: simplify output pointer handling (Timothy) Signed-off-by: Neil Roberts <nroberts@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-31 13:33:37 +02:00
Neil Roberts	9fbe5bd811	nir/types: Add a wrapper to access gl_type Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-31 13:33:37 +02:00
Alejandro Piñeiro	739bb9e3d4	arb_gl_spirv: add calls to several nir lowerings For now we are just adding nir lowerings that are needed/mandatory to get things working. After everything is settled, we would start to add good-to-have lowerings. This patch adds the following calls: * nir_split_var_copits and nir_split_per_member_structs: as vulkan drivers are doing now. See commit `b0c643d8f5` ("spirv: Use NIR per-member splitting") for more info. Without this commit, piglit tests like this crashes: spec/arb_gl_spirv/execution/varying/block And in general most of the shaders that includes any kind of struct. * nir_copy_prop: after nir_deref_instr introduction, function calls need this. See commit "nir,spirv: Rework function calls" (`c11833ab24`) for more info. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-31 13:33:37 +02:00
Alejandro Piñeiro	d69027536c	compiler/spirv: add XFB and GeometryStreams capability check support Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-31 13:33:28 +02:00
Neil Roberts	1e3f61d1d5	nir/gather_info: Set info.gs.uses_streams Whenever a non-zero stream is written to it now sets uses_streams to true. This reflects the code in validate_geometry_shader_emissions for GLSL. v2: set uses_streams at gather_info instead that at spirv to nir (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-31 13:18:28 +02:00
Neil Roberts	b0af66bb17	spirv/nir: Fix the stream ID when emitting a primitive or vertex It looks like it was previously taking the SPIR-V instruction number directly instead of looking up the constant value. v2: use vtn_constant_value helper (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-31 13:18:28 +02:00
Neil Roberts	13b8857fcf	spirv: Handle the SpvDecorationStream decoration From SPIR-V 1.0 spec, section 3.20, "Decoration": "Stream Apply to an object or a member of a structure type. Indicates the stream number to put an output on." Note the "or", so that means that it is allowed for both a full struct or a membef or a struct (although the wording is not really ideal, and somewhat error-prone, imho). We found this with some Geometry Streams tests for ARB_gl_spirv, where the full gl_PerVertex is assigned Stream 0 (default value on OpenGL for gl_PerVertex). So this commit allows structs to have this Decoration, and sets the stream at the nir variable if needed. Signed-off-by: Neil Roberts <nroberts@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> v2: squash two Decoration Stream patches (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-31 13:18:28 +02:00
Neil Roberts	d480623bef	mesa/glspirv: Set last_vert_prog v2: simplify last_vert check (Timothy) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-31 13:18:28 +02:00
Neil Roberts	cd4a14be06	spirv: Handle XFB variable decorations These set the new explicit XFB members on nir_variable. This is needed to support ARB_gl_spirv, as Vulkan doesn't support transform feedback. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-31 13:18:28 +02:00
Neil Roberts	a5ec8461f9	spirv: Handle SpvExecutionModeXfb This just sets has_transform_feedback_varyings on the shader. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-31 13:18:28 +02:00
Neil Roberts	3fd5b4c7aa	nir: Add members for the explicit XFB properties to nir_variable These are copied from the from the corresponding values in ir_variable. The intention is to eventually use them in a pure-NIR linker. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-31 13:18:28 +02:00
Christian Gmeiner	e1d4882d05	etnaviv: fix typo in query names Fixes: `d0bed0b494` ("etnaviv: support HI performance counters") Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Chris Healy <cphealy@gmail.com>	2018-07-31 08:33:32 +02:00
Tapani Pälli	553af7a190	mesa: fix a typo (trivial) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-31 08:19:38 +03:00
Tapani Pälli	ce80abbb17	mesa: add glRenderbufferStorage support for EXT_texture_norm16 formats These bits were missing, found when extending the Piglit test. Fixes: `7f467d4f73` "mesa: GL_EXT_texture_norm16 extension plumbing" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-31 08:19:10 +03:00
David Riley	f94681b6e2	egl/surfaceless: Allow DRMless fallback. Allow platform_surfaceless to use swrast even if DRM is not available. To be used to allow a fuzzer for virgl to be run on a jailed VM without hardware GL or DRM support. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Signed-off-by: David Riley <davidriley@chromium.org>	2018-07-30 19:40:45 -07:00
David Riley	b169b84be6	egl/surfaceless: Define DRI_SWRastLoader extension when using swrast. Signed-off-by: David Riley <davidriley@chromium.org> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> [chadv: Dropped spurious hunk] Reviewed-by: Chad Versace <chadversary@chromium.org>	2018-07-30 19:40:08 -07:00
Eric Anholt	d934492ff9	v3d: Dump the contents off all the buffers in CLIF mode. A V3D_DEBUG=clif file from a non-texturing .shader_test can now be successfully run through the CLIF runner in the simulator. Now I need to build an open source CLIF runner against the v3d DRM module.	2018-07-30 14:29:01 -07:00
Eric Anholt	99a5ac250b	v3d: Split walking the CLs to generate relocs from walking CLs to dump. We need to dump each buffer's contents in order for a CLIF file, so we need to collect all of the relocs into a buffer (such as the indirect CL full of both uniforms and GL shader states) before we start dumping.	2018-07-30 14:29:01 -07:00
Eric Anholt	2df6f1a3df	v3d: Include commands to run the BCL and RCL in CLIF dumps.	2018-07-30 14:29:01 -07:00
Eric Anholt	c6449e33e3	v3d: Use a short, underscored name for packets in CLIF/CL dumping. These will match the names that the CLIF parser expects to see. I may in the future decide to change more of the other names so that I match the names the HW/closed SW team uses for their packets, rather than the names in the spec (which only they and I can read anyway).	2018-07-30 14:29:01 -07:00
Eric Anholt	b56f8c475e	v3d: Rename "configuration" and "config" in the XML to "cfg" This matches what CLIF parsing expects, and makes TILE_BINNING_MODE_CONFIGURATION_COMMON_CONFIGURATION into a much more legible TILE_BINNING_MODE_CFG_COMMON.	2018-07-30 14:29:01 -07:00
Eric Anholt	300e609feb	v3d: s/colour/color in the XML. The CLIF format expects american english spelling, and the rest of Mesa is too. I was previously adhering to the spec's spelling, which is counterproductive.	2018-07-30 14:29:01 -07:00
Eric Anholt	3a8550ad06	v3d: Rename primitives to prims in the XML to match CLIF names. This makes us match up with the V3D HW team's names a bit more.	2018-07-30 14:29:01 -07:00
Eric Anholt	6237c64049	v3d: Print CLIF fixed-point values as just their decimal value. The parser doesn't handle float input, so we have to dump the raw value.	2018-07-30 14:29:01 -07:00
Eric Anholt	8da47b7648	v3d: When not doing terminal pretty-printing, comment struct field names. The struct field names aren't part of the CLIF ABI, just the order of fields within the struct. The comments are there for human readability.	2018-07-30 14:29:01 -07:00
Eric Anholt	103f21b13d	v3d: Add a separate flag for CLIF ABI output versus human-readable CLs. A few of the upcoming changes would make the V3D_DEBUG=cl output less readable, so let's make proper CLIF file production be under a separate V3D_DEBUG=clif flag.	2018-07-30 14:29:01 -07:00
Eric Anholt	89ac6fa403	v3d: Add pack header support for f187 values. V3D only has one of these (the top 16 bits of a float32) left in its CLs, but VC4 had many more. This gets us proper pretty-printing of the values instead of a large uint.	2018-07-30 14:29:01 -07:00
Eric Anholt	e146e3a795	v3d: Move depth offset packet setup to CSO creation time. This should be some simpler memcpying at draw time, and makes the next change easier.	2018-07-30 14:29:01 -07:00
Dave Airlie	9039cf70fa	r600: reduce num compute threads to 1024. I copied this value from radeonsi, but it was wrong, 1024 seems to be correct answer from looking at gpuinfo. This should fix a few compute shader related hangs. (at least in CTS) Cc: <mesa-stable@lists.freedesktop.org> (airlied: pushed because it avoids hangs)	2018-07-31 04:55:38 +10:00
Rob Clark	0ea243dcd5	freedreno/a5xx: fix txf_ms Somehow this got lost from the initial MSAA patch. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-30 12:31:05 -04:00
Rhys Perry	f310e86a42	nvc0: serialize before updating some constant buffer bindings on Maxwell+ To avoid serializing, this has the user constant buffer always be 65536 bytes and enabled unless it's required that something else is used for constant buffer 0. Fixes artifacts with at least XCOM: Enemy Within, 0 A.D. and Unigine Valley, Heaven and Superposition. v2: changed uniform_buffer_bound to be bool instead of a uint32_t v3: remove magic constants v3: remove pointless code in nvc0_validate_driverconst Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100177 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-07-30 15:04:26 +01:00
Eric Anholt	0a3f653180	v3d: Block bin on render when doing vertex texturing. The kernel by default serializes the BCL on previous BCLs submitted on this FD, but not RCLs. For now this fix is conservative and blocks on last RCL if any vertex texturing is done, which fails to get bin/render overlap if there was an intermediate job that doesn't draw to the BCL's buffer. I've dropped a perf_debug() in here to note that as a potential future improvement. Fixes intermittent failures in KHR-GLES3.copy_tex_image_conversions.required.*	2018-07-29 19:25:39 -07:00
Eric Anholt	34cefa7fe0	v3d: Fix meson build without vc4.	2018-07-29 19:22:33 -07:00
Eric Anholt	27f1bfe471	vc4: Fix meson build when enabled without v3d. Reported-by: Rob Clark <robdclark@gmail.com> Fixes: `e92959c4e0` ("v3d: Pass the whole clif_dump structure to v3d_print_group().")	2018-07-29 19:13:29 -07:00
Jason Ekstrand	05fb2f88ec	nir/instr_set: Fix nir_instrs_equal for derefs We weren't returning at the end of the nir_isntr_type_deref case in nir_instrs_equal and it was falling through to the default of false. While we're at it, make the default unreachable because all statements in the switch now have their own returns. Had we done that before, we would have caught this bug a long time ago. Fixes: `19a4662a54` "nir: Add a deref instruction type" Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Thomas Helland<thomashelland90@gmail.com>	2018-07-29 13:39:35 -07:00
Jason Ekstrand	9a4ab4c120	nir: Take if uses into account in ssa_def_components_read Fixes: `d800b7daa5` "nir: Add a helper for figuring out what..." Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-07-29 13:39:35 -07:00
Jason Ekstrand	5c1c6939ce	util/list: Make some helpers take const lists They're all just querying things about the list and not mutating anything. Reviewed-by: Thomas Helland<thomashelland90@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2018-07-29 13:39:35 -07:00
Rob Clark	0ddae4acae	freedreno/a5xx: small cleanup We no longer have semi-custom clear pipe that uses 3d state. Normal clears happen via hw blitter, and everything else uses u_blitter these days. So we don't need this hack. TODO a3xx+a4xx could get same treatment. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-29 14:00:06 -04:00
Rob Clark	3932db0f7e	freedreno/a5xx: remove unused prototype Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-29 13:50:19 -04:00
Rob Clark	104a49f166	freedreno: fix caps harder Fixes: `868ca81c` and `f485e567` Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-29 13:48:22 -04:00
Karol Herbst	bc0e0c2818	nir/lower_int64: mark all metadata as dirty v2: use nir_metadata_preserve preserve metadata in case of !progress Fixes: `074f5ba0b5` "nir: Add a simple int64 lowering pass" Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-28 19:59:28 +02:00
Mauro Rossi	0ca153f869	android: radv: enable build of vulkan.radv HAL module src/amd/Android.mk requires to include src/amd/vulkan/Android.mk to enable the build of vulkan.radv module Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Robert Foss <robert.foss@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-07-28 12:40:14 +02:00
Mauro Rossi	212af3c9ea	android: radv: add Android.mk for vulkan.radv HAL module radv implements the Android Vulkan HAL interface, this patch adds Android.mk building rules by porting of radv automake rules. vendor HAL module is installed as /vendor/lib/hw/vulkan.radv.so Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-07-28 12:40:07 +02:00
Mauro Rossi	1eb65c51ad	radv: generate entrypoints for VK_ANDROID_native_buffer Patch changes radv entrypoints generator to not skip this extension even though it is set as disabled in the vk.xml Reference: `63525ba730` ("android: enable VK_ANDROID_native_buffer") Fixes: `69f447553c` ("vulkan: Drop vk_android_native_buffer.xml") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Robert Foss <robert.foss@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-07-28 12:39:57 +02:00
Mauro Rossi	c67b36c8a1	radv: move vk_format_table.c to generated sources Android build system will try to compile vk_format_table.c as a shipped source, but at compile time it will be missing, we move it to generated source, where it belongs Fixes: `f4e499ec79` ("radv: add initial non-conformant radv vulkan driver") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Robert Foss <robert.foss@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-07-28 12:39:49 +02:00
Brian Paul	b4bda6e066	xlib: fix build break from _swrast_map_soft_renderbuffer() call We need to pass the new flip_y argument. Reviewed-by: Clayton Craft <clayton.a.craft@intel.com>	2018-07-27 21:21:24 -06:00
Brian Paul	90b189e5d2	swrast: fix crash in AA line code when there's no texture Fixes a crash running the Piglit polygon-mode-facing test (and probably others). Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-07-27 21:21:24 -06:00
Brian Paul	ce0f42dfe4	mesa: add switch case for GL 2.1 in _mesa_compute_version() The xlib/swrast driver only supports GL 2.1. This patch fixes a crash if the app calls glGetString(GL_SHADING_LANGUAGE_VERSION). Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-07-27 21:21:24 -06:00
Brian Paul	4f51e8880d	tgsi: whitespace fixes in tgsi_ureg.c Trivial.	2018-07-27 21:21:24 -06:00
Brian Paul	f02243541d	gallium/util: whitespace fixes in u_inlines.h Trivial.	2018-07-27 21:21:24 -06:00
Brian Paul	4216a1d0a8	svga: whitespace fixes in svga_tgsi_decl_sm30.c Trivial.	2018-07-27 21:21:24 -06:00
Brian Paul	2f1af8549d	mesa: replace tabs with spaces in mipmap.c Trivial.	2018-07-27 21:21:24 -06:00
Brian Paul	f39840f866	gallium/util: whitespace fixes in u_debug_memory.c Trivial.	2018-07-27 21:21:24 -06:00
Brian Paul	2261d6a403	mesa: whitespace clean-up in texstore.c Trivial.	2018-07-27 21:21:24 -06:00
Brian Paul	a67b629193	mesa: move var decls in texstore_rgba() Move them closer to where they're first used. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-07-27 21:21:24 -06:00
Brian Paul	5e2582b381	mesa: remove unneeded free() call in texstore_rgba() The pointer will always be NULL since that's what we just tested for. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-07-27 21:21:24 -06:00
Eric Anholt	942456f646	v3d: Skip printing sub-id or pad fields in CLIF dumping. The parser doesn't expect them, so our fields would end up mismatched. They're not really useful in console output, either.	2018-07-27 18:00:48 -07:00
Eric Anholt	3ee0ab599e	v3d: Emit commands to switch CLIF parser to CL/shader/attr input mode. By default after saying you are emitting a buffer, it'll expect a buffer size. Once you set a format, it'll keep parsing that format until you announce something else.	2018-07-27 18:00:46 -07:00
Eric Anholt	a57770aa37	v3d: Dump fields in CLIF output in increasing offset order. Previously, we emitted in XML order, which I happen to type in the decreasing offset order of the specifications. However, the CLIF parser wants increasing offsets.	2018-07-27 17:56:55 -07:00
Eric Anholt	95bafeeabf	v3d: Print addresses in CLIFs as references to buffers. With CLIFs, the parser will choose an address for the buffer being created, so we need to use effectively relocations to buffers instead of the addresses that the driver uses. This is also a whole lot more intelligible for console output than raw addresses!	2018-07-27 17:56:36 -07:00
Eric Anholt	3c02838d29	v3d: Stop doing pretty-printed colorful booleans in CLIF output. The parser wants to see a 1 or 0. We can put "true" and "false" in a comment to clarify that it's a boolean and the parser will skip it.	2018-07-27 17:55:57 -07:00
Eric Anholt	422910d2e7	v3d: Move clif dumping to a separate step from noting where the CLs are. Now all the printing happens from the same worklist processing.	2018-07-27 17:08:35 -07:00
Eric Anholt	01b4952773	v3d: Move clif dump BO lookup into the clif dumper. The clif dumper is going to need information about all of our BOs if we're going to dump them for replay purposes.	2018-07-27 17:08:35 -07:00
Eric Anholt	e92959c4e0	v3d: Pass the whole clif_dump structure to v3d_print_group(). To generate CLIF files that the v3dv3 simulator can parse, we're going to need to decode addresses, and for that we'll need the vaddr lookup function from the clif structure from within v3d_decoder.	2018-07-27 17:08:35 -07:00
Timothy Arceri	77207e5380	ac: pass write param to get_sampler_desc() from get_image_descriptor() Looks like a mistake from when the deref stuff landed. Fixes: `506a07e4e3` ("ac/nir: Add deref support to image intrinsics.") Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-28 08:57:03 +10:00
Marek Olšák	d89a123dfd	gallium/u_vbuf: split u_vbuf_get_minmax_index function (v2) This will be used by indirect multidraws. v2: clean up the function further, change return types to unsigned Reviewed-by: Eric Anholt <eric@anholt.net> (v1)	2018-07-27 17:50:40 -04:00
Alexander von Gluck IV	da8de6b757	gallium/auxiliary: Extern "c" fixes. Used by C++ code such as Haiku's renderer. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-07-27 16:19:12 -05:00
Marek Olšák	5fe943aaee	gallium/noop: implement invalidate_resource	2018-07-27 16:31:56 -04:00
Dave Airlie	5040319331	radv: fix cdw check vs tracing emit If we have tracing enabled we could do all the tracing emits and overflow the precalculated cdw_max. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-28 06:20:27 +10:00
Dave Airlie	b88468f15c	radv: return binary code_size not variant code size to cache The code sizes return here get passed to the cache shader insert function, which then memcpy from the code ptr, and causes all sorts of valgrind errors like: ==6755== Invalid read of size 8 ==6755== at 0x4C32FEE: memcpy@GLIBC_2.2.5 (vg_replace_strmem.c:1021) ==6755== by 0x2305D4C7: radv_pipeline_cache_insert_shaders (radv_pipeline_cache.c:416) ==6755== by 0x2305791D: radv_create_shaders (radv_pipeline.c:2158) ==6755== by 0x2305C523: radv_pipeline_init (radv_pipeline.c:3404) ==6755== by 0x2305C890: radv_graphics_pipeline_create (radv_pipeline.c:3515) ==6755== by 0x230188AB: radv_device_init_meta_blit_color (radv_meta_blit.c:871) ==6755== by 0x2301D50E: radv_device_init_meta_blit_state (radv_meta_blit.c:1278) ==6755== by 0x23011893: radv_device_init_meta (radv_meta.c:352) ==6755== by 0x2300744B: radv_CreateDevice (radv_device.c:1576) ==6755== by 0x5187D0F: ??? (in /usr/lib64/libvulkan.so.1.1.77) ==6755== by 0x518F6A3: ??? (in /usr/lib64/libvulkan.so.1.1.77) ==6755== by 0x5192A42: vkCreateDevice (in /usr/lib64/libvulkan.so.1.1.77) ==6755== Address 0x22a58548 is 4 bytes after a block of size 116 alloc'd ==6755== at 0x4C2EBAB: malloc (vg_replace_malloc.c:299) ==6755== by 0x23089DC4: ac_elf_read (ac_binary.c:144) ==6755== by 0x23090A60: ac_compile_module_to_binary (ac_llvm_helper.cpp:162) ==6755== by 0x23053F06: compile_to_memory_buffer (radv_llvm_helper.cpp:58) ==6755== by 0x23053F06: radv_compile_to_binary (radv_llvm_helper.cpp:98) ==6755== by 0x23052769: ac_llvm_compile (radv_nir_to_llvm.c:3394) ==6755== by 0x23052823: ac_compile_llvm_module (radv_nir_to_llvm.c:3418) ==6755== by 0x23053C05: radv_compile_nir_shader (radv_nir_to_llvm.c:3542) ==6755== by 0x23061B4E: shader_variant_create (radv_shader.c:580) ==6755== by 0x23061CFD: radv_shader_variant_create (radv_shader.c:634) ==6755== by 0x23057765: radv_create_shaders (radv_pipeline.c:2123) ==6755== by 0x2305C523: radv_pipeline_init (radv_pipeline.c:3404) ==6755== by 0x2305C890: radv_graphics_pipeline_create (radv_pipeline.c:3515) Since we are just inserting the code into the cache, we can avoid these bad reads and data in the cache by just using the binary code size here. Fixes: `939e5a382` (radv: add padding for the UMR disassembler) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-28 06:20:20 +10:00
Eric Anholt	22a1ba0403	v3d: Drop the use of the semaphores. The kernel's scheduler doesn't rely on our emitting them, and in fact we'd get in trouble if the kernel decided to schedule too many bins in a row before getting around to scheduling the corresponding render.	2018-07-27 12:56:36 -07:00
Eric Anholt	9bf9a6d6a1	v3d: Drop the VG support from the XML. This reflects a change on the HW/closed SW side to drop this unused HW. With it dropped on their side, the CLIF parser no longer expects to find VG fields.	2018-07-27 12:56:36 -07:00
Eric Anholt	5a1cc3861c	v3d: Use /* */ instead of () for enum names in CLIF output. This lets the comments be ignored by the CLIF parser.	2018-07-27 12:56:36 -07:00
Eric Anholt	95a0f99825	v3d: CLIF-dump the "Vec size" field as 0 == maximum value. That's what a user should want to see, and what the CLIF parser wants. This should maybe be generalized.	2018-07-27 12:56:36 -07:00
Eric Anholt	1c8e4632a7	v3d: Stop using spaces in the names of our buffers. For CLIF dumping, we need names to not have spaces. Rather than rewriting them after the fact, just change the two cases where I had put a space in.	2018-07-27 12:56:36 -07:00
Fritz Koenig	ab05dd183c	i965: implement GL_MESA_framebuffer_flip_y [v3] Instead of using _mesa_is_winsys_fbo or _mesa_is_user_fbo to infer if an fbo is flipped use the FlipY flag. v2: * additional window-system framebuffer checks [for jason] v3: * s/inverted_y/flip_y/g [for chadv] * s/InvertedY/FlipY/g [for chadv] Reviewed-by: Chad Versace <chadversary@chromium.org>	2018-07-27 12:33:32 -07:00
Fritz Koenig	318c265160	mesa: GL_MESA_framebuffer_flip_y extension [v4] Adds an extension to glFramebufferParameteri that will specify if the framebuffer is vertically flipped. Historically system framebuffers are vertically flipped and user framebuffers are not. Checking to see the state was done by looking at the name field. This adds an explicit field. v2: * updated spec language [for chadv] * correctly specifying ES 3.1 [for chadv] * refactor access to rb->Name [for jason] * handle GetFramebufferParameteriv [for chadv] v3: * correct _mesa_GetMultisamplefv [for kusmabite] v4: * update spec language [for chadv] * s/GLboolean/bool/g [for chadv] * s/InvertedY/FlipY/g [for chadv] * s/inverted_y/flip_y/g [for chadv] * assert changes [for chadv] Reviewed-by: Chad Versace <chadversary@chromium.org>	2018-07-27 12:32:25 -07:00
Chad Versace	7953399e59	gallium/auxiliary: Fix Autotools on Android (v2) Problem 1: u_debug_stack_android.cpp transitively included "pipe/p_compiler.h", but src/gallium/include was missing from the C++ include path. Problem 2: Add -std=c++11 to AM_CXXFLAGS. Android's libbacktrace headers require C++11, but the Android toolchain (at least in the Chrome OS SDK) does not enable C++11 by default. v2: Add -std=c++11. Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org> Cc: Eric Engestrom <eric.engestrom@intel.com>	2018-07-27 11:35:56 -07:00
Topi Pohjolainen	a5889d70f2	i965/icl: Disable binding table prefetching Gen 11 workarounds table #2056 WABTPPrefetchDisable suggests to disable prefetching of binding tables for ICLLP A0 and B0 steppings. It fixes multiple gpu hangs in ext_framebuffer_multisample* tests on ICLLP B0 h/w. Anuj: Add comments and commit message. Add gen 11 checks in the code. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-27 11:05:04 -07:00
Caio Marcelo de Oliveira Filho	1d71981b27	glsl: use only copy_propagation_elements Now that the elements version handles both cases, remove the non-elements version. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2018-07-27 10:51:25 -07:00
Caio Marcelo de Oliveira Filho	134b5a7047	glsl: teach copy_propagation_elements to deal with whole variables Keep information in acp_entry whether the entry is full or not, and use the ACP in more nodes when visiting the instructions: - add_copy: write whole variables to the ACP state (regardless the type). - visit(ir_dereference_variable ): perform the propagation here if we have a full candidate. Element-wise here doesn't apply because the mask isn't available at this point. - visit_leave(ir_assignment ): process beyond scalar and vector, as the full variables might have other types. Also import an improvement from opt_copy_propagation.cpp: if ir_call is an intrinsic, we know the variables affected, so keep going. v2: (all from Eric Anholt) Describe how acp_entry attributes are used. Don't do book-keeping to avoid adding repeated element to the dsts in write_elements(). v3: Use _mesa_set_remove_key. (Thomas Helland) Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2018-07-27 10:51:25 -07:00
vadym.shovkoplias	399228ecad	i965: Disable guardband clipping on SandyBridge for odd dimensions Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104388 Signed-off-by: Andriy Khulap <andriy.khulap@globallogic.com> Acked-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-27 10:07:44 -07:00
Dylan Baker	665fc9cf55	docs: Update release calendar, add news item, and add release notes for 18.1.5	2018-07-27 07:08:59 -07:00
Dylan Baker	2b7b5d3100	docs: Add sha-256 sums for 18.1.5	2018-07-27 07:06:55 -07:00
Dylan Baker	5cc4ee3e17	docs: add 18.1.5 release notes	2018-07-27 07:06:53 -07:00
Iago Toral Quiroga	615aaedb93	intel/compiler: fix lower conversions to account for predication The pass can create a temporary result for the instruction and then moves from it to the original destination, however, if the original instruction was predicated, the mov has to be predicated as well. Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>	2018-07-27 14:48:29 +02:00
Samuel Pitoiset	df679b1643	radv: allocate enough space in radv_cmd_buffer_after_draw() The driver might emit up to 4 dwords when RADV_TRACE_FILE is used. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-27 14:31:29 +02:00
Samuel Pitoiset	c08ae911d9	radv: check CS space in radv_emit_write_data_packet() This wasn't wrong but it looks better to me like this. It's only used for debugging purposes (ie. RADV_TRACE_FILE). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-27 14:31:27 +02:00
Samuel Pitoiset	434630f57c	radv: do not emit pipeline stats flushes on compute queue Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-27 14:31:26 +02:00
Samuel Pitoiset	c118c8938c	radv: reduce CB/DB meta flushes in radv_dst_access_flush() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-27 14:31:24 +02:00
Kenneth Graunke	0c4e0471f5	radv: Fix build I renamed this pass and forgot to update radv. Fixes: `488972222c` ("i965: Combine both gl_PatchVerticesIn lowering passes.")	2018-07-26 23:57:13 -07:00
Kenneth Graunke	488972222c	i965: Combine both gl_PatchVerticesIn lowering passes. Until now, we had separate passes for lowering gl_PatchVerticesIn to a statically known constant (for TES inputs when linked against a TCS), and a uniform in the other cases. Annoyingly, one had to be run before nir_lower_system_values, and the other afterward. This simplified the passes, but made life painful for the callers. This patch combines both into a single pass. If you give it a non-zero static count, it uses that. If you give it Mesa state slots, it turns it back into a built-in uniform. Otherwise, it does nothing. This also moves the i965 uniform lowering out to shared code. v2: Make token arrays const. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-26 21:51:36 -07:00
Sagar Ghuge	29dd5dda9d	i965: Expose EXT_base_instance extension in OpenGLES 3.0 The extension requires at least OpenGL 3.0 and OpenGL ES 3.0. Fixes two ext_base_instance tests: arb_base_instance-baseinstance-doesnt-affect-gl-instance-id_gles3 arb_base_instance-drawarrays_gles3 Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2018-07-26 17:25:35 -07:00
Bas Nieuwenhuizen	3665f66ef2	radv: Add support for ETC2 textures. Was surprised that is even supported by Vega. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-27 01:31:32 +02:00
Jan Vesely	1e8b8e0878	clover: Reduce wait_count in abort path. Trigger waiter condition variable. Passes 'events' CTS on carrizo and turks. v2: reduce to 0 Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-07-26 15:38:22 -04:00
Jan Vesely	c2942141ae	clover: Don't extend illegal integer types. It's OK to pass them in memory, which is what kernel invocation needs. Fixes regressions since llvm r337535 ("Reapply "AMDGPU: Fix handling of alignment padding in DAG argument lowering"): scalar-arithmetic-char scalar-arithmetic-uchar scalar-arithemtic-short scalar-arithmetic-ushort scalar-comparison-char scalar-comparison-uchar scalar-comparison-short scalar-comparison-ushort Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-07-26 15:38:22 -04:00
Kenneth Graunke	8794fe3e30	intel/compiler: Delete dead VS intrinsic handling. These are lowered by brw_nir_lower_vs_inputs(). If they weren't, we would have already hit the unreachable() in emit_system_values_block(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-26 11:45:34 -07:00
Eric Anholt	deecc1ef86	v3d: Avoid the GFXH-1461 workaround if we have only Z or only S. This seems like a sensible precaution to avoid extra draws. It doesn't deal with the case of a Z24S8 buffer created by the window system for an application that happens to never use S.	2018-07-26 11:02:25 -07:00
Eric Anholt	301c32caf4	v3d: Rework the ordering of how we clear things. First, figure out if we can just sneak the clear into the TLB clear, even if drawing has already happened (since we have job->load and job->clear to tell us), taking into account GFXH-1461. For any pieces we can't TLB clear, fall back to drawing a quad without flushing the scene. Fixes extra scene flushes in glmark2 due to GFXH-1461.	2018-07-26 11:02:25 -07:00
Eric Anholt	ceecddfe77	v3d: Only store buffers that have been written to. I've seen cases where a color buffer is bound, but only Z is written, and we end up storing color.	2018-07-26 11:02:25 -07:00
Eric Anholt	d29435e7cb	v3d: Track the buffers being loaded separately. We were computing this at RCL generation time, but that means you can't unflag the store for an invalidate_resource, or not flag the store if writmasking is disabled.	2018-07-26 11:02:20 -07:00
Eric Anholt	47f5d158ae	v3d: Rename cleared/resolve to clear/store. These describe what the fields mean in RCL generation. "resolve" is left over from VC4, and sounds like MSAA resolves (which may or may not be involved in the store we generate).	2018-07-26 11:00:34 -07:00
Eric Anholt	d934d3206e	nir: Add flipping of gl_PointCoord.y in nir_lower_wpos_ytransform. This is controlled by a new nir_shader_compiler_options flag, and fixes dEQP-GLES3.functional.shaders.builtin_variable.pointcoord on V3D. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-26 11:00:34 -07:00
Rhys Perry	b5a56a11da	docs: fix incorrect placement of the ARB_sample_locations release notes Seems something went wrong somehow when it was pushed. v2: combine into one list Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek OIšák <marek.olsak@amd.com>	2018-07-26 11:49:23 +01:00
Eric Engestrom	2cc1849afb	anv: drop unused local vars Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-26 10:21:03 +01:00
Eric Engestrom	2a4191bb38	anv: remove incorrect `UNUSED` flag Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-07-26 10:06:11 +01:00
Erik Faye-Lund	e68fe445f5	gallium: initialize ureg_dst::Invariant bit When this bit was added, it seems the some initialization code was omitted by mistake. Since stack-variables have kinda random contents, and we don't zero initialize the whole struct in these code-paths, we end up getting random-ish values for this bit. Spotted by Coverity in the following CIDs: - 1438115 - 1438123 - 1438130 Fixes: `70425bcfe6` ("gallium: plumb invariant output attrib thru TGSI") Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Jakob Bornecrantz <jakob@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-07-26 09:01:33 +02:00
Samuel Pitoiset	ff0d553818	radv: fix adjusting vertex fetches since 16bit support Move the integer conversion after the fixup. This fixes some regressions with dEQP-VK.pipeline.vertex_input.single_attribute.mat4.as_a2r10g10b10* Fixes: `b722b29f10` ("radv: add support for 16bit input/output") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-26 08:57:43 +02:00
Samuel Pitoiset	6465bf0015	nir: remove wrong assertion in print_var_decl() This breaks printing input/output variables with more than 4 components like mat4. Fixes: `1beef89ad8` ("nir: prepare for bumping up max components to 16") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-26 08:57:38 +02:00
Marek Olšák	ce8e6b970b	ac: fix typo DSL_SEL -> DST_SEL	2018-07-26 01:45:47 -04:00
Marek Olšák	7039d9299e	radeonsi: update a comment about cache behavior	2018-07-26 01:45:47 -04:00
Kenneth Graunke	37c3efca29	intel: Make the decoder just store addresses for bases, not buffers. The various base addresses are simply addresses. There may or may not be a buffer located at those addresses. So, it doesn't make much sense to request one. Just save the raw address so we can add it later, when asking about BOs at the final <base + offset> address. Suggested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-25 14:43:54 -07:00
Kenneth Graunke	933223db3c	intel: Make the decoder handle STATE_BASE_ADDRESS not being a buffer. Normally, i965 programs STATE_BASE_ADDRESS every batch, and puts all state for a given base in a single buffer. I'm working on a prototype which emits STATE_BASE_ADDRESS only once at startup, where each base address is a fixed 4GB region of the PPGTT. State may live in many buffers in that 4GB region, even if there isn't a buffer located at the actual base address itself. To handle this, we need to save the STATE_BASE_ADDRESS values across multiple batches, rather than assuming we'll see the command each time. Then, each time we see a pointer, we need to ask the driver for the BO map for that data. (We can't just use the map for the base address, as state may be in multiple buffers, and there may not even be a buffer at the base address to map.) v2: Fix things caught in review by Lionel: - Drop bogus bind_bo.size check. - Drop "get the BOs again" code - we just get the BOs as needed - Add a message about interface descriptor data being unavailable Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-25 14:43:47 -07:00
Eric Engestrom	aa59f9c8bc	anv: don't crash on vkDestroyDevice(NULL) CovID: 1438132 Fixes: `a99c9e63a0` "anv: finish the binding_table_pool on destroyDevice when use_softpin" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>	2018-07-25 21:04:30 +01:00
Eric Engestrom	270a44040c	vulkan/wsi: fix incorrect assignment in assert() CovID: 1438113, 1438118, 1438119, 1438121 Fixes: `dc1d10b396` "anv,radv: Add support for VK_KHR_get_display_properties2" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-25 20:55:35 +01:00
Eric Engestrom	bbf8316fcb	anv: fix python whitespace warning Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-07-25 20:55:35 +01:00
Eric Engestrom	e0347581f3	anv: cleanup python imports Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-07-25 20:55:35 +01:00
Eric Engestrom	ce7348507e	anv: remove unnecessary semicolons in python Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-07-25 20:55:35 +01:00
Kenneth Graunke	a2c63cae14	st/nir: Fix st_nir_opts() prototype. This wasn't updated for the new scalar ISA parameter. It worked anyway because all the function's callers live in the same file, so it found the correct function. Tim made this external for the new st prog_to_nir translator, which got reverted, but which I'd like to land eventually. So, fix the prototype. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2018-07-25 10:19:41 -07:00
Lionel Landwerlin	b21b38c46c	intel: tools: dump: only store device id on success We might fail on master node drm fd because we won't have the right permissions. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-07-25 16:53:06 +01:00
Gert Wollny	82fc6bdebf	r600: Scale integer valued texture border colors to float (v2) It seems the hardware always expects floating point border color values [0,1] for unsigned, and [-1,1] for signed texture component, regardless of pixel type, but the border colors are passed according to texture component type. Hence, before submitting the border color, convert and scale it these ranges accordingly. This doesn't seem to work for textures with 32 bit integer components though, here, it seems that the border color is always set to zero, regardless of the BORDER_COLOR_TYPE state set in Q_TEX_SAMPLER_WORD0_0. v2: Simplyfy logic as suggested by Roland Schneidegger Fixes: dEQP-GLES31.functional.texture.border_clamp.formats.compressed* dEQP-GLES31.functional.texture.border_clamp.formats.r* (non 32 bit integer) dEQP-GLES31.functional.texture.border_clamp.per_axis_wrap_mode.texture_2d* and a number of piglits out of piglit run gpu -t texture -t gather -t formats Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-07-25 08:58:33 +02:00
Jason Ekstrand	b3b170ade9	nir: Add a couple of iand/ior optimizations Spotted in a shader in Batman: Arkham City. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-24 20:39:43 -07:00
Jordan Justen	2b3064c073	i965, anv: Use INTEL_DEBUG for disk_cache driver flags Since various options within INTEL_DEBUG could impact code generation, we need to set the disk cache driver_flags parameter based on the INTEL_DEBUG flags in use. An example that will affect the program generated by i965 is the INTEL_DEBUG=nocompact option. The DEBUG_DISK_CACHE_MASK value is added to mask the settings of INTEL_DEBUG that can affect program generation. v2: * Use driver_flags (Tim) * Also update Anvil (Jason) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-24 16:17:28 -07:00
Jordan Justen	69a686b0ae	i965, anv: Add extra unused character in disk_cache renderer temp string This extra character should not be used by snprintf, but we make it available to verify that we printed the exact number we wanted, and didn't overflow. v2: * Also update Anvil Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-24 16:17:25 -07:00
Marek Olšák	7d2e6edd89	mesa: allow indirect draws with the default VAO and compatibility profile Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-24 16:00:09 -04:00
Danylo Piliaiev	49ed075615	mesa: Fix copy-paste error in ConservativeRasterDilateRange initialization Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `4580617509` ("mesa: add support for nvidia conservative rasterization extensions") Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-07-24 20:44:34 +01:00
Jason Ekstrand	f214baf72f	nir/serialize: Alloc constants off the variable nir_sweep assumes that constants area always allocated off the variable to which they belong. Violating this assumption causes them to get freed early and leads to use-after-free bugs. Fixes: `120da00975` "nir: add serialization and deserialization" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107366 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Mark Janes <mark.a.janes@intel.com>	2018-07-24 12:34:07 -07:00
Karol Herbst	7f95564a22	nir: rename f2f16_undef to f2f16 we need rounding modes on other conversions involving floats and it is easier to rename f2f16_undef than renaming all the other ones. v2: rebased on master Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-24 20:40:05 +02:00
Karol Herbst	2083cfb6eb	nir: add builtin builder also move some of the GLSL builtins over we will need for implementing some OpenCL builtins v2: replace NIR_IMM_FP by nir_imm_floatN_t in ported code fix up changes caused by swizzle rework Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-24 20:40:05 +02:00
Rob Clark	9e90708d5d	nir/spirv: import OpenCL.std.h Lightly edited to be valid 'C' code. Is there a bug open to fix this upstream? Acked-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-24 20:40:05 +02:00
Marek Olšák	98ab24fdab	radeonsi: handle SI_FORCE_FAMILY early before LLVM target machines are created	2018-07-24 14:21:29 -04:00
Mathieu Bridon	9ebd8372b9	python: Use range() instead of xrange() Python 2 has a range() function which returns a list, and an xrange() one which returns an iterator. Python 3 lost the function returning a list, and renamed the function returning an iterator as range(). As a result, using range() makes the scripts compatible with both Python versions 2 and 3. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-07-24 11:07:04 -07:00
Mathieu Bridon	022d2a381d	python: Better use iterators In Python 2, iterators had a .next() method. In Python 3, instead they have a .__next__() method, which is automatically called by the next() builtin. In addition, it is better to use the iter() builtin to create an iterator, rather than calling its __iter__() method. These were also introduced in Python 2.6, so using it makes the script compatible with Python 2 and 3. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-07-24 11:07:04 -07:00
Mathieu Bridon	01da2feb0e	python: Better sort dictionary keys/values In Python 2, dict.keys() and dict.values() both return a list, which can be sorted in two ways: * l.sort() modifies the list in-place; * sorted(l) returns a new, sorted list; In Python 3, dict.keys() and dict.values() do not return lists any more, but iterators. Iterators do not have a .sort() method. This commit moves the build scripts to using sorted() on dict keys and values, which makes them compatible with both Python 2 and Python 3. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-07-24 11:07:04 -07:00
Mathieu Bridon	5530cb1296	python: Better iterate over dictionaries In Python 2, dictionaries have 2 sets of methods to iterate over their keys and values: keys()/values()/items() and iterkeys()/itervalues()/iteritems(). The former return lists while the latter return iterators. Python 3 dropped the method which return lists, and renamed the methods returning iterators to keys()/values()/items(). Using those names makes the scripts compatible with both Python 2 and 3. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-07-24 11:07:04 -07:00
Mathieu Bridon	fdf946ffbf	python: Stop using the string module Most functions in the builtin string module also exist as methods of string objects. Since the functions were removed from the string module in Python 3, using the instance methods directly makes the code compatible with both Python 2 and Python 3. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-07-24 11:07:04 -07:00
Mathieu Bridon	1d209275c2	python: Better check for keys in dicts Python 3 lost the dict.has_key() method. Instead it requires using the "in" operator. This is also compatible with Python 2. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-07-24 11:07:04 -07:00
Kenneth Graunke	9b34742495	intel: Make the disassembler take a const pointer to the assembly. Disassembling doesn't modify the assembly. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-24 11:04:56 -07:00
Andres Gomez	3647b16675	travis: manually generate sys/syscall.h Until now, the needed bits were wrongly included in linux/memfd.h Since Travis' sys/syscall.h doesn't provide the SYS_memfd_create, we generate that header manually, including the needed bits to avoid compilation problems, as the ones observed after: `3228335b55` ("intel: aubinator: handle GGTT mappings") v2: replace fixes commit with the first direct user of syscall.h (Emil). Fixes: `3228335b55` ("intel: aubinator: handle GGTT mappings") Cc: Emil Velikov <emil.velikov@collabora.com> Cc: Juan A. Suarez Romero <jasuarez@igalia.com> Cc: Dylan Baker <dylan.c.baker@intel.com> Cc: Eric Engestrom <eric.engestrom@intel.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2018-07-24 19:52:11 +03:00
Andres Gomez	7665a05a3a	docs: update calendar to match the 18.2 plan with the one announced Additionally, I've extended the 18.1 cycle by one more release, tentatively assigned to Dylan, due to the ~2 weeks delay for 18.2. Cc: Dylan Baker <dylan.c.baker@intel.com> Cc: Juan A. Suarez <jasuarez@igalia.com> Cc: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Acked-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2018-07-24 19:49:08 +03:00
Andres Gomez	1391892e73	docs: move releases from Fridays to Wednesdays As discussed at: https://lists.freedesktop.org/archives/mesa-dev/2018-March/188525.html Cc: Emil Velikov <emil.velikov@collabora.com> Cc: Juan A. Suarez Romero <jasuarez@igalia.com> Cc: Dylan Baker <dylan.c.baker@intel.com> Cc: Ian Romanick <ian.d.romanick@intel.com> Cc: Carl Worth <cworth@cworth.org> Cc: Mark Janes <mark.a.janes@intel.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Acked-by: Dylan Baker <dylan@pnwbakers.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2018-07-24 19:48:01 +03:00
Andres Gomez	b0e49a9e7a	docs: correct typo in the submitting patches instructions Cc: Emil Velikov <emil.velikov@collabora.com> Cc: Eric Engestrom <eric.engestrom@intel.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-07-24 19:47:40 +03:00
Bas Nieuwenhuizen	28b8c18d84	radv: Still enable inmemory & API level caching if disk cache is not enabled. That we don't have a background disk cache does not mean we should prevent the app caching anything. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-24 18:06:41 +02:00
Jose Fonseca	04d77d53aa	gallium/tests: Don't ignore S3TC errors. Now we do full S3TC decompression they should no longer fail. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-07-24 15:58:14 +01:00
Harish Krupo	fd734608c3	egl: Fix missing clamping in eglSetDamageRegionKHR Clamp the x and y co-ordinates of the rectangles. v2: Clamp width/height after converting to co-ordinates (Ilia Merkin) Signed-off-by: Harish Krupo <harish.krupo.kps@intel.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-07-24 14:46:21 +01:00
Erik Faye-Lund	c3eaf8fe57	forward precise-flag if supported New versions of virglrenderer supports the precise-flag, so let's forward it from TGSI if that's the case. This fixes a few dEQP-GLES31 tests: - dEQP-GLES31.functional.tessellation.common_edge.quads_equal_spacing_precise - dEQP-GLES31.functional.tessellation.common_edge.quads_fractional_even_spacing_precise - dEQP-GLES31.functional.tessellation.common_edge.quads_fractional_odd_spacing_precise - dEQP-GLES31.functional.tessellation.common_edge.triangles_equal_spacing_precise - dEQP-GLES31.functional.tessellation.common_edge.triangles_fractional_even_spacing_precise - dEQP-GLES31.functional.tessellation.common_edge.triangles_fractional_odd_spacing_precise Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-07-24 10:27:27 +02:00
Marek Olšák	6853862a58	radeonsi: fix pk2h breakage	2018-07-23 22:29:59 -04:00
Marek Olšák	86b52d4236	radeonsi: reduce LDS stalls by 40% for tessellation 40% is the decrease in the LGKM counter (which includes SMEM too) for the GFX9 LSHS stage. This will make the LDS size slightly larger, but I wasn't able to increase the patch stride without corruption, so I'm increasing the vertex stride.	2018-07-23 20:23:52 -04:00
Tom Stellard	0866edede0	radeonsi: Add debug option to enable LLVM GlobalISel (v2) R600_DEBUG=gisel will tell LLVM to use GlobalISel rather than SelectionDAG for instruction selection. v2: mareko: move the helper to src/amd/common Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Tom Stellard <tstellar@redhat.com>	2018-07-23 20:23:48 -04:00
Jason Ekstrand	820d5e51b7	intel/compiler: Account for built-in uniforms in analyze_ubo_ranges The original pass only looked for load_uniform intrinsics but there are a number of other places that could end up loading a push constant. One obvious omission was images which always implicitly use a push constant. Legacy VS clip planes also get pushed into the shader. This fixes some new Vulkan CTS tests that test random combinations of bindings and, in particular, test lots of UBOs and images together. Cc: mesa-stable@lists.freedesktop.org Cc: Kenneth Graunke <kenneth@whitecape.org>	2018-07-23 15:28:17 -07:00
Daniel Schürmann	62024fa775	radv: enable VK_KHR_16bit_storage extension / 16bit storage features Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-23 23:16:26 +02:00
Daniel Schürmann	4d0b02bb5a	ac: add support for 16bit load_push_constant Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-23 23:16:25 +02:00
Daniel Schürmann	b722b29f10	radv: add support for 16bit input/output Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-23 23:16:25 +02:00
Daniel Schürmann	87989339a0	nir: add 16bit type information to glsl types Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-23 23:16:25 +02:00
Daniel Schürmann	7e7ee82698	ac: add support for 16bit buffer loads v2: Fixed dvec3 loads (bas) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-23 23:16:25 +02:00
Daniel Schürmann	a6a21e651d	ac: add support for 16bit UBO loads Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-23 23:16:25 +02:00
Daniel Schürmann	3109c5257b	ac: add support for 16bit ssbo stores Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-23 23:16:25 +02:00
Daniel Schürmann	f582367d49	ac: add 16bit conversion operations Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-23 23:16:25 +02:00
Dave Airlie	d73f1026b4	r600: enable tess_input_info for TES There might be a nicer way to do this, but this is at least correct. This fixes: KHR-GL44.tessellation_shader.single.max_patch_vertices KHR-GL44.tessellation_shader.tessellation_control_to_tessellation_evaluation.gl_PatchVerticesIn Reviewed-By: Gert Wollny <gert.wollny@collabora.com> Cc: mesa-stable@lists.freedesktop.org	2018-07-23 21:11:35 +01:00
Dave Airlie	760622c328	docs/features: fix virgl gles3.1 entries	2018-07-24 06:10:46 +10:00
Roland Scheidegger	09828feab0	draw: force draw pipeline if there's more than 65535 vertices The pt emit path can only handle 65535 - the number of vertices is truncated to a ushort, resulting in a too small buffer allocation, which will crash. Forcing the pipeline path looks suboptimal, then again this bug is probably there ever since GS is supported, so it seems it's not happening often. (Note that the vertex_id in the vertex header is 16 bit too, however this is only used by the draw pipeline, and it denotes the emit vertex nr, and that uses vbuf code, which will only emit smaller chunks, so should be fine I think.) Other solutions would be to simply allow 32bit counts for vertex allocation, however 65535 is already larger than this was intended for (the idea being it should be more cache friendly). Or could try to teach the pt emit path to split the emit in smaller chunks (only the non-index path can be affected, since gs output is always linear), but it's a bit tricky (we don't know the primitive boundaries up-front). Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=107295 Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-07-23 22:07:07 +02:00
Dave Airlie	51f67eeb21	docs/features: note ARB_copy_image is working on virgl	2018-07-24 06:06:15 +10:00
Dave Airlie	83332618c1	Revert "virgl: remove unused stride-arguments" This reverts commit `dc938b8398`. This adds warnings in vtest, and possibly breaks it.	2018-07-24 06:03:20 +10:00
Dave Airlie	69c2cd0b14	docs/features: note ssbo and atomic counters done for virgl	2018-07-24 05:56:35 +10:00
Dave Airlie	958b57ac82	virgl: add initial shader_storage_buffer_object support. (v2) This adds the guest side support for ARB_shader_storage_buffer_object. Co-authors: Gurchetan Singh <gurchetansingh@chromium.org> v2: move to using separate maximums (fixup macros) Reviewed-By: Gert Wollny <gert.wollny@collabora.com>	2018-07-24 05:54:21 +10:00
Jason Ekstrand	e4d346c86d	nir: Add a couple trivial abs optimizations Spotted in a shader in Batman: Arkham City. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-07-23 10:48:21 -07:00
Caio Marcelo de Oliveira Filho	52d831ff83	glsl: remove delegating constructors to allow build with C++98 Delegating constructors is a C++11 feature, so this was breaking when compiling with C++98. Change the copy_propagation_state() calls that used the convenience constructor to use a static member function instead. Since copy_propagation_state is expected to be heap allocated, this change is a good fit. Tested-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107305	2018-07-23 10:34:43 -07:00
Eric Anholt	6b73a97f84	v3d: Implement a small immediates optimization, based on VC4's. We can do one per instruction, and we have to be careful not to overwrite raddr_b, but this greatly reduces the pressure on uniform loads (particularly around ldvpm/stvpm instructions). total instructions in shared programs: 90768 -> 88220 (-2.81%) instructions in affected programs: 82711 -> 80163 (-3.08%)	2018-07-23 10:21:43 -07:00
Eric Anholt	79e0f042bc	v3d: Return an invalid src number if asked for a missing implicit uniform. Sometimes when iterating over sources, we might want to check if it's the implicit one. We wouldn't want to match on a non-implicit src using this function.	2018-07-23 10:21:43 -07:00
Eric Anholt	f2ea936f48	v3d: Skip emitting texture config parameter 2 if it's just the defaults. shader-db: total instructions in shared programs: 91275 -> 90768 (-0.56%) instructions in affected programs: 20702 -> 20195 (-2.45%)	2018-07-23 10:21:43 -07:00
Eric Anholt	421e99d777	v3d: Update an XXX comment for a path we handled in HW on V3D 4.x.	2018-07-23 10:21:43 -07:00
Eric Anholt	e7ae900341	v3d: Switch to using the new SFU instructions on V3D 4.x. These instructions let us write directly to the phys regfile, instead of just R4. That lets us avoid moving out of R4 to avoid conflicting with other SFU results, and to avoid conflicting with thread switches. There is still an extra instruction of latency, which is not represented in the scheduler at the moment. If you use the result before it's ready, the QPU will just stall, unlike the magic R4 mode where you'd read the previous value. That means that the following shader-db results aren't quite representative (since we now cause some stalls instead of emitting nops), but they're impressive enough that I'm happy with the change. total instructions in shared programs: 95669 -> 91275 (-4.59%) instructions in affected programs: 82590 -> 78196 (-5.32%)	2018-07-23 10:21:43 -07:00
Eric Anholt	58c1d3860f	v3d: Add QPU pack/unpack for the new SFU instructions. These instructions allow writing the result to any register, instead of a special writeback to r4.	2018-07-23 10:21:43 -07:00
Eric Anholt	cdfa99657d	v3d: Fix the name of the "flpop" operation. Noticed while trying to sort a new op into the appropriate place to match the documentation.	2018-07-23 10:21:43 -07:00
Eric Anholt	91e24e5718	v3d: Print the instruction we're testing in the QPU disasm/pack round-trip. If we fail initial disassembly, it's good to know what instruction it was that failed.	2018-07-23 10:21:42 -07:00
Eric Anholt	a1beb333d8	v3d: Drop unused vir_SAT() operation. We lower saturates in NIR.	2018-07-23 10:21:42 -07:00
Eric Anholt	8dfc6ee317	v3d: Rotate through registers to improve post-RA scheduling options. Similarly to VC4's implementation, by not picking r0 immediately upon freeing it, we give the scheduler more of a chance to fit later writes in earlier. I'm not clear on whether there's any real cost to picking phys over accumulators, so keep that behavior for now. shader-db: total instructions in shared programs: 96831 -> 95669 (-1.20%) instructions in affected programs: 77254 -> 76092 (-1.50%)	2018-07-23 10:21:42 -07:00
Eric Anholt	1fb31819ae	v3d: Allow reading from physical regs written in the previous instruction. This restriction existed in V3D 2.x, but lifting it was a major change in 3.x. shader-db results: total instructions in shared programs: 98117 -> 96831 (-1.31%) instructions in affected programs: 48520 -> 47234 (-2.65%)	2018-07-23 10:21:23 -07:00
Eric Engestrom	e6e22e4207	anv: remove unnecessary runtime copy of static string It's actually also a bit safer, since now the compiler will warn if the string is larger than the `.name` array. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-23 17:56:08 +01:00
Alex Smith	54f8f1545f	anv: Pay attention to VK_ACCESS_MEMORY_(READ\|WRITE)_BIT According to the spec, these should apply to all read/write access types (so would be equivalent to specifying all other access types individually). Currently, they were doing nothing. v2: Handle VK_ACCESS_MEMORY_WRITE_BIT in dstAccessMask. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-23 15:29:43 +01:00
Erik Faye-Lund	dc938b8398	virgl: remove unused stride-arguments The IOCTLs doesn't pass this along, so computing them in the first place is kinda pointless. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2018-07-23 11:21:09 +01:00
Samuel Pitoiset	6c58bc8d9c	radv: print a big warning when RADV_TRACE_FILE is set Users shouldn't use this debugging option except when we ask them to do! Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-23 11:34:42 +02:00
Samuel Pitoiset	6e32d9e7b0	radv: fix a memleak for merged shaders on GFX9 modules[i] can be NULL for merged shaders but we have to free the NIR code. radv_can_dump_shader_stats() already handles if modules[i] is NULL, no need to check it twice. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-23 11:34:39 +02:00
Jason Ekstrand	d0ee0a0a5d	intel/blorp: Fix blits to R8G8B8_UNORM_SRGB sRGB harder The first fix attempt contained a nasty typo which somehow didn't get caught in review. It also didn't work as intended because the sRGB conversion was happening but then throwing away all but the red channel because it dind't know it was RGB. Really, it's my fault for trying to fix a bug without first writing tests. I've now written tests and they pass with this change. :) Fixes: `11712b9ca1` "intel/blorp: Fix blits to R8G8B8_UNORM_SRGB" Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-23 00:36:39 -07:00
Jason Ekstrand	abd629eb3d	anv: Stop setting 3DSTATE_PS_EXTRA::PixelShaderHasUAV We've had several broadwell hangs that have come down to this bit just not working correctly. Most recently, we've had a pile of hangs reported with apps running under DXVK: https://github.com/doitsujin/dxvk/issues/469 Instead, use the bit that doesn't try to imply weird D3D coherency things and just force-enables the PS like we want. cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-22 23:43:19 -07:00
Jason Ekstrand	b99493c628	anv: Properly handle GetImageSubresourceLayout on complex images We support mipmapped and arrayed linear images so we need to support vkGetImageSubresourceLayout on them. Fortunately, it's just a trivial call into ISL. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-22 23:24:10 -07:00
Timothy Arceri	78f391d343	radeonsi/nir: make use of nir_lower_load_const_to_scalar() This allows NIR to CSE more operations. LLVM does this also so the impact is limited, however doing this in NIR allows other opts to make progress. For example some loops in Civilization Beyond Earth shaders are unrolled. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-23 09:48:51 +10:00
Ilia Mirkin	257128079c	anv/gen9: expose VK_EXT_post_depth_coverage Note that the use of ICMS_INNER_CONSERVATIVE disagrees with the GL driver. Perhaps it's more performant than ICMS_NORMAL and is otherwise permitted? Not sure, so I left it as-is. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-22 14:56:44 -07:00
Ilia Mirkin	768f143667	spirv: add support for SPV_KHR_post_depth_coverage Allow the capability to be exposed, and convert the new execution mode into fs state. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-22 14:56:36 -07:00
Mauro Rossi	6cbbd5b4f8	android: util/disk_cache: fix building errors in gallium drivers This patch applies the necessary changes in Android.common.mk as per automake rules, to avoid following building error: external/mesa/src/gallium/drivers/nouveau/nouveau_screen.c:159:8: error: implicit declaration of function 'disk_cache_get_function_timestamp' is invalid in C99 [-Werror,-Wimplicit-function-declaration] if (disk_cache_get_function_timestamp(nouveau_disk_cache_create, ^ 1 error generated. (v2) -DENABLE_SHADER_CACHE Android cflag is kept, to leave the AS-IS capability enabled Fixes: `cc10b34` ("util/disk_cache: Fix disk_cache_get_function_timestamp with disabled cache.") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-21 12:06:38 +02:00
Chih-Wei Huang	e7ffd3fb08	Android: fix a missing nir_intrinsics.h error The commit `76dfed8ae2` changed nir_intrinsics.h to be a generated header, but the corresponding dependency was not updated for Android. It causes the error: [ 0% 19/4336] target C: libmesa_pipe_radeonsi <= external/mesa/src/gallium/drivers/radeonsi/si_debug.c ... In file included from external/mesa/src/gallium/drivers/radeonsi/si_debug.c:25: In file included from external/mesa/src/gallium/drivers/radeonsi/si_pipe.h:28: In file included from external/mesa/src/gallium/drivers/radeonsi/si_shader.h:140: In file included from external/mesa/src/amd/common/ac_llvm_build.h:30: external/mesa/src/compiler/nir/nir.h:966:10: fatal error: 'nir_intrinsics.h' file not found ^~~~~~~~~~~~~~~~~~ 1 error generated. Fixes: `76dfed8ae2` ("nir: mako all the intrinsics") Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Mauro Rossi <issor.oruam@gmail.com>	2018-07-21 08:50:23 +02:00
Bas Nieuwenhuizen	e1febbefe8	nir: Fix end of function without return warning/error. There always is a continue block, so let us just do unreachable. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Fixes: `8cacf38f52` "nir: Do not use continue block after removing it." CC: 18.1 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107312	2018-07-20 22:27:39 +02:00
Danylo Piliaiev	d24c35c3fb	st: Sweep NIR after linking phase to free held memory After optimization passes and many trasfromations most of memory NIR holds is a garbage which was being freed only after shader deletion. Freeing it at the end of linking will save memory which would be useful in case there are a lot of complex shaders being compiled. The common case for this issue is 32bit game running under Wine. The cost of the optimization is around ~3-5% of compilation speed with complex shaders. Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-20 11:26:12 -07:00
Eric Anholt	945524ba0e	st/dri: Don't require a dri_format for image creation. Nothing in EGL_KHR_gl_image.txt seems to let us deny creation based on formats, and doing so causes many failures in dEQP-EGL.functional.image.api.* The NONE value we were protecting from only gets looked at in the __DRI_IMAGE_ATTRIB_FORMAT and __DRI_IMAGE_ATTRIB_FOURCC queries, which are used from wayland and gbm (which throw an error cleanly on unknown format) and DMABUF export. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-20 11:26:12 -07:00
Eric Anholt	f6750456c5	egl: Refuse EGL_MESA_image_dma_buf_export if we don't have a DRM fourcc. The EGL CTS expects that you can make images from all sorts of things, including things like z16 and s8, which we don't have DRM fourccs for. Just return an error when trying to export one of those. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-20 11:26:12 -07:00
Eric Anholt	a221f9709e	v3d: Fix incorrect handling of two fences created back-to-back. Recreating our context's syncobj with ALREADY_SIGNALED meant that if you created two fences in a row, then waiting on the second would succeed immediately. Instead, export a sync file in the gallium fence (since we don't have a syncobj clone ioctl), and just create a new syncobj to wait on whenever we need to. Noticed while debugging dEQP-GLES3.functional.fence_sync.client_wait_sync_finish	2018-07-20 11:11:29 -07:00
Eric Anholt	fc28692a5a	v3d: Fix the timeout value passed to drmSyncobjWait(). The API wants an absolute time, so we need to go add gallium's argument to CLOCK_MONOTONIC.	2018-07-20 11:11:29 -07:00
Eric Anholt	4f04bd68cf	v3d: Fix drmSyncobjWait() return value checking even more. It tends to return >0 in the success case (I think the value is something like "how much of the timeout remained"). Fixes dEQP-GLES3.functional.fence_sync.client_wait_sync_finish	2018-07-20 11:11:29 -07:00
Eric Anholt	2f90879a34	v3d: Use the list_first_entry/list_last_entry macros.	2018-07-20 11:11:29 -07:00
Eric Anholt	d0e53373e5	v3d: Move BO cache counting to dump time instead of cache management. This is one less way to get the dump stats wrong.	2018-07-20 11:11:29 -07:00
Eric Anholt	7d6aef6fa5	v3d: Reduce the stale BO reclamation spam with dump_stats set. This was obviously meant to be when we were actually freeing a BO, not just when there was at least one BO in the list.	2018-07-20 11:11:29 -07:00
Eric Anholt	5d11094db1	v3d: Respect a sampler view's first_layer field. Fixes texturing from EGL images created from cubemap faces, as in dEQP-EGL.functional.image.create.gles2_cubemap_negative_x_rgba_texture	2018-07-20 11:11:29 -07:00
Sonny Jiang	c6737756ad	radeonsi: emit_spi_map packets optimization v2: marek: remove an empty line before break; rename reg_val_seq -> spi_ps_input_cntl "type * x" -> "type *x" Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-07-20 13:50:26 -04:00
Gert Wollny	4d094993c3	virgl: Expose GL_ARB_copy_image if host supports it Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2018-07-20 19:15:12 +02:00
Gert Wollny	0bde9739c0	virgl: Allow RGB32* textures only as buffer objects When requesting a texture of the internal format GL_RGB32F Gallium will try to allocate a renderable texture and returns RGBA32F or RGBX32F, but when one requests GL_RGB32I or GL_RGB32UI the according 3-component texture will be returned. This leads to problems later, when one wants to use glCopyImageSubData to copy data between these textures that should be compatible, but given the way virgl and Gallium handle this the latter fails with an assertion, because the per-texel bit size is different. By allowing the GL_RGB32* only for texture buffers these problems are avoided without losing the ARB_tbo_rgb32 extension (thanks Ilia Mirkin). v2: Correct spelling (Gurchetan Singh) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2018-07-20 19:12:49 +02:00
Lionel Landwerlin	feb43ef674	intel: tools: dump: protect against multiple calls on destructor When running gdb, make sure to pass the LD_PRELOAD variable only to the executed program, not the debugger. Otherwise the debugger will run the preloaded constructor/destructor too and bad things will happen. Suggested-by: Rafael Antognolli <rafael.antognolli@intel.com> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-20 17:36:56 +01:00
Lionel Landwerlin	2a9069eb97	intel: tools: dump: make dump tool reliable under gdb The problem with passing the configuration of the dump lib through a file descriptor is that it can be read only once. But under gdb you might want to rerun your program multiple times. This change hands the configuration through a temporary file that is deleted once the command line passes to intel_dump_gpu has exited. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-20 17:36:37 +01:00
Samuel Pitoiset	1efc9094e0	radv: don't flush DB before subpass FS resolves That shouldn't be needed because the DB state is invalid. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-20 17:30:13 +02:00
Gert Wollny	016807161b	r600: Correct evaluation of cube array index and face The array index needs to be corrected and it must be insured that it is rounded and its value is non-negative before it is combined with the face id. v5: Use RNDNE instead of ADD 0.5 and FLOOR (Ilia Mirkin) v6: Fix type (Roland Scheidegger) Fixes 182 from android/cts/master/gles31-master.txt: dEQP-GLES31.functional.texture.filtering.cube_array.formats.* dEQP-GLES31.functional.texture.filtering.cube_array.sizes.* dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_mipmap_* dEQP-GLES31.functional.texture.filtering.cube_array.combinations.linear_mipmap_* dEQP-GLES31.functional.texture.filtering.cube_array.no_edges_visible.* Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-07-20 14:55:12 +02:00
Gert Wollny	01766c1db6	r600: correct texture offset for array index lookup Correct the array index for TEXTURE_1D_ARRAY, and TEXTURE_2D_ARRAY The standard says the array index is evaluated according to floor(z + 0.5) but RNDNE is sufficient also for the test cases were z is close to 1.5 and it is likely to hit 1.5, the corner case were RNDNE gives a result different from above formula. v5: - Use RNDNE instead of ADD 0.5 and FLOOR (Ilia Mirkin) - update commit message Fixes 325 tests from android/cts/master/gles3-master.txt: dEQP-GLES3.functional.shaders.texture_functions.texture.sampler2darray dEQP-GLES3.functional.shaders.texture_functions.textureoffset.sampler2darray dEQP-GLES3.functional.shaders.texture_functions.texturelod.sampler2darray* dEQP-GLES3.functional.shaders.texture_functions.texturelodoffset.sampler2darray dEQP-GLES3.functional.shaders.texture_functions.texturegrad.sampler2darray dEQP-GLES3.functional.shaders.texture_functions.texturegradoffset.sampler2darray dEQP-GLES3.functional.texture.filtering.2d_array.formats.* dEQP-GLES3.functional.texture.filtering.2d_array.sizes.* dEQP-GLES3.functional.texture.filtering.2d_array.combinations.* dEQP-GLES3.functional.texture.shadow.2d_array.* dEQP-GLES3.functional.texture.vertex.2d_array.* Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-07-20 14:55:12 +02:00
Gert Wollny	626bd455d4	r600: Delay emission of texture gradients and lookup offsets Gradients used in texture lookups and the offsets must reside in the same fetch clause (the first is imposed by the hardware and the second is expected by sb). In order to ensure that no ALU clause is inserted between emission and use of these, delay the emission of these instructions until the texture instruction using them is also emitted. This is needed in preparation for the correction of the texture array indices. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-07-20 14:55:12 +02:00
Bas Nieuwenhuizen	cc10b34e9e	util/disk_cache: Fix disk_cache_get_function_timestamp with disabled cache. radv always needs it, so just check the header instead. Also do not declare the function if the variable is not set, so we get a nice compile error instead of failing to open a device at runtime. Fixes: `b87ef9e606` "util: fix MSVC build issue in disk_cache.h" Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-20 12:09:19 +02:00
Bas Nieuwenhuizen	8cacf38f52	nir: Do not use continue block after removing it. Reinserting code directly before a jump means the block gets split and merged, removing the original block and replacing it in the process. Hence keeping a pointer to the continue block over a reinsert causes issues. This code changes nir_opt_if to simply look for the new continue block. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107275 CC: 18.1 <mesa-stable@lists.freedesktop.org>	2018-07-20 12:09:19 +02:00
Samuel Pitoiset	ce454d02cc	radv: simplify a condition in radv_src_access_flush() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-20 10:17:17 +02:00
Samuel Pitoiset	1ff25c4e6b	radv: save current state just before resolving with FS Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-20 10:17:15 +02:00
Samuel Pitoiset	c3d5f124c6	radv: don't check if a subpass has resolve attachments twice We already check that in radv_cmd_buffer_resolve_subpass(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-20 10:17:13 +02:00
Samuel Pitoiset	0a8127bbfb	radv: make use of radv_subpass_barrier() when resolving subpasses The goal is to use radv_barrier()/radv_subpass_barrier() as much as possible for further optimizations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-20 10:17:11 +02:00
Rhys Perry	409a60df3b	nv50/ir: move LateAlgebraicOpt back to right after ConstantFolding total instructions in shared programs : 5480808 -> 5472107 (-0.16%) total gprs used in shared programs : 647530 -> 647532 (0.00%) total shared used in shared programs : 389120 -> 389120 (0.00%) total local used in shared programs : 21064 -> 21064 (0.00%) total bytes used in shared programs : 58551648 -> 58459352 (-0.16%) local shared gpr inst bytes helped 0 0 73 2609 2609 hurt 0 0 71 34 34	2018-07-19 23:34:58 +02:00
Rhys Perry	2afef231db	nv50/ir: handle SHLADD in IndirectPropagation An alternative solution to the problem fixed in `0bd83d0` ("nv50/ir: move LateAlgebraicOpt to the very end"). total instructions in shared programs : 5481195 -> 5480808 (-0.01%) total gprs used in shared programs : 647535 -> 647530 (-0.00%) total shared used in shared programs : 389120 -> 389120 (0.00%) total local used in shared programs : 21064 -> 21064 (0.00%) total bytes used in shared programs : 58555784 -> 58551648 (-0.01%) local shared gpr inst bytes helped 0 0 2 34 34 hurt 0 0 0 0 0	2018-07-19 23:34:58 +02:00
Rhys Perry	3b6edd0b59	gm107/ir: use CS2R for SV_CLOCK This instruction seems to be faster than S2R and requires no barrier, though the range of special registers it can read from is limited. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2018-07-19 23:34:58 +02:00
Lionel Landwerlin	94cf964586	intel: tools: dump: remove mentions of intel_aubdump Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-19 20:12:53 +01:00
Lionel Landwerlin	0f9d8b754f	intel: tools: aubwrite: fix invalid frees on finish Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-07-19 20:11:56 +01:00
Samuel Pitoiset	3d41757788	ac/nir: add a workaround for bitfield_extract when count is 0 LLVM 7 returns incorrect results when count is 0, something has been broken since LLVM 6. Of course, the best solution is to fix LLVM but this workaround works as expected for now. Original workaround by Philippe Rebohle. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107276 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-19 20:41:10 +02:00
Nanley Chery	e2e32b6afd	intel/isl/gen4: Make depth/stencil buffers Y-Tiled Rendering to a linear depth buffer on gen4 is causing a GPU hang in the CI system. Until a better explanation is found, assume that errata is applicable to all gen4 platforms. Fixes `fbe01625f6` ("i965/miptree: Share tiling_flags in miptree_create"). Reported-by: Mark Janes <mark.a.janes@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107248 Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-19 11:05:07 -07:00
Nanley Chery	44ab26d0c9	i965/misc: Use depth/stencil surf's tiling on gen4-5 Make the 3D engine aware of the depth/stencil surface's tiling before doing any render operations. Fixes `fbe01625f6` ("i965/miptree: Share tiling_flags in miptree_create"). Reported-by: Mark Janes <mark.a.janes@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107248 Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-19 11:05:07 -07:00
Caio Marcelo de Oliveira Filho	507a8037a7	glsl: don't let an 'if' then-branch kill copy propagation (elements) for else-branch When handling 'if' in copy propagation elements, if a certain variable was killed when processing the first branch of the 'if', then the second would get any propagation from previous nodes. x = y; if (...) { z = x; // This would turn into z = y. x = 22; // x gets killed. } else { w = x; // This would NOT turn into w = y. } With the change, we let copy propagation happen independently in the two branches and only then apply the killed values for the subsequent code. One example in shader-db part of shaders/unity/8.shader_test: (assign (xyz) (var_ref col_1) (var_ref tmpvar_8) ) (if (expression bool < (swiz y (var_ref xlv_TEXCOORD0) )(constant float (0.000000)) ) ( (assign (xyz) (var_ref col_1) (expression vec3 + (var_ref tmpvar_8) ... ) ... ) ) ( (assign (xyz) (var_ref col_1) (expression vec3 lrp (var_ref col_1) ... ) ... ) )) The variable col_1 was replaced by tmpvar_8 in the then-part but not in the else-part. NIR deals well with copy propagation, so it already covered for the missing ones that this patch fixes. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-19 10:00:59 -07:00
Caio Marcelo de Oliveira Filho	e4f32dec23	glsl: change opt_copy_propagation_elements data structures Instead of keeping multiple acp_entries in lists, have a single acp_entry per variable. With this, the implementation of clone is more convenient and now fully implemented. In the previous code, clone was only partial. Before this patch, each acp_entry struct represented a write to a variable including LHS, RHS and a mask of what channels were written to. There were two main hash tables, the first (lhs_ht) stored a list of acp_entries per LHS variable, with the values available to copy for that variable; the second (rhs_ht) was a "reverse index" for the first hash table, so stored acp_entries per RHS variable. After the patch, there's a single acp_entry struct per LHS variable, it contains an array with references to the RHS variables per channel. There now is a single hash table, from LHS variable to the corresponding entry. The "reverse index" is stored in the ACP entry, in the form of a set of variables that copy from the LHS. To make the clone operation cheaper, the ACP entries are created on demand. This should not change the result of copy propagation, a later patch will take advantage of the clone operation. v2: Add note clarifying how the hashtable is destroyed. v3: (all from Eric Anholt) Add remove_unused_var_from_dsts() function for reuse. Remove from dsts as we go instead of clearing at the end. Add clarifying comment to erase(). Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-19 10:00:30 -07:00
Caio Marcelo de Oliveira Filho	7b0d395250	glsl: separate copy propagation state Separate higher level logic of visiting instructions and chosing when to store and use new copy data from the datastructure holding the copy propagation information. This will also make easier later patches that change the structure. v2: Remove empty destructor and clarify how hash tables are destroyed. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-19 10:00:30 -07:00
Lionel Landwerlin	49e86f09fe	intel: tools: dump: trace memory writes Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-19 16:48:42 +01:00
Lionel Landwerlin	5ba3e5c358	intel: tools: dump: remove command execution feature In commit `86cb05a6d3` ("intel: aubinator: remove standard input processing option") we removed the ability to process aub as an input stream because we're now rely on mmapping the aub file to back the buffers aubinator is parsing. intel_aubdump was the provider of the standard input data and since we've copied/reworked intel_aubdump into intel_dump_gpu within Mesa, we don't need that code anymore. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-19 10:11:54 +01:00
Danylo Piliaiev	494a206229	radv: Fix incorrect assumption about ternary operator precedence Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-19 10:04:27 +02:00
Marek Olšák	dcbcc83003	mesa: fix make check for AMD_performance_monitor	2018-07-19 01:17:01 -04:00
Marek Olšák	f097f0c55c	mesa: remove dead code from api_loopback This should only contain functions not set in vtxfmt.c. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-19 01:10:32 -04:00
Marek Olšák	987c2ece03	mesa: expose ARB_indirect_parameters in the compatibility profile Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (v1) v2: fix dispatch_sanity	2018-07-19 01:10:18 -04:00
Marek Olšák	d40188800e	vbo: fix ARB_multi_draw_indirect for the compatibility profile Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-19 00:58:51 -04:00
Marek Olšák	6c4652ea8a	mesa: expose ARB_shader_viewport_layer_array in the compatibility profile no changes needed for GL compat Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-19 00:58:51 -04:00
Marek Olšák	da528898bc	mesa: expose ARB_ES3_1_compatibility in the compatibility profile no changes needed for GL compat Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-19 00:58:51 -04:00
Marek Olšák	565dacc3d6	winsys/amdgpu: remove RADEON_SURF_FMASK leftover RADEON_SURF_FMASK is never set.	2018-07-19 00:58:51 -04:00
Marek Olšák	9b82d128c9	ac: run LLVM optimization passes only on the final function after inlining	2018-07-19 00:58:49 -04:00
Bas Nieuwenhuizen	17b5a59b4e	radv: Enable binning and dfsm by default on Raven. Seems like it increases performance by 2-3% for some demos and games. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-19 02:38:21 +02:00
Bas Nieuwenhuizen	978570769d	radv: Always set disable zpass increment bit when possible. When no occlusion queries are active even if out of order is enabled. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-19 02:38:10 +02:00
Bas Nieuwenhuizen	82664af6cf	radv: Select correct entries for binning. Overshot it by one every time. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-19 02:38:01 +02:00
Bas Nieuwenhuizen	760211b77c	radv: Fix number of samples used for binning. Used the wrong register ... CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-19 02:37:54 +02:00
Bas Nieuwenhuizen	c0144e915a	radv: Disable disabled color buffers in rbplus opts. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-19 02:37:47 +02:00
Marek Olšák	fb049742d6	r600: silence the signed overflow warning like radeonsi r600_gpu_load.c: In function ‘r600_gpu_load_thread’: ../../../../src/util/os_time.h:82:7: warning: assuming signed overflow does not occur when assuming that (X + c) >= X is always true [-Wstrict-overflow] if (start <= end)	2018-07-18 17:48:48 -04:00
Andres Rodriguez	d3d9513556	radv: fix wmaybe-uninitialized in radv_meta_fast_clear.c Assignment and usage of this variable both happen inside an if(rad_image_has_dcc()) {} blocks. It seems gcc plays it safe and assumes that both function calls could have different return values. But in this case we should be safe. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-18 15:32:51 -04:00
Sonny Jiang	4bf7234061	radeonsi: emit_guardband packets optimization Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-07-18 15:04:27 -04:00
Sonny Jiang	80ade05b8d	radeonsi: Save CLEAR_STATE initial values for optimization Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-07-18 15:04:27 -04:00
Jan Vesely	9baacf3fa7	radeonsi: Refuse to accept code with unhandled relocations They might lead to unrecoverable GPU hang. Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: mesa-stable@lists.freedesktop.org	2018-07-18 13:56:56 -04:00
Eric Anholt	70534dbe29	Allow AMD_perfmon on GLES contexts v2: whitespace alignment fix Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-07-18 10:39:21 -07:00
Eric Anholt	4ba478d7cd	egl: Use the canonical drm-uapi fourcc header to avoid local defines. We should only use a #define locally once it's been upstreamed, and at that point you should just update our drm_fourcc.h. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-18 10:37:54 -07:00
Eric Anholt	2c6279d58b	v3d: Fix tiling modifier support to use the new UIF define. You can't use T tiled buffers on V3D 3.x and newer, it's been replaced with a newer layout shared with other hardware blocks.	2018-07-18 10:37:49 -07:00
Eric Anholt	6c0482e176	drm-uapi: Update drm_fourcc.h for new format modifiers. This brings in the Broadcom VC4 SAND and V3D 3.x+ UIF modifiers, from drm-next commit 4da1d4c751c9b1b713c13043bad7c4d27cd1418c.	2018-07-18 10:37:49 -07:00
Marek Olšák	201ebf51d1	st/mesa: notify u_vbuf/driver that draw index bounds are unknown for indirect Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-18 13:33:30 -04:00
Timothy Pearson	e1621fda84	radeonsi: Use signed char for color_interp_vgpr_index color_interp_vgpr_index was declared as a generic char value. Because signed values are used in this variable, the result was not safe across architectures and crashed on ppc64[el] and arm. Declare color_interp_vgpr_index as a signed type. Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-07-18 13:31:29 -04:00
Jason Ekstrand	aaa6fac8f6	intel/blorp: Take an explicit filter parameter in blorp_blit This lets us move the glBlitFramebuffer nonsense into the GL driver and make the usage of BLORP mutch more explicit and obvious as to what it's doing. Reviewed-by: Chad Versace <chadversary@chromium.org>	2018-07-18 09:47:28 -07:00
Jason Ekstrand	9fbe2a2007	intel/blorp: Add a blorp_filter enum for use in blorp_blit At the moment, this is entirely internal but we'll expose it to clients of the BLORP API in the next commit. Reviewed-by: Chad Versace <chadversary@chromium.org>	2018-07-18 09:47:28 -07:00
Caio Marcelo de Oliveira Filho	ea556471a1	intel/tools: add missing include for stdarg.h Fixes build in GCC 8.1.1: FAILED: src/intel/tools/src@intel@tools@@intel_dump_gpu@sha/aub_write.c.o gcc -Isrc/intel/tools/src@intel@tools@@intel_dump_gpu@sha -Isrc/intel/tools -I../../src/intel/tools -Isrc/../include -I../../src/../include -Isrc -I../../src -Isrc/mapi -I../../src/mapi -Isrc/mesa -I../../src/mesa -I../../src/gallium/include -I../../src/gallium/auxiliary -Isrc/intel -I../../src/intel -I../../include/drm-uapi -fdiagnostics-color=always -pipe -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -std=c99 -O2 -g -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS '-DVERSION="18.2.0-devel"' -DPACKAGE_VERSION=VERSION '-DPACKAGE_BUGREPORT="https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa"' -DGLX_USE_TLS -DENABLE_ST_OMX_BELLAGIO=0 -DENABLE_ST_OMX_TIZONIA=0 -DHAVE_X11_PLATFORM -DGLX_INDIRECT_RENDERING -DGLX_DIRECT_RENDERING -DGLX_USE_DRM -DHAVE_DRM_PLATFORM -DHAVE_SURFACELESS_PLATFORM -DENABLE_SHADER_CACHE -DHAVE___BUILTIN_BSWAP32 -DHAVE___BUILTIN_BSWAP64 -DHAVE___BUILTIN_CLZ -DHAVE___BUILTIN_CLZLL -DHAVE___BUILTIN_CTZ -DHAVE___BUILTIN_EXPECT -DHAVE___BUILTIN_FFS -DHAVE___BUILTIN_FFSLL -DHAVE___BUILTIN_POPCOUNT -DHAVE___BUILTIN_POPCOUNTLL -DHAVE___BUILTIN_UNREACHABLE -DHAVE_FUNC_ATTRIBUTE_CONST -DHAVE_FUNC_ATTRIBUTE_FLATTEN -DHAVE_FUNC_ATTRIBUTE_MALLOC -DHAVE_FUNC_ATTRIBUTE_PURE -DHAVE_FUNC_ATTRIBUTE_UNUSED -DHAVE_FUNC_ATTRIBUTE_WARN_UNUSED_RESULT -DHAVE_FUNC_ATTRIBUTE_WEAK -DHAVE_FUNC_ATTRIBUTE_FORMAT -DHAVE_FUNC_ATTRIBUTE_PACKED -DHAVE_FUNC_ATTRIBUTE_RETURNS_NONNULL -DHAVE_FUNC_ATTRIBUTE_VISIBILITY -DHAVE_FUNC_ATTRIBUTE_ALIAS -DHAVE_FUNC_ATTRIBUTE_NORETURN -D_GNU_SOURCE -DUSE_SSE41 -DUSE_GCC_ATOMIC_BUILTINS -DUSE_X86_64_ASM -DMAJOR_IN_SYSMACROS -DHAVE_SYS_SYSCTL_H -DHAVE_LINUX_FUTEX_H -DHAVE_ENDIAN_H -DHAVE_STRTOF -DHAVE_MKOSTEMP -DHAVE_POSIX_MEMALIGN -DHAVE_TIMESPEC_GET -DHAVE_MEMFD_CREATE -DHAVE_STRTOD_L -DHAVE_DLADDR -DHAVE_DL_ITERATE_PHDR -DHAVE_ZLIB -DHAVE_PTHREAD -DHAVE_LIBDRM -DHAVE_LLVM=0x0600 -DMESA_LLVM_VERSION_PATCH=1 -DHAVE_VALGRIND -DHAVE_LIBUNWIND -DHAVE_WAYLAND_PLATFORM -DWL_HIDE_DEPRECATED -DHAVE_DRI3 -DHAVE_DRI3_MODIFIERS -Wall -Werror=implicit-function-declaration -Werror=missing-prototypes -fno-math-errno -fno-trapping-math -Wno-missing-field-initializers -fPIC -fvisibility=hidden -Wno-override-init -MD -MQ 'src/intel/tools/src@intel@tools@@intel_dump_gpu@sha/aub_write.c.o' -MF 'src/intel/tools/src@intel@tools@@intel_dump_gpu@sha/aub_write.c.o.d' -o 'src/intel/tools/src@intel@tools@@intel_dump_gpu@sha/aub_write.c.o' -c ../../src/intel/tools/aub_write.c ../../src/intel/tools/aub_write.c: In function ‘fail_if’: ../../src/intel/tools/aub_write.c:243:4: error: implicit declaration of function ‘va_start’; did you mean ‘assert’? [-Werror=implicit-function-declaration] va_start(args, format); ^~~~~~~~ assert ../../src/intel/tools/aub_write.c:245:4: error: implicit declaration of function ‘va_end’; did you mean ‘rand’? [-Werror=implicit-function-declaration] va_end(args); ^~~~~~ rand cc1: some warnings being treated as errors Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-18 09:19:22 -07:00
Jason Ekstrand	2be30a1a39	intel/tools: Rename error2aub to intel_error2aub Suggested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-18 09:03:05 -07:00
Danylo Piliaiev	d219521379	i965: Sweep NIR after linking phase to free held memory After optimization passes and many trasfromations most of memory NIR holds is a garbage which was being freed only after shader deletion. Freeing it at the end of linking will save memory which would be useful in case there are a lot of complex shaders being compiled. The common case for this issue is 32bit game running under Wine. The cost of the optimization is around ~3-5% of compilation speed with complex shaders. V2: by Jason Ekstrand - Move nir_sweep up, right after the last change of NIR Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103274 Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: mesa-stable@lists.freedesktop.org	2018-07-18 09:00:18 -07:00
Marek Olšák	51d6b163da	winsys/amdgpu: fix VDPAU interop by having one amdgpu_winsys_bo per BO (v2) Dependencies between rings are inserted correctly if a buffer is represented by only one unique amdgpu_winsys_bo instance. Use a hash table keyed by amdgpu_bo_handle to have exactly one amdgpu_winsys_bo per amdgpu_bo_handle. v2: return offset and stride properly Tested-by: Leo Liu <leo.liu@amd.com> Acked-by: Leo Liu <leo.liu@amd.com>	2018-07-18 11:56:28 -04:00
Marek Olšák	e06b8ec106	winsys/amdgpu: use a better hash_pointer function Tested-by: Leo Liu <leo.liu@amd.com> Acked-by: Leo Liu <leo.liu@amd.com>	2018-07-18 11:56:28 -04:00
Marek Olšák	53684e9163	winsys/amdgpu: clean up error handling in amdgpu_bo_from_handle Tested-by: Leo Liu <leo.liu@amd.com> Acked-by: Leo Liu <leo.liu@amd.com>	2018-07-18 11:56:28 -04:00
Marek Olšák	a73e3d5e00	winsys/amdgpu: shorten bo->ws in amdgpu_bo_destroy Tested-by: Leo Liu <leo.liu@amd.com> Acked-by: Leo Liu <leo.liu@amd.com>	2018-07-18 11:56:28 -04:00
Jason Ekstrand	6a60beba40	intel/tools: Add an error state to aub translator Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-18 08:42:53 -07:00
Jason Ekstrand	d6ad32600e	intel/tools: Break aub file writing into a helper Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-18 08:42:50 -07:00
Jason Ekstrand	0a457d987e	intel/tools: Refactor aub dumping to remove singletons Instead of having quite so many singletons, we use a struct aub_file to organize the bits we need for writing an aub file. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-18 08:42:46 -07:00
Jason Ekstrand	6953d7f5d2	intel/dump_gpu: Fix corner cases in PPGTT range calculations For large buffers which span an entire l1 page table, we got the range calculations wrong. In this case, we end up with an l1_start which is the first byte represented by the given l1 table and an l1_end which is the first byte after the range represented by the l1 table. Then l2_start_index == L2_index(l2_end) due to roll-over. Instead, compute lN_end using (1Ull << shift) - 1 so that lN_end is the last byte in the range represented by the Nth level page table. When we do this, we don't need the conditional expression anymore. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-18 08:42:38 -07:00
Caio Marcelo de Oliveira Filho	322fa3e5be	intel/blorp: fix uninitialized variable warning Compiler doesn't pick up that level and start_layer will be defined, so do as was done for num_layers in `4d8b476fa9` "intel/blorp: Fix compiler warning about num_layers." and always set it. Fixes warning ../../src/mesa/drivers/dri/i965/brw_blorp.c: In function ‘brw_blorp_clear_depth_stencil’: ../../src/mesa/drivers/dri/i965/brw_blorp.c:1439:4: warning: ‘start_layer’ may be used uninitialized in this function [-Wmaybe-uninitialized] blorp_clear_depth_stencil(&batch, &depth_surf, &stencil_surf, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ level, start_layer, num_layers, ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ x0, y0, x1, y1, ~~~~~~~~~~~~~~~ (mask & BUFFER_BIT_DEPTH), ctx->Depth.Clear, ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ stencil_mask, ctx->Stencil.Clear); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../../src/mesa/drivers/dri/i965/brw_blorp.c:1439:4: warning: ‘level’ may be used uninitialized in this function [-Wmaybe-uninitialized] Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2018-07-18 08:29:51 -07:00
Caio Marcelo de Oliveira Filho	3bf19bfdc6	util/string_buffer: fix warning in tests And also specify the maximum size when writing to static buffers. The warning below refers to the case where "str5" could be larger than "str5 - str4", then the strcat would have overlapping dst and src. Compiler doesn't pick up the bound from the snprintf above, so we make clear the bounds of str5 by using strncat() instead of strcat(). ../../src/util/tests/string_buffer/string_buffer_test.cpp: In member function ‘virtual void string_buffer_string_buffer_tests_Test::TestBody()’: ../../src/util/tests/string_buffer/string_buffer_test.cpp:106:10: warning: ‘char* strcat(char, const char)’ accessing 81 or more bytes at offsets 48 and 128 may overlap 1 byte at offset 128 [-Wrestrict] strcat(str4, str5); ~~~~~~^~~~~~~~~~~~ Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2018-07-18 08:29:51 -07:00
Caio Marcelo de Oliveira Filho	577c8d7288	i965/miptree: avoid uninitialized variable warnings GCC 8.1.1 is having a hard time identifying that the values are properly initialized when used. In the 'memset_value' case, we pass the uninitialized value to another function (that will use only if the conditions match the initialization). Just give enough hint to the compiler to figure things out. Fixes the warnings ../../src/mesa/drivers/dri/i965/intel_mipmap_tree.c: In function ‘intel_miptree_alloc_aux’: ../../src/mesa/drivers/dri/i965/intel_mipmap_tree.c:1839:18: warning: ‘memset_value’ may be used uninitialized in this function [-Wmaybe-uninitialized] mt->aux_buf = intel_alloc_aux_buffer(brw, &aux_surf, needs_memset, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ memset_value); ~~~~~~~~~~~~~ ../../src/mesa/drivers/dri/i965/intel_mipmap_tree.c:1698:10: warning: ‘initial_state’ may be used uninitialized in this function [-Wmaybe-uninitialized] if (wants_memset) ^ ../../src/mesa/drivers/dri/i965/intel_mipmap_tree.c:1772:23: note: ‘initial_state’ was declared here enum isl_aux_state initial_state; ^~~~~~~~~~~~~ Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2018-07-18 08:29:51 -07:00
Caio Marcelo de Oliveira Filho	8ec40824ae	intel/batch-decoder: fix uninitialized values warnings Code assumes that all the necessary fields will exist, but compiler doesn't know about this. Provide zero as default values, like in other decoding functions. Fixes warnings ../../src/intel/common/gen_batch_decoder.c: In function ‘handle_media_interface_descriptor_load’: ../../src/intel/common/gen_batch_decoder.c:347:7: warning: ‘binding_entry_count’ may be used uninitialized in this function [-Wmaybe-uninitialized] dump_binding_table(ctx, binding_table_offset, binding_entry_count); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../../src/intel/common/gen_batch_decoder.c:347:7: warning: ‘binding_table_offset’ may be used uninitialized in this function [-Wmaybe-uninitialized] ../../src/intel/common/gen_batch_decoder.c:346:7: warning: ‘sampler_count’ may be used uninitialized in this function [-Wmaybe-uninitialized] dump_samplers(ctx, sampler_offset, sampler_count); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../../src/intel/common/gen_batch_decoder.c:346:7: warning: ‘sampler_offset’ may be used uninitialized in this function [-Wmaybe-uninitialized] ../../src/intel/common/gen_batch_decoder.c:343:7: warning: ‘ksp’ may be used uninitialized in this function [-Wmaybe-uninitialized] ctx_disassemble_program(ctx, ksp, "compute shader"); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../../src/intel/common/gen_batch_decoder.c: In function ‘decode_dynamic_state_pointers’: ../../src/intel/common/gen_batch_decoder.c:663:54: warning: ‘state_offset’ may be used uninitialized in this function [-Wmaybe-uninitialized] const uint32_t *state_map = ctx->dynamic_base.map + state_offset; ~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~ ../../src/intel/common/gen_batch_decoder.c: In function ‘gen_print_batch’: ../../src/intel/common/gen_batch_decoder.c:856:13: warning: ‘next_batch.map’ may be used uninitialized in this function [-Wmaybe-uninitialized] if (next_batch.map == NULL) { ^ ../../src/intel/common/gen_batch_decoder.c:860:13: warning: ‘next_batch.addr’ may be used uninitialized in this function [-Wmaybe-uninitialized] gen_print_batch(ctx, next_batch.map, next_batch.size, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ next_batch.addr); ~~~~~~~~~~~~~~~~ Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2018-07-18 08:29:51 -07:00
Caio Marcelo de Oliveira Filho	f836d799f9	intel/decoder: use snprintf(..., "%s", ...) instead of strncpy strncpy() doesn't guarantee the terminator NUL, so we would need to set ourselves. Just use snprintf() instead. Fixes the warnings ../../src/intel/common/gen_decoder.c: In function ‘iter_decode_field’: ../../src/intel/common/gen_decoder.c:897:7: warning: ‘strncpy’ specified bound 128 equals destination size [-Wstringop-truncation] strncpy(iter->name, iter->field->name, sizeof(iter->name)); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In function ‘iter_advance_field’, inlined from ‘gen_field_iterator_next’ at ../../src/intel/common/gen_decoder.c:1015:9: ../../src/intel/common/gen_decoder.c:844:7: warning: ‘strncpy’ specified bound 128 equals destination size [-Wstringop-truncation] strncpy(iter->name, iter->field->name, sizeof(iter->name)); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2018-07-18 08:29:51 -07:00
Caio Marcelo de Oliveira Filho	20fcd152a2	anv: give more room to debug report The error buffer is limited to 256, but the report contains the filename and possibly other data. So give it more space. Avoids the warnings ../../src/intel/vulkan/anv_util.c: In function ‘__anv_perf_warn’: ../../src/intel/vulkan/anv_util.c:66:42: warning: ‘%s’ directive output may be truncated writing up to 255 bytes into a region of size 254 [-Wformat-truncation=] snprintf(report, sizeof(report), "%s: %s", file, buffer); ^~ ~~~~~~ ../../src/intel/vulkan/anv_util.c:66:4: note: ‘snprintf’ output 3 or more bytes (assuming 258) into a destination of size 256 snprintf(report, sizeof(report), "%s: %s", file, buffer); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../../src/intel/vulkan/anv_util.c: In function ‘__vk_errorf’: ../../src/intel/vulkan/anv_util.c:96:48: warning: ‘%s’ directive output may be truncated writing up to 255 bytes into a region of size 252 [-Wformat-truncation=] snprintf(report, sizeof(report), "%s:%d: %s (%s)", file, line, buffer, ^~ ~~~~~~ ../../src/intel/vulkan/anv_util.c:96:7: note: ‘snprintf’ output 8 or more bytes (assuming 263) into a destination of size 256 snprintf(report, sizeof(report), "%s:%d: %s (%s)", file, line, buffer, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error_str); ~~~~~~~~~~ Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2018-07-18 08:29:51 -07:00
Caio Marcelo de Oliveira Filho	01d02e8906	anv: avoid warning when switching in VkStructureType When one of the cases is not part of the enum, the compilar complains: ../../src/intel/vulkan/anv_formats.c: In function ‘anv_GetPhysicalDeviceFormatProperties2’: ../../src/intel/vulkan/anv_formats.c:728:7: warning: case value ‘1000001004’ not in enumerated type ‘VkStructureType’ {aka ‘enum VkStructureType’} [-Wswitch] case VK_STRUCTURE_TYPE_WSI_FORMAT_MODIFIER_PROPERTIES_LIST_MESA: ^~~~ Given the switch has an "default:" case, we don't lose anything by switching on the unsigned value to avoid the warning. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2018-07-18 08:29:51 -07:00
Caio Marcelo de Oliveira Filho	df8f1637fa	glsl: remove unnecessary parenthesis from macro The "__inst" will contain the name used for the variable of type "__type ". Parenthesis is not necessary as the name itself shouldn't be an expression. Fixes warning: In file included from ../../src/mesa/main/mtypes.h:49, from ../../src/intel/compiler/brw_compiler.h:30, from ../../src/intel/compiler/brw_shader.h:29, from ../../src/intel/compiler/brw_fs.h:31, from ../../src/intel/compiler/brw_fs_cse.cpp:24: ../../src/intel/compiler/brw_fs_cse.cpp: In member function ‘bool fs_visitor::opt_cse_local(bblock_t)’: ../../src/compiler/glsl/list.h:675:12: warning: unnecessary parentheses in declaration of ‘entry’ [-Wparentheses] __type *(__inst); \ ^ ../../src/intel/compiler/brw_fs_cse.cpp:257:10: note: in expansion of macro ‘foreach_in_list_use_after’ foreach_in_list_use_after(aeb_entry, entry, &aeb) { ^~~~~~~~~~~~~~~~~~~~~~~~~ Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2018-07-18 08:29:51 -07:00
Caio Marcelo de Oliveira Filho	4a29ee1861	intel/compiler: fix -Wsign-compare warning Explicitly convert to signed integer. Conversion is valid since is the same (implicitly) used to initialize the loop. Avoids the warning: ../../src/intel/compiler/brw_fs.cpp: In member function ‘bool fs_visitor::lower_simd_width()’: ../../src/intel/compiler/brw_fs.cpp:5761:45: warning: comparison of integer expressions of different signedness: ‘int’ and ‘unsigned int’ [-Wsign-compare] split_inst.eot = inst->eot && i == n - 1; ~~^~~~~~~~ Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2018-07-18 08:29:51 -07:00
Caio Marcelo de Oliveira Filho	7df5f62768	intel/compiler: silence -Wclass-memaccess warnings Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2018-07-18 08:29:51 -07:00
Caio Marcelo de Oliveira Filho	ff8abce361	spirv: initialize is_vertex_input Fixes warning: ../../src/compiler/spirv/vtn_variables.c: In function ‘var_decoration_cb’: ../../src/compiler/spirv/vtn_variables.c:1400:12: warning: ‘is_vertex_input’ may be used uninitialized in this function [-Wmaybe-uninitialized] bool is_vertex_input; ^~~~~~~~~~~~~~~ The code used to set is_vertex_input in all possible codepaths, but after `23edc5b1ef` "spirv: translate default-block uniforms" the compiler isn't sure all codepaths will initialize the variable. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2018-07-18 08:29:51 -07:00
Rob Clark	cbad8f3cc0	freedreno/a5xx: perfmance counters AMD_performance_monitor support Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-18 10:19:03 -04:00
Rob Clark	33af91dc07	freedreno: batch query support (perfcounters) Core infrastructure for performance counters, using gallium's batch query interface (to support AMD_performance_monitor). Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-18 10:19:03 -04:00
Rob Clark	9e30e7490d	freedreno: batch query prep-work For batch queries we have N different query_type's for one query, so mapping a single query_type to a sample_provider doesn't really work out. Instead add a new constructor to construct a query directly from a sample_provider. Also, the sample buffer size needs to be determined at runtime, as it depends on the number of query_types. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-18 10:19:03 -04:00
Rob Clark	37b724ff72	freedreno: rework accumulated query result vfunc Take the query object, rather than the ctx. The ctx ptr isn't hugely useful but for back queries we will need the query object to properly get the results. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-18 10:19:03 -04:00
Rob Clark	1f464d5301	freedreno/ir3: output ir3 and nir asm for frameretrace See: `298dc8195b` Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-18 10:10:45 -04:00
Rob Clark	e4c225ab6f	freedreno/ir3: redirectable ir3 disasm output For now it still goes to stdout, this will make it easier to support output on stderr like what frameretrace expects. (If we eventually have a proper GL extension for this, implementation probably looks like dumping shader disasm to a tmp file and then dumping that out over whatever mechanism is used.) Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-18 10:10:45 -04:00
Rob Clark	4c58db8064	freedreno/ir3: resync ir3 disassembler Pull in latest updates from cffdump in envytools tree, so we can output to other than just stdout. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-18 10:10:45 -04:00
Rob Clark	97a9283f5d	freedreno: register usage queries Avg number of (half) regs per draw, so we can corrolate fps dips to shader register usage. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-18 10:10:44 -04:00
Rob Clark	8dfc9e22c1	nir: add lowering for gl_HelperInvocation v2: reword comment about lower_helper_invocations to be more clear that it might not work on all hardware v3: add special variant of load_sample_id which does not imply per- sample shading Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-18 10:10:44 -04:00
Rob Clark	09f240eb5f	mesa: don't double incr/decr ActiveCounters Frameretrace ends up w/ excess calls to SelectPerfMonitorCountersAMD() which ends up re-enabling already enabled counters. Which causes ActiveCounters[group] to be double incremented for the same counter. This causes BeginPerfMonitorAMD() to fail. The AMD_performance_monitor spec doesn't say that an error should be generated in this case. So I think the safe thing to do is just safe- guard against excess increments/decrements. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-18 10:10:44 -04:00
Rob Clark	426f1c60bc	mesa: fix error msg typo Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-07-18 10:10:44 -04:00
Rob Clark	640b8eb5b1	nir: fixup intrinsic comment Now the deref is the first src. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-07-18 10:10:44 -04:00
Tomeu Vizoso	3f7c2148b0	mesa: handle a bunch of formats in IMPLEMENTATION_COLOR_READ_* Virgl could save a lot of work converting buffers in the host side between formats if Mesa supported a bunch of other formats when reading pixels. This commit adds cases to handle specific formats so that the values reported by the two calls match more closely the underlying native formats. In GLES is important that IMPLEMENTATION_COLOR_READ_* return the native format and data type because the spec only allows reading with those, besides GL_RGBA or GL_RGBA_INTEGER. Additionally, because virgl currently doesn't implement such conversions, this commit fixes several tests in dEQP-GLES3.functional.fbo.color.clear., when using virgl in the guest side. The logic is based on knowledge that is shared with _mesa_format_matches_format_and_type() but we cannot assert that the results match as we don't have all the starting information at both points. So leave the assert out and hope CI comes soon to save us all. v2: Let R10G10B10A2_UINT fall back to GL_RGBA_INTEGER (Eric Anholt) * Assert with _mesa_format_matches_format_and_type (Eric Anholt) v3: * Remove the assert, as it won't be reliable (Eric Anholt) v4: * Use _mesa_is_format_integer in the fallback (Eric Anholt) v5: * Remove superfluous call to _mesa_uncompressed_format_to_type_and_comps (Eric Anholt) Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>	2018-07-18 14:52:35 +01:00
Samuel Pitoiset	e45ba51ea4	radv: add support for VK_EXT_conditional_rendering Inherited commands buffers are not supported. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-18 13:44:09 +02:00
Samuel Pitoiset	946cf3f39f	radv: add support for non-inverted conditional rendering By default, our internal rendering commands are discarded only if the predicate is non-zero (ie. DRAW_VISIBLE). But VK_EXT_conditional_rendering also allows to discard commands when the predicate is zero, which means we have to use a different flag. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-18 13:44:06 +02:00
Samuel Pitoiset	4d99caf590	radv: set the predicate for indirect/indexed draw commands VK_EXT_conditional_rendering allows to discard draw commands (not only normal draws). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-18 13:44:04 +02:00
Samuel Pitoiset	1e83f65673	radv: set the predicate for dispatch commands VK_EXT_conditional_rendering allows to discard dispatch commands. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-18 13:44:01 +02:00
Lionel Landwerlin	83427acc87	i965: batchbuffer: write correct canonical offset with softpin Addresses in the command streams should be in canonical form (i.e bit[63:48] == bit[47]). If the [bo->gtt_offset, bo->gtt_offset + target_offset] range contains the address 0x800000000000, the current code will fail that criteria. v2: Fix missing include (Lionel) Fixes: `1c9053d076` ("i965: Prepare batchbuffer module for softpin support.") Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-18 11:29:16 +01:00
Samuel Pitoiset	1376f2824f	radv: remove unused variable in radv_CreateRenderPass2KHR() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-18 10:54:42 +02:00
Samuel Pitoiset	d9526384bd	radv: optimize radv_stage_flush() for pre fragment shader stages We don't need to emit PS_PARTIAL_FLUSH for the pre fragment shader stages (ie. geometry/tessellation). Emitting VS_PARTIAL_FLUSH is enough for these stages. Note that PS_PARTIAL_FLUSH also synchronizes all vertex stages. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-18 10:09:05 +02:00
Samuel Iglesias Gonsálvez	0f29006256	anv: fix assert in anv_CmdBindDescriptorSets() The assert is checking that we are not binding more descriptor sets than the supported by the driver. When binding the descriptor set number MAX_SETS-1, it was breaking the assert because descriptorSetCount = 1. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-18 08:54:23 +02:00
Jan Vesely	154fbd03cc	clover: Report error when pipe driver fails to create compute state CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-07-17 21:04:15 -04:00
Jan Vesely	866b25fd01	clover: Catch errors from executing event action Abort all dependent events. v2: Abort the current event as well. CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-07-17 21:04:15 -04:00
Timothy Arceri	e105b0ca30	nir: add a couple of ior opts to nir_opt_algebraic One of these was seen in a Deus Ex shader. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-18 09:53:27 +10:00
Timothy Arceri	c4188a9b9f	nir: allow opt_peephole_select to handle nir_instr_type_deref Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-18 09:53:22 +10:00
Marek Olšák	bb5449cfee	r600: fix warnings when unref'ing pool->bo	2018-07-17 14:51:45 -04:00
Konstantin Kharlamov	3f8fa7716d	r600g: some -Wsign-compare fixes Signed-off-by: Konstantin Kharlamov <Hi-Angel@yandex.ru> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-07-17 14:47:37 -04:00
Konstantin Kharlamov	b674a1d3b9	st/glx: constify some variables Just a nice hint for both peoples and compilers. Signed-off-by: Konstantin Kharlamov <Hi-Angel@yandex.ru> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-07-17 14:47:37 -04:00
Konstantin Kharlamov	1379d9759f	st/nine: constify some variables Just a nice hint for both peoples and compilers. Signed-off-by: Konstantin Kharlamov <Hi-Angel@yandex.ru> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-07-17 14:47:37 -04:00
Konstantin Kharlamov	77ca550224	r600g: constify some variables Just a nice hint for both peoples and compilers. Signed-off-by: Konstantin Kharlamov <Hi-Angel@yandex.ru> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-07-17 14:47:37 -04:00
Konstantin Kharlamov	9b379591c9	r600g: do not use "fast-clear" for small textures (v3) Ported from radeonsi. Improves windowed glxgears ran as vblank_mode=0 glxgears -info -geometry 0+0+512+512 from ≈2270 FPS to ≈2360 FPS. Tested with AMD TURKS. v2: turned out glxgears ignores the option above, the correct way would be "512x512+0+0". Now it can be seen 512x512 actually loses 30 FPS. 300×300 however wins around a hundred FPS, and to leave some room in case results may differ for other cards I want not to nitpick in search of an optimum but to simply leave 300×300 in the code. v3: remove redundant braces, and try harder for the mail to stick to the rest of the series. Signed-off-by: Konstantin Kharlamov <Hi-Angel@yandex.ru> Reviewed-by: Gert Wollny <gw.fossdev@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-07-17 14:47:37 -04:00
Rob Clark	4cf8f329ed	freedreno: re-work fd_batch_reference() locking Annoyingly we still have to briefly drop the lock to unref resources.. but push the lock down into __fd_batch_destroy() so we can invalidate the batch and reset resources before dropping the lock. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-17 11:00:00 -04:00
Rob Clark	4b847b38ae	freedreno: make fd_batch a one-shot thing Re-allocate rather than re-use. Originally we had an unnecessarily complex design to avoid re-allocating cmdstream buffers. But now that support for "growable" cmdstream buffers has been in place for a couple years, I guess we can care a bit less about the extra overhead on older kernels. But making the batches one-shot removes a class of potential race conditions vs the flush_queue. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-17 11:00:00 -04:00
Rob Clark	f129971e71	freedreno: flush immediately when reading a pending batch Instead of the reading batch setting a dependency on the writing batch, simply flush the writing batch immediately. This avoids situations where we have to flush the context's current batch later. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-17 11:00:00 -04:00
Rob Clark	20f677f6bc	freedreno: get rid of noop render This was basically to avoid a zero-dword IB (indirect-branch), but instead just don't emit the IB packet in that case. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-17 11:00:00 -04:00
Rob Clark	15f6c0509a	freedreno: fix samples=0 vs samples=1 confusion pipe_framebuffer_state can have samples=0 in various cases, which is actually the same thing as samples=1. So use the _get_num_samples() helper to populate the key, to avoid this looking like two distinct fb states to the cache. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-17 11:00:00 -04:00
Rob Clark	d77fcdeb59	freedreno: comment for _invalidate_batch() Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-17 11:00:00 -04:00
Rob Clark	f2570409f9	freedreno: hold batch references when flushing It is possible for a batch to be freed under our feet when flushing, so it is best to hold a reference to all of them up-front. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-17 11:00:00 -04:00
Karol Herbst	71add09e79	nir/spirv: print id for unsupported alu opcode Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-17 13:24:09 +02:00
Karol Herbst	1beef89ad8	nir: prepare for bumping up max components to 16 OpenCL knows vector of size 8 and 16. v2: rebased on master (nir_swizzle rework) rework more declarations with nir_component_mask_t adjust print_var_decl Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-17 13:24:09 +02:00
Samuel Pitoiset	f65bee7e85	radv/winsys: use alloca() for semaphore dependencies Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-17 10:53:45 +02:00
Samuel Pitoiset	88e56804a7	radv: reduce number of CB/DB meta flushes for VK_ACCESS_TRANSFER_WRITE_BIT If we know that the given image doesn't have any metadata, we don't need to flush. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-07-17 09:34:20 +02:00
Samuel Pitoiset	b213947510	radv: fix implementation of VK_KHR_create_renderpass2 for multiviews The Vulkan 1.1.80 spec says: "viewMask has the same effect for the described subpass as VkRenderPassMultiviewCreateInfo::pViewMasks has on each corresponding subpass." Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-17 09:04:35 +02:00
Erik Faye-Lund	591b700944	virgl: respect max_vertex_attrib_stride cap This is required for OpenGL 4.4 and OpenGL ES 3.1 support. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-07-17 15:45:37 +10:00
Lepton Wu	04e278f793	virgl: Fix flush in virgl_encoder_inline_write. The current code is buggy: if there are only 12 dwords left in cbuf, we emit a zero data length command which will be rejected by virglrenderer. Fix it by calling flush in this case. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-07-17 14:56:25 +10:00
Erik Faye-Lund	b5db3aa6e8	virgl: implement set_min_samples This allows us to implement glMinSampleShading correctly, which up until now just got ignored. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-07-17 13:59:47 +10:00
Caio Marcelo de Oliveira Filho	ba1b41b504	glsl: do second pass of const propagation in loops When handling loops in constant propagation, implement the "FINISHME" comment like copy propagation: perform a first pass to find values that can't be propagated, then perform a second pass with the ACP containing still valid values. Certain values are killed because the loop may run more than one iteration, so we can't copy propagate them as they would be invalid in the later iterations. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-16 16:33:39 -07:00
Caio Marcelo de Oliveira Filho	d7849fd1da	glsl: don't let an 'if' then-branch kill const propagation for else-branch When handling 'if' in constant propagation, if a certain variable was killed when processing the first branch of the 'if', then the second would get any propagation from previous nodes. This is similar to the change done for copy propagation code. x = 1; if (...) { z = x; // This would turn into z = 1. x = 22; // x gets killed. } else { w = x; // This would NOT turn into w = 1. } With the change, we let constant propagation happen independently in the two branches and only then apply the killed values for the subsequent code. The new code use a single hash table for keeping the kills of both branches (the branches only write to it), and it gets deleted after we use -- instead of waiting for mem_ctx to collect it. NIR deals well with constant propagation, so it already covered for the missing ones that this patch fixes. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-16 16:33:39 -07:00
Eric Anholt	229836fb37	v3d: Disable shader-db cycle estimates until we sort out TMU estimates. I keep having to ignore these shader-db changes since I don't trust them, so just disable the reports entirely.	2018-07-16 14:39:59 -07:00
Eric Anholt	2baab6bf2a	v3d: Emit the lowered uniform just before its first use in a block. total instructions in shared programs: 98578 -> 98119 (-0.47%) instructions in affected programs: 27571 -> 27112 (-1.66%) and it also eliminates most spills/fills on the CTS's randomized uniform usage testcases.	2018-07-16 14:39:59 -07:00
Eric Anholt	26f830d9fc	v3d: Add an assert that we don't provide an invalid texture return words. The docs had an update noting this restriction, so reflect it in the code.	2018-07-16 14:39:59 -07:00
Eric Anholt	d661d78464	v3d: Apply GFXH-1625 restriction on TMUWT in the end of the shader. This doesn't affect us yet since we're not doing TMUWTs, but I think we will for GLES 3.1.	2018-07-16 14:39:59 -07:00
Sergii Romantsov	cec540fbc6	intel/batch_decoder: decoding of 3DSTATE_CONSTANT_BODY. SNB doesn't have a definition of 3DSTATE_CONSTANT_BODY, thats why we got segmentation fault when used INTEL_DEBUG=bat. Fixed by adding of 3DSTATE_CONSTANT_BODY into 3DSTATE_CONSTANT of VS, GS and PS structures. v2: added definition of 3DSTATE_CONSTANT_BODY to the gen6.xml Fixes: `169d8e011a` (intel: Fix 3DSTATE_CONSTANT buffer decoding.) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107190 Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-16 12:18:36 -07:00
Marek Olšák	4054133dcc	r600: fix build after the removal of RADEON_PRIO_* flags	2018-07-16 14:33:31 -04:00
Roland Scheidegger	b3474645d4	nir: fix msvc build Empty initializer braces aren't valid c (it's a gnu extension, and it's valid in c++). Hopefully fixes appveyor / msvc build... Fixes `a3150c1d06`	2018-07-16 20:07:53 +02:00
Jason Ekstrand	f378fa94b2	nir/worklist: Rework the foreach macro This makes the arguments match the (thing, container) pattern used in other nir_foreach macros and also renames it to make that a bit more clear. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-16 11:02:10 -07:00
Eric Anholt	360714bfa5	intel: tools: Fix uninitialized variable warnings in intel_dump_gpu. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-16 10:58:40 -07:00
Jason Ekstrand	5e030deaf2	spirv: Fix a couple of image atomic load/store bugs For one thing, the NIR opcodes for image load/store always take and return a vec4 value regardless of the image type. We need to fix up both the source and destination to handle it. For another thing, we weren't actually setting up a destination in the OpAtomicLoad case. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: mesa-stable@lists.freedesktop.org	2018-07-16 10:54:50 -07:00
Marek Olšák	f8aa116c3c	winsys/amdgpu: clean up error handling in amdgpu_cs_submit_ib Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-16 13:32:33 -04:00
Marek Olšák	6b1e0e51e6	radeonsi: rework RADEON_PRIO flags to be <= 31 This decreases sizeof(struct amdgpu_cs_buffer) from 24 to 16 bytes. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-16 13:32:33 -04:00
Marek Olšák	54ad9b444c	radeonsi: merge DCC/CMASK/HTILE priority flags For a later simplification. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-16 13:32:33 -04:00
Marek Olšák	3e6888e5d7	radeonsi: remove non-GFX BO priority flags For a later simplification. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-16 13:32:33 -04:00
Marek Olšák	342fff6cbc	winsys/amdgpu: use alloca when using global_bo_list Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-16 13:32:33 -04:00
Marek Olšák	6ec44b7055	winsys/amdgpu: remove label bo_list_error Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-16 13:32:33 -04:00
Marek Olšák	7346e5296e	winsys/amdgpu: always update gfx_bo_list_counter Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-16 13:32:33 -04:00
Marek Olšák	caf41fb96d	winsys/amdgpu: make amdgpu_cs_context::flags & handles local Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-16 13:32:33 -04:00
Gert Wollny	78887e99e3	mesa/virgl: Fix off-by-one and copy-paste error in multisample position evaluation Converting from a switch statement that would not allow intermediate sample counts to use an if-else chain went a bit wrong, so that in some cases the range that should be inclusive was exclusive and the line for 16 samples was copies wrongly. v2: elaborate commit message. Fixes: `91f48cdfe5` virgl: Add support for glGetMultisample Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> (v1)	2018-07-16 12:51:39 +02:00
Karol Herbst	4d0d911875	nouveau: fix 3D blitter for unsigned to signed integer conversions fixes a couple of packed_pixel CTS tests. No regressions inside a CTS run. v2: simplify the changes a bit Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-15 19:28:37 +02:00
Karol Herbst	87c8af2836	nir: fix printing of vec16 type Fixes: `2f181c8c18` "glsl_types: vec8/vec16 support" Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-15 19:28:37 +02:00
Rob Clark	427a3dbdb1	nir/spirv: implement BuiltInWorkDim Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-15 07:51:13 +02:00
Karol Herbst	39180d3931	nir/spirv: print id for unsupported builtins Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-15 07:51:13 +02:00
Jason Ekstrand	daa78f30b6	intel/blorp: Handle 3-component formats in clears This fixes a nasty hang in Batman: Arkham City which apparently calls vkCmdClearColorImage on a linear RGB image. cc: mesa-stable@lists.freedesktop.org Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-07-13 20:57:46 -07:00
Jason Ekstrand	11712b9ca1	intel/blorp: Fix blits to R8G8B8_UNORM_SRGB In this case, the surface faking will give us a R8_UNORM surface and we need to do an sRGB conversion in the shader. Found by inspection. cc: mesa-stable@lists.freedesktop.org Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-07-13 20:57:46 -07:00
Caio Marcelo de Oliveira Filho	4ec8b39fcd	util/hash_table: add helper to remove entry by key And the corresponding test case. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-13 14:20:49 -07:00
Jason Ekstrand	a3150c1d06	nir/lower_tex: Use nir_format_srgb_to_linear A while ago, we added a bunch of format conversion helpers; we should use them instead of hand-rolling sRGB conversions. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-13 14:02:18 -07:00
Jason Ekstrand	b52d79514c	vc4: Tell NIR to lower fdiv instructions This should allow us to use them in nir_lower_tex Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-13 14:02:18 -07:00
Dylan Baker	53aca66874	docs: Update news, calendar, and relnotes for 18.1.4	2018-07-13 13:54:46 -07:00
Dylan Baker	97870f2cd0	docs: Add sha256 sums for 18.1.4 tarballs	2018-07-13 13:53:03 -07:00
Dylan Baker	e8df2f12d6	docs: Add release notes for 18.1.4	2018-07-13 13:53:01 -07:00
Eric Anholt	d009463a65	vc4: Switch to using u_transfer_helper for MSAA maps. No requirement, just reduces code duplication.	2018-07-13 13:29:29 -07:00
Eric Anholt	afcc714c98	v3d: Work around GFXH-1461 bug losing our Z/S clears. If you load S and clear Z or vice versa, the clear may get lost. Just fall back to drawing a quad. Fixes KHR-GLES3.packed_depth_stencil.verify_read_pixels.depth24_stencil8	2018-07-13 13:29:29 -07:00
Eric Anholt	162fcdad6a	meson: Move xvmc test tools from unit tests to installed tools. These are not unit tests, as they rely on the host's XVMC and some user configuration. Switch them over to being general installed tools, to fix unit testing. Fixes: `22a817af8a` ("meson: build gallium xvmc state tracker") Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-07-13 13:29:29 -07:00
Gert Wollny	695a4cb0f6	r600: Add spill output to group only if register or target index changes The current spill code checks in each instruction of an instruction group whether spilling is needed and if so, it adds spilling for each component as a seperate instruction and it allocates a new temporary for each component and since it takes the write mask from the TGSI representation, all components might be written each time and as a result already written components might be overwritten with garbage like: ... y: MOV R9.y, [0x42140000 37].x t: MOV R8.x, [0x42040000 33].y ... MEM_SCRATCH WRITE_IND_ACK 0 R9.xy__, @R4.x ES:3 MEM_SCRATCH WRITE_IND_ACK 0 R8.xy__, @R4.x ES:3 ... To resolve this isse accumulate spills to the same memory location so that only one memory write instruction is emitted for an instruction group that writes up to all four components. This fixes updated piglits (see https://patchwork.freedesktop.org/series/46064/): spec/glsl-1.30/execution fs-large-local-array-vec2.shader_test fs-large-local-array-vec3.shader_test fs-large-local-array-vec4.shader_test v2: fix some typos and add comment about piglits (Roland Scheidegger) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v1)	2018-07-13 21:11:34 +02:00
Nanley Chery	3b4279f772	i965/miptree: Allocate MS texture BOs as BUSY These buffer objects are never accessed with the CPU. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-07-13 08:36:26 -07:00
Nanley Chery	7784a9ceac	i965/miptree: Inline make_separate_stencil Note that the separate stencil miptree now has the same alloc_flag as the depth component. Only stencil renderbuffers (as opposed to textures) have BO_ALLOC_BUSY. v2: Add note about BO_ALLOC_BUSY in message (Topi). Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-07-13 08:36:26 -07:00
Nanley Chery	74cf188985	i965/miptree: Init r8stencil_needs_update to false The current behavior masked two bugs where the flag was not set to true after modifying the stencil texture. One case was a regression introduced with commit `bdbb527a65` and another was a bug in the depthstencil mapping code. These have since been fixed. To prevent such bugs from being masked in the future, initialize r8stencil_needs_update to false. v2: Keep the delayed allocation. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-07-13 08:36:19 -07:00
Nanley Chery	ffac81fa5c	i965/miptree: Refactor miptree_create Enable a future patch to create the r8stencil_mt in this function. v2: Explicitly set etc_format to MESA_FORMAT_NONE (Topi). Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-07-13 08:31:21 -07:00
Nanley Chery	03cbaae03e	i965/miptree: Add and use mt_surf_usage v2: Make mt_fmt const (Topi). Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-07-13 08:31:21 -07:00
Nanley Chery	32b22592a8	i965/miptree: Share alloc_flags in miptree_create Note that this maintains BO_ALLOC_BUSY for depth renderbuffers, but not depth textures. v2: Add note about BO_ALLOC_BUSY in message (Topi). Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-07-13 08:31:21 -07:00
Nanley Chery	2321e85759	i965/miptree: Share the miptree format in miptree_create Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-07-13 08:31:21 -07:00
Nanley Chery	fbe01625f6	i965/miptree: Share tiling_flags in miptree_create Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-07-13 08:31:21 -07:00
Nanley Chery	6c9947c3ef	i965/miptree: Delete MIPTREE_CREATE_LINEAR This enum constant was introduced to enable blit maps with intel_miptree_create `da2880bea0`. Now that such maps use the more direct make_surface function which allows you to specify the tiling directly, the constant is no longer being used. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-07-13 08:31:21 -07:00
Nanley Chery	684fa59eb6	i965/miptree: Use make_surface in map_blit Do this so that we don't have to special case linearly-tiled depth buffers in miptree_create. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-07-13 08:31:21 -07:00
Nanley Chery	63d428dc17	i965/draw: Fix adding the stencil bo to the depth cache Fix the case where stencil writes are enabled on a depth stencil texture. Found by inspection. v2: Fix message to allow for depth stencil writes (Topi). Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-07-13 08:31:21 -07:00
Nanley Chery	be07cc43a2	i965/draw: Set the r8stencil flag after drawing Fixes the regresion introduced with commit `bdbb527a65` "i965: Use ISL for emitting depth/stencil/hiz state on gen6+" Found by inspection. Prevents regressing the piglit test, fbo-depth-array stencil-draw, later on in this series. Cc: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-07-13 08:31:21 -07:00
Nanley Chery	0eafe44ba7	i965/miptree: Set the r8stencil flag in map_depthstencil Found by initializing the r8stencil_needs_update to false in make_separate_stencil_surface. Prevents regressing the piglit test arb_stencil_texturing-draw, later on in the series. Cc: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-07-13 08:31:21 -07:00
Nanley Chery	cef7ce07fa	i965: Set the r8stencil flag in miptree_finish_write This seems to be the most appropriate place. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-07-13 08:31:21 -07:00
Karol Herbst	cb65246ed2	nir: cleanup oversized arrays in nir_swizzle calls There are no fixed sized array arguments in C, those are simply pointers to unsized arrays and as the size is passed in anyway, just rely on that. where possible calls are replaced by nir_channel and nir_channels. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-13 15:46:57 +02:00
Nanley Chery	0288fe8d04	i965/miptree: Use the correct BLT pitch Retile miptrees to a linear tiling less often. Retiling can cause issues with imported BOs. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106738 Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2018-07-12 19:16:30 -07:00
Nanley Chery	3df201e3e8	i965/miptree: Drop an if case from retile_as_linear Drop an if statement whose predicate never evaluates to true. row_pitch belongs to a surface with non-linear tiling. According to isl_calc_tiled_min_row_pitch, the pitch is a multiple of the tile width. By looking at isl_tiling_get_info, we see that non-linear tilings have widths greater than or equal to 128B. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2018-07-12 19:16:30 -07:00
Nanley Chery	0ab2541943	i965: Make blt_pitch public We'd like to reuse this helper. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2018-07-12 19:16:30 -07:00
Caio Marcelo de Oliveira Filho	1f6ce1973a	nir: delete not needed for reinserted nir_cf_list It wasn't causing problems since there's nothing to delete, but better be consistent with the rest of existing codebase. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-12 14:03:51 -07:00
Caio Marcelo de Oliveira Filho	13cfd6cc96	glsl: remove struct kill_entry in constant propagation The only value in kill_entry is the writemask, which can be stored in the data pointer of the hash table entry. Suggested by Eric Anholt. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2018-07-12 14:03:51 -07:00
Caio Marcelo de Oliveira Filho	d6e869afe9	glsl: slim the kill_entry struct used in const propagation Since `4654439fdd` "glsl: Use hash tables for opt_constant_propagation() kill sets." uses a hash_table for storing kill_entries, so the structs can be simplified. Remove the exec_node from kill_entry since it is not used in an exec_list anymore. Remove the 'var' from kill_entry since it is now redundant with the key of the hash table. Suggested by Eric Anholt. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2018-07-12 14:03:51 -07:00
Caio Marcelo de Oliveira Filho	094225d69d	i965: fix typo (wrong gen number) in comment Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-12 14:03:51 -07:00
Caio Marcelo de Oliveira Filho	fa0c19d17b	util/set: helper to remove entry by key v2: Add unit test. (Eric Anholt) Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-12 14:03:51 -07:00
Caio Marcelo de Oliveira Filho	b034facfbc	util/set: add a clone function v2: Add unit test. (Eric Anholt) Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-12 14:03:51 -07:00
Caio Marcelo de Oliveira Filho	8af0a45b47	util/set: add a basic unit test Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-12 14:03:51 -07:00
Marek Olšák	2e0b00ab7d	radeonsi: add support for Vega20 Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2018-07-12 16:48:12 -04:00
Eric Anholt	e8dc3c0c36	u_blitter: Add an option to draw the triangles using an index buffer. For V3D, the HW will interpolate slightly differently along the shared edge of the trifan. The conformance tests manage to catch this in the nearest_consistency_* group. To get interpolation to match, we need the last vertex of the triangle to be shared. I first tried implementing draw_rectangle to do triangles instead, but that was quite a bit (147 lines) of code duplication from u_blitter, and this seems much simpler and less likely to break as u_blitter changes. Fixes dEQP-GLES3.functional.fbo.blit.rect.nearest_consistency_* on V3D. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-12 11:49:22 -07:00
Eric Anholt	c17dac0534	u_draw: Add some indices to the util_draw_elements() helpers. These helpers have been unused, and were definitely not useful since `330d0607ed` ("gallium: remove pipe_index_buffer and set_index_buffer") made it so that they never had an index buffer passed in. For an upcoming u_blitter change to use these helpers, I have just 6 bytes of index data, so pass it as user data until a more interesting caller comes along. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-12 11:49:20 -07:00
Eric Anholt	50a3a283d0	vc4: Don't automatically reallocate a PERSISTENT-mapped buffer. I had mistakenly used the COHERENT flag, which can only be set when PERSISTENT is mapped, but isn't always. Fixes: `a2014c2eb9` ("vc4: Simplify the DISCARD_RANGE handling")	2018-07-12 11:31:08 -07:00
Eric Anholt	7714896256	v3d: Don't automatically reallocate a PERSISTENT-mapped buffer. I had mistakenly used the COHERENT flag, which can only be set when PERSISTENT is mapped, but isn't always. Fixes piglit bufferstorage-persistent read	2018-07-12 11:31:08 -07:00
Eric Anholt	e48c615292	v3d: Fix stride of 1D_ARRAY mappings. All of our other texture arrays will be tiled, but 1D is an array of raster mappings and we had the wrong value plugged in here. Fixes piglit getteximage-targets 1D_ARRAY	2018-07-12 11:31:08 -07:00
Eric Anholt	97ddeed949	v3d: Fix MRT blending with independent blending disabled. We were only emitting the RT blend state for RT 0 and only enabling it for RT 0, when the gallium API for !independent_blend is for rt0's state to apply to all of them. Fixes piglit fbo-drawbuffers-blend-add.	2018-07-12 11:31:08 -07:00
Eric Anholt	e0dbbf9987	gallium/u_transfer_helper: Initialize the stride of MSAA maps. We just never set the value that was returned for MSAA mappings (directly reading back an MSAA framebuffer). Since we're handing back ss_map, it should be ss_map's stride from our nested transfer. Fixes piglit /home/anholt/src/piglit/bin/fbo-depthstencil -samples=4 cases. Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-07-12 11:31:06 -07:00
Eric Anholt	589bb5bd65	gallium/u_transfer_helper: Fix MSAA mappings with nonzero x/y. We created a temporary with box->{width,height} and then tried to map width,height from a nonzero offset when we meant to just map the whole temporary. Fixes segfaults in V3D in dEQP-GLES3.functional.prerequisite.read_pixels with --deqp-egl-config-name=rgba8888d24s8ms4 and also piglit's read-front clear-front-first -samples=4 Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-07-12 11:31:00 -07:00
Jason Ekstrand	ccb8309516	util/rb_tree: Fix a compiler warning Gcc 8 warns "cast to pointer from integer of different size" in 32-bit builds. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-07-12 10:25:46 -07:00
Jose Maria Casanova Crespo	62f37ee53d	i965/fs: unspills shoudn't use grf127 as dest since Gen8+ At `232ed89802` "i965/fs: Register allocator shoudn't use grf127 for sends dest" we didn't take into account the case of SEND instructions that are not send_from_grf. But since Gen7+ although the backend still uses MRFs internally for sends they are finally assigned to a GRFs. In the case of unspills the backend assigns directly as source its destination because it is suppose to be available. So we always have a source-destination overlap. If the reg_allocator assigns registers that include the grf127 we fail the validation rule that affects Gen8+ "r127 must not be used for return address when there is a src and dest overlap in send instruction." So this patch activates the grf127_send_hack_node for Gen8+ and if we have any register spilled we add interferences to the destination of the unspill operations. We also need to avoid that opt_bank_conflicts() optimization, that runs after the register allocation, doesn't move things around, causing the grf127 to be used in the condition we were avoiding. Fixes piglit test tests/spec/arb_compute_shader/linker/bug-93840.shader_test and some shader-db crashed because of the grf127 validation rule.. v2: make sure that opt_bank_conflicts() optimization doesn't change the use of grf127. (Caio) Found by Caio Marcelo de Oliveira Filho Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107193 Fixes: `232ed89802` "i965/fs: Register allocator shoudn't use grf127 for sends dest" Cc: 18.1 <mesa-stable@lists.freedesktop.org> Cc: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Cc: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-07-12 18:02:26 +02:00
Michel Dänzer	34e89e4d38	gallium: Check pipe_screen::resource_changed before dereferencing it It's optional, only implemented by the etnaviv driver so far. Fixes: `501d0edeca` "st/mesa: call resource_changed when binding a EGLImage to a texture" Fixes: `a37cf630b4` "gallium: add pipe_screen::resource_changed callback wrappers" Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2018-07-12 17:39:12 +02:00
Jason Ekstrand	c2587ac4e5	docs/features: Add the missing KHR extensions Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-12 08:28:04 -07:00
Jason Ekstrand	55b68c4833	docs/features: Move the Vulkan 1.1 extensions to the 1.1 section While we're at it, add some extensions we missed along the way like the VK_KHR_maintenanceN extensions. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-12 08:28:04 -07:00
Jason Ekstrand	bc15d74529	docs/features: Mark some Vulkan extensions as done Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-12 08:28:04 -07:00
Karol Herbst	686e140ce0	nir/spirv: handle OpConstantComposites with OpUndef members Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-12 13:09:00 +02:00
Karol Herbst	154ef32e46	nir/spirv: implement BuiltInGlobalSize Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-12 13:09:00 +02:00
Karol Herbst	31cbcbdb87	nir: move lowering of SYSTEM_VALUE_LOCAL_GROUP_SIZE into a function we already have this code duplicated and we will need it for the global group size as well Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-12 13:09:00 +02:00
Karol Herbst	529aa9e646	compiler: add missing entries to gl_system_value_name also reorder to match the gl_system_value enum. It is weird that the STATIC_ASSERT doesn't trigger though. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-12 13:09:00 +02:00
Rob Clark	d4280561f5	nir/spirv: print extension name in fail msg Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-12 13:09:00 +02:00
Rob Clark	9ce0360f76	nir/spirv: Use imov where we might have 8 bit types Otherwise nir_validate may complain about 8 bit floats, which do not exist. Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-12 13:09:00 +02:00
Samuel Pitoiset	f1b3f7bfac	radv: simplify the logic in radv_set_descriptor_set() Now that 'set' can't be NULL because the meta operations no longer bind a NULL descriptor, the logic can be simplified a little bit. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-12 11:08:49 +02:00
Samuel Pitoiset	826b3a8773	radv: remove one useless check in radv_bind_descriptor_set() 'set' shouldn't be NULL. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-12 11:08:47 +02:00
Samuel Pitoiset	6bfbc7b38b	radv/meta: do not restore a NULL descriptor Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-12 11:08:45 +02:00
Samuel Pitoiset	5b32926f7e	radv: remove unnecessary verification code around ring_offsets_idx I don't want to waste CPU cycles for nothing. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-12 11:08:42 +02:00
Samuel Pitoiset	6248fbe5e4	radv: get rid of buffer object priorities We mostly use the same priority for all buffer objects, so I don't think that matter much. This should reduce CPU overhead a little bit. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-12 11:08:40 +02:00
Lucas Stach	501d0edeca	st/mesa: call resource_changed when binding a EGLImage to a texture When a EGLImage is newly bound to a texture, we need to make sure the driver is informed that the resource might have changed. Fixes stale texture content on Etnaviv when binding an existing EGLImage to an existing texture object. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-12 11:02:04 +02:00
Samuel Pitoiset	1f616a840e	radv: emit a dummy ZPASS_DONE to prevent GPU hangs on GFX9 A ZPASS_DONE or PIXEL_STAT_DUMP_EVENT (of the DB occlusion counters) must immediately precede every timestamp event to prevent a GPU hang on GFX9. Cc: 18.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-12 10:22:36 +02:00
Samuel Pitoiset	3a16c722cf	radv: add support for VK_KHR_create_renderpass2 VkCreateRenderPass2KHR() is quite similar to VkCreateRenderPass() but refactoring the code is a bit painful. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-12 10:20:10 +02:00
Samuel Pitoiset	fe28978f2a	radv: introduce radv_subpass_attachment data structure Needed for VK_KHR_create_renderpass2. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-12 10:20:06 +02:00
Kenneth Graunke	c0874947f1	st/mesa: Only enable depth writes if the function isn't EQUAL. If the depth function is EQUAL, then we'll only write the depth value when it already matches what's in the buffer, which is pointless. Skipping these writes can save bandwidth. The state tracker can easily take care of this, so all drivers benefit. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-11 11:23:20 -07:00
Chad Versace	be5fc0d7f1	anv/android: Fix type error in call to vk_errorf() In a single call to vk_errorf() in the Android code, the arguments were swapped. The bug has existed since day one. Chrome OS used to forgive the warning, but it is now a compilation error. CC: <mesa-stable@lists.freedesktop.org> Fixes: `053d4c32` "anv: Implement VK_ANDROID_native_buffer (v9)" Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-07-11 11:09:19 -07:00
Chad Versace	8e403bc959	anv/android: Fix Autotools build for VK_ANDROID_native_buffer Changes to vk.xml and anv_entrypoints_gen.py broke the Autotools build on Android. The changes undef'd the VK_ANDROID_native_buffer entrypoints in anv_entrypoints.h. Fix it with CPPFLAGS += -DVK_USE_PLATFORM_ANDROID_KHR. CC: <mesa-stable@lists.freedesktop.org> See-Also: `63525ba7` "android: enable VK_ANDROID_native_buffer" Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-07-11 11:09:16 -07:00
Samuel Pitoiset	4a67ce886a	radv: make sure to wait for CP DMA when needed This might fix some synchronization issues. I don't know if that will affect performance but it's required for correctness. CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-11 12:11:56 +02:00
Rafael Antognolli	688d757e15	intel/tools/dump_gpu: Add option to print ppgtt mappings. Using -vv will increase the verbosity, by printing the ppgtt mappings as they get written into the aub file. Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-10 09:05:44 -07:00
Neil Roberts	45106a1c93	spirv: Fix InterpolateAt* instructions for vecs with dynamic index If the glsl is something like this: in vec4 some_input; interpolateAtCentroid(some_input[idx]) then it now gets generated as if it were: interpolateAtCentroid(some_input)[idx] This is necessary because the index will get generated as a series of nir_bcsel instructions so it would no longer be an input variable. It is similar to what is done for GLSL in `ca63a5ed3e`. Although I can’t find anything explicit in the Vulkan specs to say this should be allowed, the SPIR-V spec just says “the operand interpolant must be a pointer to the Input Storage Class”, which I guess doesn’t rule out any type of pointer to an input. This was found using the spec/glsl-4.40/execution/fs-interpolateAt* Piglit tests with the ARB_gl_spirv branch. Signed-off-by: Neil Roberts <nroberts@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> v2: update after nir_deref_instr land on master. Implemented by Alejandro Piñeiro. Special thanks to Jason Ekstrand for guidance at the new nir_deref_instr world. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-10 11:43:40 +02:00
Francisco Jerez	18c086a9e6	intel/ir: Uncomment definition of several unused hardware opcodes. There are a number of opcode_desc table entries for many of these unused opcodes. A symbolic opcode enum will be required in a future commit in order to keep them in the opcode description tables. The alternative would be to remove the unused opcodes from the opcode description tables. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	48d6fc5eb6	intel/fs: Initialize mlen for gen7 varying pull constant load messages. This makes the message length available at the IR level, which should save some guesswork in a future commit. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	6643143f6e	intel/eu: Assert that the instruction is send-like in brw_set_desc_ex(). Constructing a descriptor in-place as part of the immediate of an ALU instruction is no longer supported. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	6f81e2b994	intel/eu: Get rid of the return value of brw_send_indirect_message(). The return value is not used anymore. This allows simplifying the code slightly, and in addition it should frustrate anybody's attempts to continue using the obsolete piecemeal approach to construct a message descriptor in combination with brw_send_indirect_message(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	b3cce4c130	intel/eu: Get rid of the return value of brw_send_indirect_surface_message(). All users of brw_send_indirect_surface_message() should be providing a full descriptor immediate up front by now, this isn't necessary anymore. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	95b5367149	intel/eu: Use descriptor constructors for dataport typed surface messages. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	94166cef40	intel/eu: Use descriptor constructors for dataport scattered byte surface messages. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	2a9605d610	intel/eu: Use descriptor constructors for dataport untyped surface messages. v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	8e707fc2af	intel/eu: Provide single descriptor argument to brw_send_indirect_surface_message(). Instead of the current message_len, response_len and header_present arguments. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	b10b4e7c45	intel/eu: Use descriptor constructors for pixel interpolator messages. v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	8fa4bc4676	intel/eu: Use descriptor constructors for dataport write messages. v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	2bac890bf5	intel/eu: Use descriptor constructors for dataport read messages. v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	27c211e30f	intel/eu: Use descriptor constructors for sampler messages. v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	1c90ae5acc	intel/eu: Provide desc immediate argument up front to brw_send_indirect_message(). The current approach of returning a setup instruction where additional descriptor fields can be specified is still supported in order to keep things working, but it will be removed later in this series. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	b382bdde1d	TRIVIAL: intel/eu: Use a local devinfo variable in brw_shader_time_add(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	c3793d49e4	intel/eu: Use brw_set_desc() along with a helper to set common descriptor controls. This replaces brw_set_message_descriptor() with the composition of brw_set_desc() and a new inline helper function that packs the common message descriptor controls into an integer. The goal is to represent all message descriptors as a 32-bit integer which is written at once into the instruction, which is more flexible (SENDS anyone?), robust (see `d2eecf0b0b` fixing an issue ultimately caused by some bits of the extended message descriptor being left undefined) and future-proof than the current approach of specifying the individual descriptor fields directly into the instruction. This approach also seems more self-documenting, since it will allow removing calls to functions with way too many arguments like brw_set__message() and brw_send_indirect_message(), and instead provide a single descriptor argument constructed from an appropriate combination of brw__desc() helpers. Note that because brw_set_message_descriptor() was (conditionally?) overriding fields of the instruction which strictly speaking weren't part of the message descriptor, this involves calling brw_inst_set_sfid() and brw_inst_set_eot() in some cases in addition to brw_set_desc(). v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	20b962232b	intel/eu: Define SET_BITS helper more easily reusable than SET_FIELD. Allows to specify a bitfield based on its upper and lower bounds instead of a symbolic field definition, kind of what the current GET_BITS macro is to GET_FIELD. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	d0f589a55b	intel/eu: Define helper to specify the descriptor immediates of a SEND instruction. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	f55884cad3	intel/eu: Add brw_inst.h helpers for the SEND(C) descriptor and extended descriptor. This introduces helpers that can be used to specify or extract the whole descriptor of a SEND message instruction at once. Because the the instruction encoding of these is rather awkward on some generations using the generic brw_inst.h macros doesn't seem like an option. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Jordan Justen	1c8a045bfb	i965: Support saving the gen program with glGetProgramBinary Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:33 -07:00
Jordan Justen	eb5b4b0fd1	i965: Add flag_state param to brw_search_cache This allows brw_search_cache to be used to find programs without causing extra state to be emitted in the case where the program isn't being made active. (For example, to find the program to save out with the ARB_get_program_binary interface.) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:33 -07:00
Jordan Justen	48ce7745dc	mesa: Add gl_shader_program param to ProgramBinarySerializeDriverBlob This might be required because some stages might generate different programs depending on the other stages in the program. For example, the i965 driver's tessellation control stage depends on the tessellation evaluation shader. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:33 -07:00
Jordan Justen	36dd15f8b3	i965: Add brw_populate_default_key We will need to populate the default key for ARB_get_program_binary to allow us to retrieve the default gen program to store in the program binary. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:33 -07:00
Jordan Justen	65f2014740	i965: Replace brw_setup_tex_for_precompile brw with devinfo Trying to make sure the setup of the default program key is not dependent on the GL state. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:33 -07:00
Jordan Justen	e426286e21	i965: Regenerate blob without gen program for shader cache Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:33 -07:00
Jordan Justen	3a133223b3	compiler/blob: Add blob_skip_bytes Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:33 -07:00
Jordan Justen	8e7ee7433e	i965: Add support for driver cache blob containing the gen program Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:33 -07:00
Jordan Justen	05bb4b4849	i965: Use brw_prog_key_set_id in disk cache load/store code Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:33 -07:00
Jordan Justen	170d76de9f	i965: Add brw_prog_key_set_id helper to set the program id on any stage For saving programs (shader cache; get program binary) it is useful to set the id to 0, with the stage being a parameter. For restoring programs it is useful to set the id to the id allocated to the program at creation time. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:33 -07:00
Jordan Justen	1c1a7d11c8	i965: Add brw_stage_cache_id to map gl stages to brw cache_ids Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:32 -07:00
Jordan Justen	b9f9b35431	i965: Add brw_(read\|write)_blob_program_data functions We will want to use these for both the disk shader cache, and for the ARB_get_program_binary. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:32 -07:00
Jordan Justen	1777c23abf	i965: Add brw_program_deserialize_driver_blob brw_program_deserialize_driver_blob will be a more generic form of brw_program_deserialize_nir. In addition to nir, it will also be able to extract gen binaries and upload them to the program cache. In this commit, it continues to only support nir. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:32 -07:00
Jordan Justen	f4c154afc1	i965: Move brw_program_*serialize_nir to brw_program_binary.c This will allow get_program_binary to add the gen program into its serialization in addition to just the nir program. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:32 -07:00
Jordan Justen	cce3994dee	mesa: Always call ProgramBinarySerializeDriverBlob The driver may prefer to have a different blob for ARB_get_program_binary compared to the version saved out for the disk shader cache. Since they both use the driver_cache_blob field, we need to always give the driver the opportunity to fill in the driver_cache_blob when saving the program binary. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:32 -07:00
Jordan Justen	6497be42b7	i965: Use ShaderCacheSerializeDriverBlob driver function This function is called just before the gl_program::driver_cache_blob is saved out as part of the gl_program serialization. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:32 -07:00
Jordan Justen	450f00e39d	st/mesa: Use ShaderCacheSerializeDriverBlob driver function Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:32 -07:00
Jordan Justen	c510dd22a9	st/mesa: Skip serializing driver_cache_blob if it exists Previously the mesa core code would not call to serialize the driver_cache_blob if it existed. We will update it to always call to serialize the driver_cache_blob meaning we should avoid re-serializing it under mesa/state_tracker. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:32 -07:00
Jordan Justen	2a55553be3	mesa: Add disk shader cache driver blob callback Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:28 -07:00
Iago Toral Quiroga	213491600a	intel/compiler: emit actual barriers for working-group level barriers Until now we have assumed that we could skip emitting these barriers in the general case based on empirical testing and a few assumptions detailed in a comment in the driver code, however, recent CTS tests have showed that we actually need them to produce correct behavior. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-10 07:46:34 +02:00
Dave Airlie	0cab6e51e3	radv: add some cxxflags for new c++ file Looks like I broke intel CI compiles. Fixes: `6f3aee40f9` (radv: using tls to store llvm related info and speed up compiles (v10)) Tested-by: Clayton Craft <clayton.a.craft@intel.com>	2018-07-10 10:48:03 +10:00
Jason Ekstrand	dc1d10b396	anv,radv: Add support for VK_KHR_get_display_properties2 Reviewed-by: Keith Packard <keithp@keithp.com>	2018-07-09 17:09:41 -07:00
Jason Ekstrand	c0a27c5946	intel/aubinator_error_decode: Allow for more sections Error states coming from actual Vulkan applications tend to have fairly long command buffers and lots of chained batches. 30 total BOs isn't nearly enough. This commit bumps it to 256, makes some things use the actual number of sections instead of the #define, and adds asserts if we ever go over 256 sections. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-09 16:40:54 -07:00
Jason Ekstrand	5009e73bb1	intel/batch_decoder: Recurse for all 2nd level batches Our attempt to restart the loop with the second level batch worked at one point but got broken at some point. It was too fragile anyway and we're not likely to have enough secondaries to actually overflow the stack so we may as well recurse in both cases. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-09 16:40:54 -07:00
Dave Airlie	45e25adfe8	virgl/vtest: add support to vtest for new cap getting. The vtest protocol is pretty simple but also pretty dumb, and the v1 caps query was fixed size, with no nice way to expand it, however the server also ignores any command it doesn't understand. So we can query v2 caps by sending a v2 followed by a v1, if the v2 is ignored we know it's an old vtest server, and the we get a v2 answer then we can just read the v1 answer and discard it. Acked-by: Jakob Bornecrantz <jakob@collabora.com> (sounds good)	2018-07-10 09:07:37 +10:00
Anuj Phogat	2badf0e85b	i965/icl: Don't set float blend optimization bit in CACHE_MODE_SS CACHE_MODE_SS is not listed in gfxspecs table for user mode non-privileged registers. So, making any changes from Mesa will do nothing. Kernel is already setting this bit in CACHE_MODE_SS register which is saved/restored to/from the HW context image. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-09 15:38:42 -07:00
Anuj Phogat	c1d8300117	anv/icl: Don't set float blend optimization bit in CACHE_MODE_SS CACHE_MODE_SS is not listed in gfxspecs table for user mode non-privileged registers. So, making any changes from Mesa will do nothing. Kernel is already setting this bit in CACHE_MODE_SS register which is saved/restored to/from the HW context image. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-09 15:38:42 -07:00
Jason Ekstrand	227dabc266	anv: Implement VK_EXT_vertex_attribute_divisor Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-07-09 15:37:51 -07:00
Jason Ekstrand	2caf6c0392	anv/pipeline: Add a per-VB instance divisor Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-07-09 15:37:51 -07:00
Jason Ekstrand	32f4feb5a0	anv/pipeline: Use a per-VB struct instead of separate arrays Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-07-09 15:37:51 -07:00
Jose Maria Casanova Crespo	6db20229ab	anv: Enable SPV_KHR_8bit_storage and VK_KHR_8bit_storage Enables SPV_KHR_8bit_storage and VK_KHR_8bit_storage on gen 8+ using the VK_KHR_get_physical_device_properties2 functionality to expose if the extension is supported or not. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-10 00:14:50 +02:00
Jose Maria Casanova Crespo	0c01bf70e0	spirv/nir: Add support for SPV_KHR_8bit_storage Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-10 00:14:50 +02:00
Jose Maria Casanova Crespo	f29c19cd5c	spirv: Include headers and grammar for SPV_KHR_8bit_storage Updates headers and grammar to ff684ffc6a35d2a58f0f63108877d0064ea33feb Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-10 00:14:50 +02:00
Jose Maria Casanova Crespo	cd0afab99b	i965/fs: Enable store_ssbo for 8-bit types. v2: Update comment according to this patch. (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-10 00:14:50 +02:00
Jose Maria Casanova Crespo	11c904d0d3	intel/compiler: relax brw_eu_validate for byte raw movs When the destination is a BYTE type allow raw movs even if the stride is not exact multiple of destination type and exec type, execution type is Word and its size is 2. This restriction was only allowing stride==2 destinations for 8-bit types. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-10 00:14:49 +02:00
Jose Maria Casanova Crespo	87fc9af3fc	i965/fs: Enable conversions to 8-bit integers Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-10 00:14:49 +02:00
Jose Maria Casanova Crespo	030472c1f0	i965: Support for 8-bit base types in helper functions Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-10 00:14:49 +02:00
Jose Maria Casanova Crespo	232ed89802	i965/fs: Register allocator shoudn't use grf127 for sends dest Since Gen8+ Intel PRM states that "r127 must not be used for return address when there is a src and dest overlap in send instruction." This patch implements this restriction creating new grf127_send_hack_node at the register allocator. This node has a fixed assignation to grf127. For vgrf that are used as destination of send messages we create node interfereces with the grf127_send_hack_node. So the register allocator will never assign to these vgrf a register that involves grf127. If dispatch_width > 8 we don't create these interferences to the because all instructions have node interferences between sources and destination. That is enough to avoid the r127 restriction. This fixes CTS tests that raised this issue as they were executed as SIMD8: dEQP-VK.spirv_assembly.instruction.graphics.8bit_storage.8struct_to_32struct.storage_buffer_*int_geom Shader-db results on Skylake: total instructions in shared programs: 7686798 -> 7686797 (<.01%) instructions in affected programs: 301 -> 300 (-0.33%) helped: 1 HURT: 0 total cycles in shared programs: 337092322 -> 337091919 (<.01%) cycles in affected programs: 22420415 -> 22420012 (<.01%) helped: 712 HURT: 588 Shader-db results on Broadwell: total instructions in shared programs: 7658574 -> 7658625 (<.01%) instructions in affected programs: 19610 -> 19661 (0.26%) helped: 3 HURT: 4 total cycles in shared programs: 340694553 -> 340676378 (<.01%) cycles in affected programs: 24724915 -> 24706740 (-0.07%) helped: 998 HURT: 916 total spills in shared programs: 4300 -> 4311 (0.26%) spills in affected programs: 333 -> 344 (3.30%) helped: 1 HURT: 3 total fills in shared programs: 5370 -> 5378 (0.15%) fills in affected programs: 274 -> 282 (2.92%) helped: 1 HURT: 3 v2: Avoid duplicating register classes without grf127. Let's use a node with a fixed assignation to grf127 and create interferences to send message vgrf destinations. (Eric Anholt) v3: Update reference to CTS VK_KHR_8bit_storage failing tests. (Jose Maria Casanova) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: 18.1 <mesa-stable@lists.freedesktop.org>	2018-07-10 00:14:49 +02:00
Jose Maria Casanova Crespo	0e47ecb29a	intel/compiler: grf127 can not be dest when src and dest overlap in send Implement at brw_eu_validate the restriction from Intel Broadwell PRM, vol 07, section "Instruction Set Reference", subsection "EUISA Instructions", Send Message (page 990): "r127 must not be used for return address when there is a src and dest overlap in send instruction." v2: Style fixes (Matt Turner) Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: 18.1 <mesa-stable@lists.freedesktop.org>	2018-07-10 00:14:49 +02:00
Dave Airlie	6f3aee40f9	radv: using tls to store llvm related info and speed up compiles (v10) This uses the common compiler passes abstraction to help radv avoid fixed cost compiler overheads. This uses a linked list per thread stored in thread local storage, with an entry in the list for each target machine. This should remove all the fixed overheads setup costs of creating the pass manager each time. This takes a demo app time to compile the radv meta shaders on nocache and exit from 1.7s to 1s. It also has been reported to take the startup time of uncached shaders on RoTR from 12m24s to 11m35s (Alex) v2: fix llvm6 build, inline emit function, handle multiple targets in one thread v3: rebase and port onto new structure v4: rename some vars (Bas) v5: drag all code into radv for now, we can refactor it out later for radeonsi if we make it shareable v6: use a bit more C++ in the wrapper v7: logic bugs fixed so it actually runs again. v8: rebase on top of radeonsi changes. v9: drop some C++ headers, cleanup list entry v10: use pop_back (didn't have enough caffeine) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-10 07:58:03 +10:00
Adam Jackson	c1ec582059	swrast: Fix eglMakeCurrent(dpy, NULL, NULL, ctx) (v2) Fixes 14 piglits, mostly in egl_khr_create_context. v2: Also short-circuit the same-context-no-drawables case (Eric Anholt) Fixes: https://github.com/anholt/libepoxy/issues/177 Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Adam Jackson <ajax@redhat.com>	2018-07-09 16:09:58 -04:00
Lionel Landwerlin	7205bdf41f	intel: tools: dump_gpu: fix ppgtt mapping We were not properly writing page tables when the virtual address range spans multiple subtrees of the tables. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-09 21:08:08 +01:00
Eric Anholt	beeb94402f	v3d: Implement noperspective varyings on V3D 4.x. Fixes a bunch of piglit interpolation tests, and reduces my concern about some MSAA blit shaders with noperspective varyings.	2018-07-09 11:48:32 -07:00
Eric Anholt	4b4795be9d	v3d: Refactor flat shade/centroid flag emission. The logic was duplicated in a pretty gross way, when what we really need is just a helper function for stuffing the values in the packet. This will make implementing noperspective easier.	2018-07-09 11:48:32 -07:00
Eric Anholt	93f437d128	v3d: Fix typo in dither mode offset. We weren't using the field yet, so it didn't affect anything. Fixes: `c0476d964a` ("v3d: Express dithering mode in the same way that the CLIF parser does.")	2018-07-09 11:48:32 -07:00
zhaowei yuan	73ec437627	glsl: Treat sampler2DRect and sampler2DRectShadow as reserved in ES2 "sampler2DRect" and "sampler2DRectShadow" are specified as reserved from GLSL 1.1 and GLSL ES 1.0 Signed-off-by: zhaowei yuan <zhaowei.yuan@samsung.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106906 Reviewed-by: Eric Anholt <eric@anholt.net> Fixes: `34f7e761bc` ("glsl/parser: Track built-in types using the glsl_type directly")	2018-07-09 11:37:08 -07:00
Charmaine Lee	097952abaa	st/wgl: check for NULL piAttribList in wglCreatePbufferARB() Java2d opengl pipeline passes NULL piAttribList to wglCreatePbufferARB(). So skip parsing the attribute list if it is NULL. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-07-06 17:32:49 -07:00
Jason Ekstrand	a695de5845	anv: Add support for VK_KHR_create_renderpass2 The implementation of CreateRenderPass2 uses the helpers we broke out in previous commits. The implementations of the new vkCmd functions just call the old versions. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-09 10:11:53 -07:00
Jason Ekstrand	208be8eafa	anv: Make subpass::depth_stencil_attachment a pointer This makes certain checks a bit easier and means that we don't have the attachment information duplicated in the attachment list and in depth_stencil_attachment. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-09 10:11:53 -07:00
Jason Ekstrand	75e308fc44	anv/pass: Move implicit dependency setup to anv_render_pass_compile Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-09 10:11:53 -07:00
Jason Ekstrand	144626946e	anv/pass: Move some dependency setup into a helper This new helper takes a VkSubpassDependency2KHR for future-proofing. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-09 10:11:53 -07:00
Jason Ekstrand	6f9485d21f	anv/pass: Move a bunch of analysis into a separate "compile" stage Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-09 10:11:53 -07:00
Jason Ekstrand	55285b8404	anv/pass: Use a designated initailizer for attachments Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-09 10:11:53 -07:00
Jason Ekstrand	6c746e8fea	anv: Bump the advertised patch version to 80 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-09 10:11:53 -07:00
Adam Jackson	d257ec0136	glx: Don't allow glXMakeContextCurrent() with only one valid drawable Drawable and readable need to either both be None or both be non-None. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-09 12:03:18 -04:00
Erik Faye-Lund	af6b7bf236	mesa: verify MaxVertexAttribStride for GLES 3.1 The OpenGL 3.1 specification, table Table 20.41 ("Implementation Dependent Values"), defines the minimum-maximum value for MAX_VERTEX_ATTRIB_STRIDE to be 2048. So we shouldn't enable OpenGL ES 3.1 on implementations where this isn't the case. Let's add a check for this Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-07-09 17:32:31 +02:00
Erik Faye-Lund	2e64a2f2d1	mesa: verify MaxVertexAttribStride for GL 4.4 The OpenGL 4.4 specification, table Table 23.55 ("Implementation Dependent Values"), defines the minimum-maximum value for MAX_VERTEX_ATTRIB_STRIDE to be 2048. So we shouldn't enable OpenGL 4.4 on implementations where this isn't the case. Let's add a check for this. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-07-09 17:32:31 +02:00
Erik Faye-Lund	747cf468ff	r600: report incorrect max-vertex-attrib for GL 4.4 OpenGL 4.4 requires a max vertex attrib of 2048 or higher, but r600 only supports 2047. Technically, this makes it an GL4.3 GPU, but it's currently exposing GL4.4. To avoid regressing the GL version supported in the following patches, let's just lie and pretend like we support 2048. Any applications using 2048 are already broken anyway. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-07-09 17:32:31 +02:00
Jose Maria Casanova Crespo	6706b421f0	intel/fs: use uint type for per_slot_offset at GS This helps us to compact original instruction: mul(8) g3<1>D g6<8,8,1>UD 0x00000006UD { align1 1Q }; So now we emit: mul(8) g3<1>UD g6<8,8,1>UD 0x00000006UD { align1 1Q compacted }; Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-07-09 15:28:48 +02:00
Samuel Pitoiset	e8f82b33fb	radv: add the trace BO to the list when starting a new cmdbuf That might reduce CPU overhead a little bit when using RADV_TRACE_FILE. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-09 13:57:01 +02:00
Samuel Pitoiset	5e5a28d52a	radv: reduce CPU overhead in radv_flush_descriptors() The number of enabled descriptors for a given pipeline stage can be computed at compile time. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-09 13:56:58 +02:00
Iago Toral Quiroga	81ca08e030	intel/compiler: remove unused function Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-09 13:21:48 +02:00
Iago Toral Quiroga	449c22004c	anv/pipeline: honor the pipeline_cache_enabled run-time flag v2: merge both conditions to reduce the diff (Lionel) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-09 08:40:26 +02:00
Roland Scheidegger	817efd8968	r600/sb: fix crash in fold_alu_op3 fold_assoc() called from fold_alu_op3() can lower the number of src to 2, which then leads to an invalid access to n.src[2]->gvalue(). This didn't seem to have caused much harm in the past, but on Fedora 28 it will crash (presumably because -D_GLIBCXX_ASSERTIONS is used, although with libstdc++ 4.8.5 this didn't do anything, -D_GLIBCXX_DEBUG was needed to show the issue). An alternative fix would be to instead call fold_alu_op2() from within fold_assoc() when the number of src is reduced and return always TRUE from fold_assoc() in this case, with the only actual difference being the return value from fold_alu_op3() then. I'm not sure what the return value actually should be in this case (or whether it even can make a difference). https://bugs.freedesktop.org/show_bug.cgi?id=106928 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-07-09 07:17:29 +01:00
Jason Ekstrand	7c92c7d151	vulkan: Update the XML and headers to 1.1.80 Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-08 21:39:18 -07:00
Lionel Landwerlin	420bf14e12	i965: fix clear color bo address relocation Fixes: `7987d041fd` ("i965/surface_state: Emit the clear color address instead of value.") Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-07 20:54:55 +01:00
Mauro Rossi	1a1f2b134c	radv: winsys/amdgpu: include missing pthread.h header pthread types are used in some files without explicitely including pthread.h. This leads to compile errors on Android 7.x nougat-x86 e.g. in src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.h In file included from external/mesa/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.c:31: In file included from external/mesa/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.h:32: external/mesa/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.h:52:2: error: unknown type name 'pthread_mutex_t' pthread_mutex_t global_bo_list_lock; ^ 1 error generated. Including pthread.h explicitely solves the building error Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-07 20:53:59 +02:00
Karol Herbst	de13978733	nv50/ir: fix Instruction::isActionEqual for PHI instructions phi instructions don't have the same results by simply having the same sources. They need to be inside the same BasicBlock or share an equal condition resulting into a path through the shader selecting equal sources as well. short example: cond = ...; const0 = 0; const1 = 1; if (cond) { ssa_1 = const0; } else { ssa_2 = const1; } ssa_3 = phi ssa_1 ssa_2; if (!cond) { ssa_4 = const0; } else { ssa_5 = const1; } ssa_6 = phi ssa_4 ssa_5; allthough both phis actually have sources with equal results, merging them would be wrong due to having a different condition selecting which source to take. For now we also stick an assert into GlobalCSE, because it should never end up having to merge phi instructions. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-07-07 20:32:33 +02:00
Rhys Perry	f2cc694d8e	nvc0/ir: use the combined tid special register total instructions in shared programs : 5804448 -> 5804690 (0.00%) total gprs used in shared programs : 670065 -> 670065 (0.00%) total shared used in shared programs : 548832 -> 548832 (0.00%) total local used in shared programs : 21068 -> 21068 (0.00%) local shared gpr inst bytes helped 0 0 0 5 5 hurt 0 0 0 191 191 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2018-07-07 20:31:56 +02:00
Jason Ekstrand	6e88561156	nir/print: Print texture and sampler indices Commit 5fb69daa6076e56b deleted support from nir_print for printing the texture and sampler indices on texture instructions. This commit just brings it back as best as we can. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-07 09:32:33 -07:00
Ian Romanick	f8e54d02f7	intel/compiler: Relax mixed type restriction for saturating immediates At the time of commit `7bc6e455e2` (i965: Add support for saturating immediates.) we thought mixed type saturates would be impossible. We were only thinking about type converting moves from D to F, for example. However, type converting moves w/saturate from F to DF are definitely possible. This change minimally relaxes the restriction to allow cases that I have been able trigger via piglit tests. Fixes new piglit tests: - arb_gpu_shader_fp64/execution/built-in-functions/fs-sign-sat-neg-abs.shader_test - arb_gpu_shader_fp64/execution/built-in-functions/vs-sign-sat-neg-abs.shader_test Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-07-06 16:20:10 -07:00
Ian Romanick	9626ea497d	i965/vec4: Properly handle sign(-abs(x)) This is achived by copying the sign(abs(x)) optimization from the FS backend. On Gen7 an earlier platforms, this fixes new piglit tests: - glsl-1.10/execution/vs-sign-neg-abs.shader_test - glsl-1.10/execution/vs-sign-sat-neg-abs.shader_test Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-07-06 16:20:07 -07:00
Ian Romanick	88bd37c010	i965/fs: Properly handle sign(-abs(x)) Fixes new piglit tests: - glsl-1.10/execution/fs-sign-neg-abs.shader_test - glsl-1.10/execution/fs-sign-sat-neg-abs.shader_test - glsl-1.10/execution/vs-sign-neg-abs.shader_test - glsl-1.10/execution/vs-sign-sat-neg-abs.shader_test Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-07-06 16:20:04 -07:00
Lionel Landwerlin	c05c8d65ba	vulkan: utils: handle hexadecimal values in registry Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-06 22:12:00 +01:00
Marek Olšák	0eaf069679	st/dri: fix a crash in server_wait_sync Ported from i965 including the comment. This fixes: dEQP-EGL.functional.reusable_sync.valid.wait_server Cc: 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2018-07-06 16:23:37 -04:00
Mathieu Bridon	b39bdb0716	python: Stop using the Python 2 exception syntax We could have made this compatible with Python 3 by using: except Exception as e: But since none of this code actually uses the exception objects, let's just drop them entirely. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-07-06 10:18:43 -07:00
Mathieu Bridon	e5a8d51e54	python: Use spaces, not tabs Python 3 doesn't allow mixing spaces and tabs in a script, contrarily to Python 2. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-07-06 10:04:55 -07:00
Mathieu Bridon	0f7b18fa0d	python: Use the print function In Python 2, `print` was a statement, but it became a function in Python 3. Using print functions everywhere makes the script compatible with Python versions >= 2.6, including Python 3. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Acked-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2018-07-06 10:04:22 -07:00
Jon Turney	b3a42fa066	vma/tests: Fix compilation if limits.h defines PAGE_SIZE (v2) per POSIX, limits.h may define PAGE_SIZE when the value is not indeterminate v2: just change the variable name, since there's no intended correlation here between this value and the machine's actual page size. Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-07-06 14:01:08 +01:00
Samuel Pitoiset	85865dbe0d	radv: fix emitting the view index on GFX9 For merged shaders, VS as HS for example. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-06 10:22:53 +02:00
Ian Romanick	965a06dbd7	i965/vec4: Make the vec4_visitor::nir_emit_instr default case unreachable The bug fixed by the previous commit went undetected because extra stderr messages are not flagged by the CI. Copy the solution from fs_visitor::nir_emit_instr and mark the default case unreachable. An alternate solution is to delete the default case so that the compiler will issue a warning. That may require more work since there are other (impossible) cases that exist. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-05 21:13:32 -07:00
Ian Romanick	a4d4787327	intel/compiler: More DCE after lowering Some of the lowering passes, nir_lower_locals_to_regs for example, can cause some previously live code to be dead. This pass in particular leaves a bunch of nir_instr_type_deref instructions floating around. This causes shader-db runs on Gen5 through Haswell to spew tons of messages like: VS instruction not yet implemented by NIR->vec4 UnrealEngine4/EffectsCaveDemo/239.shader_test is one shader that generates these messages. Cleaning up the dead code fixes that. To verify, I did a shader-db before and after. Even though all the messages are gone, the results make my brain hurt. :( Haswell total cycles in shared programs: 411890163 -> 411891145 (<.01%) cycles in affected programs: 57016 -> 57998 (1.72%) helped: 3 HURT: 11 helped stats (abs) min: 2 max: 154 x̄: 96.67 x̃: 134 helped stats (rel) min: 0.08% max: 2.23% x̄: 1.42% x̃: 1.96% HURT stats (abs) min: 18 max: 686 x̄: 115.64 x̃: 20 HURT stats (rel) min: 0.81% max: 7.12% x̄: 1.87% x̃: 0.93% 95% mean confidence interval for cycles value: -51.39 191.67 95% mean confidence interval for cycles %-change: -0.14% 2.46% Inconclusive result (value mean confidence interval includes 0). Ivy Bridge total cycles in shared programs: 259114802 -> 259115032 (<.01%) cycles in affected programs: 24034 -> 24264 (0.96%) helped: 1 HURT: 9 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.08% max: 0.08% x̄: 0.08% x̃: 0.08% HURT stats (abs) min: 18 max: 48 x̄: 25.78 x̃: 20 HURT stats (rel) min: 0.80% max: 1.94% x̄: 1.08% x̃: 0.80% 95% mean confidence interval for cycles value: 12.42 33.58 95% mean confidence interval for cycles %-change: 0.54% 1.38% Cycles are HURT. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Fixes: `5a02ffb733` nir: Rework lower_locals_to_regs to use deref instructions Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-05 21:13:21 -07:00
Eric Anholt	9d0406c52f	v3d: Fix leak of the default attributes BOs. The GLES3 CTS makes a lot more progress on a run now.	2018-07-05 15:50:54 -07:00
Eric Anholt	6b11131373	v3d: Fix leak of the spill BO on context destruction.	2018-07-05 15:50:52 -07:00
Eric Anholt	4b2ba18ff3	nir: Apply fragment color clamping to gl_FragData[] as well. From the ARB_color_buffer_float spec: 35. Should the clamping of fragment shader output gl_FragData[n] be controlled by the fragment color clamp. RESOLVED: Since the destination of the FragData is a color buffer, the fragment color clamp control should apply. Fixes arb_color_buffer_float-mrt mixed on v3d. Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-07-05 12:39:36 -07:00
Eric Anholt	03f6d26b62	v3d: Skip emitting per-RT blend state for RTs with blend disabled. Cleans up the CL of fbo-drawbuffers2-blend a bit. We could do better on more complicated cases by noticing if multiple RTs have the same blend state and emitting them in a single packet.	2018-07-05 12:39:36 -07:00
Eric Anholt	572f6ab489	v3d: Add proper support for GL_EXT_draw_buffers2's blending enables. I had flagged it as enabled on V3D 4.x, but not actually implemented the per-RT enables. Fixes piglit fbo_drawbuffers2-blend.	2018-07-05 12:39:36 -07:00
Eric Anholt	5601ab3981	v3d: Add support for GL_SAMPLE_ALPHA_TO_ONE. Fixes piglit ext_framebuffer_multisample-draw-buffers-alpha-to-one	2018-07-05 12:39:36 -07:00
Eric Anholt	7b63371420	v3d: Respect swap_color_rb for the f32_color_rb case. We don't actually set the two flags together, but I want to use the r/g/b/a reordered fields in the next commit.	2018-07-05 12:39:36 -07:00
Eric Anholt	dbd52585fa	st/nir: Disable varying packing when doing transform feedback. The varying packing would result in st_nir_assign_var_locations() picking new driver_locations, despite the pipe_stream_output already being set up for the old driver location. This left the gallium driver with no way to work back to what varying was referenced by pipe_stream_output. Fixes these tests on V3D: dEQP-GLES3.functional.transform_feedback.random.separate.points.3 dEQP-GLES3.functional.transform_feedback.random.separate.points.7 dEQP-GLES3.functional.transform_feedback.random.separate.points.9 dEQP-GLES3.functional.transform_feedback.random.separate.triangles.3 dEQP-GLES3.functional.transform_feedback.random.separate.triangles.8 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-05 12:38:27 -07:00
Jon Turney	ab7aa0f10c	meson: Set with_dri from with_gallium when DRI glx is explicitly configured Set with_dri from with_gallium when DRI GLX is explicitly configured, as well as when DRI GLX is chosen automatically. Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-07-05 17:48:35 +01:00
Samuel Pitoiset	72fd93370f	radv/winsys: make use of radeon_emit() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-05 17:23:25 +02:00
Samuel Pitoiset	f2a310849e	radv: only flush CB meta in pipeline image barriers when needed If the given image doesn't enable CMASK, FMASK or DCC that's useless to flush CB metadata. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-05 17:20:16 +02:00
Samuel Pitoiset	17bb4c2cf5	radv: only flush DB meta in pipeline image barriers when needed If the given image doesn't have HTILE, that's useless to flush DB metadata. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-05 17:20:12 +02:00
Samuel Pitoiset	2a3e9c89ff	radv: fix "error: initializer element is not constant" build error GCC 4.8 fails to compile with "static const", while GCC 8.1 fails to compile with only "static". Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-07-05 17:12:02 +02:00
Lionel Landwerlin	78d5c1c82a	util: u_queue: fix android build error mesa/src/util/u_queue.c:242:15: error: address of array 'queue->name' will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion] Fixes: `b238e33bc9` "kutil/queue: add a process name into a thread name" Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-07-05 15:42:26 +01:00
Benedikt Schemmer	93a5c9bc99	Util: fix msvc build The MSVC preprocessor doesnt understand #warning Fixes: `2e1e6511f7` ("util: extract get_process_name from xmlconfig.c") Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-07-05 14:24:08 +01:00
Mathieu Bridon	f9b6dfd919	python: Specify the JSON separators On Python 2, the default JSON separators are ', ' for items and ': ' for dicts. On Python 3, the default is the same when no indent is specified, but if one is (and we do specify one) then the default items separator becomes ',' (the dict separator remains unchanged). This change explicitly specifies the Python 3 default, which helps ensuring that the output is identical, whether it was generated by Python 2 or 3. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-07-05 12:52:38 +01:00
Mathieu Bridon	fe8a153648	python: Stabilize some script outputs In Python, dictionaries and sets are unordered, and as a result their is no guarantee that running this script twice will produce the same output. Using ordered dicts and explicitly sorting items makes the build more reproducible, and will make it possible to verify that we're not breaking anything when we move the build scripts to Python 3. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-07-05 12:52:12 +01:00
Lionel Landwerlin	d337713ec4	intel: tools: remove drm-uapi defines We already embed the headers, no need to redefine defines/structs. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-05 11:57:45 +01:00
Lionel Landwerlin	87915baa23	intel: intel_dump_gpu: use simulator id in captures Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-05 11:57:45 +01:00
Lionel Landwerlin	aab21cedc6	intel: devinfo: add simulator id Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-05 11:57:45 +01:00
Scott D Phillips	0f53948c59	intel: tools: dump-gpu: dump 48-bit addresses For gen8+, write out PPGTT tables in aub files so that full 48-bit addresses can be serialized. v2: Fix handling of `end` index in map_ppgtt v3: Correctly mark GGTT entry as present (Rafael) Signed-off-by: Scott D Phillips <scott.d.phillips@intel.com> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-05 11:57:45 +01:00
Lionel Landwerlin	6e37b949d5	intel: tools: import intel_aubdump Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-05 11:57:45 +01:00
Lionel Landwerlin	fa00b9c1c9	intel: tools: update intel_aub.h Scott added new stuff in IGT. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-05 11:57:45 +01:00
Lionel Landwerlin	5ffa35b64d	intel: batch-decoder: add missing return line Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-05 11:57:45 +01:00
Lionel Landwerlin	28476c9d81	intel: batch-decoder: don't asks for constant BO until decoding With PPGTT mappings, our aubinator implementation can be quite slow if we request a buffer that doesn't exist. Instead of doing a PPGTT walk for invalid addresses (0 lengths), wait until we're sure we want to decode the data. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-05 11:57:45 +01:00
Scott D Phillips	c262ec19d0	intel/batch-decoder: handle non-contiguous binding table / surface state Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-05 11:57:45 +01:00
Scott D Phillips	3ebee627cb	intel/tools/aubinator: aubinate ppgtt aubs v2: by Lionel Fix memfd_create compilation issue Fix pml4 address stored on 32 instead of 64bits Return no buffer if first ppgtt page is not mapped v3: Drop additional memfd_create() (Rafael) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-05 11:57:45 +01:00
Lionel Landwerlin	3228335b55	intel: aubinator: handle GGTT mappings We use memfd to store physical pages as they get read/written to and the GGTT entries translating virtual address to physical pages. Based on a commit by Scott Phillips. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-05 11:57:45 +01:00
Jason Ekstrand	2602ea89d5	util: rb-tree: A simple, invasive, red-black tree This is a simple, invasive, liberally licensed red-black tree implementation. It's an invasive data structure similar to the Linux kernel linked-list where the intention is that you embed a rb_node struct the data structure you intend to put into the tree. The implementation is mostly based on the one in "Introduction to Algorithms", third edition, by Cormen, Leiserson, Rivest, and Stein. There were a few other key design points: * It's an invasive data structure similar to the [Linux kernel linked list]. * It uses NULL for leaves instead of a sentinel. This means a few algorithms differ a small bit from the ones in "Introduction to Algorithms". * All search operations are inlined so that the compiler can optimize away the function pointer call. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-05 11:57:45 +01:00
Lionel Landwerlin	144b40db54	intel: aubinator: drop the 1Tb GTT mapping Now that we're softpinning the address of our BOs in anv & i965, the addresses selected start at the top of the addressing space. This is a problem for the current implementation of aubinator which uses only a 40bit mmapped address space. This change keeps track of all the memory writes from the aub file and fetch them on request by the batch decoder. As a result we can get rid of the 1<<40 mmapped address space and only rely on the mmap aub file \o/ Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-05 11:57:45 +01:00
Lionel Landwerlin	9d08ef6335	intel: aubinator: rework register writes handling Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-05 11:57:45 +01:00
Lionel Landwerlin	86cb05a6d3	intel: aubinator: remove standard input processing option On a follow up commit in this series, we stop copying the data from the mmap'ed file into our big gtt mmap, and start referencing data in it directly. So reallocating the read buffer and adding more data from stdin wouldn't work. For that reason, let's stop supporting stdin process. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-05 11:57:45 +01:00
Lionel Landwerlin	08d85a8301	intel: aubinator: remove unused variables These memory offsets are stored in the gen_batch_decode_ctx. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-05 11:57:45 +01:00
Mathieu Bridon	3153bcc73e	gallium/auxiliary: Fix string matching Commit `f69bc797e1` did the following: - if format.layout in ('bptc', 'astc'): + if format.layout in ('astc'): The intention was to go from matching either 'bptc' or 'astc' to matching only 'astc'. But the new code doesn't respect this intention any more, because in Python `('astc')` is not a tuple containing a string, it is just the string. (the parentheses are simply ignored) That means we now match any substring of 'astc', for example 'a'. This commit fixes the test to respect the original intention. Fixes: `f69bc797e1` "gallium/auxiliary: Add helper support for bptc format compress/decompress" Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-07-05 11:48:47 +01:00
Samuel Pitoiset	8339ba827b	radv: optimize vkCmd{Set,Reset}Event() a little bit Always emitting a bottom-of-pipe event is quite dumb. Instead, start to optimize these functions by syncing PFP for the top-of-pipe and syncing ME for the post-index-fetch event. This can still be improved by emitting EOS events for syncing PS and CS stages. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-05 11:31:06 +02:00
Samuel Pitoiset	f635109140	radv: optimize radv_CmdWaitEvents() This introduces radv_barrier() (same as the draw/dispatch codepath). This helper is used for merging the code from CmdWaitEvents() and CmdPipelineBarrier because it's quite similar. We do ignore the source stage mask for CmdWaitEvents because it's irrelevant when event objects are used. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-05 11:31:03 +02:00
Roland Scheidegger	620626a371	nir/linker: fix msvc build Empty initializer braces aren't valid c (it's a gnu extension, and it's valid in c++). Hopefully fixes appveyor / msvc build... Fixes `6677e131b8` Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-05 09:27:05 +02:00
Gert Wollny	806a42fc47	r600: compare structure elements instead of doing a memcmp Structures might be padded by the compiler and these padding bytes remain un-initialized which in turn makes memcmp return a difference where from the logical point of view there is none. Fixes valgrind: Conditional jump or move depends on uninitialised value(s) at 0x4C32CBA: __memcmp_sse4_1 (vg_replace_strmem.c:1099) by 0xB8D2537: r600_set_vertex_buffers (r600_state_common.c:573) by 0xB71D44A: u_vbuf_set_driver_vertex_buffers (u_vbuf.c:1129) by 0xB71F7BB: u_vbuf_draw_vbo (u_vbuf.c:1153) by 0xB3B92CB: st_draw_vbo (st_draw.c:235) by 0xB36B1AE: vbo_draw_arrays (vbo_exec_array.c:391) by 0xB36BB0D: vbo_exec_DrawArrays (vbo_exec_array.c:550) by 0x10A989: piglit_display (textureSize.c:157) by 0x4F8F174: run_test (piglit_fbo_framework.c:52) by 0x4F7BA12: piglit_gl_test_run (piglit-framework-gl.c:229) by 0x10A60A: main (textureSize.c:71) Uninitialised value was created by a stack allocation at 0xB3948FD: st_update_array (st_atom_array.c:388) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-07-05 07:59:07 +02:00
Gert Wollny	9c1ae6a1a1	r600: Add R4G4B4A4 and A1B5G5R5 to supported vertex formats Below tests would fail with an error message "Vertex format (R4G4B4A4\|R5G5B5A1) not supported." Add the formate to the translation routine to enable these formats. Fixes: dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rgba4_2d dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rgba4_cube dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rgb5_a1_2d dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rgb5_a1_cube dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgba4_2d dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgba4_cube dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgb5_a1_2d dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgb5_a1_cube dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgba4_2d_array dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgba4_3d dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgb5_a1_2d_array dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgb5_a1_3d dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgba4_2d_array dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgba4_3d dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgb5_a1_2d_array dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgb5_a1_3d Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-07-05 07:57:28 +02:00
Gert Wollny	5278436d67	r600: force LOD range to be only one value when mip.min filter is NONE For a texture that has only one LOD defined, but for which GL_TEXTURE_MAX_LEVEL is the default (1000) and GL_TEXTURE_MIN_LOD != GL_TEXTURE_MAX_LOD the reading from the texture does not properly resolve the LOD level and texture lookup might fail. Hence, when no mipmap filter is given (indicating that no mip-mapping takes place), force the LOD range to contain only value. Fixes: dEQP-GLES3.functional.shaders.texture_functions.texture.(i\|u)sampler2d dEQP-GLES3.functional.texture.format.sized.cube.rgb* out of VK_GL_CTS/android/cts/master/gles3-master.txt Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-07-05 07:57:28 +02:00
Gert Wollny	e7dd1a84a0	mesa/st: draw_vbo: initialize restart_index too restart_index is later always used in a comparison, so it should be initialized properly. Fixes valgrind warning: Conditional jump or move depends on uninitialised value(s) at 0xB8D682F: r600_draw_vbo (r600_state_common.c:2153) by 0xB71F743: u_vbuf_draw_vbo (u_vbuf.c:1156) by 0xB3B92DB: st_draw_vbo (st_draw.c:235) by 0xB36B1AE: vbo_draw_arrays (vbo_exec_array.c:391) by 0xB36BB0D: vbo_exec_DrawArrays (vbo_exec_array.c:550) by 0x10A989: piglit_display (textureSize.c:157) by 0x4F8F174: run_test (piglit_fbo_framework.c:52) by 0x4F7BA12: piglit_gl_test_run (piglit-framework-gl.c:229) by 0x10A60A: main (textureSize.c:71) Uninitialised value was created by a stack allocation at 0xB3B90B0: st_draw_vbo (st_draw.c:143) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-07-05 07:57:16 +02:00
Timothy Arceri	0cb6537dee	mesa: enable ARB_direct_state_access in OpenGL 4.5 compat profile Its unlikely anyone will add proper ARB_direct_state_access compat support before we branch 18.2. Enabling the extension in 4.5 at least allows users to make use of MESA_GL_VERSION_OVERRIDE=4.5COMPAT for games like No Mans Sky. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-05 13:15:34 +10:00
Timothy Arceri	39063334d3	util/drirc: turn on force_glsl_extensions_warn for No Mans Sky The game forgets to enable multiple extensions in its shaders, one of those extesions is EXT_texture_array. But enabling this config entry fixes at least one other rendering issue that enabling EXT_texture_array on its own doesn't fix. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-05 13:05:47 +10:00
Marek Olšák	9b4c4fe334	util/queue: remove leftover debug code	2018-07-04 22:19:47 -04:00
Marek Olšák	7fab8a4b37	Shorten u_queue names There is a 15-character limit for thread names shared by the queue name and process name. Shorten the thread name to make space for the process name. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-04 22:03:35 -04:00
Marek Olšák	b238e33bc9	kutil/queue: add a process name into a thread name v2: simplifications Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (v1) Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v1)	2018-07-04 21:54:39 -04:00
Marek Olšák	7149bffe66	gallium/os: use util_get_process_name when possible Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-07-04 21:16:57 -04:00
Marek Olšák	2e1e6511f7	util: extract get_process_name from xmlconfig.c Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-07-04 21:16:03 -04:00
Marek Olšák	4695984dbc	ac: fold LLVMContext creation into ac_llvm_context_init Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-07-04 15:48:18 -04:00
Marek Olšák	f5cb4194c9	radeonsi: reorder code in si_llvm_context_init Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-07-04 15:48:18 -04:00
Marek Olšák	ff330055e9	radeonsi: use ac_compile_module_to_binary to reduce compile times Compile times of simple shaders are reduced by ~20%. Compile times of prologs and epilogs are reduced by up to 40%. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-07-04 15:48:18 -04:00
Marek Olšák	0075e5fed8	ac: add reusable helpers for direct LLVM compilation This is basically LLVMTargetMachineEmitToMemoryBuffer inlined and reworked. struct ac_compiler_passes (opaque type) contains the main pass manager. ac_create_llvm_passes -- the result can go to thread local storage ac_destroy_llvm_passes -- can be called by a destructor in TLS ac_compile_module_to_binary -- from LLVMModuleRef to ac_shader_binary The motivation is to do the expensive call addPassesToEmitFile once per context or thread. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-07-04 15:48:18 -04:00
Rhys Perry	c2ae9b4052	nvc0: implement multisampled images on Maxwell+ Changes in v2: - make loadSuInfo32() protected without making the rest protected - move NVC0_SU_INFO_* into nv50_ir_lowering_nvc0.h instead of duplicating NVC0_SU_INFO_MS Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2018-07-04 16:04:23 +02:00
Neil Roberts	2d5ddbe960	i965: Fix output register sizes when variable ranges are interleaved In `6f5abf3146` this code was fixed to calculate the maximum size of an attribute in a seperate pass and then allocate the registers to that size. However this wasn’t taking into account ranges that overlap but don’t have the same starting location. For example: layout(location = 0, component = 0) out float a[4]; layout(location = 2, component = 1) out float b[4]; Previously, if ‘a’ was processed first then it would allocate a register of size 4 for location 0 and it wouldn’t allocate another register for location 2 because it would already be covered by the range of 0. Then if something tries to write to b[2] it would try to write past the end of the register allocated for ‘a’ and it would hit an assert. This patch changes it to scan for any overlapping ranges that start within each range to calculate the maximum extent and allocate that instead. Fixed Piglit’s arb_enhanced_layouts/execution/component-layout/ vs-fs-array-interleave-range.shader_test Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Fixes: `6f5abf3146` "i965: Fix output register sizes when multiple variables share a slot."	2018-07-04 10:57:51 +02:00
Dave Airlie	8c51caab24	r600/sb: cleanup if_conversion iterator to be legal C++ The current code causes: /usr/include/c++/8/debug/safe_iterator.h:207: Error: attempt to copy from a singular iterator. This is due to the iterators getting invalidated, fix the reverse iterator to use the return value from erase, and cast it properly. (used Mathias suggestion) Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-07-04 07:42:22 +01:00
Marek Olšák	45f9d58668	radeonsi: fix compiler breakage Broken by `d853d3a59b`.	2018-07-04 00:13:38 -04:00
Dave Airlie	5b32b246cf	ac: make some fns static Some of the compiler functions are no longer called outside the util file. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-04 10:29:26 +10:00
Dave Airlie	7398913a62	ac/radv: move llvm compiler info to struct and init in one place This ports radv to the shared code, however due to a bug in LLVM version prior to 7, radv cannot add target info at this stage, as it would leak one for every shader compile, however I'd prefer to keep this llvm damage in the shared code, since it isn't the driver at fault here. We just add a flag to denote if the driver can support leaking the target info or not, and the common code does the right thing depending on the llvm version. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-04 10:29:16 +10:00
Dave Airlie	d853d3a59b	ac/radeonsi: port compiler init/destroy out of radeonsi. We want to share this code with radv in the future, so port it out of radeonsi. Add a return value as radv will want that to know if this succeeds Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-04 10:29:03 +10:00
Dave Airlie	35c82af539	radv/radeonsi: add a check ir tm options This doesn't do much yet, but it makes it easier to move the code to a common shared code base. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-04 05:32:35 +10:00
Dave Airlie	0eb65b4944	radeonsi: rename si_compiler -> ac_llvm_compiler As precursor to moving init to common code, just rename the struct and move it. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-04 05:31:32 +10:00
Dave Airlie	887ba45c93	ac: add target library info helpers Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-04 05:31:29 +10:00
Dave Airlie	e1387eaf12	radv: create/destroy passmgr at the higher level. This is prep work for moving this to a per-thread struct Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-04 05:31:05 +10:00
Dave Airlie	97d9b88447	radv: port to use common passmgr code. This adds a inline always pass, but otherwise should work the same. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-04 05:30:34 +10:00
Dave Airlie	584ad1eda9	ac/radeonsi: refactor out pass manager init to common code. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-04 05:18:01 +10:00
Dave Airlie	f2b3e96e75	radv: drop copy of ac_create_target_machine. Once we split the init once stuff out, this can be shared again. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-04 05:15:35 +10:00
Dave Airlie	473be16c74	ac/radv: split the non-common init_once code from the common target code. (v2) This just splits out the non-shared code and reuses ac_get_llvm_target in radv. v2: rebase on Marek's patch - fixup brace position/whitespace Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-04 05:15:23 +10:00
Neil Roberts	590cc7c8f6	i965: Use the new nir atomic counter linker for SPIR-V shaders Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-03 12:41:46 +02:00
Alejandro Piñeiro	c13f8ea8ac	i965: enable AtomicStorage capability for gen7+ That is the same gen requirement for ARB_shader_atomic_counters. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-03 12:41:46 +02:00
Antia Puentes	7600678216	mesa/glspirv: lower workgroup access to offsets This will perform the CS shared lowering. See `8761a04d0d` Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-03 12:41:46 +02:00
Antia Puentes	fbcebfc5bf	nir: Fix OpAtomicCounterIDecrement for uniform atomic counters From the SPIR-V 1.0 specification, section 3.32.18, "Atomic Instructions": "OpAtomicIDecrement: <skip> The instruction's result is the Original Value." However, we were implementing it, for uniform atomic counters, as a pre-decrement operation, as was the one available from GLSL. Renamed the former nir intrinsic 'atomic_counter_dec' to 'atomic_counter_pre_dec' for clarification purposes, as it implements a pre-decrement operation as specified for GLSL. From GLSL 4.50 spec, section 8.10, "Atomic Counter Functions": "uint atomicCounterDecrement (atomic_uint c) Atomically 1. decrements the counter for c, and 2. returns the value resulting from the decrement operation. These two steps are done atomically with respect to the atomic counter functions in this table." Added a new nir intrinsic 'atomic_counter_post_dec' which implements a post-decrement operation as required by SPIR-V. v2: (Timothy Arceri) Add extra spec quotes on commit message * Use "post" instead "pos" to avoid confusion with "position" Signed-off-by: Antia Puentes <apuentes@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-03 12:41:46 +02:00
Neil Roberts	6677e131b8	nir/linker: Add a pure NIR implementation of the atomic counter linker This is mostly just a straight-forward conversion of link_assign_atomic_counter_resources to C directly using nir variables instead of GLSL IR variables. It is based on the version of link_assign_atomic_counter_resources in `6b8909f2d1`. I’m noting this here to make it easier to track changes and keep the NIR version up-to-date. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-03 12:41:46 +02:00
Neil Roberts	1fb9984d7e	nir/types: Add wrappers for a couple of atomic counter methods Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-03 12:41:46 +02:00
Alejandro Piñeiro	54d7fca077	spirv/nir: add capability check for SpvCapabilityAtomicStorage Capability that informs if atomic counters are supported. From SPIR-V 1.0 spec, section 3.7, "Storage Class", item 10 from table: (Column "Storage Class"): "AtomicCounter For holding atomic counters. Visible across all functions of the current invocation. Atomic counter-specific memory." (Column "Required Capability"): "AtomicStorage" Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-03 12:41:46 +02:00
Alejandro Piñeiro	12301766de	spirv/nir: add atomic counter support on vtn_handle_ssbo_or_shared_atomic So renamed to a more general vtn_handle_atomics Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-03 12:41:46 +02:00
Alejandro Piñeiro	c3eb0ba0ff	spirv/nir: initialize offset on the nir var at vtn_create_variable This is convenient when dealing with atomic counter uniforms. The alternative would be doing that at vtn_handle_atomics. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-03 12:41:46 +02:00
Antia Puentes	4110bc4c17	nir/spirv: Fix atomic counter (multidimensional-)arrays When constructing NIR if we have a SPIR-V uint variable and the storage class is SpvStorageClassAtomicCounter, we store as NIR's glsl_type an atomic_uint to reflect the fact that the variable is an atomic counter. However, we were tweaking the type only for atomic_uint scalars, we have to do it as well for atomic_uint arrays and atomic_uint arrays of arrays of any depth. Signed-off-by: Antia Puentes <apuentes@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> v2: update after deref patches got pushed (Alejandro Piñeiro) v3: simplify repair_atomic_type (suggested by Timothy Arceri, included on the patch by Alejandro) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-03 12:41:46 +02:00
Alejandro Piñeiro	480d2c56b3	spirv/nir: tweak nir type when storage class is SpvStorageClassAtomicCounter GLSL types differentiates uint from atomic uint. On SPIR-V the type is uint, and the variable has a specific storage class. So we need to tweak the type based on the storage class. Ideally we would like to get the proper type at vtn_handle_type, but we don't have the storage class at that moment. We tweak only the nir type, as is the one that really requires it. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-03 12:41:46 +02:00
Alejandro Piñeiro	88d3325a44	nir_types: add glsl_atomic_uint_type() helper Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-03 12:41:46 +02:00
Alejandro Piñeiro	c6230b9358	spirv/nir: add offset at vtn_variable Also initialize it on var_decoration_cb This is equivalent to nir_variable.offset, used to store the location an atomic counter is stored at. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-03 12:37:32 +02:00
Alejandro Piñeiro	768c275deb	spirv/nir: SpvStorageClassAtomicCounter support on vtn_storage_class_to_mode Atomic Counters are uniforms per spec. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-03 12:37:32 +02:00
Alejandro Piñeiro	a9e6298727	nir/linker: handle uniforms without explicit location ARB_gl_spirv points that uniforms in general need explicit location. But there are still some cases of uniforms without location, like for example uniform atomic counters. Those doesn't have a location from the OpenGL point of view (they are identified with a binding and offset), but Mesa internally assigns it a location. Signed-off-by: Eduardo Lima <elima@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Signed-off-by: Neil Roberts <nroberts@igalia.com> v2: squash with another patch, minor variable name tweak (Timothy Arceri) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-03 12:37:32 +02:00
Alejandro Piñeiro	b0712df6cf	compiler/glsl: refactor empty_uniform_block utilities to linker_util This includes: * Move the defition of empty_uniform_block to linker_util.h * Move find_empty_block (with a rename) to linker_util.h * Refactor some code at linker.cpp to a new method at linker_util.h (link_util_update_empty_uniform_locations) So all that code could be used by the GLSL linker and the NIR linker used for ARB_gl_spirv. v2: include just "ir_uniform.h" (Timothy Arceri) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-03 12:37:32 +02:00
Ian Romanick	995d993710	i965/vec4: Don't cmod propagate from CMP to ADD if the writemask isn't compatible Otherwise we can incorrectly cmod propagate in situations like add(8) g10<1>.xD g2<0>.xD -16D ... cmp.ge.f0(8) null<1>D g2<0>.xD 16D ... (+f0) sel(8) g21<1>.xyUD g14<4>.xyyyUD g18<4>.xyyyUD Sadly, this change hurts quite a few shaders. v2: Refactor writemask compatibility check into a separate function. Suggested by Caio. Ivy Bridge and Haswell had similar results. (Haswell shown) total instructions in shared programs: 12968489 -> 12968738 (<.01%) instructions in affected programs: 60679 -> 60928 (0.41%) helped: 0 HURT: 249 HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.22% max: 0.81% x̄: 0.46% x̃: 0.44% 95% mean confidence interval for instructions value: 1.00 1.00 95% mean confidence interval for instructions %-change: 0.44% 0.48% Instructions are HURT. total cycles in shared programs: 409171965 -> 409172317 (<.01%) cycles in affected programs: 260056 -> 260408 (0.14%) helped: 0 HURT: 176 HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.04% max: 0.34% x̄: 0.17% x̃: 0.17% 95% mean confidence interval for cycles value: 2.00 2.00 95% mean confidence interval for cycles %-change: 0.16% 0.18% Cycles are HURT. Sandy Bridge total instructions in shared programs: 10423577 -> 10423753 (<.01%) instructions in affected programs: 40667 -> 40843 (0.43%) helped: 0 HURT: 176 HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.29% max: 0.79% x̄: 0.48% x̃: 0.42% 95% mean confidence interval for instructions value: 1.00 1.00 95% mean confidence interval for instructions %-change: 0.46% 0.51% Instructions are HURT. total cycles in shared programs: 146097503 -> 146097855 (<.01%) cycles in affected programs: 503990 -> 504342 (0.07%) helped: 0 HURT: 176 HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.02% max: 0.36% x̄: 0.12% x̃: 0.11% 95% mean confidence interval for cycles value: 2.00 2.00 95% mean confidence interval for cycles %-change: 0.11% 0.13% Cycles are HURT. No changes on any other platforms. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Fixes: `cd635d149b` i965/vec4: Propagate conditional modifiers from compares to adds Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-07-02 19:19:16 -07:00
Ian Romanick	fb6dc8e894	intel/compiler: Silence unused parameter warnings brw_nir.c src/intel/compiler/brw_nir.c: In function ‘brw_nir_lower_vue_outputs’: src/intel/compiler/brw_nir.c:464:32: warning: unused parameter ‘is_scalar’ [-Wunused-parameter] bool is_scalar) ^~~~~~~~~ src/intel/compiler/brw_nir.c: In function ‘lower_bit_size_callback’: src/intel/compiler/brw_nir.c:610:57: warning: unused parameter ‘data’ [-Wunused-parameter] lower_bit_size_callback(const nir_alu_instr alu, void data) ^~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-07-02 16:17:19 -07:00
Kenneth Graunke	8e38947f6c	i965: Fix BRW_NEW_NUM_SAMPLES to be in .brw, not .mesa This is the wrong kind of dirty bit. Caught by GCC warnings, due to 64-bit values being truncated to 32 bits. Fixes: `b95b0e2918` (intel/anv,blorp,i965: Implement the SKL 16x MSAA SIMD32 workaround) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-02 15:30:21 -07:00
Jason Ekstrand	afa8f58921	anv: Add support for the on-disk shader cache The Vulkan API provides a mechanism for applications to cache their own shaders and manage on-disk pipeline caching themselves. Generally, this is what I would recommend to application developers and I've resisted implementing driver-side transparent caching in the Vulkan driver for a long time. However, not all applications do this and, for some use-cases, it's just not practical. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-02 14:52:05 -07:00
Jason Ekstrand	e0f7a3aa5b	anv/pipeline_cache: Add a _locked suffix to a function Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-02 13:07:06 -07:00
Jason Ekstrand	f5c38f4a30	anv: Add device-level helpers for searching for and uploading kernels Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-02 13:07:06 -07:00
Jason Ekstrand	eae192bf5f	anv/pipeline: Stop optimizing for not having a cache Before, we were only hashing the shader if we had a shader cache to cache things in. This means that if we ever get it wrong, we could end up trying to cache a shader with an undefined hash. Since not having a shader cache is an extremely uncommon case, let's optimize for code clarity and obvious correctness over avoiding a hash operation. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-02 13:07:06 -07:00
Jason Ekstrand	76fdc8a85c	anv: Use a default pipeline cache if none is specified If a client is dumb enough to not specify a pipeline cache, give it a default. We have to create one anyway for blorp so we may as well let the client cache shaders in it. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-02 13:07:06 -07:00
Jason Ekstrand	d1c778b362	anv: Be more careful about hashing pipeline layouts Previously, we just hashed the entire descriptor set layout verbatim. This meant that a bunch of extra stuff such as pointers and reference counts made its way into the cache. It also meant that we weren't properly hashing in the Y'CbCr conversion information information from bound immutable samplers. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-02 13:07:06 -07:00
Jason Ekstrand	06412bfc98	anv,intel: Enable nir_opt_large_constants for Vulkan According to RenderDoc, this shaves 99.6% of the run time off of the ambient occlusion pass in Skyrim Special Edition when running under DXVK and shaves 92% off the runtime for a reasonably representative frame. When running the actual game, Skyrim goes from being a slide-show to a very stable and playable framerate on my SKL GT4e machine. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-02 12:09:50 -07:00
Jason Ekstrand	70ce880434	anv: Add state setup support for shader constants Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-02 12:09:49 -07:00
Jason Ekstrand	3a5ed18c51	anv: Add support for shader constant data to the pipeline cache Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-02 12:09:47 -07:00
Jason Ekstrand	1235850522	nir: Add a large constants optimization pass This pass searches for reasonably large local variables which can be statically proven to be constant and moves them into shader constant data. This is especially useful when large tables are baked into the shader source code because they can be moved into a UBO by the driver to reduce register pressure and make indirect access cheaper. v2 (Jason Ekstrand): - Use a size/align function to ensure we get the right alignments - Use the newly added deref offset helpers Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-02 12:09:45 -07:00
Jason Ekstrand	c90f221e0a	nir: Add a concept of constant data associated with a shader This commit adds a concept to NIR of having a blob of constant data associated with a shader. Instead of being a UBO or uniform that can be manipulated by the client, this constant data considered part of the shader and remains constant across all invocations of the given shader until the end of time. To access this constant data from the shader, we add a new load_constant intrinsic. The intention is that drivers will eventually lower load_constant intrinsics to load_ubo, load_uniform, or something similar. Constant data will be used by the optimization pass in the next commit but this concept may also be useful for OpenCL. v2 (Jason Ekstrand): - Rename num_constants to constant_data_size (anholt) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-02 12:09:42 -07:00
Jason Ekstrand	e8e159e9df	nir/deref: Add helpers for getting offsets These are very similar to the related function in nir_lower_io except that they don't handle per-vertex or packed things (that could be added, in theory) and they take a more detailed size/align function pointer. One day, we should consider switching nir_lower_io over to using the more detailed size/align functions and then we could make it use these helpers instead of having its own. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-02 12:09:41 -07:00
Jason Ekstrand	2bf8be99b0	nir/types: Add a natural size and alignment helper The size and alignment are "natural" in the sense that everything is aligned to a scalar. This is a bit tighter than std430 where vec3s are required to be aligned to a vec4. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-02 12:09:39 -07:00
Jason Ekstrand	893fc2d07d	nir: Add a deref_instr_has_indirect helper Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-02 12:09:37 -07:00
Jason Ekstrand	70b16963fc	util/macros: Import ALIGN_POT from ralloc.c v2 (Jason Ekstrand): - Rename y to pot_align (Brian) - Also use ALIGN_POT in build_id.c and slab.c (Brian) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-02 12:09:14 -07:00
Eric Anholt	4819da2301	v3d: Claim PIPE_CAP_TGSI_CAN_READ_OUTPUTS. Fixes warning at screen creation. We store our outputs in normal temps and just emit them to shader I/O at the end, due to our I/O ordering requirements, so reading "outputs" in NIR is fine.	2018-07-02 11:35:41 -07:00
Marek Olšák	32e413ca59	ac: move all LLVM module initialization into ac_create_module This removes some ugly code around module initialization. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-07-02 14:34:39 -04:00
Eric Anholt	49f7631c9f	v3d: Emit a TF flush after each draw using TF. This fixes GPU hangs on 7278 in transform feedback tests such as GTF-GLES3.gtf.GL3Tests.transform_feedback2.transform_feedback2_basic	2018-07-02 10:05:14 -07:00
Karol Herbst	c7726fbfa5	nv50/ir: handle clipvertex for geom and tess shaders as well this will be needed for compatibility profiles v2: handle tess shaders Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-02 16:21:31 +02:00
Erik Faye-Lund	4c87705705	gallium/u_vbuf: drop min/max-scanning for empty indirect draws When building with asserts enabled, we'll end up triggering an assert in pipe_buffer_map_range down this code-path, due to trying to map an empty range. Even if we avoid that, we'll trigger another assert a bit later, because u_vbuf_get_minmax_index returns a min-index of -1 here, which gets promoted to an unsigned value, and gives us an out-of-bounds buffer-mapping offset. Since we can't really have a well-defined min/max range here when the range is empty anyway, we should just drop this dance in the first place. After all, no rendering is going to be produced. This fixes a crash in dEQP-GLES31.functional.draw_indirect.random.0 on VirGL for me. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-02 10:51:29 +02:00
Samuel Pitoiset	02db2363f0	radv: reset the image's predicate after a color decompression pass After performing a fast-clear eliminate, a FMASK decompress, or a DCC decompress, we can reset the predicate to FALSE. With that, the GPU should be able to skip unnecessary color decompression passes. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-07-02 10:43:33 +02:00
Samuel Pitoiset	ff7daadca1	radv: enable/disable predication for the DCC decompression pass Performing a DCC decompression pass is currently pretty rare, but using predication allows the GPU to skip unnecessary passes. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-07-02 10:43:17 +02:00
Samuel Pitoiset	939e5a3823	radv: add padding for the UMR disassembler Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-02 10:42:17 +02:00
Gert Wollny	91f48cdfe5	virgl: Add support for glGetMultisample Use caps to obtain the multisample sample positions for up to 16 positions and implement the according Gallium interface. This implemenation (plus its counterpart in virglrenderer) assume that the fixed sample position are always the same for a given number of samples over the whole live time of a qemu session. It also assumes that sample series are only given for 2, 4, 8, and 16 samples, and for intermediate numbers N of samples the next higher supported set from above list is picked and the sample positions for the first N samples are returned accordingly. Fixes (when run on GL host): dEQP-GLES31.functional.texture.multisample.samples_1.sample_position dEQP-GLES31.functional.texture.multisample.samples_2.sample_position dEQP-GLES31.functional.texture.multisample.samples_3.sample_position dEQP-GLES31.functional.texture.multisample.samples_4.sample_position dEQP-GLES31.functional.texture.multisample.samples_8.sample_position dEQP-GLES31.functional.texture.multisample.samples_10.sample_position dEQP-GLES31.functional.texture.multisample.samples_12.sample_position dEQP-GLES31.functional.texture.multisample.samples_13.sample_position dEQP-GLES31.functional.texture.multisample.samples_16.sample_position v2: remove unrelated chunk (thanks Ilia Mirkin) v3: - also return positions for intermediate sample counts - fix unused varible warning - update description v4: explain better what this patch assumes and how it handles sample numbers that are not directly advertised (thanks go to Erik Faye-Lund for making me aware that this should be documented) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2018-07-02 09:33:55 +02:00
Tomeu Vizoso	ba78e78cd5	st/mesa: Also check for PIPE_FORMAT_A8R8G8B8_SRGB for texture_sRGB and PIPE_FORMAT_R8G8B8A8_SRGB, as well. The reason for this is that when Virgl runs with GLES on the host, it cannot directly upload textures in BGRA. So to avoid a conversion step, consider the RGB sRGB formats as well for this extension. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-02 09:33:48 +02:00
Tomeu Vizoso	71867a0a61	st/mesa: Fall back to R8G8B8A8_SRGB for ETC2 If the driver doesn't support PIPE_FORMAT_B8G8R8A8_SRGB, fall back to PIPE_FORMAT_R8G8B8A8_SRGB. Drivers such as Virgl will have a hard time supporting PIPE_FORMAT_B8G8R8A8_SRGB when the host runs GLES, as GL_BGRA isn't as well suported there. So go with PIPE_FORMAT_R8G8B8A8_SRGB so these drivers can avoid a conversion copy. v2: Fix typo in commit message Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-02 09:33:41 +02:00
Tomeu Vizoso	e5604ef78b	st/mesa/i965: Allow decompressing ETC2 to GL_RGBA When Mesa itself implements ETC2 decompression, it currently decompresses to formats in the GL_BGRA component order. That can be problematic for drivers which cannot upload the texture data as GL_BGRA, such as Virgl when it's backed by GLES on the host. So this commit adds a flag to _mesa_unpack_etc2_format so callers can specify the optimal component order. In Gallium's case, it will be requested if the format isn't in PIPE_FORMAT_B8G8R8A8_SRGB format. For i965, it will remain GL_BGRA, as before. v2: * Remove unnecesary include (Emil Velikov) Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-02 09:33:33 +02:00
Iago Toral Quiroga	1b54824687	anv/cmd_buffer: make descriptors dirty when emitting base state address Every time we emit a new state base address we will need to re-emit our binding tables, since they might have been emitted with a different base state adress. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> CC: <mesa-stable@lists.freedesktop.org>	2018-07-02 08:31:20 +02:00
Iago Toral Quiroga	6a1d8350c9	anv/cmd_buffer: clean dirty push constants flag after emitting push constants Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> CC: <mesa-stable@lists.freedesktop.org>	2018-07-02 08:31:02 +02:00
Iago Toral Quiroga	198a72220b	anv/cmd_buffer: never shrink the push constant buffer size If we have to re-emit push constant data, we need to re-emit all of it. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> CC: <mesa-stable@lists.freedesktop.org>	2018-07-02 08:30:40 +02:00
Denis Pauk	2854c0f795	gallium/llvmpipe: Enable support bptc format. v2: none v3: none Signed-off-by: Denis Pauk <pauk.denis@gmail.com> CC: Marek Olšák <maraeo@gmail.com> CC: Rhys Perry <pendingchaos02@gmail.com> CC: Matt Turner <mattst88@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-07-01 15:42:37 -04:00
Denis Pauk	530130e74f	gallium/softpipe: Enable support bptc format. v2: none v3: none Signed-off-by: Denis Pauk <pauk.denis@gmail.com> CC: Marek Olšák <maraeo@gmail.com> CC: Rhys Perry <pendingchaos02@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-07-01 15:42:37 -04:00
Denis Pauk	f69bc797e1	gallium/auxiliary: Add helper support for bptc format compress/decompress Reuse code shared with mesa/main/texcompress_bptc. v2: Use block decompress function v3: Include static bptc code from texcompress_bptc_tmp.h Suggested-by: Marek Olšák <maraeo@gmail.com> Signed-off-by: Denis Pauk <pauk.denis@gmail.com> CC: Nicolai Hähnle <nicolai.haehnle@amd.com> CC: Marek Olšák <maraeo@gmail.com> CC: Gert Wollny <gw.fossdev@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-07-01 15:42:37 -04:00
Denis Pauk	bf4871f9e8	mesa: add header for share bptc decompress functions Move shared bptc functions to texcompress_bptc_tmp.h: * fetch_rgba_unorm_from_block * fetch_rgb_float_from_block * compress_rgba_unorm * compress_rgb_float Create decompress functions: * decompress_rgba_unorm * decompress_rgb_float Functions will be reused in gallium/auxiliary code. v2: Add block decompress function v3: Move all shared code to header Suggested-by: Marek Olšák <maraeo@gmail.com> Signed-off-by: Denis Pauk <pauk.denis@gmail.com> CC: Marek Olšák <maraeo@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-07-01 15:42:36 -04:00
Marek Olšák	99c6cae227	glsl/cache: save and restore ExternalSamplersUsed Shaders that need special code for external samplers were broken if they were loaded from the cache. Cc: 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-30 01:04:16 -04:00
Timothy Arceri	463f849097	nir: fix selection of loop terminator when two or more have the same limit We need to add loop terminators to the list in the order we come across them otherwise if two or more have the same exit condition we will select that last one rather than the first one even though its unreachable. This fix is for simple unrolls where we only have a single exit point. When unrolling these type of loops the unreachable terminators and their unreachable branch are removed prior to unrolling. Because of the logic change we also switch some list access in the complex unrolling logic to avoid breakage. Fixes: `6772a17acc` ("nir: Add a loop analysis pass") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-30 10:13:03 +10:00
Timothy Arceri	18293be622	radeonsi: enable OpenGL 4.4 compat profile Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-30 08:38:33 +10:00
Timothy Arceri	ddb351f7fe	mesa: enable ARB_vertex_attrib_64bit in compat profile Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-30 08:38:33 +10:00
Timothy Arceri	c283b413c1	mesa: add outstanding ARB_vertex_attrib_64bit dlist support Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-30 08:38:33 +10:00
Dave Airlie	98d02104a7	vbo_save: add support for doubles to display list code Required for ARB_vertex_attrib_64bit compat profile support. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-30 08:38:33 +10:00
Timothy Arceri	d2caa37741	mesa: add compat profile support for ARB_multi_draw_indirect v2: add missing ARB_base_instance support Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-30 08:38:33 +10:00
Timothy Arceri	103b8f11d6	mesa: make valid_draw_indirect_multi() accessible externally We will use this to add compat support to ARB_multi_draw_indirect in the following patch. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-30 08:38:33 +10:00
Timothy Arceri	5f90fb4007	mesa: add ARB_draw_indirect support to compat profile v2: add missing ARB_base_instance support Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-30 08:38:33 +10:00
Timothy Arceri	9b32c80357	mesa: generate GL_INVALID_OPERATION using draw indirect in dlist The spec doesn't explicitly say to generate an error but since DrawArraysInstanced* and DrawElementsInstanced* do, it makes sense to do it for these functions also. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-30 08:38:33 +10:00
Timothy Arceri	03f1a2e8df	mesa: add missing display list support for ARB_compute_shader The extension is enabled for compat profile but there is currently no display list support. I filed a spec bug and it has been agreed that glDispatchComputeIndirect should generate an INVALID_OPERATION error when called during display list compilation. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-30 08:38:33 +10:00
Timothy Arceri	87d6093583	mesa: expose some ARB_viewport_array dependent extensions in compat Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-30 08:38:33 +10:00
Timothy Arceri	d87913e72a	mesa: enable ARB_viewport_array in compat profile Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-30 08:38:33 +10:00
Timothy Arceri	d332986589	mesa: add ARB_viewport_array display list support Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-30 08:38:33 +10:00
Timothy Arceri	df5e22cb7d	mesa: enable ARB_shader_subroutine in compat profile Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-30 08:38:33 +10:00
Timothy Arceri	05f3589e67	mesa: add glUniformSubroutinesuiv() display list support Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-30 08:38:33 +10:00
Timothy Arceri	52e3ef2400	mesa: stop hiding remaining query parameters from OpenGL compat I managed to miss these two in my last pass at this. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-30 08:38:33 +10:00
Timothy Arceri	9f77a9729e	mesa: enable ARB_gpu_shader_fp64 in compat profile Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-30 08:38:33 +10:00
Timothy Arceri	a138fbc955	mesa: add ProgramUniform*d display list support This is required for fp64 to be enabled in compat profile. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-30 08:38:32 +10:00
Timothy Arceri	145f517cbd	mesa: add Uniform*d support to display lists This is required so we can enable fp64 support in compat profile. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-30 08:38:32 +10:00
Karol Herbst	04b443104d	st/glsl_to_nir: run lower_output_reads on !PIPE_CAP_TGSI_CAN_READ_OUTPUTS this is required for Drivers which don't allow reading from outputs. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-06-29 23:43:26 +02:00
Eric Anholt	a77cb724da	v3d: Move GL shader state dumping out of per-version compilation. It doesn't depend on V3D_VER, since it's just calling v3d_print_group.	2018-06-29 13:36:28 -07:00
Eric Anholt	c2901ff80f	v3d: Add missing Stream field to transform feedback specs on V3D 4.1. Noticed when trying to CLIF parse a transform feedback job that hangs on HW.	2018-06-29 13:36:28 -07:00
Eric Anholt	69efc1e025	v3d: Add missing "tri trip or fan" flag in Primitive List Format.	2018-06-29 13:36:28 -07:00
Eric Anholt	b341b39db3	v3d: Fix the shader code address field widths on V3D 4.1+ We were overlapping it with the threadable/nan flags, resulting in incorrect relocations (threadable/nan included in the offset) and wrong ordering in the CLIF files.	2018-06-29 13:36:28 -07:00
Eric Anholt	6c3c11ba19	v3d: Add missing "no prim pack" field to the V3D4.1+ GL shader state. It looks like we don't need this flag for anything (not that I'm clear on what it does), but it makes our struct dumping line up with CLIF parsing.	2018-06-29 13:36:28 -07:00
Eric Anholt	c0476d964a	v3d: Express dithering mode in the same way that the CLIF parser does.	2018-06-29 13:36:28 -07:00
Eric Anholt	24d2f1347d	v3d: Add missing "number of bin tile lists" field. Noticed when trying to feed our dumps through the CLIF parser. Since this is a "minus one" field, we were already filling in the value we wanted (0).	2018-06-29 13:36:28 -07:00
Eric Anholt	b65b61cefe	v3d: Rewrite the color write masks to match CLIF format. The render_target_* fields gave us pretty(ish) printing, but meant we were incompatible with CLIF, and had much more verbose code generating them.	2018-06-29 13:36:28 -07:00
Eric Anholt	38172dcba9	v3d: Merge the V3D 4.1 and 4.2 XML into V3D 3.3'x XML. The XML ends up noisier if you're only looking at one version, but from the diffstat there's obvious wins in terms of deduplication. This will get even more significant if we ever support 3.2 or 4.0.	2018-06-29 13:36:28 -07:00
Eric Anholt	725561c0b6	v3d: Switch v3d_decoder.c to the XML's top min_ver/max_ver fields. The XML zipper wants one XML per version for filling out its tables, but we want to do more than one GPU version per XML now. Assume that the "gen" field will be the same as min_ver and look up our XML text assuming that they're listed in increasing min_ver.	2018-06-29 13:36:28 -07:00
Eric Anholt	f8af5c58c3	v3d: Create XML fields for min_ver and max_ver of a packet/struct/enum. This will be used to merge together the V3D 3.3-4.1 XML with the variants disabled based on the version.	2018-06-29 13:36:28 -07:00
Eric Anholt	6f7ad7ed11	v3d: Pass the version being generated to the pack generator script. It turns out that most V3D versions change very few packets, so keeping separate copies of the XML per version makes changing the XML a pain as you have to replicate your changes to each one. This is the start of changing it so that one XML can generate headers for multiple versions.	2018-06-29 13:36:28 -07:00
Jose Maria Casanova Crespo	a99c9e63a0	anv: finish the binding_table_pool on destroyDevice when use_softpin Running VK-CTS in batch execution mode was raising the VK_ERROR_INITIALIZATION_FAILED error in multiple tests. But when the same failing tests were run isolated they always passed. createDevice and destroyDevice were called before and after every tests. Because the binding_table_pool was never closed, we reached the maximum number of open file descriptors (ulimit -n) and when that happened every call to createDevice implied a VK_ERROR_INITIALIZATION_FAILED error. Fixes: `c7db0ed4e9` ("anv: Use a separate pool for binding tables when soft pinning") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-29 21:49:31 +02:00
Marek Olšák	ea8b55b49f	gallium/util: remove dummy function util_format_is_supported Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2018-06-29 15:31:49 -04:00
Dylan Baker	82bf8a6a82	docs: update calendar, add news and link release notes to 18.1.3	2018-06-29 11:04:22 -07:00
Dylan Baker	9dfcf044f7	docs: Add SHA256 sums to notes for 18.1.3	2018-06-29 11:02:41 -07:00
Dylan Baker	2fa6c3821f	docs: Add release notes for 18.1.3	2018-06-29 11:02:39 -07:00
Rhys Perry	ffba56cc3c	nv50/ir: improve maintainability of Target*::initOpInfo() This is mainly useful for when one needs to add new opcodes in a painless and reliable way. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-06-29 16:47:27 +02:00
Rhys Perry	d885303a38	nv50/ir: fix image stores with indirect handles Having this if statement here prevented the next if statement from being reached in the case of image stores, which is needed for instructions with indirect bindless handles like "STORE TEMP[ADDR[2].x+1](1) ...". Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-06-29 16:07:59 +02:00
Ross Burton	d7c4ce1d1d	egl: fix build race in automake There is a parallel make build issue in src/egl/drivers/dri2/ for wayland builds. Can be reproduced with: $ rm src/egl/drivers/dri2/*.h src/egl/drivers/dri2/platform_wayland.lo $ make -C src/egl/ drivers/dri2/platform_wayland.lo ../../../mesa-18.1.2/src/egl/drivers/dri2/platform_wayland.c:50:10: fatal error: linux-dmabuf-unstable-v1-client-protocol.h: No such file or directory This patch adds the missing dependency. Fixes: `02cc359372` "egl/wayland: Use linux-dmabuf interface for buffers" Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> [Eric: fixed up the commit title] Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-29 12:49:51 +01:00
Marek Olšák	5a6414f135	radeonsi: implement vertex color clamping for tess and GS	2018-06-28 22:41:12 -04:00
Marek Olšák	034b385fc2	radeonsi: move VS_STATE_SGPR before draw SGPRs for vertex color clamping.	2018-06-28 22:27:25 -04:00
Marek Olšák	0c554bc5d5	radeonsi: don't use malloc in si_generate_gs_copy_shader	2018-06-28 22:27:25 -04:00
Marek Olšák	7bac3b589c	radeonsi: disable DCC statistics gathering on everything but Stoney I think we don't need it on other chips.	2018-06-28 22:27:25 -04:00
Marek Olšák	0da94fa19c	radeonsi: don't enable DCC statistics gathering for small surfaces	2018-06-28 22:27:25 -04:00
Marek Olšák	f8b0c54e3f	radeonsi: simplify logic around vi_separate_dcc_try_enable	2018-06-28 22:27:25 -04:00
Marek Olšák	41f80373b4	radeonsi: fix memory exhaustion issue with DCC statistics gathering with DRI2 Cc: 18.1 <mesa-stable@lists.freedesktop.org>	2018-06-28 22:27:25 -04:00
Marek Olšák	fb28bf23db	radeonsi: remove references to Evergreen	2018-06-28 22:27:25 -04:00
Marek Olšák	1542169a4a	radeonsi: enable shader caching for compute shaders Compute shaders were not using the shader cache.	2018-06-28 22:27:25 -04:00
Marek Olšák	d77557c9db	radeonsi: store compute local_size into tgsi_shader_info This is kinda a hack, but it's enough for the shader cache.	2018-06-28 22:27:25 -04:00
Marek Olšák	d13f240269	radeonsi: unify duplicated code for initial shader compilation	2018-06-28 22:27:25 -04:00
Marek Olšák	8e9c57a7fe	ac: set +auto-waitcnt-before-barrier when needed This removes useless s_waitcnt before barriers. Only radeonsi uses this function.	2018-06-28 22:27:25 -04:00
Marek Olšák	7d6ec9d43b	radeonsi/gfx9: insert the barrier between merged shaders inside the if block	2018-06-28 22:27:25 -04:00
Joe M. Kniss	70425bcfe6	gallium: plumb invariant output attrib thru TGSI Add support for glsl 'invariant' modifier for output data declarations. Gallium drivers that use TGSI serialization currently loose invariant modifiers in glsl shaders. v2: use boolean for invariant instead of unsigned. Tested: chromiumos on qemu with virglrenderer. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-06-29 11:11:54 +10:00
Francisco Jerez	c2c803be7b	intel/fs: Build 32-wide FS shaders. Co-authored-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-28 13:25:21 -07:00
Jason Ekstrand	b95b0e2918	intel/anv,blorp,i965: Implement the SKL 16x MSAA SIMD32 workaround Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-28 13:25:18 -07:00
Jason Ekstrand	d5e028a57b	intel/fs: Add fields to wm_prog_data for SIMD32 dispatch Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	bcbc7d3a17	intel/fs: Fix nir_intrinsic_load_helper_invocation for SIMD32. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	7144247c2c	intel/fs: Fix fs_builder::sample_mask_reg() for 32-wide FS dispatch. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	37c1df28c9	intel/fs: Fix Gen6+ interpolation setup for SIMD32 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Jason Ekstrand	e208bc3bb7	intel/fs: Get rid of MOV_DISPATCH_TO_FLAGS We can just emit the MOV in the two places where we use this. Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Jason Ekstrand	5e3028d826	intel/fs: Emit MOV_DISPATCH_TO_FLAGS once for the centroid workaround There's no reason for us to emit it a pile of times and then have a whole pass to clean it up. Just emit it once like we really want. Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	40fe108e2b	intel/fs: Generalize the unlit centroid workaround This generalizes the unlit centroid workaround so it's less code and now supports SIMD32. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	1d381731e0	intel/fs: Fix sample id setup for SIMD32. v2 (Jason Ekstrand): - Disallow gl_SampleId in SIMD32 on gen7 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	2fd0aed89a	intel/fs: Fix Gen7 compressed source region alignment restriction for SIMD32 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	6909aed90e	intel/fs: Implement 32-wide FS payload setup on Gen6+ Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	f6c4aace22	intel/fs: Extend thread payload layout to SIMD32 And handle 32-wide payload register reads in fetch_payload_reg(). v2 (Jason Ekstrand); - Fix some whitespace and brace placement Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	8f143f70d6	intel/fs: Wrap FS payload register look-up in a helper function. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	d996e5b812	intel/fs: Use fs_regs instead of brw_regs in the unlit centroid workaround While we're here, we change to using horiz_offset() instead of abusing half(). v2 (Jason Ekstrand): - Use horiz_offset() instead of half() Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	38aee1a06d	intel/fs: Simplify fs_visitor::emit_samplepos_setup The original code manually handled splitting the MOVs to 8-wide to handle various regioning restrictions. Now that we have a SIMD width splitting pass that handles these things, we can just emit everything at the full width and let the SIMD splitting pass handle it. We also now have a useful "subscript" helper which is designed exactly for the case where you want to take a W type and read it as a vector of Bs so we may as well use that too. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	244a0ff3a8	i965: Add plumbing for shader time in 32-wide FS dispatch mode. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	2d7d652d5c	intel/fs: Disable opt_sampler_eot() in 32-wide dispatch. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Jason Ekstrand	db6ca13efc	intel/fs: Emit LINE+MAC for LINTERP with unaligned coordinates On g4x through Sandy Bridge, src1 (the coordinates) of the PLN instruction is required to be an even register number. When it's odd (which can happen with SIMD32), we have to emit a LINE+MAC combination instead. Unfortunately, we can't just fall through to the gen4 case because the input registers are still set up for PLN which lays out the four src1 registers differently in SIMD16 than LINE. v2 (Jason Ekstrand): - Take advantage of both accumulators and emit LINE LINE MAC MAC (Based on a patch from Francisco Jerez) - Unify the gen4 and gen4x-6 cases using a loop v3 (Jason Ekstrand): - Don't unify gen4 with gen4x-6 as this turns out to be more fragile than first thought without reworking the gen4 barycentric coordinate layout. Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Jason Ekstrand	566e6abd6d	intel/fs: Mark LINTERP opcode as writing accumulator on platforms without PLN When we don't have PLN (gen4 and gen11+), we implement LINTERP as either LINE+MAC or a pair of MADs. In both cases, the accumulator is written by the first of the two instructions and read by the second. Even though the accumulator value isn't actually ever used from a logical instruction perspective, it is trashed so we need to make the scheduler aware. Otherwise, the scheduler could end up re-ordering instructions and putting a LINTERP between another an instruction which writes the accumulator and another which tries to use that result. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	73d60455e9	intel/fs: Rework INTERPOLATE_AT_PER_SLOT_OFFSET This reworks INTERPOLATE_AT_PER_SLOT_OFFSET to work more like an ALU operation and less like a send. This is less code over-all and, as a side-effect, it now properly handles execution groups and lowering so SIMD32 support just falls out. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Jason Ekstrand	74b477039d	intel/fs: Add the group to the flag subreg number on SNB and older We want consistent behavior in the meaning of the flag_subreg field between SNB and IVB+. v2 (Jason Ekstrand): - Add some extra commentary Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	2aefa5e19f	intel/fs: Fix FB read header setup for SIMD32. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	e06f5b30cc	intel/fs: Fix logical FB write lowering for SIMD32 Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	ce370902d4	intel/fs: Fix FB write message control codegen for SIMD32. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	8b788069fb	intel/fs: Don't enable dual source blend if no outputs are written This prevents a crash in some arb_enhanced_layouts tests that would be caused by the next commit. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	48241c780a	intel/fs: Fix codegen of FS_OPCODE_SET_SAMPLE_ID for SIMD32. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	789d20df36	intel/eu: Fix pixel interpolator queries for SIMD32. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	1650442026	intel/fs: Disable SIMD32 dispatch for fragment shaders with discard. Current discard handling requires dedicating the second flag register to discard. However, control-flow in SIMD32 requires both flag registers so it's incompatible with the current discard handling. Just don't support SIMD32+discard for now. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	1811cbdc25	intel/fs: Disable SIMD32 dispatch on Gen4-6 with control flow The hardware's control flow logic is 16-wide so we're out of luck here. We could, in theory, support SIMD32 if we know the control-flow is uniform but we don't have that information at this point. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Jason Ekstrand	d5b617a28e	intel/fs: Split instructions low to high in lower_simd_width Commit `0d905597f` fixed an issue with the placement of the zip and unzip instructions. However, as a side-effect, it reversed the order in which we were emitting the split instructions so that they went from high group to low instead of low to high. This is fine for most things like texture instructions and the like but certain render target writes really want to be emitted low to high. This commit just switches the order back around to be low to high. Reviewed-by: Matt Turner <mattst88@gmail.com> Fixes: `0d905597f` "intel/fs: Be more explicit about our placement of [un]zip"	2018-06-28 13:19:38 -07:00
Jason Ekstrand	0b830081f0	intel/fs: Rework KSP data to be SIMD width-based Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Jason Ekstrand	9d78abbef8	intel/compiler: Add and use helpers for working with KSP indices The pixel shader dispatch table is kind-of a confusing mess. This adds some helpers for dealing with it and for easily extracting the correct data from wm_prog_data. Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Jason Ekstrand	85750348bc	i965: Re-arrange shader kernel setup in WM state Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	5b6e91dd35	intel/fs: Remove program key argument from generator. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Jason Ekstrand	a14fb0184a	intel/fs: Set up FB write message headers in the visitor Doing instruction header setup in the generator is awful for a number of reasons. For one, we can't schedule the header setup at all. For another, it means lots of implied writes which the instruction scheduler and other passes can't properly read about. The second isn't a huge problem for FB writes since they always happen at the end. We made a similar change to sampler handling in `ff4726077d`. Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	dda31a7bbc	intel/fs: Fix implied_mrf_writes() for headerless FB writes. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	90643689aa	intel/fs: Fix fs_inst::flags_written() for Gen4-5 FB writes. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	ed09e78023	intel/eu: Return new instruction to caller from brw_fb_WRITE(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Jason Ekstrand	c0a1c248b8	intel/fs: Pull FB write implied headers from src[0] Now that we have the implied header in src[0] for tracking purposes, we may as well use it in the generator. This makes things a tiny bit more general. Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Jason Ekstrand	b1cc9a9ae1	intel/fs: Properly track implied header regs read by FB writes The FB write opcode on gen4-5 does implied copies from g0 and g1 to the message payload. With this commit, we start tracking that as part of the IR by having the FB write read from g0-1. Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Jason Ekstrand	d91fa20655	intel/fs: FS_OPCODE_REP_FB_WRITE has side effects It doesn't matter since we don't ever run replicated write shaders through the optimizer but it's good to be complete. Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Dylan Baker	e83cd38eac	docs: Add news item for mesa 18.1.2 Which I forgot to do when 18.1.2 came out. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-06-28 10:06:44 -07:00
Rhys Perry	c92eb71a65	nvc0: remove magic values in nve4_set_tex_handles() With this commit, things no longer break if NVC0_CB_AUX_TEX_INFO is changed to anything other than 0x20. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-06-28 18:22:06 +02:00
Rhys Perry	6bb0f87c60	nvc0/ir: fix TargetNVC0::insnCanLoadOffset() Previously, TargetNVC0::insnCanLoadOffset() returned whether the offset could be set to a specific value. The IndirectPropagation pass expected it to return whether the offset could be increased by a specific value, which is what TargetNV50::insnCanLoadOffset() does. Fixes: `37b67db6ae` ("nvc0/ir: be careful about propagating very large offsets into const load") Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-06-28 18:22:06 +02:00
Alok Hota	5b7d4f9428	swr/rast: Updating code style based on current clang-format rules Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-06-28 08:18:14 -05:00
Vinson Lee	f90a60fe79	swr/rast: Fix addPassesToEmitFile usage with llvm-7.0. Fix build error after llvm-7.0svn r332881 ("CodeGen: Add a dwo output file argument to addPassesToEmitFile and hook it up to dwo output."). CXX rasterizer/jitter/libmesaswr_la-JitManager.lo rasterizer/jitter/JitManager.cpp:368:93: error: too few arguments to function call, expected at least 4, have 3 pTarget->addPassesToEmitFile(*pMPasses, filestream, TargetMachine::CGFT_AssemblyFile); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-06-28 08:18:06 -05:00
Alok Hota	c7e9102d89	swr/rast: Handling removed LLVM intrinsics in trunk - Functionality replaced with emulated intrinsics - Fixes Bug 106558 Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-06-28 08:18:00 -05:00
Alok Hota	83d3ddd0ec	swr/rast: Adding SCATTERPS functionality to BuilderGfxMem Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-06-28 08:17:55 -05:00
Alok Hota	4509cdbb37	swr/rast: Adding Read/Write specifier to TranslateGfxAddress stack - Removing unused generic translate function - Requiring read/write specifier in builder_gfx_mem Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-06-28 08:17:33 -05:00
Chad Versace	dc6665422a	gallium: Fix automake for Android (v2) Chromium OS uses Autotools and pkg-config when building Mesa for Android. The gallium drivers were failing to find the headers and libraries for zlib and Android's libbacktrace. v2: - Don't add a check for zlib.pc. configure.ac already checks for zlib.pc elsewhere. [for tfiga] - Check for backtrace.pc separately from the other Android libs. [for tfiga] Reviewed-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-27 19:58:16 -07:00
Timothy Arceri	2a5121bf35	glsl: skip comparison opt when adding vars of different size The spec allows adding scalars with a vector or matrix. In this case the opt was losing swizzle and size information. This fixes a bug with Doom (2016) shaders. Fixes: `34ec1a24d6` ("glsl: Optimize (x + y cmp 0) into (x cmp -y).") Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-28 12:15:17 +10:00
Jason Ekstrand	e8eb182ec5	Revert "anv: Print the actual enum for ignored structure types" This reverts commit `fda7014c35`. It was hitting an unreachable when the sType was unknown.	2018-06-27 14:10:37 -07:00
Jason Ekstrand	fda7014c35	anv: Print the actual enum for ignored structure types Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-06-27 12:43:18 -07:00
Jason Ekstrand	6a35ba5ce9	i965/bufmgr: Use the correct argument order for bo_alloc_internal The memzone and flags parameters were accidentally flipped in the call from brw_bo_alloc_tiled_2d. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-27 12:43:18 -07:00
Keith Packard	60e6b6fa96	vulkan/wsi_common_display: Return SURFACE_LOST for fatal DRM errors Instead of encouraging the client to re-create the swapchain and keep going with an OUT_OF_DATE error, tell the client that further use of the current surface will not succeed as the associated kernel objects are no longer valid. In particular, when a DRM lease is revoked, then the client needs to get another lease and create a new surface for that. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-27 10:02:18 -07:00
Eric Anholt	6bb046cd29	glsl: Make sure that packed varyings reflect always_active_io properly. The always_active_io flag was only set according to the first variable that got packed in, so NIR io compaction would end up compacting XFB varyings that shouldn't move at that point. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-27 09:35:55 -07:00
Eric Anholt	ad1a4cb563	v3d: Fix Z clipping when viewport.scale[2] is negative. Fixes: dEQP-GLES3.functional.shaders.builtin_variable.depth_range_fragment dEQP-GLES3.functional.shaders.builtin_variable.depth_range_vertex	2018-06-27 09:35:51 -07:00
Eric Anholt	9f80bcc2bc	v3d: Convert a bunch of our "minus one" fields over to the new XML attr. This fixes up their formatting for CLIF files and makes the code more legible.	2018-06-27 09:13:48 -07:00
Eric Anholt	18b1bb0b63	v3d: Add pack/unpack/decode support for fields with a "- 1" modifier. Right now, we name these fields as "field name minus one" so that your C code obviously states what the value should be. However, it's easy enough to handle at the codegen level with another little XML attribute, meaning less C code and easier-to-read values in CLIF dumping and gdb as well. (The actual CLIF format for simulator and FPGA replay takes in pre-minus-one values, so we need it there too).	2018-06-27 09:13:48 -07:00
Tapani Pälli	e9a77c3e96	i965: small cleanup in blorp debug printing output (trivial) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-06-27 11:05:48 +03:00
Tapani Pälli	9a92acec67	mesa: add a space between headers and source (trivial) There used to be one and it looks like it was removed by `eb63640c1d`. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-06-27 11:05:48 +03:00
Tapani Pälli	58ba7ab535	features.txt: mark some extensions as done Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-06-27 11:05:48 +03:00
Danylo Piliaiev	e7cdaa895a	mesa: Return number of result bits for GL_ANY_SAMPLES_PASSED_CONSERVATIVE Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106986 Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-06-27 11:02:34 +03:00
Samuel Pitoiset	7a57c82767	radv: use separate bind points for the dynamic buffers The Vulkan spec says: "pipelineBindPoint is a VkPipelineBindPoint indicating whether the descriptors will be used by graphics pipelines or compute pipelines. There is a separate set of bind points for each of graphics and compute, so binding one does not disturb the other." CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-27 09:48:31 +02:00
Samuel Pitoiset	9c09e7d66e	radv: remove unused 'predicated' parameter from some functions It's always false. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-06-27 09:48:15 +02:00
Dave Airlie	a6b64d6dde	virgl: add ARB_texture_view support Reviewed-By: Gert Wollny <gert.wollny@collabora.com>	2018-06-27 14:08:00 +10:00
Jason Ekstrand	ff6db94c18	nir/opt_if: Remove unneeded phis if we make progress Now that SSA values can be derefs and they have special rules, we have to be a bit more careful about our LCSSA phis. In particular, we need to clean up in case LCSSA ended up creating a phi node for a deref. This fixes validation issues with some Vulkan CTS tests with the new deref instructions. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-06-26 10:47:26 -07:00
Samuel Pitoiset	fa42fa1a60	radv: emit PIPELINESTAT_{START,STOP} events for pipeline stats queries Ported from RadeonSI. This appears to fix some random fails with: dEQP-VK.query_pool.statistics_query.* Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-26 18:23:16 +02:00
Tapani Pälli	ab2643e4b0	glsl: serialize data from glTransformFeedbackVaryings While XFB has been enabled for cache, we did not serialize enough data for the whole API to work (such as glGetProgramiv). Fixes: `6d830940f7` "Allow shader cache usage with transform feedback" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106907 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-06-26 12:44:22 +03:00
Samuel Pitoiset	bcbd8dd6c9	radv: enable VK_EXT_shader_stencil_export The driver already supports exporting the stencil value. The following CTS test now pass: dEQP-VK.pipeline.shader_stencil_export.op_replace Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-26 10:40:10 +02:00
Samuel Pitoiset	ba5e25ed29	radv: ignore pInheritanceInfo for primary command buffers From the Vulkan spec: "If this is a primary command buffer, then this value is ignored." CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-26 10:39:43 +02:00
Andrii Simiklit	232c5d75ea	i965/gen6/gs: Handle case where a GS doesn't allocate VUE We can not use the VUE Dereference flags combination for EOT message under ILK and SNB because the threads are not initialized there with initial VUE handle unlike Pre-IL. So to avoid GPU hangs on SNB and ILK we need to avoid usage of the VUE Dereference flags combination. (Was tested only on SNB but according to the specification SNB Volume 2 Part 1: 1.6.5.3, 1.6.5.6 the ILK must behave itself in the similar way) v2: Approach to fix this issue was changed. Instead of different EOT flags in the program end we will create VUE every time even if GS produces no output. v3: Clean up the patch. Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105399 CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Tested-by: Mark Janes <mark.a.janes@intel.com>	2018-06-26 08:18:55 +02:00
Dave Airlie	318ff60ccd	radeon: duplicate cmask surface for now. The radeon winsys isn't linked against the ac code, I have vague memories of this causing some problems before, for now fix the build but just duplicating the code. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-26 11:26:35 +10:00
Marek Olšák	bd963f8430	radeonsi: rename r600_transfer -> si_transfer Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	eabeeb86b2	radeonsi: properly set cmask_buffer in si_reallocate_texture_inplace Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	d4755ef389	radeonsi: remove redundant si_texture::cmask_size cmask_buffer and surface.cmask_size can replace its role. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	2a8d1039b6	radeonsi: inline struct r600_cmask_info Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	166250f4e5	radeonsi: move CMASK size computation into ac_surface Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	3da693b7d9	ac/surface: move cmask_size/alignment into radeon_surf cmask_size is changed to uint32_t because it can't be greater than 4GB. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	2d64a68c6f	radeonsi: rename r600_surface -> si_surface Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	218e133695	radeonsi: rename r600_memory_object -> si_memory_object Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	e5df04f13d	radeonsi: remove unused r600_memory_object::offset The real offset is passed through resource_from_memobj. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	45004abfd5	radeonsi: unify duplicated texture_from_handle & texture_from_memobj Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	cac7ab1192	radeonsi: reorder and initialize more fields in si_reallocate_texture_inplace Some fields shouldn't be initialized, like framebuffers_bound and other stats. It's hopefully complete now. Cc: 18.1 <mesa-stable@lists.freedesktop.org>	2018-06-25 18:33:58 -04:00
Marek Olšák	7888245ef3	radeonsi: stop using lp_build_emit_llvm_unary/binary Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	0810f15046	radeonsi: stop using lp_build_alloc Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	21ba8a204e	radeonsi: use gallivm less Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	965904eebd	radeonsi: stop using lp_bld_intr.h Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	6ab54d25a6	radeonsi: remove last uses of lp_build_context::undef Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	30f3e2200a	radeonsi: stop using lp_bld_arit.h Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	5f54fc3ad1	radeonsi: stop using lp_build_gather_values Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	7bd40dc2f2	radeonsi: clean up some #includes Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	f154555733	radeonsi: clean up passing the is_monolithic flag for compilation Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Robert Foss	c7bb82136b	egl/android: Add DRM node probing and filtering This patch both adds support for probing & filtering DRM nodes and switches away from using the GRALLOC_MODULE_PERFORM_GET_DRM_FD gralloc call. Currently the filtering is based just on the driver name, and the desired name is supplied using the "drm.gpu.vendor_name" Android property. Signed-off-by: Robert Foss <robert.foss@collabora.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org>	2018-06-25 18:54:10 +02:00
Rob Herring	3f7bca44d9	egl/android: #ifdef out flink name support Maintaining both flink names and prime fd support which are provided by 2 different gralloc implementations is problematic because we have a dependency on a specific gralloc implementation header. This mostly disables the dependency on the gralloc implementation and headers. The dependency on GRALLOC_MODULE_PERFORM_GET_DRM_FD remains for now, but the definition is added locally to remove the header dependency. drm_gralloc support can be enabled by setting BOARD_USES_DRM_GRALLOC=true in BoardConfig.mk. Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Robert Foss <robert.foss@collabora.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org>	2018-06-25 18:54:09 +02:00
Robert Foss	5a34aba07d	gallium/util: Fix build error due to cast to different size Signed-off-by: Robert Foss <robert.foss@collabora.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-25 18:54:09 +02:00
Samuel Pitoiset	07cb1373a2	radv: fix HTILE metadata initialization in presence of subpass clears If the driver ends up by performing a slow depthstencil clear, the HTILE metadata won't be initialized correctly. This fixes random VM faults on Polaris while running CTS with Bas's runner. This doesn't seem to regress performance. CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-25 17:38:59 +02:00
Gert Wollny	eebb65258d	r600/sb: give the scheduler more margin to find valid instructions groups For instruction sequences that change the address register with every load the current limit to bail out of the scheduler and reject the optimisation was too tight, i.e. it was expected that at least one pending instruction would be scheduled each time. Give the scheduler more margin to sort out these load sequences by allowing a number of rounds where no instruction is scheduled. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106163 Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-06-25 05:40:19 +01:00
Gert Wollny	cd7db0ab0a	r600/sb: fix rotated register in while loop This patch is based on https://lists.freedesktop.org/archives/mesa-dev/2018-February/185805.html Dave Airlie: "A bunch of CTS tests led me to write tests/shaders/ssa/fs-while-loop-rotate-value.shader_test which r600/sb always fell over on. GCM seems to move some of the copies into other basic blocks, if we don't allow this to happen then it doesn't seem to schedule them badly. Everything I've read on SSA/phi copies say they have to happen in parallel, so keeping them in the same basic block seems like a good way to keep some of that property." This patch differs from the one proposed by Dave in that it only adds the NF_DONT_MOVE flag to copy_move instructions that are created by split_phi* and that are located in loops. Fixes piglit: tests/shaders/ssa/fs-while-loop-rotate-value.shader_test (no regressions in the shader set). It also fixes all failing tests from dEQP-GLES3.functional.shaders.loops.* Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-06-25 05:39:41 +01:00
Rob Clark	1977e92ee3	freedreno/ir3: fix deref conversion fallout Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-23 18:23:11 -04:00
Rob Clark	445871de94	freedreno/ir3: fix unused variable warning Fixes: `cf0c7258ee` freedreno/a5xx: MSAA Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-23 18:23:11 -04:00
Rob Clark	868ca81cbe	freedreno: fix HW_ATOMIC_COUNTERS cap This was mistakenly exposed, even though we want atomic counters to be lowered to atomic ops on an SSBO like nearly every other GPU. Which somehow recently started getting segfaults due to calling a null pipe->set_hw_atomic_buffers(). Fixes a crash in stk, and probably other things. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-23 18:23:11 -04:00
Keith Packard	1df586be12	radv: add VK_EXT_display_control to radv driver [v5] This extension provides fences and frame count information to direct display contexts. It uses new kernel ioctls to provide 64-bits of vblank sequence and nanosecond resolution. v2: Rework fence integration into the driver so that waiting for any of a mixture of fence types (wsi, driver or syncobjs) causes the driver to poll, while a list of just syncobjs or just driver fences will block. When we get syncobjs for wsi fences, we'll adapt to use them. v3: Adopt Jason Ekstrand's coding conventions Declare variables at first use, eliminate extra whitespace between types and names. Wrap lines to 80 columns. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> v4: Adapt to WSI fence API change. It now returns VkResult and no longer has an option for relative timeouts. v5: wsi_register_display_event and wsi_register_device_event now use the default allocator when NULL is provided, so remove the computation of 'alloc' here. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-23 07:59:00 -07:00
Keith Packard	16eb390834	anv: add VK_EXT_display_control to anv driver [v5] This extension provides fences and frame count information to direct display contexts. It uses new kernel ioctls to provide 64-bits of vblank sequence and nanosecond resolution. v2: Adopt Jason Ekstrand's coding conventions Declare variables at first use, eliminate extra whitespace between types and names. Wrap lines to 80 columns. Add extension to list in alphabetical order Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> v3: Adapt to WSI fence API change. It now returns VkResult and no longer has an option for relative timeouts. v4: wsi_register_display_event and wsi_register_device_event now use the default allocator when NULL is provided, so remove the computation of 'alloc' here. v5: use zalloc2 instead of alloc2 for the WSI fence. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2018-06-23 07:59:00 -07:00
Keith Packard	86c8d93e5a	vulkan: add VK_EXT_display_control [v10] This extension provides fences and frame count information to direct display contexts. It uses new kernel ioctls to provide 64-bits of vblank sequence and nanosecond resolution. v2: Remove DRM_CRTC_SEQUENCE_FIRST_PIXEL_OUT flag. This has been removed from the proposed kernel API. Add NULL parameter to drmCrtcQueueSequence ioctl as we don't care what sequence the event was actually queued to. v3: Adapt to pthread clock switch to MONOTONIC v4: Fix scope for wsi_display_mode andwsi_display_connector allocs Suggested-by: Jason Ekstrand <jason@jlekstrand.net> v5: Adopt Jason Ekstrand's coding conventions Declare variables at first use, eliminate extra whitespace between types and names. Wrap lines to 80 columns. Use wsi_rel_to_abs_time helper function to convert relative timeouts to absolute timeouts without causing overflow. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> v6: Change WSI fence wait function to return VkResult instead of bool. This makes the meaning of the return value easier to understand, and allows for the indication of failure. Also change the WSI fence wait function to take only absolute timeouts and not provide an option for a relative timeout. No users wanted relative timeouts, and it's simpler if that option isn't available. Terminate the DPMS property loop once we've found the property. Assert that the fence hasn't already been destroyed in wsi_display_fence_destroy. Rearrange the event handler function order in the file to place routines in an easier to find order. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> v7: Adapt to API changes for surface_get_capabilities v8: Use wsi->alloc in register_display_event so that callers don't have to dig out an allocator for us. v9: Fix a few minor formatting issues Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> v10: Use wsi->alloc if none provided in wsi_display_fence_alloc. Now that drivers are expected to pass the allocator argument straight through from the application, we need to check those for NULL everywhere. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2018-06-23 07:59:00 -07:00
Keith Packard	5581dd5c32	anv: Support wait for heterogeneous list of fences [v3] Handle the case where the set of fences to wait for is not all of the same type by either waiting for them sequentially (waitAll), or polling them until the timer has expired (!waitAll). We hope the latter case is not common. While the current code makes sure that it always has fences of only one type, that will not be true when we add WSI fences. Split out this refactoring to make merging that clearer. v2: Adopt Jason Ekstrand's coding conventions Declare variables at first use, eliminate extra whitespace between types and names. Wrap lines to 80 columns. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> v2: Cast INT64_MAX to uint64_t to make of its use as the maximum possible timeout clearly unsigned to the reader. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> Make anv_wait_for_fences with !waitAll check all fences at least once, even if the requested timeout has already passed. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2018-06-23 07:59:00 -07:00
Bas Nieuwenhuizen	8c4f430d43	radv: Enable lower_io_to_temporaries after deref changes. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 21:23:06 -07:00
Jason Ekstrand	aef4213fca	nir/lower_system_values: Assert/assume direct var derefs System values are never arrays or structs so we can assume a direct var deref. This simplifies things a bit and prevents us from accidentally throwing away an array index. Suggested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 21:23:06 -07:00
Jason Ekstrand	a331d7d1cd	nir: Remove old-school deref chain support Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 21:23:06 -07:00
Jason Ekstrand	9800b81ffb	nir: Remove deref chain support from analyze_loops Note that this patch needs to come late in the series since this pass can be run after any pass that damages nir_metadata_loop_analysis. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 21:23:06 -07:00
Rob Clark	2db8784167	freedreno/ir3: convert to deref instructions Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 21:23:05 -07:00
Rob Clark	95683bdce3	nir: promote intrinsic_get_var() to helper Useful in a few other places.. let's not copy-pasta Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	5a02ffb733	nir: Rework lower_locals_to_regs to use deref instructions This completely reworks the pass to support deref instructions and delete support for old deref chains Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	2fa7a4a541	intel,ir3: Re-enable nir_opt_copy_prop_vars Now that it's rewritten for deref instructions, we can turn it back on. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen	67df3739c5	radeonsi: Remove deref chain support in nir scan pass. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com>	2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen	9cb345588b	radv: Remove deref chain support in radv shader info pass. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com>	2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen	a1e9d799ad	ac/nir: Remove deref chain support. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com>	2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen	9bfd81b217	radeonsi: Add deref support to the nir scan pass. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	ba2bd20f87	nir: Rework opt_copy_prop_vars to use deref instructions Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	fa6ffcc083	nir/copy_prop_vars: Re-order some logic in compare_derefs Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	c5d9a65944	nir: Remove deref chain support from split_per_member_structs Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	18175ab66f	nir: Remove deref chain support from opt_undef Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	aeb4bbfd1e	nir: Remove deref chain support from split_var_copies Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	636256cdc7	nir: Remove deref chain support from dead_variables Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	378d7cf3ba	nir: Remove deref chain support from propagate_invariant Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	c6a9c2b60b	nir: Remove deref chain support from lower_var_copies Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	fc59230a46	nir: Remove deref chain support from lower_drawpixels Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	d4dd2ca4a7	nir: Remove deref chain support from opt_peephole_select Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	54bfc0cbcf	nir: Remove deref chain support from lower_tex Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	a3589bb01f	nir: Remove deref chain support from lower_wpos_ytransform Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	3992665c52	nir: Remove deref chain support from lower_wpos_center Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	8a62db7712	nir: Remove deref chain support from lower_system_values Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	e5db1b951c	nir: Remove deref chain support from remove_unused_varyings Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	6bdd867968	nir: Delete lower_io_types It's only used by the ir3 stand-alone compiler and Rob said we could delete it. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	c6fc653232	nir: Remove deref chain support from lower_phis_to_scalar Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	47ffb893e6	nir: Convert lower_io to deref instructions This deletes support for _var intrinsics and legacy deref chains in favor of deref instructions. The internals are also reworked a bit to use deref instructions directly. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	0d03c63e91	nir/lower_io: Convert atomic lowering to deref instructions No one is currently using so we can make this change irrespective of driver. We may use it again in i965 so it's best to pretend to keep it working. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	c290e8c4b0	nir: Remove deref chain support from lower_global_vars_to_local Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	41c52c963a	nir: Remove deref chain support from lower_clamp_color_outputs Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	d2adc08abe	nir: Remove deref chain support from lower_alpha_test Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	81f29d6d33	nir: Remove deref chain support from lower_atomics Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	4b0ea65333	nir: Remove deref chain support from lower_clip_cull_distance_arrays Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	a42af8d0d6	nir: Remove deref chain support from lower_indirect_derefs Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	69866af357	nir: Rework gather_info to entirely use deref instructions Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	b1a18b8797	nir/vars_to_ssa: Rework to entirely use deref instructions This commit reworks nir_lower_vars_to_ssa to use deref instructions and deref paths internally instead of deref chains. We also drop support for the old load/store/copy_var intrinsics. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	f747ff1969	nir/vars_to_ssa: Add an is_direct field to deref_node This makes us build the is_direct parameter as the nodes are constructed rather than as we walk the chain. This will be useful later. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Eric Anholt	e1f0a1b029	broadcom/vc4: Remove deref chain support from nir_lower_txf_ms. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Rob Clark	3d19f116ad	st,ir3,radeonsi: push lower_deref_instrs back into driver vc4+vc5 is not really effected by the deref chain to deref instr conversion, so it no longer needs this pass. For others, now that all the passes mesa/st uses are using deref instructions, push the lowering to deref chains back into driver. Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Rob Clark	3e8879be5c	nir/lower_samplers: remove legacy version Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Rob Clark	a20929fed2	nir: convert lower_samplers_as_deref to deref instructions This also removes the legacy version of lower_samplers. Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Rob Clark	0bc15340be	mesa/st: re-enable lower_io_to_elements() Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Rob Clark	245ce114c9	nir: convert lower_io_arrays_to_elements to deref instructions Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Rob Clark	c409cfddcf	mesa/st/nir: convert lower_builtins to deref instructions Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Rob Clark	3859e0b4fe	mesa/st: temporarily disable lower_io_to_elements() Not required for correctness, and makes the order of converting passes to deref instructions hard to get right for both prog_to_nir and glsl_to_nir cases. Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Rob Clark	c6009a1e8e	nir: convert lower_io_to_scalar to deref instructions Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Rob Clark	d143f6c856	move lower_deref_instrs Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	d7b0be48ef	nir: Use deref instructions in lower_constant_initializers Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	85f4149f8a	nir/builder: Use deref instructions for load/store/copy_var Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen	3573570afe	radv: Disable lower_io_to_temporaries during deref changes. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	75286c2d08	nir: Use derefs in nir_lower_samplers We change glsl_to_nir to provide derefs for bot textures and samplers while we're at it. This makes the lowering much easier since we only either replace sources or remove them. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	36efae1d66	nir/lower_samplers: Clean up function arguments This little refactor makes us stop passing stage around and puts the builder as the first parameter to some functions. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Rob Clark	a6ebbbc594	nir/lower_samplers: split out _legacy version for deref chains To simplify the transition, and make things bisectable, split out a legacy copy or lower_samplers. This way the i965 and gallium drivers can independently switch over to deref instructions. Since the lower_samplers_as_deref pass is only used by gallium drivers, it can be converted in lock-step with moving the lower_deref_instrs pass, and so does not need a corresponding _legacy clone. This legacy pass will be removed in a future commit. Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	3891c1906f	intel/blorp: Stop setting tex->texture/sampler nir_tex_instr_create uses rzalloc so it's already NULL Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	606eb56ab9	intel/nir: Only lower load/store derefs Everything else should already be handled. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	71cd9ebed9	intel/fs: Use image_deref intrinsics instead of image_var Since we had to rewrite the deref walking loop anyway, I took the opportunity to make it a bit clearer and more efficient. In particular, in the AoA case, we will now emit one minmax instead of one per array level. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	032b845edf	anv/pipeline: Convert apply_pipeline_layout to deref instructions Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	43bb707fa4	anv/apply_pipeline_layout: Simplify extract_tex_src_plane Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	9fb36011d1	anv/pipeline: Convert lower_multiview to deref instructions Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	d57e724a45	anv/pipeline: Convert YCbCr lowering to deref instructiosn Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	38f1b89805	anv/pipeline: Convert lower_input_attachments to deref instructions Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	5cd7324a57	anv/pipeline: Do less deref instruction lowering This commit removes most of the deref instruction lowering. Instead of lowering early, we only lower textures and images and we only do so right before any of the anv image lowering passes. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen	1d59034de2	radv: Remove image_var stores. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen	43af92edc5	radv: Use deref instructions for tex derefs in meta shaders. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen	657cedb12f	ac/nir: Add deref interp support. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen	d00e7d42f5	ac/nir: Add shared atomic deref instr support. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen	302884d121	radv: Gather info for deref instr based load/store. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen	547d970122	ac/nir: Add deref based var loads/stores. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:03 -07:00
Bas Nieuwenhuizen	5780af9880	radv: Add shader info support for image deref instructions. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:02 -07:00
Bas Nieuwenhuizen	506a07e4e3	ac/nir: Add deref support to image intrinsics. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen	bb5781c9a7	ac/nir: Implement derefs for integer gather4 lowering. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:58 -07:00
Bas Nieuwenhuizen	ca271e266e	ac/nir: Support deref instructions in tex instructions. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:58 -07:00
Bas Nieuwenhuizen	9b14eacf0e	ac/nir: Support deref instructions in get_sampler_desc. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:58 -07:00
Bas Nieuwenhuizen	4a888beea9	ac/nir: Implement the deref instr for shared memory. v2: Store the result in ctx->ssa_defs. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:58 -07:00
Jason Ekstrand	c11833ab24	nir,spirv: Rework function calls This commit completely reworks function calls in NIR. Instead of having a set of variables for the parameters and return value, nir_call_instr now has simply has a number of sources which get mapped to load_param intrinsics inside the functions. It's up to the client API to build an ABI on top of that. In SPIR-V, out parameters are handled by passing the result of a deref through as an SSA value and storing to it. This virtue of this approach can be seen by how much it allows us to delete from core NIR. In particular, nir_inline_functions gets halved and goes from a fairly difficult pass to understand in detail to almost trivial. It also simplifies spirv_to_nir somewhat because NIR functions never were a good fit for SPIR-V. Unfortunately, there is no good way to do this without a mega-commit. Core NIR and SPIR-V have to be changed at the same time. This also requires changes to anv and radv because nir_inline_functions couldn't handle deref instructions before this change and can't work without them after this change. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:58 -07:00
Jason Ekstrand	58799b6a5b	spirv/cfg: Make the builder fully capable for both walks We were only initializing vtn_builder::func for the pre-walk where we build the CFG. We were only initializing the nir_builder for the later walk through the instructions even though were were setting b->cursor for the pre-walk. Let's set both both places so that everything is consistent. This useful because we handle OpFunctionParameter in the pre-walk and we're going to need to be able to emit instructions. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:58 -07:00
Jason Ekstrand	3fc3798677	spirv: Record the type of functions Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:57 -07:00
Jason Ekstrand	2f9bfd7dd9	spirv: Update vtn_pointer_to/from_ssa to handle deref pointers Now that pointers can be derefs and derefs just produce SSA values, we can convert any pointer to/from SSA. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:57 -07:00
Jason Ekstrand	d5930c222c	spirv: Allow pointers to have a deref at the base Previously, pointers fell into two categories: index/offset for UBOs, SSBOs, etc. and var + access chain for logical pointers. This commit adds another logical pointer mode that's deref + access chain. It's tempting to think that we can just replace variable-based pointers with deref-based or at least replace the access chain with a deref chain. Unfortunately, there are a few sticky bits that prevent this: 1) We can't return deref-based pointers from OpVariable because those opcodes may come outside of a function so there's no place to emit the deref instructions. 2) We can't always use variable-based pointers because we may not always know the variable. (We do now, but he upcoming function rework will take that option away.) 3) We also can't replace the access chain struct with a deref. Due to the re-ordering we do in order to handle loop continues, the derefs we would emit as part of OpAccessChain may not dominate their uses. We normally fix this up with nir_repair_ssa but that generates phi nodes which we don't want in the middle of our deref chains. All in all, we have no real better option than to support partial access chains while also re-emitting the deref instructions on the spot. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:57 -07:00
Jason Ekstrand	fdd5ffee32	spirv: Clean up vtn_pointer_to_offset Now that push constants are using on-the-fly offsets, we no longer need to handle access chains in vtn_pointer_to_offset. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:57 -07:00
Jason Ekstrand	7dfa440922	spirv: Make push constants an offset-based pointer Push constants have been a weird edge-case for a while in that they have explitic offsets but we've been internally building access chains for them. This mostly works but it means that passing pointers to push constants through as function arguments is broken. The easy thing to do for now is to just treat them like UBOs or SSBOs only without a block index. This does loose a bit of information since we no longer have an accurate access range and any indirect access will look like it could read the whole block. Unfortunately, there's not much we can do about that. Once NIR derefs get a bit more powerful, we can plumb these through as derefs and be able to reason about them again. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:57 -07:00
Jason Ekstrand	b0c643d8f5	spirv: Use NIR per-member splitting Before, we were doing structure splitting in spirv_to_nir. Unfortunately, this doesn't really work when you think about passing struct pointers into functions. Doing it later in NIR is a much better plan. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:57 -07:00
Jason Ekstrand	2100c2f3a2	nir/spirv: Pass nir_variable_data into apply_var_decoration Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:57 -07:00
Jason Ekstrand	39bf61aa37	nir: Add a concept of per-member structs and a lowering pass This adds a concept of "members" to a variable with an interface type. It allows you to specify the full variable data for each member of the interface instead of once for the variable. We also add a lowering pass to lower those variables to a sequence of variables and rewrite all the derefs accordingly. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:57 -07:00
Jason Ekstrand	eb40540b8a	spirv: Use deref instructions for most variables The only thing still using old-school drefs are function calls. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:57 -07:00
Jason Ekstrand	e5130012e4	st/nir: Move lower_deref_instrs later Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:57 -07:00
Jason Ekstrand	152057b138	i965: Move nir_lower_deref_instrs to right before locals_to_regs Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:57 -07:00
Jason Ekstrand	a649610ace	nir/lower_tex: Always copy deref and offset sources This should make nir_lower_tex properly handle deref instructions as well as make it more correct when texture arrays are used and it's called after lowering samplers to binding table indices. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:56 -07:00
Jason Ekstrand	261fe676e5	intel/nir: Fixup deref modes after lowering patch vertices Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:56 -07:00
Jason Ekstrand	d7d5aab45b	intel,ir3: Disable nir_opt_copy_prop_vars This pass doesn't handle deref instructions yet. Making it handle both legacy derefs and deref instructions would be painful. Since it's not important for correctness, just disable it for now. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:56 -07:00
Jason Ekstrand	5dc58908b7	nir: Support deref instructions in opt_undef Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:56 -07:00
Jason Ekstrand	f46ecdc441	nir: Consider deref instructions in opt_peephole_select Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:56 -07:00
Jason Ekstrand	1e1733aaf0	nir: Consider deref instructions in lower_phis_to_scalar Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:56 -07:00
Jason Ekstrand	775ef13384	nir: Support deref instructions in lower_drawpixels Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:56 -07:00
Jason Ekstrand	932c6577a0	nir: Support deref instructions in lower_clamp_color_outputs Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:56 -07:00
Jason Ekstrand	076b6627c2	nir: Support deref instructions in lower_alpha_test Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:56 -07:00
Jason Ekstrand	414148cdc1	nir: Support deref instructions in loop_analyze Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:56 -07:00
Jason Ekstrand	e786fcf777	nir: Support deref instructions in remove_unused_varyings Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:56 -07:00
Jason Ekstrand	933c2851ab	nir: Support deref instructions in lower_pos_center Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:55 -07:00
Jason Ekstrand	64057fd333	nir: Support deref instructions in lower_wpos_ytransform Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:55 -07:00
Jason Ekstrand	2c9ca29372	nir: Support deref instructions in lower_atomics Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:55 -07:00
Jason Ekstrand	d029167ea0	nir: Support deref instructions in lower_io Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:55 -07:00
Jason Ekstrand	59b43be105	nir: Support deref instructions in gather_info Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:55 -07:00
Jason Ekstrand	1442969ae1	nir: Support deref instructions in propagate_invariant Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:55 -07:00
Jason Ekstrand	f23356a4dd	nir: Support deref instructions in lower_clip_cull Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:55 -07:00
Jason Ekstrand	61b7bef3a3	nir: Support deref instructions in lower_system_values Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:55 -07:00
Jason Ekstrand	1285cc9616	nir: Support deref instructions in lower_indirect_derefs Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:55 -07:00
Jason Ekstrand	dccb3acb63	nir: Support deref instructions in lower_vars_to_ssa Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:55 -07:00
Jason Ekstrand	9fe99129df	nir: Support deref instructions in split_var_copies Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:55 -07:00
Jason Ekstrand	4a4e175738	nir: Support deref instructions in lower_var_copies Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:55 -07:00
Jason Ekstrand	a406f7e0c9	nir: Add a deref path helper struct This commit introduces a new nir_deref.h header for helpers that are less common and really only needed by a few heavy-duty passes. In this header is a new struct for representing a full deref path which can be walked in either direction. v2 (Jason Ekstrand): - Assert that deref != NULL (Caio) - Fill _short_path with 0xdeadbeef in debug builds when not used (Caio) - Make nir_deref_path a typedef (Rob) Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:54 -07:00
Jason Ekstrand	535289a3a9	nir: Support deref instructions in lower_io_to_temporaries Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:54 -07:00
Jason Ekstrand	21befc46ef	nir: Support deref instructions in lower_global_vars_to_local Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:54 -07:00
Jason Ekstrand	54e440945e	nir: Add a pass for fixing deref modes This will be needed by anything which changes variable modes without rewriting derefs. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:54 -07:00
Jason Ekstrand	f917814c14	nir: Support deref instructions in remove_dead_variables Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:54 -07:00
Rob Clark	f03a33a19a	ttn: convert to deref instructions Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:54 -07:00
Jason Ekstrand	82c498510e	prog/nir: Use deref instructions for params Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:54 -07:00
Jason Ekstrand	2c7b892909	glsl/nir: Use deref instructions instead of dref chains Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:54 -07:00
Jason Ekstrand	7f41a99cac	glsl/nir: Only claim to handle intrinsic functions Non-intrinsic function handling has never actually been tested and probably doesn't work. Just get rid of it for now. We can always add it back in later if it's useful. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:54 -07:00
Rob Clark	d80c342d89	nir: add deref lowering sanity checking This will be removed at the end of the transition, but add some tracking plus asserts to help ensure that lowering passes are called at the correct point (pre or post deref instruction lowering) as passes are converted and the point where lower_deref_instrs() is called is moved. Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:54 -07:00
Jason Ekstrand	74212c2414	anv,i965,radv,st,ir3: Call nir_lower_deref_instrs This inserts a call to nir_lower_deref_instrs at every call site of glsl_to_nir, spirv_to_nir, and prog_to_nir. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:54 -07:00
Jason Ekstrand	8b7aa66169	nir/deref: Add some deref cleanup functions Sometimes it's useful for a pass to be able to clean up its own derefs instead of waiting for DCE. This little helper makes it very easy. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:53 -07:00
Jason Ekstrand	a80fa2766e	nir: Add helpers for working with deref instructions This commit adds a pass for lowering deref instructions to deref chains as well as some smaller helpers to ease the transition. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:53 -07:00
Jason Ekstrand	5286b5d832	nir: Add deref sources to texture instructions Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:53 -07:00
Jason Ekstrand	f1dc2088e2	nir: Add _deref versions of all of the _var intrinsics Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:53 -07:00
Jason Ekstrand	de7f60b653	nir/builder: Add deref building helpers Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:53 -07:00
Jason Ekstrand	19a4662a54	nir: Add a deref instruction type This commit adds a new instruction type to NIR for handling derefs. Nothing uses it yet but this adds the data structure as well as all of the code to validate, print, clone, and [de]serialize them. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:53 -07:00
Jason Ekstrand	5fbbbda37a	nir/validate: Rework intrinsic type validation This moves the switch statement for specific intrinsics above source and destination validation. We also rework the source and destination validation to use different bit_size values for each source and/or destination. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:53 -07:00
Karol Herbst	133e8bf4de	nv50/ir: only avoid spilling constrained def if a mov is added fix spilling regression introduced by `5428066f5e` this is just a minor mistake done while moving the code out into a new function. The function contained a loop which might have been terminated earlier and skipped setting noSpill to 1. After the refactoring it was always set. Fixes: `5428066f5e` ("nv50/ir: make a copy of tex src if it's referenced multiple times") Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-06-23 03:00:24 +02:00
Dylan Baker	ced3df5623	meson: Fix typo that breaks -Dgalium-xvmc=false _xmvc -> _xvmc. Sigh Fixes: `a6943bb4ce` ("meson: Fix auto option for xvmc") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Clayton Craft <clayton.a.craft@intel.com>	2018-06-22 10:16:27 -07:00
Dylan Baker	94cf397092	meson: Fix auto option for va The same as the previous two patches, but for the libva state tracker. Fixes: `724916c8a8` ("meson: dedup gallium-xvmc logic") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-22 09:51:25 -07:00
Dylan Baker	a6943bb4ce	meson: Fix auto option for xvmc This fixes the same problem as the previous patch did for vdpau, but for xvmc. Fixes: `724916c8a8` ("meson: dedup gallium-xvmc logic") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-22 09:51:18 -07:00
Dylan Baker	d9a8008a93	meson: Correct behavior of vdpau=auto Currently if vdpau is set to auto, it will be disabled only in cases where gallium is disabled or the host OS is not supported (mac, haiku, windows). However on (for example) Linux if libvdpau is not installed then the build will error because of the unmet dependency. This corrects auto to do the right thing, and not error if libvdpau is not installed. Fixes: `992af0a4b8` ("meson: dedup gallium-vdpau logic") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-22 09:51:11 -07:00
Samuel Pitoiset	ca59c3906d	radv: always check the return error when submitting a CS Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-22 17:47:10 +02:00
Samuel Pitoiset	68d9517690	radv: check the return values of radv_signal_fence() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-22 17:47:09 +02:00
Samuel Pitoiset	07832083d3	radv: change the returned error in radv_signal_fence() From my point of view, when we aren't able to submit a CS something terribly wrong happens and we are most likely going to lost the device. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-22 17:47:06 +02:00
Jonathan Marek	94bc06b196	freedreno: a2xx: fix clear color the format of the CLEAR_COLOR register doesn't depend on the target format this fixes clear color when rendering to 32-bit RGBA and 16-bit targets Signed-off-by: Jonathan Marek <jonathan@marek.ca> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-22 08:23:10 -04:00
Jonathan Marek	dd8553dd95	freedreno: a2xx: fix crash when freeing context Signed-off-by: Jonathan Marek <jonathan@marek.ca> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-22 08:23:10 -04:00
Jonathan Marek	6eeac34cee	freedreno: a2xx: fix crash on first clear blend can be NULL, so check for that Signed-off-by: Jonathan Marek <jonathan@marek.ca> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-22 08:23:10 -04:00
Jonathan Marek	17e16ba9db	freedreno: add a20x this patch adds support for a20x, which has some differences with a220: -no VGT_MAX_VTX_INDX register -no CLEAR_COLOR register -set RB_BC_CONTROL in restore (hangs without) -different CP_DRAW_INDX format tested with kmscube and glmark2 scenes, on par with a220 Signed-off-by: Jonathan Marek <jonathan@marek.ca> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-22 08:23:10 -04:00
Jonathan Marek	d5ff36b97b	freedreno: a2xx: increase size of the offset field in instr_fetch_vtx_t The offset field is 22 bit large. 11 bits are necessary because MaxVertexAttribRelativeOffset = 2047 Signed-off-by: Jonathan Marek <jonathan@marek.ca> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-22 08:23:10 -04:00
Eric Anholt	69ae42ca4c	v3d: Don't forget to initialize the buffer offset of a new winsys handle.	2018-06-21 15:56:18 -07:00
Eric Anholt	ee9a6a13fb	v3d, vc4: Disable valgrind checking of CLE inputs when NDEBUG is set. For a meson -Db_ndebug=true release build on x86_64, reduces text size of libv3d.a from 53.0k to 51.6k. Inspired by `0d5329d626` ("anv: Disable __gen_validate_value if NDEBUG is set.")	2018-06-21 15:46:40 -07:00
Marek Olšák	a2790b134a	mesa: fix glGetInteger64v for arrays of integers Cc: 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:55:15 -04:00
Marek Olšák	ce4b8b952a	ac/surface: disallow rotated micro tile mode Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-21 14:42:14 -04:00
Marek Olšák	9410cd53c3	radeonsi: fix occlusion queries with 16x AA without FBO attachments on Stoney Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-21 14:42:14 -04:00
Marek Olšák	9c21002f6e	radeonsi: handle non-clearable DCC buffers as MSAA resolve dst This is reproducible on Stoney, but other chips may be affected too. Cc 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-21 14:42:14 -04:00
Marek Olšák	587e712eda	radeonsi: disable DCC MSAA for 128bpp formats on Stoney Cc: 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-21 14:42:14 -04:00
Rob Clark	6764aae169	docs: update freedreno features Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-21 08:54:48 -04:00
Rob Clark	fbd154294f	mesa: fix GLES 3.1 version calculation All of ARB_gpu_shader5 is most certainly not required for GLES 3.1 (most of it is in OES_gpu_shader5 on top of GLES 3.1). Some of what is required from ARB_gpu_shader5 is provided by ARB_texture_gather, so check for that. The remaining subset of ARB_gpu_shader5 doesn't have individual extensions to check for, but I guess it is unlikely that some driver has all of these extensions but not, say, integer bitfield manipulation. Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-21 08:54:47 -04:00
Rob Clark	cf0c7258ee	freedreno/a5xx: MSAA Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-21 08:54:47 -04:00
Rob Clark	b6e690ef80	freedreno: update generated headers Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-21 08:54:47 -04:00
Rob Clark	418b3fd184	freedreno/ir3: txf_ms support Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-21 08:54:47 -04:00
Rob Clark	d03bd103f8	freedreno/a5xx: fix gpu hangs with large compute shaders Similar to the combined limit for VS+FS, there is an upper limit for shader size to run from internel memory. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-21 08:54:47 -04:00
Rob Clark	e1e40935b4	freedreno/ir3: fix base_vertex Fixes: `c366f422f0` nir: Offset vertex_id by first_vertex instead of base_vertex Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-21 08:54:47 -04:00
Eduardo Lima Mitev	77e790f99a	i965: Link uniforms of SPIR-V programs using the NIR linker v2: nir_link_uniforms renamed to gl_nir_link_uniforms Signed-off-by: Eduardo Lima <elima@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Neil Roberts	ae0208e5b4	i965: Setup glsl uniforms by index rather than name matching Previously when setting up a uniform it would try to walk the uniform storage slots and find one that matches the name of the given variable. However, each variable already has a location which is an index into the UniformStorage array so we can just directly jump to the right slot. Some of the variables take up more than one slot so we still need to calculate how many it uses. The main reason to do this is to support ARB_gl_spirv because in that case the uniforms don’t have names so the previous approach won’t work. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Eduardo Lima Mitev	57b6184931	i965: account for NIR uniforms without name Right now, the BRW linker code assumes nir_variable::name is always non-NULL, but thanks to ARB_gl_spirv we will soon be linking SPIR-V programs, and those explicitly require matching uniforms by location. The name is just a debug hint. Instead of checking for the name this patch makes it check for var->num_state_slots on the assumption that everything that had an internal name also had some state slots. This seems likely because the two code paths that are taken when the name begins with "gl_" already have an assert that var->state_slots is not NULL. v2: simplified, most of it moved to glsl/nir/spirv (Neil Roberts) v3: check for num_state_slots instead of the name. This is needed because we do actually have nameless builtins with SPIR-V such as PatchVerticesIn and we want them to hit the _mesa_add_state_reference code path (Neil Roberts) Signed-off-by: Eduardo Lima <elima@igalia.com> Signed-off-by: Neil Roberts <nroberts@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Neil Roberts	7dd96a0653	i965: Update TexturesUsed after linking the shaders Otherwise if the shader is SPIR-V then SamplerUsed won’t have been initialised yet so it will end up thinking no textures are used. This was causing a crash later on if nothing causes it to regenerate TexturesUsed before the next render. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Eduardo Lima Mitev	4bf8b80f54	i965: Build SPIR-V programs' resource list using NIR v2: tweak after nir_linker.h being renamed to gl_nir_linker.h Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Eduardo Lima Mitev	3cf12c6317	nir/linker: Add nir_build_program_resource_list() This function is equivalent to the linker.cpp build_program_resource_list() but will extract the resources from NIR shaders instead. For now, only uniforms and program inputs are implemented. v2: move from compiler/nir to compiler/glsl (Timothy Arceri) v3: remove support for inputs, that is still WIP (spotted by Timothy Arceri) Signed-off-by: Eduardo Lima <elima@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Alejandro Piñeiro	215c9359ed	compiler/link: move add_program_resource to linker_util So it could be used by the GLSL and NIR linker. v2: (Timothy Arceri) * Moved from compiler to compiler/glsl * Method renamed to link_util_add_program_resource Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Neil Roberts	2bf91733fc	nir/linker: Set the uniform initial values This is based on link_uniform_initializers.cpp. v2: move from compiler/nir to compiler/glsl (Timothy Arceri) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Eduardo Lima Mitev	7a9e5cdfbb	nir/linker: Add gl_nir_link_uniforms() This function will be the entry point for linking the uniforms from the nir_shader objects associated with the gl_linked_shaders of a program. This patch includes initial support for linking uniforms from NIR shaders. It is tailored for the ARB_gl_spirv needs, and it is far from complete, but it should handle most cases of uniforms, array uniforms, structs, samplers and images. There are some FIXMEs related to specific features that will be implemented in following patches, like atomic counters, UBOs and SSBOs. Also, note that ARB_gl_spirv makes mandatory explicit location for normal uniforms, so this code only handles uniforms with explicit location. But there are cases, like uniform atomic counters, that doesn't have a location from the OpenGL point of view (they have a binding), but that Mesa assign internally a location. That will be handled on following patches. A nir_linker.h file is also added. More NIR-linking related API will be added in subsequent patches and those will include stuff from Mesa, so reusing nir.h didn't seem a good idea. v2: move from compiler/nir to compiler/glsl (Timothy Arceri) v3: sets var->driver.location if the uniform was found from a previous stage (Neil Roberts). Signed-off-by: Eduardo Lima <elima@igalia.com> Signed-off-by: Neil Roberts <nroberts@igalia.com Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Alejandro Piñeiro	aa95f0bc5b	compiler/link: add linker_util.h, move linker_error/warning to it Linker utilities common to the GLSL IR and NIR linker (the latter to be used for ARB_gl_spirv). We need to move it to a new header as the NIR linker doesn't need to know about ir_variable, and others, included at linker.h. v2: move from src/compiler to src/compiler/glsl (Timothy Arceri) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Neil Roberts	b995bda9bc	spirv: Set nir_variable->explicit_binding When SpvDecorationBinding is encountered in the SPIR-V source it now sets explicit_binding on the nir_variable. This will be used to determine whether to initialise sampler and image uniforms with the binding value. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Neil Roberts	386f09be9b	spirv: Get rid of vtn_variable_mode_image/sampler vtn_variable_mode_image and _sampler are instead replaced with vtn_variable_mode_uniform which encompasses both of them. In the few places where it was neccessary to distinguish between the two, the GLSL type of the pointer is used instead. The main reason to do this is that on OpenGL it is permitted to put images and samplers into structs and declare a uniform with them. That means that variables can now have a mix of uniform, sampler and image modes so picking a single one of those modes for a variable no longer makes sense. This fixes OpLoad on a sampler within a struct which was previously using the variable mode to determine whether it was a sampler or not. The type of the variable is a struct so it was not being considered to be uniform mode even though the member being loaded should be sampler mode. The previous code appeared to be using var->interface_type as a place to store the type of the variable without the enclosing array for images and samplers. I guess this worked because opaque types can not appear in interfaces so the interface_type is sort of unused. This patch removes the overloading of var->interface_type and any places that needed the type without the array can now just deduce it from var->type. v2: squash in this patch the changes to anv/nir (Timothy) Signed-off-by: Eduardo Lima <elima@igalia.com> Signed-off-by: Neil Roberts <nroberts@igalia.com Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Nicolai Hähnle	23edc5b1ef	spirv: translate default-block uniforms They are supported by SPIR-V for ARB_gl_spirv. v2 (changes on top of Nicolai's original patch): * Handle UniformConstant storage class for uniforms other than samplers and images. (Eduardo Lima) * Handle location decoration also for samplers and images. (Eduardo Lima) * Rebase update (spirv_to_nir options added, logging changes, and others) (Alejandro Piñeiro) Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Eduardo Lima <elima@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Eduardo Lima Mitev	3d6664763d	nir/types: Add a utility wrapper to glsl_type::sampler_index() I think it is more accurate to call it a sampler target (?). Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Eduardo Lima Mitev	f1ab16cf17	nir/types: Add a glsl_get_component_slots() utility It is basically a wrapper around glsl_type::component_slots(). Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Eduardo Lima Mitev	2b8765b824	nir/lower_samplers: Limit assert to GLSL shader programs Vulkan has the concept of separate image and sampler objects in the SPIR-V code whereas GL conflates them into one. nir_lower_samplers contains an assert to verify that sampler operand is not being set on the nir instruction. However when the code comes from spirv_to_nir the sampler operand is always set. GL_arb_gl_spirv explicitly states that OpTypeSampler is not supported so it retains the GL behaviour of not being able to seperate them. Therefore the sampler will always be the same as the texture. This GL version of the lowering code ignores instr->sampler and sets instr->sampler_index to the same value as instr->texture_index. Some other places in the code (such as in nir_print) assume that once the instruction is lowered then both instr->texture and instr->sampler will be NULL, so to keep this behaviour we now set instr->sampler to NULL after ignoring it to fill in instr->sampler_index. Signed-off-by: Eduardo Lima <elima@igalia.com> Signed-off-by: Neil Roberts <nroberts@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Neil Roberts	652be1563f	nir: Add explicit_binding to nir_variable This is copied from the corresponding value in ir_variable. The intention is to eventually use it in a pure-NIR linker. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Alejandro Piñeiro	8d1ec2ed5a	mesa/main: add NULL name check when searching for a resource name Since ARB_gl_spirv name reflection can be missing. piglit shader_runner does several resource checking, so this commit is useful to get even the more simple piglit tests running without crashing on SPIR-V mode. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Alejandro Piñeiro	a6dc3d22eb	i965: use gl_shader_program_data::spirv Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Eduardo Lima Mitev	a940683733	mesa/main: Add a 'spirv' flag to gl_shader_program_data This will be used by the linker code to differentiate between programs made out of SPIR-V or GLSL shaders. This was rejected in the past, assuming that it was equivalent to check for "shProg->_LinkedShaders[stage]->spirv_data != NULL". But: * At some points of the linking process it would be needed to check if _LinkerShaders[stage] is present, so the full check would be: "shProg->_LinkedShaders[stage] != NULL && shProg->_LinkedShaders[stage]->spirv_data != NULL" * Sometimes you would like to do some specific to SPIR-V independently of the stage, or for any stage. For example, "link all the uniforms, for all stages". In that case checking for the flag would be equivalent to iterate all the _LinkedShaders and check if there is any spirv_data available. The former makes readibility really worse. Both could be solved by adding two helpers. But adding a flag seems really more simple and readable. v2: added justification for the flag on the commit message (Alejandro) Signed-off-by: Eduardo Lima <elima@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Emil Velikov	697254111b	docs/release-calendar: restore the missing 18.1 column Earlier commit removed the column, instead of adjusting the height. Cc: Dylan Baker <dylan@pnwbakers.com> Fixes: `0d4f338a11` ("docs: Update release-notes and calendar") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-06-21 12:09:39 +01:00
Emil Velikov	dfb1f2759c	configure: use compliant grep regex checks The current `grep "foo\\|bar"' trips on some grep implementations, like the FreeBSD one. Instead use `egrep "foo\|bar"' as suggested by Stefan. Cc: Stefan Esser <se@FreeBSD.org> Reported-by: Stefan Esser <se@FreeBSD.org> Bugzilla: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=228673 Fixes: `1914c814a6` ("configure: error out if building OMX w/o supported platform") Fixes: `63e11ac2b5` ("configure: error out if building VA w/o supported platform") Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-21 12:09:39 +01:00
Emil Velikov	d589eddc8b	glsl/tests/glcpp: reinstate "error out if no tests found" With the recent rework of converting the shell script to a python one the check for actual tests was dropped. Bring that back, since it was explicitly added considering we had a ~2 year period, during which the tests were not run. v2: use raise Exception() over print() & return false (Dylan) Fixes: `db8cd8e367` ("glcpp/tests: Convert shell scripts to a python script") Cc: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-21 12:09:39 +01:00
Emil Velikov	a2f5292c82	glsl/glcpp/tests: reinstate srcdir/abs_builddir blurb Bring back the "detection" of the said variables, to allow standalone execution. Fixes: `db8cd8e367` ("glcpp/tests: Convert shell scripts to a python script") Cc: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-06-21 12:09:39 +01:00
Emil Velikov	87cebace54	glsl: fold glcpp-test-cr-lf.sh into glcpp-test.sh As of recently both of these have been reworked so they invoke a python script. At the same time the latter can be executed with the combined arguments of both scripts. AKA we no longer need to have them separate. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-06-21 12:09:39 +01:00
Emil Velikov	1c1f70d12f	st/dri: constify dri_fill_st_visual's screen As the function says - only the visual is changed. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-21 12:09:39 +01:00
Emil Velikov	ccaa9f09cc	mesa: remove struct gl_extensions::ATI_separate_stencil Virtually every driver that supports ATI_separate_stencil also supports EXT_stencil_two_side. Use the latter boolean for both extension. With that in mind we can drop the explicit true from the drivers and the nasty comment in compute_version(). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-21 12:09:39 +01:00
Eric Engestrom	1714dfca8a	travis: add libXrandr and its randrproto dependency Fixes: `3f960c1338` "vulkan: EXT_acquire_xlib_display requires libXrandr headers to build" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-21 11:46:47 +01:00
Juan A. Suarez Romero	d24839be70	swr: bump minimum supported LLVM version to 5.0 RADV now requires LLVM 5.0 or greater, and thus we can't build dist tarball because swr requires LLVM 4.0. Let's bump required LLVM to 5.0 in swr too. Fixes: `f9eb1ef870` ("amd: remove support for LLVM 4.0") Cc: Tim Rowley <timothy.o.rowley@intel.com> Cc: Emil Velikov <emil.velikov@collabora.com> Cc: Dylan Baker <dylan@pnwbakers.com> Cc: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-06-21 12:16:46 +02:00
Grazvydas Ignotas	f966929805	radeonsi: add a debug flag to zero vram allocations This allows to avoid having to see garbage in Dying Light loading screen at least, which probably expects Windows/NV behavior of all allocations being zeroed by default. Analogous to radv flag with the same name. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-21 12:18:50 +03:00
Grazvydas Ignotas	4e0d93dc0e	radeonsi: use shifts for sign extension Avoids a branch and reduces code size a tiny bit: text data bss dec hex filename 10804563 398653 2070368 13273584 ca89f0 /tmp/radeonsi_dri.so.old 10804499 398653 2070368 13273520 ca89b0 /tmp/radeonsi_dri.so Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-21 12:17:34 +03:00
Samuel Pitoiset	af17a29ad8	radv: set EVENT_WRITE_EOP.INT_SEL = wait for write confirmation Ported from RadeonSI. Not sure why this is needed but AMDVLK does something similar. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-21 10:31:03 +02:00
Samuel Pitoiset	41f6096c26	radv: use EOP_DATA_SEL_* instead of magic numbers Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-21 10:31:02 +02:00
Roland Scheidegger	53959fcbd8	r600: fix copy/paste bug for sampleMaskIn workaround The sampleMaskIn workaround (`b936f4d1ca`) tries to figure out if the shader is running at per-sample frequency, but there's a typo bug so it will only recognize per-sample linar inputs, not per-sample perspective ones. Spotted by Eric Engestrom <eric.engestrom@intel.com> Fixes: b936f4d1ca0d2ab1e828a "r600: partly fix sampleMaskIn value"	2018-06-21 02:37:11 +02:00
Eric Anholt	edb7890750	v3d: Fix min vs mag determination when not doing mip filtering. Fixes all 128 failing tests in dEQP-GLES3.functional.texture.filtering.*.combinations	2018-06-20 12:31:54 -07:00
Keith Packard	3f960c1338	vulkan: EXT_acquire_xlib_display requires libXrandr headers to build When VK_USE_PLATFORM_XLIB_XRANDR_EXT is defined, vulkan.h includes X11/extensions/Xrandr.h for the RROutput typedef which is used in the vkGetRandROutputDisplayEXT interface. Make sure we have the required header by checking during the build, and also set CFLAGS to point at the right directory. We don't need to link against the library as we don't use any functions from there, so don't add the _LIBS value in the autotools build. Signed-off-by: Keith Packard <keithp@keithp.com> Fixes: `dbac8e25f8` "radv: Add EXT_acquire_xlib_display to radv driver [v2]" Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-20 10:42:05 -07:00
Eric Anholt	f49d112a01	v3d: Implement ALPHA_TO_COVERAGE. There's a convenient "FTOC" instruction for generating the coverage now, unlike vc4. This fixes dEQP-GLES3.functional.multisample.fbo_4_samples.proportionality_alpha_to_coverage	2018-06-20 09:30:46 -07:00
Eric Anholt	94f7c011d6	v3d: Track write reference to the separate stencil buffer. Otherwise, a blit from separate stencil may fail to flush the job that initialized it, or new drawing could fail to flush a blit reading from stencil. Fixes: dEQP-GLES3.functional.fbo.blit.depth_stencil.depth32f_stencil8_basic dEQP-GLES3.functional.fbo.blit.depth_stencil.depth32f_stencil8_scale dEQP-GLES3.functional.fbo.blit.depth_stencil.depth32f_stencil8_stencil_only dEQP-GLES3.functional.fbo.msaa.2_samples.depth32f_stencil8 dEQP-GLES3.functional.fbo.msaa.4_samples.depth32f_stencil8	2018-06-20 09:30:46 -07:00
Eric Anholt	a52c357a65	v3d: Add missing reference to the separate stencil buffer. Noticed while debugging a missing flush of rendering in the z32f_s8 case.	2018-06-20 09:30:46 -07:00
Eric Anholt	1334295f29	v3d: Fix return value from fence_finish. We needed to convert from a -errno to a boolean success value. Fixes: GTF-GLES3.gtf.GL3Tests.sync.sync_functionality_clientwaitsync_flush GTF-GLES3.gtf.GL3Tests.sync.sync_functionality_clientwaitsync_signaled	2018-06-20 09:30:46 -07:00
Christian Gmeiner	8b3099353e	mesa/st: only do scalar lowerings if driver benefits As not every (upcoming) backend compiler is happy with nir_lower_xxx_to_scalar lowerings do them only if the backend is scalar (and not vec4) based. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-06-20 17:56:37 +02:00
Christian Gmeiner	f485e5671c	gallium: add scalar isa shader cap v1 -> v2: - nv30 is _NOT_ scalar as suggested by Ilia Mirkin. - Change from a screen cap to a shader cap as suggested by Eric Anholt. - radeonsi is scalar as suggested by Marek Olšák. - Change missing ones to be scalar. v2 -> v3: - r600 prefers vec4 as suggested by Marek Olšák. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-20 17:55:39 +02:00
Keith Packard	050d8a4b42	radv: Add VK_EXT_display_surface_counter to radv driver This extension is required to support EXT_display_control as it offers a way to query whether the vblank counter is supported. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-20 08:16:45 -07:00
Keith Packard	1801d7c73c	anv: Add VK_EXT_display_surface_counter to anv driver [v2] This extension is required to support EXT_display_control as it offers a way to query whether the vblank counter is supported. v2: Add extension to list in alphabetical order Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-20 08:16:34 -07:00
Jason Ekstrand	b1a013d035	Vulkan/wsi: Implement VK_EXT_display_surface_counter This extension is required to support EXT_display_control as it offers a way to query whether the vblank counter is supported. Internally, it is implemented using a fake MESA extension which provides a chain-in to GetSurfaceCapabilities2KHR which contains the one added field. This has the advantage of reducing number of callbacks needed in the back-ends. It also means that anything chained into GetSurfaceCapabilities2EXT through VkSurfaceCapabilities2KHR::pNext so we only need to handle crawling the pNext chain once per back-end. Reviewed-by: Keith Packard <keithp@keithp.com>	2018-06-20 08:16:03 -07:00
Jason Ekstrand	8f3b58ebee	vulkan/wsi: Get rid of the get_capabilities hook Instead, we can just use get_capabilities2. This way back-ends only have to implement one hook. Reviewed-by: Keith Packard <keithp@keithp.com>	2018-06-20 08:16:03 -07:00
Eric Engestrom	7f3cb7db08	intel/aubinator: drop unused functions Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-06-20 15:17:26 +01:00
Samuel Pitoiset	65b3fed037	radv: always initialize the clear depth/stencil values to 0 Similar to the clear color values. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-20 13:21:42 +02:00
Samuel Pitoiset	204cf5714a	radv: always initialize the clear color values to 0 Having random data in there is probably not the best. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-20 13:21:42 +02:00
Samuel Pitoiset	4b564bd612	radv: always initialize the DCC predicate to FALSE This might eventually skip some useless DCC decompression passes. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-20 13:21:42 +02:00
Samuel Pitoiset	70c1bee187	radv: do not use an user SGPR for the sample position offset We know the number of samples at compile time. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-20 13:21:42 +02:00
Samuel Pitoiset	20170865db	radv: don't store the number of samples as log2 Needed for the following patch. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-20 13:21:42 +02:00
Gert Wollny	8a6e3f0c5d	gallium/aux/util/u_cpu_detect.h: Fix -Wsign-compare warning in u_cpu_detect.c Change the type of util_cpu_caps::nr_cpus to int because sysconfig returns a signed value, fixes: u_cpu_detect.c: In function 'util_cpu_detect': u_cpu_detect.c:317:30: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (util_cpu_caps.nr_cpus == -1) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	33f4e8a043	gallium/aux/util/u_debug.h: Fix "noreturn" warnings in debug mode Only decorate function as noreturn when DEBUG is not defined, because when compiled in DEBUG mode the function actually executes an int3 and may return, fixes: u_debug.c: In function '_debug_assert_fail': u_debug.c:309:1: warning: 'noreturn' function does return Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	70f632962a	gallium/aux/util: Fix some warnings util/u_cpu_detect.c: In function 'util_cpu_detect': util/u_cpu_detect.c:377:30: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (util_cpu_caps.nr_cpus == ~0u) ^~ util/u_hash_table.c:274:21: warning: unused parameter 'k' [-Wunused- parameter] util_hash_inc(void k, void v, void d) ^ util/u_hash_table.c:274:30: warning: unused parameter 'v' [-Wunused- parameter] util_hash_inc(void k, void v, void d) ^ util/u_tests.c: In function 'test_texture_barrier': util/u_tests.c:652:25: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (int i = 0; i < num_samples / 2; i++) { ^ Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	3e091d5a7a	gallium/aux/tgsi_ureg.c: remove unused parameter from match_or_expand_immediate64 remove "type" from "match_or_expand_immediate64", fixes: tgsi/tgsi_ureg.c: In function 'match_or_expand_immediate64': tgsi/tgsi_ureg.c:837:34: warning: unused parameter 'type' [-Wunused- parameter] int type, ^~~~ Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	f79b980486	gallium/aux/tgsi_two_side.c: Fix -Wsign-compare warnings Integer propagation rules can sometimes be irritating. With "unsigned x" "x + 1" gets propagated to a signed integer, so explicitely assign the sum to an unsigned and use that for comaprison. In file included from tgsi/tgsi_two_side.c:41:0: tgsi/tgsi_two_side.c: In function 'xform_decl': ./util/u_math.h:660:29: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] #define MAX2( A, B ) ( (A)>(B) ? (A) : (B) ) ^ tgsi/tgsi_two_side.c:86:24: note: in expansion of macro 'MAX2' ts->num_inputs = MAX2(ts->num_inputs, decl->Range.Last + 1); ^~~~ ./util/u_math.h:660:40: warning: signed and unsigned type in conditional expression [-Wsign-compare] #define MAX2( A, B ) ( (A)>(B) ? (A) : (B) ) ^ tgsi/tgsi_two_side.c:86:24: note: in expansion of macro 'MAX2' ts->num_inputs = MAX2(ts->num_inputs, decl->Range.Last + 1); ^~~~ ./util/u_math.h:660:29: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] #define MAX2( A, B ) ( (A)>(B) ? (A) : (B) ) ^ tgsi/tgsi_two_side.c:89:23: note: in expansion of macro 'MAX2' ts->num_temps = MAX2(ts->num_temps, decl->Range.Last + 1); ^~~~ ./util/u_math.h:660:40: warning: signed and unsigned type in conditional expression [-Wsign-compare] #define MAX2( A, B ) ( (A)>(B) ? (A) : (B) ) ^ tgsi/tgsi_two_side.c:89:23: note: in expansion of macro 'MAX2' ts->num_temps = MAX2(ts->num_temps, decl->Range.Last + 1); ^~~~ tgsi/tgsi_two_side.c: In function 'xform_inst': tgsi/tgsi_two_side.c:184:45: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (inst->Src[i].Register.Index == ts- >front_color_input[j]) { ^~ Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	dc5ba7e17c	gallium/aux/tgsi_ureg.c: Fix various warnings tgsi/tgsi_ureg.c: In function 'ureg_DECL_sampler': tgsi/tgsi_ureg.c:721:34: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (ureg->sampler[i].Index == nr) ^~ tgsi/tgsi_ureg.c: In function 'match_or_expand_immediate64': tgsi/tgsi_ureg.c:837:34: warning: unused parameter 'type' [-Wunused- parameter] int type, ^~~~ tgsi/tgsi_ureg.c: In function 'emit_decls': tgsi/tgsi_ureg.c:1821:31: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (ureg->properties[i] != ~0) ^~ tgsi/tgsi_ureg.c: In function 'ureg_create_with_screen': tgsi/tgsi_ureg.c:2193:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < ARRAY_SIZE(ureg->properties); i++) ^ Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	c5e8280504	gallium/aux/tgsi_text.c: Fix -Wsign-compare warnings tgsi/tgsi_text.c: In function 'parse_identifier': tgsi/tgsi_text.c:218:16: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (i == len - 1) ^~ tgsi/tgsi_text.c: In function 'parse_optional_swizzle': tgsi/tgsi_text.c:873:21: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < components; i++) { ^ tgsi/tgsi_text.c: In function 'parse_instruction': tgsi/tgsi_text.c:1103:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < info->num_dst + info->num_src + info->is_tex; i++) { ^ tgsi/tgsi_text.c:1118:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] else if (i < info->num_dst + info->num_src) { ^ tgsi/tgsi_text.c: In function 'parse_immediate': tgsi/tgsi_text.c:1660:24: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (type = 0; type < ARRAY_SIZE(tgsi_immediate_type_names); ++type) { ^ Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	b16b6d0889	gallium/aux/tgsi_point_sprite.c: Fix -Wsign-compare warnings tgsi/tgsi_lowering.c: In function 'emit_twoside': tgsi/tgsi_lowering.c:1179:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < ctx->two_side_colors; i++) { ^ tgsi/tgsi_lowering.c:1208:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < ctx->two_side_colors; i++) { ^ tgsi/tgsi_lowering.c:1216:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < ctx->two_side_colors; i++) { ^ tgsi/tgsi_lowering.c: In function 'emit_decls': tgsi/tgsi_lowering.c:1280:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < ctx->numtmp; i++) { ^ tgsi/tgsi_lowering.c: In function 'rename_color_inputs': tgsi/tgsi_lowering.c:1311:28: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (src->Index == ctx->two_side_idx[j]) { ^~ Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	3792d85755	gallium/aux/tgsi_lowering.c: Fix -Wsign-compare warnings tgsi/tgsi_lowering.c: In function 'emit_twoside': tgsi/tgsi_lowering.c:1179:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < ctx->two_side_colors; i++) { ^ tgsi/tgsi_lowering.c:1208:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < ctx->two_side_colors; i++) { ^ tgsi/tgsi_lowering.c:1216:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < ctx->two_side_colors; i++) { ^ tgsi/tgsi_lowering.c: In function 'emit_decls': tgsi/tgsi_lowering.c:1280:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < ctx->numtmp; i++) { ^ tgsi/tgsi_lowering.c: In function 'rename_color_inputs': tgsi/tgsi_lowering.c:1311:28: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (src->Index == ctx->two_side_idx[j]) { ^~ Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	7a3daaab41	gallium/aux/tgsi_build.c: Fix -Wsign-compare warnings tgsi/tgsi_build.c: In function 'tgsi_build_full_immediate': tgsi/tgsi_build.c:622:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for( i = 0; i < full_imm->Immediate.NrTokens - 1; i++ ) { ^ tgsi/tgsi_build.c: In function 'tgsi_build_full_property': tgsi/tgsi_build.c:1393:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for( i = 0; i < full_prop->Property.NrTokens - 1; i++ ) { ^ Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	94f40d3ac0	gallium/aux/tgsi_build.c: Remove now unused variable Removing the unused prev_tocken from the function calls made this local variable also unused. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	dc46b2aa99	gallium/aux/tgsi_build.c: Remove unused parameters prev_token from various functions remove parameter prev_token unused in tgsi_build_instruction_label tgsi_build_instruction_texture tgsi_build_instruction_memory tgsi_build_texture_offset This fixes the following warnings: tgsi/tgsi_build.c: In function 'tgsi_build_instruction_label': tgsi/tgsi_build.c:716:24: warning: unused parameter 'prev_token' [- Wunused-parameter] struct tgsi_token prev_token, ^~~~~~~~~~ tgsi/tgsi_build.c: In function 'tgsi_build_instruction_texture': tgsi/tgsi_build.c:749:23: warning: unused parameter 'prev_token' [- Wunused-parameter] struct tgsi_token prev_token, ^~~~~~~~~~ tgsi/tgsi_build.c: In function 'tgsi_build_instruction_memory': tgsi/tgsi_build.c:784:23: warning: unused parameter 'prev_token' [- Wunused-parameter] struct tgsi_token prev_token, ^~~~~~~~~~ tgsi/tgsi_build.c: In function 'tgsi_build_texture_offset': tgsi/tgsi_build.c:819:23: warning: unused parameter 'prev_token' [- Wunused-parameter] struct tgsi_token prev_token, ^~~~~~~~~~ Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	f06194b012	gallium/aux/tgsi_exec.c: Fix various -Wsign-compare tgsi/tgsi_exec.c: In function 'exec_tex': tgsi/tgsi_exec.c:2254:46: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] assert(shadow_ref >= dim && shadow_ref < ARRAY_SIZE(args)); ^ ./util/u_debug.h:189:30: note: in definition of macro 'debug_assert' #define debug_assert(expr) ((expr) ? (void)0 : _debug_assert_fail(#expr, __FILE__, __LINE__, __FUNCTION__)) ^~~~ tgsi/tgsi_exec.c:2254:7: note: in expansion of macro 'assert' assert(shadow_ref >= dim && shadow_ref < ARRAY_SIZE(args)); ^~~~~~ tgsi/tgsi_exec.c:2290:23: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = dim; i < ARRAY_SIZE(args); i++) ^ In file included from ./util/u_memory.h:39:0, from tgsi/tgsi_exec.c:62: tgsi/tgsi_exec.c: In function 'exec_lodq': tgsi/tgsi_exec.c:2357:15: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] assert(dim <= ARRAY_SIZE(coords)); ^ ./util/u_debug.h:189:30: note: in definition of macro 'debug_assert' #define debug_assert(expr) ((expr) ? (void)0 : _debug_assert_fail(#expr, __FILE__, __LINE__, __FUNCTION__)) ^~~~ tgsi/tgsi_exec.c:2357:4: note: in expansion of macro 'assert' assert(dim <= ARRAY_SIZE(coords)); ^~~~~~ tgsi/tgsi_exec.c:2363:20: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = dim; i < ARRAY_SIZE(coords); i++) { ^ Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	a7cbb9ba46	gallium/aux/tgsi_exec.c: remove superfluous parameter from etch_source_d Remove unused parameter src_datatype from fetch_source_d, fixes warning; tgsi/tgsi_exec.c: In function 'fetch_source_d': tgsi/tgsi_exec.c:1594:40: warning: unused parameter 'src_datatype' [-Wunused-parameter] enum tgsi_exec_datatype src_datatype) ^~~~~~~~~~~~ Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	5fe1b3b848	gallium/aux/tgsi_exec.c: remove superfluous parameter from store_dest_dstret remove unused parameter inst from store_dest_dstret (and consequently also from store_dest_double), fixes warning: tgsi/tgsi_exec.c: In Funktion »store_dest_dstret«: tgsi/tgsi_exec.c:1765:47: Warning: unused parameter »inst« [-Wunused-parameter] const struct tgsi_full_instruction *inst) ^~~~ Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	c9b53c6410	gallium/aux/tgsi_exec.c: Remove unused parameter from fetch_src_file_channel remove unused parameter chan_index from fetch_src_file_channel, fixes warning: tgsi/tgsi_exec.c: In Funktion »fetch_src_file_channel«: tgsi/tgsi_exec.c:1480:35: Warning: unused parameter »chan_index« [-Wunused-parameter] const uint chan_index, ^~~~~~~~~~ Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	38a9b42d8e	gallium/aux/tgsi_exec.c: Remove paramater inst from exec_kill Fixes warning: tgsi/tgsi_exec.c: In Funktion »exec_kill«: tgsi/tgsi_exec.c:2049:47: Warning: unused parameter »inst« [-Wunused-parameter] const struct tgsi_full_instruction *inst) ^~~~ Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	b8fca73e47	gallium/aux/tgsi_aa_point.c: Fix -Wsign-compare warnings tgsi/tgsi_aa_point.c:32:0: tgsi/tgsi_aa_point.c: In Funktion »aa_decl«: ./util/u_math.h:660:29: Comparison between signed and unsigned in conditional expressions [-Wsign-compare] #define MAX2( A, B ) ( (A)>(B) ? (A) : (B) ) ^ tgsi/tgsi_aa_point.c:76:21: Remark: when substituting of the macro »MAX2« ts->num_tmp = MAX2(ts->num_tmp, decl->Range.Last + 1); ^~~~ ./util/u_math.h:660:40: Warning: signed and unsigned type in conditional expression [-Wsign-compare] #define MAX2( A, B ) ( (A)>(B) ? (A) : (B) ) ^ tgsi/tgsi_aa_point.c:76:21: Remark: when substituting of the macro »MAX2« ts->num_tmp = MAX2(ts->num_tmp, decl->Range.Last + 1); ^~~~ tgsi/tgsi_aa_point.c: In Funktion »aa_inst«: tgsi/tgsi_aa_point.c:220:31: Comparison between signed and unsigned in conditional expressions [-Wsign-compare] dst->Register.Index == ts->color_out) { Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	09b3b37b95	gallium/aux/tgsi_sanity.c: Fix -Wsign-compare warnings tgsi_sanity.c: In function 'iter_instruction': tgsi_sanity.c:316:29: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (ctx->index_of_END != ~0) { ^~ tgsi_sanity.c: In function 'epilog': tgsi_sanity.c:488:26: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (ctx->index_of_END == ~0) { Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	bf6b695a90	gallium/aux/tgsi/tgsi_parse.c: Fix two warnings tgsi_parse.c: In function 'tgsi_parse_free': tgsi_parse.c:54:31: warning: unused parameter 'ctx' [-Wunused-parameter] struct tgsi_parse_context *ctx ) ^~~ tgsi_parse.c: In function 'tgsi_parse_end_of_tokens': tgsi_parse.c:62:25: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] return ctx->Position >= Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	fc9e259e58	gallium/aux/tgsi/tgsi_dump.c: Fix -Wsign-compare warnings tgsi_dump.c: In function 'iter_property': tgsi_dump.c:443:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < prop->Property.NrTokens - 1; ++i) { ^ tgsi_dump.c:459:13: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (i < prop->Property.NrTokens - 2) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	03ac9708cf	gallium/aux/cso_cache: Fix various warnings cso_cache.c: In Function »delete_blend_state«: cso_cache/cso_cache.c:90:51: Warning: unused parameter »data« [-Wunused- parameter] static void delete_blend_state(void state, void data) ^~~~ cso_cache/cso_cache.c: In Funktion »delete_depth_stencil_state«: cso_cache/cso_cache.c:98:59: Warning: unused parameter »data« [-Wunused- parameter] static void delete_depth_stencil_state(void state, void data) ^~~~ cso_cache/cso_cache.c: In Funktion »delete_sampler_state«: cso_cache/cso_cache.c:106:53: Warning: unused parameter »data« [- Wunused-parameter] static void delete_sampler_state(void state, void data) ^~~~ cso_cache/cso_cache.c: In Funktion »delete_rasterizer_state«: cso_cache/cso_cache.c:114:56: Warning: unused parameter »data« [- Wunused-parameter] static void delete_rasterizer_state(void state, void data) ^~~~ cso_cache/cso_cache.c: In Funktion »delete_velements«: cso_cache/cso_cache.c:122:49: Warning: unused parameter »data« [- Wunused-parameter] static void delete_velements(void state, void data) ^~~~ cso_cache/cso_cache.c: In Funktion »sanitize_cb«: cso_cache/cso_cache.c:166:52: Warning: unused parameter »user_data« [- Wunused-parameter] int max_size, void user_data) ^~~~~~~~~ gallium/aux/cso_context.c: a -Wunused-parameter warning cso_cache/cso_context.c: In Funktion »delete_sampler_state«: cso_cache/cso_context.c:163:57: Warning: unused parameter »ctx« [- Wunused-parameter] static boolean delete_sampler_state(struct cso_context ctx, void *state) ^~~ Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	81e5bf3cfe	configure.ac: Add CFLAG -Wno-missing-field-initializers (v5) This warning is misleading: When a struct is partially initialized without assigning to the structure members by name, then the remaining fields will be zeroed out, and this warning will be issued (if enabled). If, on the other hand, the partial initialization is done by assigning to named members, the remaining structure elements may hold random data, but the warning is not issued. Since in Mesa the first approach to initialize structure elements is used very often, and it is usually assumed that the remaining elements are zeroed out, heeding this warning would be counter-productive. v2: - add -Wno-missing-field-initializers to meson-build - fix empty line error (both Eric Engestrom) v3: * check for -Wmissing-field-initializers warning and then disable it because gcc and clang always accept -Wno-* (Dylan Baker) * Also disable this warning for C++ v4: * meson.build add -Wno-missing-field-initializers to c_args instead of no_override_init_args (Eric Engstrom) v5: * configure.ac: Correct copy/paste error with CFLAGS/CXXFLAGS Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1) Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v2) Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Signed-off-by: Gert Wollny <gert.wollny@collabora.com>	2018-06-20 11:08:28 +02:00
Samuel Pitoiset	916dda5cf7	radv: remove unnecessary code around CACHE_FLUSH_AND_INV_TS_EVENT AMDVLK also always uses CACHE_FLUSH_AND_INV_TS_EVENT. The other workaround is to flush DB metadata after emitting the framebuffer, but that seems slower. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-20 10:08:37 +02:00
Bas Nieuwenhuizen	4705a5dfda	radv: Fix flush_bits being used uninitialized. A case of making things worse while trying to fix something minor ... Fixes: `ef79457004` "radv: Merge the flush bits of CMASK & DCC clear." Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-06-20 10:02:39 +02:00
Keith Packard	dbac8e25f8	radv: Add EXT_acquire_xlib_display to radv driver [v2] This extension adds the ability to borrow an X RandR output for temporary use directly by a Vulkan application to the radv driver. v2: Simplify addition of VK_USE_PLATFORM_XLIB_XRANDR_KHR to vulkan_wsi_args Suggested-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-19 14:17:46 -07:00
Keith Packard	46090a642d	anv: Add EXT_acquire_xlib_display to anv driver [v3] This extension adds the ability to borrow an X RandR output for temporary use directly by a Vulkan application to the anv driver. v2: Simplify addition of VK_USE_PLATFORM_XLIB_XRANDR_KHR to vulkan_wsi_args Suggested-by: Eric Engestrom <eric.engestrom@imgtec.com> v3: Add extension to list in alphabetical order Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-19 14:17:46 -07:00
Keith Packard	7ab1fffcd2	vulkan: Add EXT_acquire_xlib_display [v5] This extension adds the ability to borrow an X RandR output for temporary use directly by a Vulkan application. For DRM, we use the Linux resource leasing mechanism. v2: Clean up xlib_lease detection * Use separate temporary '_xlib_lease' variable to hold the option value to avoid changin the type of a variable. * Use boolean expressions instead of additional if statements to compute resulting with_xlib_lease value. * Simplify addition of VK_USE_PLATFORM_XLIB_XRANDR_KHR to vulkan_wsi_args Suggested-by: Eric Engestrom <eric.engestrom@imgtec.com> Move mode list from wsi_display to wsi_display_connector Fix scope for wsi_display_mode and wsi_display_connector allocs Suggested-by: Jason Ekstrand <jason@jlekstrand.net> v3: Adopt Jason Ekstrand's coding conventions Declare variables at first use, eliminate extra whitespace between types and names. Wrap lines to 80 columns. Explicitly forbid multiple DRM leases. Making the code support this looks tricky and will require additional thought. Use xcb_randr_output_t throughout the internals of the implementation. Convert at the public API (wsi_get_randr_output_display). Clean up check for usable active_crtc (possible when only the desired output is connected to the crtc). Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> v4: Move output resource fetching closer to use in wsi_display_get_output. This simplifies the error returns in earlier parts of the code a bit. Return VK_ERROR_INITIALIZATION_FAILED from wsi_acquire_xlib_display. Jason says this is the right error message. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> v5: randr doesn't pass vscan over the wire, so we set vscan to 0 for randr-acquired modes, and test wsi modes for vscan <= 1 when comparing against randr modes. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-19 14:17:46 -07:00
Keith Packard	5a2efefb0a	radv: Add EXT_direct_mode_display to radv driver Add support for the EXT_direct_mode_display extension. This just provides the vkReleaseDisplayEXT function. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-19 14:17:46 -07:00
Keith Packard	f89d3874fb	anv: Add EXT_direct_mode_display to anv driver [v2] Add support for the EXT_direct_mode_display extension. This just provides the vkReleaseDisplayEXT function. v2: Add extension to list in alphabetical order Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-19 14:17:46 -07:00
Keith Packard	352d320a07	vulkan: Add EXT_direct_mode_display [v2] Add support for the EXT_direct_mode_display extension. This just provides the vkReleaseDisplayEXT function. v2: Adopt Jason Ekstrand's coding conventions Declare variables at first use, eliminate extra whitespace between types and names. Wrap lines to 80 columns. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-19 14:17:46 -07:00
Keith Packard	451b58a51e	radv: Add KHR_display extension to radv [v5] This adds support for the KHR_display extension to the radv Vulkan driver. The driver now attempts to open the master DRM node when the KHR_display extension is requested so that the common winsys code can perform the necessary operations. v2: * Simplify addition of VK_USE_PLATFORM_DISPLAY_KHR to vulkan_wsi_args Suggested-by: Eric Engestrom <eric.engestrom@imgtec.com> v3: Adapt to new wsi_device_init API (added display_fd) v4: Adopt Jason Ekstrand's coding conventions Declare variables at first use, eliminate extra whitespace between types and names. Wrap lines to 80 columns. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> v5: Add vkCreateDisplayModeKHR. This doesn't actually create new modes, it only looks to see if the requested parameters matches an existing mode and returns that. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-19 14:17:46 -07:00
Keith Packard	54d0daa481	anv: Add KHR_display extension to anv [v7] This adds support for the KHR_display extension to the anv Vulkan driver. The driver now attempts to open the master DRM node when the KHR_display extension is requested so that the common winsys code can perform the necessary operations. v2: Make sure primary fd is usable When KHR_display is selected, we try to open the primary node instead of the render node in case the user wants to use KHR_display for presentation. However, if we're actually going to end up using RandR leases, then we don't care if the resulting fd can't be used for display, but the kernel also prevents us from using it for drawing when someone else has master. v3: Simplify addition of VK_USE_PLATFORM_DISPLAY_KHR to vulkan_wsi_args Suggested-by: Eric Engestrom <eric.engestrom@imgtec.com> v4: Adapt primary node usage to new wsi_device_init API v5: Adopt Jason Ekstrand's coding conventions Declare variables at first use, eliminate extra whitespace between types and names. Wrap lines to 80 columns. Remove spurious MM_PER_PIXEL define Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> v6: Open DRM master before initializing WSI layer. The DRM master FD is passed to the WSI layer during initialization, so we need to open the device slightly earlier in the function. Close DRM master in device_finish. Use anv_gem_get_param to detect working master_fd instead of directly using the ioctl. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> v7: Add vkCreateDisplayModeKHR. This doesn't actually create new modes, it only looks to see if the requested parameters matches an existing mode and returns that. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-19 14:17:46 -07:00
Keith Packard	da997ebec9	vulkan: Add KHR_display extension using DRM [v10] This adds support for the KHR_display extension support to the vulkan WSI layer. Driver support will be added separately. v2: * fix double ;; in wsi_common_display.c * Move mode list from wsi_display to wsi_display_connector * Fix scope for wsi_display_mode andwsi_display_connector allocs * Switch all allocations to vk_zalloc instead of vk_alloc. * Fix DRM failure in wsi_display_get_physical_device_display_properties When DRM fails, or when we don't have a master fd (presumably due to application errors), just return 0 properties from this function, which is at least a valid response. * Use vk_outarray for all property queries This is a bit less error-prone than open-coding the same stuff. * Remove VK_COMPOSITE_ALPHA_INHERIT_BIT_KHR from surface caps Until we have multi-plane support, we shouldn't pretend to have any multi-plane semantics, even if undefined. Suggested-by: Jason Ekstrand <jason@jlekstrand.net> * Simplify addition of VK_USE_PLATFORM_DISPLAY_KHR to vulkan_wsi_args Suggested-by: Eric Engestrom <eric.engestrom@imgtec.com> v3: Add separate 'display_fd' and 'render_fd' arguments to wsi_device_init API. This allows drivers to use different FDs for the different aspects of the device. Use largest mode as display size when no preferred mode. If the display doesn't provide a preferred mode, we'll assume that the largest supported mode is the "physical size" of the device and report that. v4: Make wsi_image_state enumeration values uppercase. Follow more common mesa conventions. Remove 'render_fd' from wsi_device_init API. The wsi_common_display code doesn't use this fd at all, so stop passing it in. This avoids any potential confusion over which fd to use when creating display-relative object handles. Remove call to wsi_create_prime_image which would never have been reached as the necessary condition (use_prime_blit) is never set. whitespace cleanups in wsi_common_display.c Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Add depth/bpp info to available surface formats. Instead of hard-coding depth 24 bpp 32 in the drmModeAddFB call, use the requested format to find suitable values. Destroy kernel buffers and FBs when swapchain is destroyed. We were leaking both of these kernel objects across swapchain destruction. Note that wsi_display_wait_for_event waits for anything to happen. wsi_display_wait_for_event is simply a yield so that the caller can then check to see if the desired state change has occurred. Record swapchain failures in chain for later return. If some asynchronous swapchain activity fails, we need to tell the application eventually. Record the failure in the swapchain and report it at the next acquire_next_image or queue_present call. Fix error returns from wsi_display_setup_connector. If a malloc failed, then the result should be VK_ERROR_OUT_OF_HOST_MEMORY. Otherwise, the associated ioctl failed and we're either VT switched away, or our lease has been revoked, in which case we should return VK_ERROR_OUT_OF_DATE_KHR. Make sure both sides of if/else brace use matches Note that we assume drmModeSetCrtc is synchronous. Add a comment explaining why we can idle any previous displayed image as soon as the mode set returns. Note that EACCES from drmModePageFlip means VT inactive. When vt switched away drmModePageFlip returns EACCES. Poll once a second waiting until we get some other return value back. Clean up after alloc failure in wsi_display_surface_create_swapchain. Destroy any created images, free the swapchain. Remove physical_device from wsi_display_init_wsi. We never need this value, so remove it from the API and from the internal wsi_display structure. Use drmModeAddFB2 in wsi_display_image_init. This takes a drm format instead of depth/bpp, which provides more control over the format of the data. v5: Set the 'currentStackIndex' member of the VkDisplayPlanePropertiesKHR record to zero, instead of indexing across all displays. This value is the stack depth of the plane within an individual display, and as the current code supports only a single plane per display, should be set to zero for all elements Discovered-by: David Mao <David.Mao@amd.com> v6: Remove 'platform_display' bits from the build and use the existing 'platform_drm' instead. v7: Ensure VK_ICD_WSI_PLATFORM_MAX is large enough by setting to VK_ICD_WSI_PLATFORM_DISPLAY + 1 v8: Simplify wsi_device_init failure from wsi_display_init_wsi by using the same pattern as the other wsi layers. Adopt Jason Ekstrand's white space and variable declaration suggestions. Declare variables at first use, eliminate extra whitespace between types and names, add list iterator helpers, switch to lower-case list_ macros. Respond to Jason's April 8 review: * Create a function to convert relative to absolute timeouts to catch overflow issues in one place * use VK_NULL_HANDLE to clear prop->currentDisplay * Get rid of available_present_modes array. * return OUT_OF_DATE_KHR when display_queue_next called after display has been released. * Make errors from mode setting fatal in display_queue_next * Remove duplicate pthread_mutex_init call * Add wsi_init_pthread_cond_monotonic helper function to isolate pthread error handling from wsi_display_init_wsi Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> v9: Fix vscan handling by using MAX2(vscan, 1) everywhere. Vscan can be zero anywhere, which is treated the same as 1. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> v10: Respond to Vulkan CTS failures. 1. Initialize planeReorderPossible in display_properties code 2. Only report connected displays in get_display_plane_supported_displays 3. Return VK_ERROR_OUT_OF_HOST_MEMORY when pthread cond initialization fails. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> 4. Add vkCreateDisplayModeKHR. This doesn't actually create new modes, it only looks to see if the requested parameters matches an existing mode and returns that. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Keith Packard <keithp@keithp.com>	2018-06-19 14:17:46 -07:00
Bas Nieuwenhuizen	ef79457004	radv: Merge the flush bits of CMASK & DCC clear. Probably won't be much different in practice, but still wrong. Fixes Coverity issue 1435002. Not CC'ing to stable since this is only hit if you enable MSAA DCC via RADV_DEBUG. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-06-19 22:35:13 +02:00
Bas Nieuwenhuizen	ed06b1cdca	radv: Don't check for pipeline being set in draw. Draws without pipeline are definitely not allowed. Fixes Coverity issue 1434216. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-06-19 22:35:13 +02:00
Marek Olšák	1ba87f4438	radeonsi: rename r600_texture -> si_texture, rxxx -> xxx or sxxx Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-19 13:08:50 -04:00
Marek Olšák	6703fec58c	amd,radeonsi: rename radeon_winsys_cs -> radeon_cmdbuf Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-19 13:08:50 -04:00
Rob Clark	39b4fdc45f	freedreno/a5xx: move emit_marker5() into a5xx backend The scratch registers move again in a6xx.. so for post-a4xx let's just move this into the backend, and move the one place it used to be needed in core into fd5_emit_ib(). For a6xx we will do similar, calling emit_marker6() from fd6_emit_ib(). Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-19 13:02:28 -04:00
Rob Clark	0c8d9e923a	freedreno/a5xx: fix crash in dEQP-GLES31.stress.vertex_attribute_binding.buffer_bounds.bind_vertex_buffer_offset_near_wrap_10 This is kind of a hack, but really the only problem is the debug_assert() in OUT_RELOC(). But the debug_assert() is useful to catch real issues. So just add some #ifdef DEBUG code to filter things out before we hit the assert. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-19 13:02:28 -04:00
Rob Clark	4a41b02d46	freedreno/a5xx: don't crash if compute shader compile fails It is impolite, and a bit annoying with dEQP (all tests running in single process). Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-19 13:02:28 -04:00
Rob Clark	658f1f6003	freedreno/ir3: fix missing recursion into block condition Fixes a problem seen with dEQP-GLES31.functional.ssbo.layout.single_basic_array.shared.row_major_mat4 Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-19 13:02:28 -04:00
Rob Clark	1a6150207c	freedreno/a5xx: better FOUR_QUAD/TWO_QUAD decision for compute If we aren't going to get full occupancy, then use TWO_QUAD. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-19 13:02:28 -04:00
Rob Clark	f07154421a	freedreno/a5xx: bordercolor fixes Need a bit of hand-holding for stencil bordercolor, and add border color values for sRGB. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-19 13:02:28 -04:00
Rob Clark	ced14f1c7a	freedreno: remove per-stateobj dirty_mask's These never got updated in fd_context_all_dirty() so actually trying to rely on them (in the case of fd5_emit_images()) ends up in some cases where state is not emitted but should be. Best to just rip this out. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-19 13:02:28 -04:00
Rob Clark	5708440597	freedreno/a5xx: remove one image stateblock I think this ends up just setting uniform/const memory. But we upload x/y/z stride differently. At best this is unneeded, at worst it could possibly clobber other uniform/const memory. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-19 13:02:28 -04:00
Rob Clark	e0c6135625	freedreno/a5xx: cubemap image fixes Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-19 13:02:28 -04:00
Rob Clark	0bb0cac8dc	freedreno/ir3: handle image buffer Similar to txf case, we need to insert a 2nd coordinate (zero). Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-19 13:02:28 -04:00
Rob Clark	d1d2b13518	freedreno/ir3: handle arrays of images Unlike textures, this doesn't get lowered for us. (Would be nice if they were.. at least until we are ready to deal w/ indirect indexing..) Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-19 13:02:28 -04:00
Rob Clark	5b2ef78532	freedreno/ir3: images can be arrays too Seems I previously toally forgot about 2d-arrays, etc.. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-19 13:02:28 -04:00
Rob Clark	f489fa1f3f	freedreno/ir3: use move_load_const pass Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-19 13:02:28 -04:00
Rob Clark	7235c144a6	nir: add pass to move load_const Run this pass late (after opt loop) to move load_const instructions back into the basic blocks which use the result, in cases where a load_const is only consumed in a single block. This helps reduce register usage in cases where the backend driver cannot lower the load_const to a uniform. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-19 13:02:28 -04:00
Rob Clark	c9d6e579ec	mesa/st/nir: fix driver_location for arrays of image/sampler We can have arrays of images or samplers. But I forgot to handle that case long ago. Suprised no one complained yet. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-19 13:02:28 -04:00
Rob Clark	228457234c	nir: add comment for loop_unroll pass Save the next person from digging through the code to figure out what the indirect_mask parameter actually does. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-19 13:02:28 -04:00
Rob Clark	e3bbc1eaf4	glsl: fix random typo Just something I stumbled across. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-19 13:02:28 -04:00
Marek Olšák	dfeb61c5cf	radeonsi: ignore PIPE_RESOURCE_FLAG_MAP_COHERENT We treat coherent and non-coherent buffers the same. And move external_usage for better packing. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-19 12:52:28 -04:00
Marek Olšák	9322974ec7	radeonsi: always put persistent buffers into GTT on radeon This improves performance for certain games. Cc: 18.1 <mesa-stable@lists.freedesktop.org> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-19 12:52:28 -04:00
Marek Olšák	ffbbc008be	radeonsi: fix si_get_num_queries for radeon Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-19 12:52:28 -04:00
Marek Olšák	94b29763a4	radeonsi: don't expose performance counters for non-existent blocks Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-19 12:52:28 -04:00
Marek Olšák	a2451a4c23	ac/gpu_info: add radeon_info::num_tcc_blocks The values for the radeon winsys were copied from the kernel driver. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-19 12:52:28 -04:00
Marek Olšák	166c00e28e	radeonsi: set a better NUM_PATCHES hard limit AMDVLK uses 64 (distributed) and 16 (non-distributed). radeonsi will use 63 and 16. * This might improve tessellation performance on Hawaii, Bonaire, Tahiti, Pitcairn. (they will use 16) * I'm not sure if this matters for 1 SE configs. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-19 12:52:28 -04:00
Marek Olšák	0d685ba290	radeonsi: make sure LS-HS vector lanes are reasonably occupied Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-19 12:52:28 -04:00
Marek Olšák	e93fe403bc	radeonsi: properly compute an LS-HS thread group size limit "64 / max * 4" is less than "64 * 4 / max". Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-19 12:52:28 -04:00
Eric Anholt	da0115b1c3	v3d: Fix blitting from a linear winsys BO. This is the case for the simulator environment, and broke many blitter tests by trying to texture from linear while the HW can only actually do UIF/UBLINEAR/LT. Just make a temporary and copy into it with the CPU, then blit from that. This is the kind of path that should use the TFU, but I haven't exposed that hardware yet. Fixes dEQP-GLES3.functional.fbo.blit.default_framebuffer.*	2018-06-19 09:42:20 -07:00
Eric Anholt	07b243674f	v3d: Add missing always_flush debug flag. The #define existed and was checked in the driver.	2018-06-19 09:42:20 -07:00
Tomeu Vizoso	9b1cb50ba4	virgl: Remove debugging left-overs Some fprintfs were probably left unintentionally a few years ago and are a bit of a nuisance. Fixes: `2d3301e4d5` ("virgl: fix reference counting of prime handles") Cc: Rob Herring <robh@kernel.org> Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-06-19 13:35:13 +02:00
Timothy Arceri	6c243ac2dd	glsl: fix desktop glsl linking regression The prog->Shaders[i]->IsES check was accidentally removed causing ES linking rules to be applied to desktop GLSL. Fixes: `725b1a406d` ("mesa/util: add allow_glsl_relaxed_es driconfig override")	2018-06-19 17:58:05 +10:00
Timothy Arceri	a9114b5e3e	util: add allow_glsl_relaxed_es to drirc for Google Earth VR Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-06-19 12:09:56 +10:00
Timothy Arceri	725b1a406d	mesa/util: add allow_glsl_relaxed_es driconfig override This relaxes a number of ES shader restrictions allowing shaders to follow more desktop GLSL like rules. This initial implementation relaxes the following: - allows linking ES shaders with desktop shaders - allows mismatching precision qualifiers - always enables standard derivative builtins These relaxations allow Google Earth VR shaders to compile. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-06-19 12:09:56 +10:00
Timothy Arceri	781c23ece6	util: add allow_glsl_builtin_const_expression to drirc for Google Earth VR Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-06-19 12:09:56 +10:00
Timothy Arceri	90dbab0f9a	mesa/util: add allow_glsl_builtin_const_expression driconf override Google Earth VR shaders uses builtins in constant expressions with GLSL 1.10. That feature wasn't allowed until GLSL 1.20. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-06-19 12:09:56 +10:00
Timothy Arceri	de93f546a7	util: manually extract the program name from program_invocation_name Glibc has the same code to get program_invocation_short_name. However for some reason the short name gets mangled for some wine apps. For example with Google Earth VR I get: program_invocation_name: "/home/tarceri/.local/share/Steam/steamapps/common/EarthVR/Earth.exe" program_invocation_short_name: "e" Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-19 12:09:56 +10:00
Bas Nieuwenhuizen	1a8501a9dd	ac/surface: Set compressZ for stencil-only surfaces. We HTILE compress stencil-only surfaces too. CC: 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-19 02:52:01 +02:00
Jason Ekstrand	0146d79636	anv: Use a single global API patch version The Vulkan API has only one patch version shared among all of the major.minor versions. We should also advertise the same patch version regardless of major.minor. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106941 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-06-18 17:11:52 -07:00
Timothy Arceri	68bf94a8b0	radeonsi: enable OpenGL 3.3 compat profile Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-19 09:21:33 +10:00
Timothy Arceri	89a5d6f715	mesa: add ff fragment shader support for geom and tess shaders This is required for compatibility profile support. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-06-19 09:21:33 +10:00
Eric Anholt	e636199c1c	v3d: Set the SO offsets correctly if we have to re-emit. This should fix TF across a glFlush() or TF pause/restart. Fixes dEQP-GLES3.functional.transform_feedback.array.interleaved.lines.highp_float and many, many others.	2018-06-18 14:54:16 -07:00
Marek Olšák	94178044d5	gallium/hud: = should rename the last added data source Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-18 17:53:15 -04:00
Rafael Antognolli	ba2c18763b	anv: Disable constant buffer 0 being relative. If we are on gen8+ and have context isolation support, just make that constant buffer address be absolute, so we can use it for push UBOs too. v2: Do not duplicate constant_buffer_0_is_relative flag (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-18 14:41:38 -07:00
Rafael Antognolli	be18d5a0ce	anv/device: Check for kernel support of context isolation. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-06-18 14:41:38 -07:00
Rafael Antognolli	056214ebfc	intel/genxml: Add bitmasks for CS_DEBUG_MODE2/INSTPM. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-06-18 14:41:38 -07:00
Alok Hota	a678f40e46	swr/rast: Clang-Format most rasterizer source code Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-06-18 13:57:38 -05:00
Eric Engestrom	d85fef1e34	radv: fix reported number of available VGPRs It's a bit late to round up after an integer division. Fixes: `de88979413` "radv: Implement VK_AMD_shader_info" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Alex Smith <asmith@feralinteractive.com>	2018-06-18 17:08:22 +01:00
Eric Engestrom	9a4bd6b45f	mesa: add missing return in error path Fixes: `67f40dadaa` "mesa: add support for ARB_sample_locations" Cc: Rhys Perry <pendingchaos02@gmail.com> Cc: Brian Paul <brianp@vmware.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-06-18 16:19:48 +01:00
Bas Nieuwenhuizen	a3d93eec7c	radv: Use less conservative approximation for context rolls. Drops the number of time we set the scissor by 4x for F1 2017, which results in a consistent performance improvement of about 4%. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-06-18 16:21:10 +02:00
Eric Engestrom	4d08c1e7d1	radv: fix bitwise check Fixes: `922cd38172` "radv: implement out-of-order rasterization when it's safe on VI+" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-06-18 12:15:18 +01:00
Eric Engestrom	e8eb84826e	meson: fix i965/anv/isl genX static lib names Shouldn't make any functional difference, just that `liblibanv_gen90.a` will now be called `libanv_gen90.a`. Fixes: `3218056e0e` "meson: Build i965 and dri stack" Fixes: `d1992255bb` "meson: Add build Intel "anv" vulkan driver" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-06-18 12:03:24 +01:00
Timothy Arceri	66673bef94	mesa: Unconditionally enable floating-point textures ARB_texture_float references US Patent #6,650,327 [1] which has a filing date of June 16 1998. According to [2], patents filed after 1995 expire 20 years from the filing date, giving an expiration of June 17 2018. [1] https://www.google.com/patents/US6650327 [2] https://en.wikipedia.org/wiki/Term_of_patent_in_the_United_States Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-18 09:29:38 +10:00
Jose Maria Casanova Crespo	b8e099e7d5	intel/fs: shuffle_64bit_data_for_32bit_write is not used anymore Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo	a4965842d6	intel/fs: Use new shuffle_32bit_write for all 64-bit storage writes Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo	a4d445b93c	intel/fs: shuffle_32bit_load_result_to_64bit_data is not used anymore Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo	71b319a285	intel/fs: Use shuffle_from_32bit_read for 64-bit FS load_input As the previous use of shuffle_32bit_load_result_to_64bit_data had a source/destination overlap for 64-bit. Now a temporary destination is used for 64-bit cases to use shuffle_from_32bit_read that doesn't handle src/dst overlaps. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo	8003ae87f4	intel/fs: shuffle_from_32bit_read at load_per_vertex_input at TCS/TES Previously, the shuffle function had a source/destination overlap that needs to be avoided to use shuffle_from_32bit_read. As we can use for the shuffle destination the destination of removed MOVs. This change also avoids the internal MOVs done by the previous shuffle to deal with possible overlaps. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo	5565630f85	intel/fs: Use shuffle_from_32bit_read at VS load_input shuffle_from_32bit_read manages 32-bit reads to 32-bit destination in the same way that the previous loop so now we just call the new function for all bitsizes, simplifying also the 64-bit load_input. v2: Add comment about future 16-bit support (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo	152bffb69b	intel/fs: Use shuffle_from_32bit_read for 64-bit gs_input_load This implementation avoids two unneeded MOVs for each 64-bit component. One was done in the old shuffle, to avoid cases of src/dst overlap but this is not the case. And the removed MOV was already being being done in the shuffle. Copy propagation wasn't able to remove them because shuffle destination values are defined with partial writes because they have stride == 2. v2: Reword commit log summary (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo	8b26a2d96d	intel/fs: shuffle_from_32bit_read for 64-bit do_untyped_vector_read do_untyped_vector_read is used at load_ssbo and load_shared. The previous MOVs are removed because shuffle_from_32bit_read can handle storing the shuffle results in the expected destination just using the proper offset. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo	c2297bdf19	intel/fs: Remove old 16-bit shuffle/unshuffle functions Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo	fd3d8a8f79	intel/fs: Use shuffle_for_32bit_write for 16-bits store_ssbo Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo	20e4732f7d	intel/fs: Use shuffle_from_32bit_read to read 16-bit SSBO Using shuffle_from_32bit_read instead of 16-bit shuffle functions avoids the need of retype. At the same time new function are ready for 8-bit type SSBO reads. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo	a0891eabca	intel/fs: Use shuffle_from_32bit_read at VARYING_PULL_CONSTANT_LOAD shuffle_from_32bit_read can manage the shuffle/unshuffle needed for different 8/16/32/64 bit-sizes at VARYING PULL CONSTANT LOAD. To get the specific component the first_component parameter is used. In the case of the previous 16-bit shuffle, the shuffle operation was generating not needed MOVs where its results where never used. This behaviour passed unnoticed on SIMD16 because dead_code_eliminate pass removed the generated instructions but for SIMD8 they cound't be removed because of being partial writes. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo	22c654941b	intel/fs: New shuffle_for_32bit_write and shuffle_from_32bit_read These new shuffle functions deal with the shuffle/unshuffle operations needed for read/write operations using 32-bit components when the read/written components have a different bit-size (8, 16, 64-bits). Shuffle from 32-bit to 32-bit becomes a simple MOV. shuffle_src_to_dst takes care of doing a shuffle when source type is smaller than destination type and an unshuffle when source type is bigger than destination. So this new read/write functions just need to call shuffle_src_to_dst assuming that writes use a 32-bit destination and reads use a 32-bit source. As shuffle_for_32bit_write/from_32bit_read components take components in unit of source/destination types and shuffle_src_to_dst takes units of the smallest type component, we adjust components and first_component parameters. To enable this new functions it is needed than there is no source/destination overlap in the case of shuffle_from_32bit_read. That never happens on shuffle_for_32bit_write as it allocates a new destination register as it was at shuffle_64bit_data_for_32bit_write. v2: Reword commit log and add comments to explain why first_component and components parameters are adjusted. (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo	a5665056e5	intel/fs: general 8/16/32/64-bit shuffle_src_to_dst function This new function takes care of shuffle/unshuffle components of a particular bit-size in components with a different bit-size. If source type size is smaller than destination type size the operation needed is a component shuffle. The opposite case would be an unshuffle. Component units are measured in terms of the smaller type between source and destination. As we are un/shuffling the smaller components from/into a bigger one. The operation allows to skip first_component number of components from the source. Shuffle MOVs are retyped using integer types avoiding problems with denorms and float types if source and destination bitsize is different. This allows to simplify uses of shuffle functions that are dealing with these retypes individually. Now there is a new restriction so source and destination can not overlap anymore when calling this shuffle function. Following patches that migrate to use this new function will take care individually of avoiding source and destination overlaps. v2: (Jason Ekstrand) - Rewrite overlap asserts. - Manage type_sz(src.type) == type_sz(dst.type) case using MOVs from source to dest. This works for 64-bit to 64-bits operation that on Gen7 as it doesn't support Q registers. - Explain that components units are based in the smallest type. v3: - Fix unshuffle overlap assert (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-16 22:39:08 +02:00
Jose Fonseca	d882331f7a	appveyor: Consume LLVM 5.0.1. https://ci.appveyor.com/project/jrfonseca/mesa/build/47 Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-06-16 18:09:20 +01:00
Bas Nieuwenhuizen	c4714f698b	ac: Clear meminfo to avoid valgrind warning. Somehow valgrind misses that the value is initialized by the ioctl. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-06-16 19:03:47 +02:00
Samuel Pitoiset	5917761e3d	radv: fix emitting the TCS regs on GFX9 The primitive ID is NULL and this generates an invalid select instruction which crashes because one operand is NULL. This fixes crashes in The Long Journey Home, Quantum Break and Just Cause 3 with DXVK. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106756 CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-16 10:18:51 +02:00
Ian Romanick	355868dbfc	nir: Document a couple instances of parent_instr nir_ssa_def::parent_instr and nir_src::parent_instr have the same name, but they mean really different things. I choose to save the next person the hour+ that I just spent figuring that out. Even now that I know, I doubt I'd notice in code review that someone typed foo->parent_instr when they actually meant foo->ssa->parent_instr. v2: Minor wording tweak in nir_ssa_def::parent_instr. Suggested by Jason. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-15 17:36:51 -07:00
Ian Romanick	4467040cb6	i965/fs: Propagate conditional modifiers from not instructions Skylake total instructions in shared programs: 14399081 -> 14399010 (<.01%) instructions in affected programs: 26961 -> 26890 (-0.26%) helped: 57 HURT: 0 helped stats (abs) min: 1 max: 6 x̄: 1.25 x̃: 1 helped stats (rel) min: 0.16% max: 0.80% x̄: 0.30% x̃: 0.18% 95% mean confidence interval for instructions value: -1.50 -0.99 95% mean confidence interval for instructions %-change: -0.35% -0.25% Instructions are helped. total cycles in shared programs: 532978307 -> 532976050 (<.01%) cycles in affected programs: 468629 -> 466372 (-0.48%) helped: 33 HURT: 20 helped stats (abs) min: 3 max: 360 x̄: 116.52 x̃: 98 helped stats (rel) min: 0.06% max: 3.63% x̄: 1.66% x̃: 1.27% HURT stats (abs) min: 2 max: 172 x̄: 79.40 x̃: 43 HURT stats (rel) min: 0.04% max: 3.02% x̄: 1.48% x̃: 0.44% 95% mean confidence interval for cycles value: -81.29 -3.88 95% mean confidence interval for cycles %-change: -1.07% 0.12% Inconclusive result (%-change mean confidence interval includes 0). All Gen6+ platforms, except Ivy Bridge, had similar results. (Haswell shown) total instructions in shared programs: 12973897 -> 12973838 (<.01%) instructions in affected programs: 25970 -> 25911 (-0.23%) helped: 55 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.07 x̃: 1 helped stats (rel) min: 0.16% max: 0.62% x̄: 0.28% x̃: 0.18% 95% mean confidence interval for instructions value: -1.14 -1.00 95% mean confidence interval for instructions %-change: -0.32% -0.24% Instructions are helped. total cycles in shared programs: 410355841 -> 410352067 (<.01%) cycles in affected programs: 578454 -> 574680 (-0.65%) helped: 47 HURT: 5 helped stats (abs) min: 3 max: 360 x̄: 85.74 x̃: 18 helped stats (rel) min: 0.05% max: 3.68% x̄: 1.18% x̃: 0.38% HURT stats (abs) min: 2 max: 242 x̄: 51.20 x̃: 4 HURT stats (rel) min: <.01% max: 0.45% x̄: 0.15% x̃: 0.11% 95% mean confidence interval for cycles value: -104.89 -40.27 95% mean confidence interval for cycles %-change: -1.45% -0.66% Cycles are helped. Ivy Bridge total instructions in shared programs: 11679351 -> 11679301 (<.01%) instructions in affected programs: 28208 -> 28158 (-0.18%) helped: 50 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.12% max: 0.54% x̄: 0.23% x̃: 0.16% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.27% -0.19% Instructions are helped. total cycles in shared programs: 257445362 -> 257444662 (<.01%) cycles in affected programs: 419338 -> 418638 (-0.17%) helped: 40 HURT: 3 helped stats (abs) min: 1 max: 170 x̄: 65.05 x̃: 24 helped stats (rel) min: 0.02% max: 3.51% x̄: 1.26% x̃: 0.41% HURT stats (abs) min: 2 max: 1588 x̄: 634.00 x̃: 312 HURT stats (rel) min: 0.05% max: 2.97% x̄: 1.21% x̃: 0.62% 95% mean confidence interval for cycles value: -97.96 65.41 95% mean confidence interval for cycles %-change: -1.56% -0.62% Inconclusive result (value mean confidence interval includes 0). No changes on Iron Lake or GM45. v2: Move 'if (cond != BRW_CONDITIONAL_Z && cond != BRW_CONDITIONAL_NZ)' check outside the loop. Suggested by Iago. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-15 17:22:27 -07:00
Ian Romanick	f2d8bb7a7b	i965/fs: Rearrange code to remove most of the gotos Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-15 17:22:27 -07:00
Ian Romanick	77f269bb56	i965/fs: Refactor propagation of conditional modifiers from compares to adds Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-15 17:22:27 -07:00
Ian Romanick	22f9fbc0d9	i965/vec4: Optimize OR with 0 into a MOV All of the affected shaders are geometry shaders... the same ones from the similar fs changes. The "No changes on any other platforms" comment below is not quite right. Without the previous change to register coalescing, this optimization caused quite a few regressions in tests that either used gl_ClipVertex or used different interpolation modes. I observed that with both patches applied, glsl-1.10/execution/interpolation/interpolation-none-gl_BackSecondaryColor-smooth-vertex.shader_test was one instruction shorter. I suspect other shaders would be similarly affected. Since this is all based on NOS, shader-db does not reflect it. Haswell total instructions in shared programs: 12954955 -> 12954918 (<.01%) instructions in affected programs: 3603 -> 3566 (-1.03%) helped: 37 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.21% max: 2.50% x̄: 1.99% x̃: 2.50% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -2.30% -1.69% Instructions are helped. total cycles in shared programs: 410012108 -> 410012098 (<.01%) cycles in affected programs: 3540 -> 3530 (-0.28%) helped: 5 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.28% max: 0.28% x̄: 0.28% x̃: 0.28% 95% mean confidence interval for cycles value: -2.00 -2.00 95% mean confidence interval for cycles %-change: -0.28% -0.28% Cycles are helped. Ivy Bridge total instructions in shared programs: 11679387 -> 11679351 (<.01%) instructions in affected programs: 3292 -> 3256 (-1.09%) helped: 36 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.21% max: 2.50% x̄: 2.04% x̃: 2.50% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -2.34% -1.74% Instructions are helped. No changes on any other platforms. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-15 17:22:27 -07:00
Ian Romanick	e6a9bd97b9	i965/vec4: Don't register coalesce into source of VS_OPCODE_UNPACK_FLAGS_SIMD4X2 This prevents regressions in a bunch of clipping and interpolation tests caused by the next patch (i965/vec4: Optimize OR with 0 into a MOV). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-15 17:22:27 -07:00
Ian Romanick	284b563fb0	i965/fs: Optimize OR with 0 into a MOV fs_visitor::set_gs_stream_control_data_bits generates some code like "control_data_bits \| stream_id << ((2 * (vertex_count - 1)) % 32)" as part of EmitVertex. The first time this (dynamically) occurs in the shader, control_data_bits is zero. Many times we can determine this statically and various optimizations will collaborate to make one of the OR operands literal zero. Converting the OR to a MOV usually allows it to be copy-propagated away. However, this does not happen in at least some shaders (in the assembly output of shaders/closed/UnrealEngine4/EffectsCaveDemo/301.shader_test, search for shl). All of the affected shaders are geometry shaders. Broadwell and Skylake had similar results. (Skylake shown) total instructions in shared programs: 14375452 -> 14375413 (<.01%) instructions in affected programs: 6422 -> 6383 (-0.61%) helped: 39 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.14% max: 2.56% x̄: 1.91% x̃: 2.56% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -2.26% -1.57% Instructions are helped. total cycles in shared programs: 531981179 -> 531980555 (<.01%) cycles in affected programs: 27493 -> 26869 (-2.27%) helped: 39 HURT: 0 helped stats (abs) min: 16 max: 16 x̄: 16.00 x̃: 16 helped stats (rel) min: 0.60% max: 7.92% x̄: 5.94% x̃: 7.92% 95% mean confidence interval for cycles value: -16.00 -16.00 95% mean confidence interval for cycles %-change: -6.98% -4.90% Cycles are helped. No changes on earlier platforms. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-15 17:22:27 -07:00
Eric Anholt	4106f6ce54	v3d: Handle a no-intersection scissor even if it's outside of the VP. The min/maxes ended up producing a negative clip width/height for dEQP-GLES3.functional.fragment_ops.scissor.outside_render_line. Just make sure they stay at 0 (or v3d 3.x's workaround) if that happens.	2018-06-15 16:09:39 -07:00
Eric Anholt	9aa670e52a	v3d: Use the proper depth texture type for sampling. Fixes failing tests in dEQP-GLES3.functional.texture.shadow	2018-06-15 16:09:39 -07:00
Eric Anholt	778594ae12	v3d: Limit shader threading according to our maximum TMU fifo usage. Fixes simulator assertion failures in dEQP-GLES3.functional.shaders.texture_functions.texture.samplercubeshadow_bias_fragment and similar complicated cases.	2018-06-15 16:09:39 -07:00
Eric Anholt	e130ada243	v3d: Fix shaders using pixel center W but no varyings. The docs called this field "uses both center W and centroid W", but actually it's "do you need center W even if varyings don't obviously call for it?" Fixes dEQP-GLES3.functional.shaders.builtin_variable.fragcoord_w	2018-06-15 16:09:39 -07:00
Dylan Baker	0d4f338a11	docs: Update release-notes and calendar	2018-06-15 13:53:25 -07:00
Dylan Baker	3c454fc84a	docs: Add release notes for 18.1.2	2018-06-15 13:52:44 -07:00
Rafael Antognolli	9e1f208795	intel/aubinator: Use int to store getopt_long flags. getopt_long flag parameter is an int pointer, so if we use bool to store those values, when getopt_long writes to one of them, it might end up overwriting the next one. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-15 09:03:10 -07:00
Samuel Pitoiset	f8e2c4c57c	Revert "radv: always set/load both depth and stencil clear values" This fixes a rendering regression with RoTR. This reverts commit `4bdad9fadd`. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-15 16:52:06 +02:00
Samuel Pitoiset	a2f6e72138	radv: don't check for linear images in emit_fast_color_clear() We don't enable CMASK for linear surfaces and addrlib only enables DCC for tiling surfaces. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-15 15:54:12 +02:00
Samuel Pitoiset	3befac52db	radv: allow RADV_PERFTEST=dccmsaa on GFX9 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-15 15:54:10 +02:00
Samuel Pitoiset	bfca15e16a	radv: add RADV_DEBUG=checkir This allows to run the LLVM verifier pass. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-15 15:54:08 +02:00
Samuel Pitoiset	706d51de7f	radv: update ZRANGE_PRECISION in radv_update_bound_fast_clear_ds() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-15 15:54:06 +02:00
Samuel Pitoiset	fa8bc821a8	radv: clean up radv_{set,load}_depth_clear_regs() helpers And replace _regs by _metadata because it makes more sense. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-15 15:54:04 +02:00
Samuel Pitoiset	4bdad9fadd	radv: always set/load both depth and stencil clear values I don't think that matter much to emit both values and that makes the code a bit simpler. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-15 15:54:02 +02:00
Samuel Pitoiset	2193a6a828	radv: update the fast ds clear values only if the image is bound It's unnecessary to update the fast depth/stencil clear values if the fast cleared depth/stencil image isn't currently bound. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-15 15:54:00 +02:00
Samuel Pitoiset	be794fa26b	radv: clean up radv_{set,load}_color_clear_regs() helpers And replace _regs by _metadata because it makes more sense. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-15 15:53:58 +02:00
Samuel Pitoiset	d7b772abb4	radv: update the fast color clear values only if the image is bound It's unnecessary to update the fast color clear values if the fast cleared color image isn't currently bound. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-15 15:53:55 +02:00
Christian Gmeiner	efae127993	util/bitset: include util/macro.h BITSET_FFS(x) macro makes use of ARRAY_SIZE(x) macro which is defined in util/macro.h. Include it directy to make usage more straightforward. Fixes: `692bd4a1ab` ("util: replace Elements() with ARRAY_SIZE()") Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-15 11:26:30 +01:00
Lukas Rusak	4cfc4cef80	meson: fix private libs when building without glx I noticed that the generated pkg-config files will include glx and x11 dependencies even when x11 isn't a selected platform. This fixes the private libs and was tested by building kmscube V2: - check if gallium-xlib is being used for glx Fixes: `108d257a16` "meson: build libEGL" Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-15 10:43:22 +01:00
Rhys Perry	30f1ab7a59	docs: document addition of GL_ARB_sample_locations for nvc0 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v2)	2018-06-14 20:09:45 -06:00
Rhys Perry	66ca7e400b	nvc0: add support for programmable sample locations Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>	2018-06-14 20:09:45 -06:00
Rhys Perry	9f217facbd	st/mesa: add support for ARB_sample_locations Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v2) Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)	2018-06-14 20:09:45 -06:00
Rhys Perry	51a221e378	gallium: add support for programmable sample locations Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v2) Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)	2018-06-14 20:09:45 -06:00
Rhys Perry	67f40dadaa	mesa: add support for ARB_sample_locations Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v2) Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)	2018-06-14 20:09:45 -06:00
Eric Anholt	cd2e673abc	v3d: Fix polygon offset for Z16 buffers. Fixes: dEQP-GLES3.functional.polygon_offset.fixed16_displacement_with_units dEQP-GLES3.functional.polygon_offset.fixed16_render_with_units	2018-06-14 17:03:16 -07:00
Eric Anholt	d91e06a065	v3d: Fix configuration setup of mixed f32 and f16 render targets. Fixes dEQP-GLES3.functional.fragment_out.random.26 and 6 others.	2018-06-14 16:52:25 -07:00
Eric Anholt	6784aa9870	v3d: Don't set the first_ez_state to DISABLED if after only UNDECIDED draws. We need to have the RCL start with EZ enabled, since those undecided draws had EZ enabled. But we do need to update from UNDECIDED to LT or GT as necessary still. Fixes many simulator assertion fails in deqp fragment_ops/interaction/basic_shader/*	2018-06-14 16:52:25 -07:00
Eric Anholt	9080642449	v3d: Use the right size for v3d 4.x TEXTURE_SHADER_STATE BO. This doesn't really matter, since they both get rounded up to 4096.	2018-06-14 16:52:25 -07:00
Eric Anholt	31548187cf	v3d: Add static asserts for other packed packet sizes.	2018-06-14 16:52:25 -07:00
Eric Anholt	0eef4d7f8f	v3d: Fix the size of the packed attribute state. Fixes segfaults in dEQP-GLES3.functional.vertex_array_objects.all_attributes.	2018-06-14 16:52:25 -07:00
Eric Anholt	7d8fe50af3	v3d: Remove some unused context fields from vc4.	2018-06-14 16:52:25 -07:00
Eric Anholt	48011c42aa	v3d: Remove unused QUNIFORM_STENCIL left over from vc4.	2018-06-14 16:52:25 -07:00
Eric Anholt	4564537222	v3d: Use our #define for max attributes in shader caps.	2018-06-14 16:52:25 -07:00
Eric Anholt	a40bc33b11	v3d: Fix undefined results for a swap_color_rb RT from a float shader output. Fixes segfaults and undefined behavior in dEQP-GLES3.functional.fragment_out.basic.fixed.srgb8_alpha8_lowp_float	2018-06-14 16:52:25 -07:00
Dave Airlie	600d34c822	radv: remove multisample bit from shader key. This wasn't being used anywhere inside the shader from what I can see. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-15 09:33:20 +10:00
Kenneth Graunke	f6898f2b55	intel/compiler: Properly consider UBO loads that cross 32B boundaries. The UBO push analysis pass incorrectly assumed that all values would fit within a 32B chunk, and only recorded a bit for the 32B chunk containing the starting offset. For example, if a UBO contained the following, tightly packed: vec4 a; // [0, 16) float b; // [16, 20) vec4 c; // [20, 36) then, c would start at offset 20 / 32 = 0 and end at 36 / 32 = 1, which means that we ought to record two 32B chunks in the bitfield. Similarly, dvec4s would suffer from the same problem. v2: Rewrite the accounting, my calculations were wrong. v3: Write a comment about partial values (requested by Jason). Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> [v1] Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> [v3]	2018-06-14 14:58:59 -07:00
Ian Romanick	37bd9ccd21	glsl: Don't copy propagate elements from SSBO or shared variables either Since SSBOs can be written by a different GPU thread, copy propagating a read can cause the value to magically change. SSBO reads are also very expensive, so doing it twice will be slower. The same shader was helped by this patch and the previous. Haswell, Broadwell, and Skylake had similar results. (Skylake shown) total instructions in shared programs: 14399119 -> 14399113 (<.01%) instructions in affected programs: 683 -> 677 (-0.88%) helped: 1 HURT: 0 total cycles in shared programs: 532973113 -> 532971865 (<.01%) cycles in affected programs: 524666 -> 523418 (-0.24%) helped: 1 HURT: 0 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106774	2018-06-14 11:28:12 -07:00
Ian Romanick	461a5c899c	glsl: Don't copy propagate from SSBO or shared variables either Since SSBOs can be written by other GPU threads, copy propagating a read can cause the value to magically change. SSBO reads are also very expensive, so doing it twice will be slower. Haswell, Broadwell, and Skylake had similar results. (Skylake shown) total instructions in shared programs: 14399120 -> 14399119 (<.01%) instructions in affected programs: 684 -> 683 (-0.15%) helped: 1 HURT: 0 total cycles in shared programs: 532978931 -> 532973113 (<.01%) cycles in affected programs: 530484 -> 524666 (-1.10%) helped: 1 HURT: 0 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106774	2018-06-14 11:26:33 -07:00
Lukas Rusak	1d92d6486a	meson: only build vl_winsys_dri.c when x11 platform is used This seems to have been missed in the move from autotools This fixes the following build issue: ../src/gallium/auxiliary/vl/vl_winsys_dri.c:34:10: fatal error: X11/Xlib-xcb.h: No such file or directory #include <X11/Xlib-xcb.h> ^~~~~~~~~~~~~~~~ Fixes: `b1b65397d0` ("meson: Build gallium auxiliary") Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-06-14 10:34:51 -07:00
Brian Paul	b9e6438adf	st/mesa: add missing switch cases in glsl_to_tgsi_visitor::visit() To silence compiler warning about unhandled switch cases. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-06-14 11:29:51 -06:00
Bas Nieuwenhuizen	41dabdc475	radv: Fix output for sparse MRTs. We need to init the cb_shader_format correctly with the changed col_format, so this moves the col_format adjustment to before the adjustment to before the cb_shader_mask gets generated. Fixes: `06d3c65098` "radv: fix a GPU hang when MRTs are sparse" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106903 CC: 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-06-14 11:48:24 +02:00
Samuel Pitoiset	68dead112e	radv: update the ZRANGE_PRECISION value for the TC-compat bug On GFX8+, there is a bug that affects TC-compatible depth surfaces when the ZRange is not reset after LateZ kills pixels. The workaround is to always set DB_Z_INFO.ZRANGE_PRECISION to match the last fast clear value. Because the value is set to 1 by default, we only need to update it when clearing Z to 0.0. We also need to set the depth clear regs and to update ZRANGE_PRECISION when initializing a TC-compat depth image to 0. Original patch from James Legg. This fixes random CTS fails with dEQP-VK.renderpass.suballocation.formats.d32_sfloat_s8_uint.input.* Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105396 CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-14 11:38:29 +02:00
Samuel Iglesias Gonsálvez	183adc51f8	anv: reduce maxFragmentInputComponents If the application asks for the maximum number of fragment input components (128), use all of them plus some builtins that are passed in the VUE, then we exceed the maximum number of used VUE slots (32) and we break one assert that checks this limit. Also, with separate shader objects, we add CLIP_DIST0, CLIP_DIST1 builtins in brw_compute_vue_map() because we don't know if gl_ClipDistance is going to be read/write by an adjacent stage. Fixes VK-GL-CTS CL#2569. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-14 09:54:28 +02:00
Marek Olšák	6d671078a8	radeonsi/gfx9: fix si_get_buffer_from_descriptors for 48-bit pointers This fixes: GL45-CTS.pipeline_statistics_query_tests_ARB.functional_compute_shader_invocations Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-06-13 22:00:12 -04:00
Marek Olšák	a4312742a5	radeonsi/gfx9: update & clean up a DPBB heuristic Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-13 22:00:43 -04:00
Marek Olšák	47b780be21	radeonsi/gfx9: set POPS_DRAIN_PS_ON_OVERLAP due to a hw bug This may not be needed yet, but let's set it now. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-13 22:00:42 -04:00
Marek Olšák	a152ca70f2	radeonsi/gfx9: remove UINT_MAX array terminators in bin size tables Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-13 22:00:40 -04:00
Marek Olšák	cd0be6cdc8	radeonsi/gfx9: update bin sizes This is based on our docs (recently updated), not amdvlk. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-13 22:00:39 -04:00
Marek Olšák	2f51081a93	radeonsi/gfx9: update primitive binning code for EQAA Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-13 22:00:37 -04:00
Marek Olšák	22e994bb75	radeonsi: assume that rasterizer state is non-NULL in draw_vbo Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-13 22:00:36 -04:00
Marek Olšák	f3b3ee6974	radeonsi: micro-optimize prim checking and fix guardband with lines+adjacency Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-13 22:00:34 -04:00
Marek Olšák	d6974feb90	radeonsi: move the guardband registers into a separate state atom They have a different frequency of updates and don't change when scissors change. I think this even fixes something in si_update_vs_viewport_state. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-13 22:00:31 -04:00
Marek Olšák	68b1c669e7	radeonsi/gfx9: implement the scissor bug workaround without performance drop This might improve performance on Vega10 and Raven. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-13 22:00:27 -04:00
Marek Olšák	73b0d10152	radeonsi: don't set VGT_LS_HS_CONFIG if it doesn't change Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-13 22:00:25 -04:00
Marek Olšák	28ee825e19	radeonsi: move VGT_GS_OUT_PRIM_TYPE into si_shader_gs same as amdvlk. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-13 22:00:23 -04:00
Marek Olšák	99e0ba6868	radeonsi: record CLIPVERTEX output usage properly for compatibility profiles This was missed when adding CLIPVERTEX support into GS & tess. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-13 22:00:20 -04:00
Marek Olšák	47a57a709d	radeonsi: fix FBFETCH with 2D MSAA arrays Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-13 22:00:17 -04:00
Marek Olšák	e5e57c3a5e	ac: handle undefined EQAA samples in ac_apply_fmask_to_sample RADV might wanna use this helper too. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-13 22:00:12 -04:00
Marek Olšák	a2d4c8ff6d	radeonsi: return real memory usage instead of per-process usage Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-13 21:47:36 -04:00
Marek Olšák	95ecde42eb	ac/gpu_info: report real total memory sizes The change from MIN2 to MAX2 is intentional. Cc: 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-13 21:47:36 -04:00
Dave Airlie	f11b664f48	docs: mark virgl GL 4.0 features as complete. virgl should now expose GL4.1 where it can.	2018-06-14 10:38:11 +10:00
Dave Airlie	7b6f2704eb	virgl: add ARB_tessellation_shader support. (v2) This should add all the pieces to enable tess shaders on virgl. v2: fixup transform to handle tess and strip out precise. set default for max patch varyings to work around issue when tess gets enabled from v1 caps but v2 caps aren't in place. (Elie) Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-06-14 10:36:31 +10:00
Dave Airlie	babd1d526b	glsl: allow standalone semicolons outside main() GLSL 4.60 offically added this but games and older CTS suites actually had shaders that did this, we may as well enable it everywhere. Adding stable because it appears apps in the wild do this. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: <mesa-stable@lists.freedesktop.org>	2018-06-14 10:21:51 +10:00
Samuel Pitoiset	51e23d3419	radv: don't fast clear HTILE for 16-bit depth surfaces on GFX8 This causes rendering issues in Shadow Warrior 2 with DXVK. Cc: mesa-stable@lists.freedesktop.org Fixes: `ccc64f3133` ("radv: enable TC-compat HTILE for 16-bit depth surfaces on GFX8") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106912 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-13 20:30:04 +02:00
Andrew Galante	baf16b2ea3	configure.ac: Test for __atomic_add_fetch in atomic checks Some platforms have 64-bit __atomic_load_n but not 64-bit __atomic_add_fetch, so test for both of them. Bug: https://bugs.gentoo.org/655616 Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-06-13 10:09:46 -07:00
Andrew Galante	9d547a7617	meson: Test for __atomic_add_fetch in atomic checks Some platforms have 64-bit __atomic_load_n but not 64-bit __atomic_add_fetch, so test for both of them. Bug: https://bugs.gentoo.org/655616 Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-06-13 10:09:46 -07:00
Matt Turner	b29b5a82a1	meson: Fix -latomic check Commit `54ba73ef10` (configure.ac/meson.build: Fix -latomic test) fixed some checks for -latomic, and then commit `54bbe600ec` (configure.ac: rework -latomic check) further extended the fixes in configure.ac but not in Meson. This commit extends those fixes to the Meson tests. Fixes: `54bbe600ec` (configure.ac: rework -latomic check) Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-06-13 10:09:46 -07:00
Dylan Baker	9cc577761f	meson: Remove various completed todos v3: - Remove "won't do" todos, so only completed todo's are now removed. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> (v2)	2018-06-13 10:07:03 -07:00
Dylan Baker	0ce3f3538b	meson: Make use of optional modules meson 0.43 gained support for optional modules, which clover wold like to use. Since we require 0.44.1 now we can rely on them being available for clover. compile tested only. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-13 10:06:58 -07:00
Dylan Baker	34bbb24ce7	meson: Add support for ppc assembly/optimizations v2: - Use -mpower8-vector in compiler test for altivec - rename altivec option to power8 - reword power8 option description to be more clear, originally I had made it a boolean, but replaced it with an auto option. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-13 10:06:54 -07:00
Dylan Baker	e26af22143	meson: Add support for SPARC assembly This was blindly copied from autotools and tested by a helpful gentoo user. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-13 10:06:25 -07:00
Dylan Baker	6eaa013685	meson: Set include dirs for asm v2: - split this from the next patch - Only include x86-64 and not x86 when buiding x86_64 Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-13 10:06:23 -07:00
Dylan Baker	65e447c5df	meson: move cc and cpp definitions to top of main meson.build This just makes using cc and cpp easier. v2: - Add this patch to fix altivec Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-13 10:06:16 -07:00
Jason Ekstrand	51376cd749	Revert "intel/compiler: Properly consider UBO loads that cross 32B boundaries." This reverts commit `b8fa847c2e`. This broke about 30k Vulkan CTS tests.	2018-06-13 09:23:55 -07:00
Kenneth Graunke	b8fa847c2e	intel/compiler: Properly consider UBO loads that cross 32B boundaries. The UBO push analysis pass incorrectly assumed that all values would fit within a 32B chunk, and only recorded a bit for the 32B chunk containing the starting offset. For example, if a UBO contained the following, tightly packed: vec4 a; // [0, 16) float b; // [16, 20) vec4 c; // [20, 36) then, c would start at offset 20 / 32 = 0 and end at 36 / 32 = 1, which means that we ought to record two 32B chunks in the bitfield. Similarly, dvec4s would suffer from the same problem. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-06-13 02:07:58 -07:00
Ross Burton	3c288da5ee	drivers/dri/i965: add missing #include brw_bufmgr.h uses time_t without include time.h, so the build fails under musl. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-12 12:08:30 +01:00
Mauro Rossi	fb9ab2fbd3	anv/android: Use an address for each anv_image plane Fixes to avoid building error after change in image->planes[] structure, {bo,bo_offset} has to be replaced by address.{bo,offset} and update is needed also in the assert() for debug builds. external/mesa/src/intel/vulkan/anv_android.c:188:21: error: no member named 'bo' in 'struct anv_image::(anonymous at external/mesa/src/intel/vulkan/anv_private.h:2647:4)' image->planes[0].bo = bo; ~~~~~~~~~~~~~~~~ ^ 1 error generated. Fixes: `bf34ef16ac` ("anv: Use an address for each anv_image plane") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-12 11:17:43 +03:00
Mauro Rossi	a1220e7311	anv/android: Set the BO flags in bo_cache_import (v2) Changes to avoid building error: external/mesa/src/intel/vulkan/anv_android.c:131:72: error: too few arguments to function call, expected 5, have 4 result = anv_bo_cache_import(device, &device->bo_cache, dma_buf, &bo); ~~~~~~~~~~~~~~~~~~~ ^ 1 error generated. (v2) Set the correct bo_flags based on support of 48bit addresses and soft-pin Fixes: `b0d50247a7` ("anv/allocator: Set the BO flags in bo_cache_alloc/import") Fixes: `e7d0378bd9` ("anv: Soft-pin client-allocated memory") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-12 11:16:39 +03:00
Kenneth Graunke	0d5329d626	anv: Disable __gen_validate_value if NDEBUG is set. We were enabling undefined memory checking for genxml values based on Valgrind being installed at build time, even for release builds. This generates piles and piles of assembly whenever you touch genxml. With gcc 7.3.1 and -O3 and -march=native on a Kabylake with Valgrind installed at build time: text data bss dec hex filename 5978385 262884 13488 6254757 5f70a5 libvulkan_intel.so 3799377 262884 13488 4075749 3e30e5 libvulkan_intel.so That's a 36% reduction in text size. Fixes: `047ed02723` (vk/emit: Use valgrind to validate every packed field) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-11 14:55:32 -07:00
Eric Engestrom	06e8771dec	README: wording fix for previous commit Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-11 18:34:58 +01:00
Eric Engestrom	d9f54dceca	README: add link to WhosWho for IRC nicks Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-11 18:33:12 +01:00
Eric Engestrom	eadc068406	add project README Now that we're using GitLab, let's take advantage of the "landing page" README feature with some minimal information, mostly to point people to the right resources. Acked-by: Dylan Baker <dylan@pnwbakers.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-11 18:02:35 +01:00
Eric Engestrom	e43c012433	i965: fix resource leak v2: intel_miptree_release() already takes care of the planes, no need to hand-code the loop (Lionel) Coverity ID: 1436909 Fixes: `3352f2d746` "i965: Create multiple miptrees for planar YUV images" Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Eric Engestrom <eric@engestrom.ch>	2018-06-11 14:54:23 +01:00
Rob Clark	55d1a77c29	freedreno/ir3: use pipe_image_view's cpp At least for PIPE_BUFFER, we could get the resource used as (for example) R32F imageBuffer. So using cpp=1 from the rsc is wrong. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-11 09:06:03 -04:00
Rob Clark	9bb90a3255	freedreno/ir3: fix image dimensions offset copy-pasta fail from how SSBO sizes are handled. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-11 09:06:03 -04:00
Rob Clark	e9fc9c16c9	freedreno/a5xx: correct image/ssbo offset Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-11 09:06:03 -04:00
Rob Clark	132e5b0b34	freedreno/ir3: use saml always if we have lod In some cases we get plain tex opcodes (but w/ a lod argument).. in this case always use the saml instruction. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-11 09:06:03 -04:00
Rob Clark	cf5dda3349	freedreno/ir3: don't cp absneg into meta:fi If using a fanin (collect) to collect of consecutive registers together, we can CP mov's into the fanin, but not (abs) or (neg). No places that allow those modifiers are consuming a fanin anyways. But this caused an absneg to be lost between a ldgb and stgb for shaders like: outputs[n] = abs(input[n]) Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-11 09:06:03 -04:00
Rob Clark	39e7a39e91	freedreno/ir3: rework size/type conversion instructions With 8b and 16b, there are a lot more to handle. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-11 09:06:03 -04:00
Rob Clark	a52e698219	freedreno/ir3: propagate HALF flag across fanout If we have a fanout (split) meta instruction to split the result of a vector instruction, propagate the HALF flag back to the original instruction. Otherwise result ends up in a full precision register while instruction(s) that use the result look in a half-precision register. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-11 09:06:03 -04:00
Rob Clark	fc1690c9d9	freedreno/a5xx: add sample-id/sample-mask-in Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-11 09:06:03 -04:00
Rob Clark	619d2317cd	freedreno/ir3: add sample-id/sample-mask-in Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-11 09:06:03 -04:00
Rob Clark	a49c87956e	freedreno: update generated headers Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-11 09:06:03 -04:00
Rob Clark	067d89c2cd	freedreno/ir3: image atomics use image-store path image reads are handled via tex state, whereas image writes and atomics are handled via SSBO state block. Previously we were only considering image write, and not image atomics which also uses the SSBO state block. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-11 09:06:03 -04:00
Kyle Brenneman	41642bdbca	egl/glvnd: Fix a segfault in eglGetProcAddress. If FindProcIndex in egldispatchstubs.c is called with a name that's less than the first entry in the array, it would end up trying to store an index of -1 in an unsigned integer, wrap around to 2^32, and then crash when it tries to look that up. Change FindProcIndex so that it uses bsearch(3) instead of implementing its own binary search, like the GLX equivalent FindGLXFunction does. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-11 12:17:07 +01:00
Jordan Justen	e266b32059	mesa/program_binary: add implicit UseProgram after successful ProgramBinary Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106810 Fixes: `b4c37ce214` "i965: Add ARB_get_program_binary support using nir_serialization" Ref: `3fe8d04a6d` "mesa: don't always set _NEW_PROGRAM when linking" Ref: `c505d6d852` "mesa: use gl_program for CurrentProgram rather than gl_shader_program" Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-10 21:12:46 -07:00
Dave Airlie	525cfe5dab	features.txt: update virgl GL4.1 status. All the features for GL4.1 are done (64-bit attribs were part of the fp64 enable). Once tessellation shaders land this will be advertised	2018-06-11 10:49:14 +10:00
Dave Airlie	77d7d7acab	virgl: enable ARB_gpu_shader_fp64 This enables ARB_gpu_shader_fp64 if the host provides it. Tested-by: Gurchetan Singh <gurchetansingh@chromium.org> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2018-06-11 08:35:03 +10:00
Samuel Pitoiset	135e4d434f	radv: add a workaround for DXVK hangs by setting amdgpu-skip-threshold Workaround for bug in llvm that causes the GPU to hang in presence of nested loops because there is an exec mask issue. The proper solution is to fix LLVM but this might require a bunch of work. This fixes a bunch of GPU hangs that happen with DXVK. Vega10: Totals from affected shaders: SGPRS: 110456 -> 110456 (0.00 %) VGPRS: 122800 -> 122800 (0.00 %) Spilled SGPRs: 7478 -> 7478 (0.00 %) Spilled VGPRs: 36 -> 36 (0.00 %) Code Size: 9901104 -> 9922928 (0.22 %) bytes Max Waves: 7143 -> 7143 (0.00 %) Code size slightly increases because it inserts more branch instructions but that's expected. I don't see any real performance changes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105613 Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-09 14:16:49 +02:00
Samuel Pitoiset	94706f0de4	radv: fix missing ZRANGE_PRECISION(1) for GFX9+ ZRANGE_PRECISION(1) seems to be the default optimal value, but it was only set for VI and older chips. This fixes a rendering issue with Banished through DXVK, and might fix more than that. There is still the ZRANGE_PRECISION bug that we need to handle but that can be fixed later. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-09 10:57:01 +02:00
Gustavo Lima Chaves	7dfaf025c5	anv: enable VK_EXT_shader_stencil_export Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-08 11:16:01 -07:00
Gustavo Lima Chaves	7cc5178bba	spirv: add/hookup SpvCapabilityStencilExportEXT v2: An attempt to support SpvExecutionModeStencilRefReplacingEXT's behavior also follows, with the interpretation to said mode being we prevent writes to the built-in FragStencilRefEXT variable when the execution mode isn't set. v3: A more cautious reading of `1db44252d0` led me to a missing change that would stop (what I later discovered were) GPU hangs on the CTS test written to exercise this. v4: Turn FragStencilRefEXT decoration usage without StencilRefReplacingEXT mode into a warning, instead of trying to make the variable read-only. If we are to follow the originating extension on GL, the built-in variable in question should never be readable anyway. v5/v6: rebases. v7: Fix check for gen9 lost in rebase. (Ilia) Reduce the scope of the bool used to track whether SpvExecutionModeStencilRefReplacingEXT was used. Was in shader_info, moved to vtn_builder. (Jason) v8: Assert for fragment shader handling StencilRefReplacingEXT execution mode. (Caio) Remove warning logic, since an entry point might not have StencilRefReplacingEXT execution mode, but the global output variable might still exist for another entry point in the module. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-08 11:15:37 -07:00
Eric Anholt	22cc83cf87	travis: Add the v3d driver to the automake build. Hopefully this reduces the number of fixup commits we need for the automake build. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-08 09:50:38 -07:00
Eric Anholt	3db39d84d2	travis: Do our automake build tests with srcdir != builddir. This will catch many automake bugs that end-users get to experience first, otherwise. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-08 09:50:28 -07:00
Eric Engestrom	37eb56d239	autotools/meson: compile against wayland-egl-backend Bug: https://bugs.freedesktop.org/show_bug.cgi?id=106861 Fixes: `1db4ec0546` "egl: rewire the build systems to use libwayland-egl" Suggested-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Andreas Hartmetz <ahartmetz@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-08 16:45:43 +01:00
Cameron Kumar	cb03803253	vulkan/wsi: Destroy swapchain images after terminating FIFO queues The queue_manager thread can access the images from x11_present_to_x11, hence this reorder prevents dereferencing of dangling pointers. Cc: "18.1" <mesa-stable@lists.freedesktop.org> Fixes: `e73d136a02` ("vulkan/wsi/x11: Implement FIFO mode.") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-06-08 14:06:46 +01:00
Sonny Jiang	ce64c1b70a	radeonsi: emit_dpbb_state packets optimization Remembering latest states of registers to eliminate redunant SET_CONTEXT_REG packets Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-06-07 23:26:40 -04:00
Sonny Jiang	7dcfa1f46e	radeonsi: emit_clip_state packets optimization Remembering latest states of registers to eliminate redunant SET_CONTEXT_REG packets Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-06-07 23:26:36 -04:00
Sonny Jiang	06b47005d3	radeonsi: emit_msaa_sample_locs packets optimization Remembering latest states of registers to eliminate redunant SET_CONTEXT_REG packets Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-06-07 23:26:36 -04:00
Sonny Jiang	a1b4b00ce2	radeonsi: emit_msaa_config packets optimization Remembering latest states of registers to eliminate redunant SET_CONTEXT_REG packets Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-06-07 23:26:36 -04:00
Sonny Jiang	2bad413f55	radeonsi: emit_cb_render_state packets optimization Remembering latest states of registers to eliminate redunant SET_CONTEXT_REG packets Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-06-07 23:26:25 -04:00
Sonny Jiang	43b0269ce3	radeonsi: emit_db_render_state packets optimization Remembering latest states of registers to eliminate redunant SET_CONTEXT_REG packets Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-06-07 23:26:25 -04:00
Jan Vesely	d797f1f47e	drisw: Fix invalid pointer arithmetic Use of void * in pointer arithmetic is illegal, use char * instead. Fixes: `cf54bd5e83` ("drisw: use shared memory when possible") Reviewed-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>	2018-06-07 21:01:29 -04:00
Timothy Arceri	03c370d2f1	radeonsi: fix possible truncation on renderer string Fixes truncation warning in gcc 8.1 Fixes: `8539c9bf31` ("gallium/radeon: add the kernel version into the renderer string") Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2018-06-08 10:07:55 +10:00
Timothy Arceri	fae3b38770	ac: fix possible truncation of intrinsic name Fixes the gcc warning: snprintf’ output between 26 and 33 bytes into a destination of size 32 Fixes: `d5f7ebda3e` ("ac: add LLVM build functions for subgroup instrinsics") Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-08 09:24:15 +10:00
Bas Nieuwenhuizen	4fc2d5e141	amd/common: Fix number of coords for getlod. The LLVM 6 code reduced it to a non-array call. We need to do that with the new code too. This fixes dEQP-VK.glsl.texture_functions.query.texturequerylod.array for radv. Fixes: `a9a7993441` "amd/common: use the dimension-aware image intrinsics on LLVM 7+" Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-06-07 23:59:52 +02:00
Dave Airlie	9be56316cf	features: add virgl to the GL features list This hopefully adds virgl to the correct places and current statuses of various extensions. virgl of course relies on two external things a) host driver that can support the features b) up to date host virglrenderer library that can support the features. This list will be maintained as latest (a) + (b) + mesa. Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2018-06-08 07:34:53 +10:00
Matt Turner	a5abb2da74	meson: Add support for read-only text segment on x86 Port of `6dfc5e28f7` (configure.ac: Add support to enable read-only text segment on x86.) to Meson. Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-06-07 14:16:44 -07:00
Dylan Baker	8f2421d73b	meson: work around gentoo applying -m32 to host compiler in cross builds Gentoo's ebuild system always adds -m32 to the compiler for doing x86_64 -> x86 cross builds, while meson expects it not to do that. This results in an x86 -> x86 cross build, and assembly gets disabled. Fixes: `2d62fc0646` ("meson: disable x86 asm in fewer cases.") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-07 11:54:06 -07:00
Jason Ekstrand	e0fa239962	i965/screen: Sanity check that all formats we advertise are useable Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-06-07 11:23:34 -07:00
Jason Ekstrand	0e7f3febf7	i965/screen: Use RGBA non-sRGB formats for images Not all of the MESA_FORMAT and ISL_FORMAT helpers we use can properly handle RGBX formats. Also, we don't want to make decisions based on those in the first place because we can't render to RGBA and we use the non-sRGB version to determine whether or not to allow CCS_E. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-06-07 11:23:34 -07:00
Jason Ekstrand	a266934935	i965/screen: Return false for unsupported formats in query_modifiers Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-06-07 11:23:34 -07:00
Jason Ekstrand	eeae485149	i965/screen: Refactor query_dma_buf_formats This reworks it to work like query_dma_buf_modifiers and, in particular, makes it more flexible so that we can disallow a non-static set of formats. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-06-07 11:23:34 -07:00
Jason Ekstrand	3b54dd87f7	intel/isl: Add bounds-checking assertions for the format_info table We follow the same convention as isl_format_get_layout in having two assertions to ensure that only valid formats are passed in. We also check against the array size of the table because some valid formats such as CCS formats will may be past the end of the table. This fixes some potential out-of-bounds array access even in valid cases. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-06-07 11:23:34 -07:00
Jason Ekstrand	778e2881a0	intel/isl: Add bounds-checking assertions in isl_format_get_layout We add two assertions instead of one because the first assertion that format != ISL_FORMAT_UNSUPPORTED is more descriptive and checks for a real but unsupported enumerant while the second ensures that they don't pass in garbage values. We also update some other helpers to use isl_format_get_layout instead of using the table directly so that they get bounds checking too. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-06-07 11:23:34 -07:00
Dylan Baker	c267f46ef2	meson: Clarify why asm cannot be used in cross compile This makes the reasoning for why a cross compile is not using asm clearer (hopefully). v2: - fix typos Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-07 10:40:35 -07:00
Eric Engestrom	f436ae237b	docs: talk about Wayland instead of libwayland Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-07 18:06:40 +01:00
Jason Ekstrand	237c5ac4f9	anv: Set fence/semaphore types to NONE in impl_cleanup There were some places that were calling anv_semaphore_impl_cleanup and neither deleting the semaphore nor setting the type back to NONE. Just set it to NONE in impl_cleanup to avoid these issues. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106643 Fixes: `031f57eba` "anv: Add a basic implementation of VK_KHX_external..." Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-06-07 09:46:45 -07:00
Plamena Manolova	3ba16d640e	nir: Add global invocation id intrinsic. Add the missing nir intrinsic for the gl_GlobalInvocationID compute shader variable. Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-06-07 14:53:12 +01:00
Eric Engestrom	61edad216e	travis: bump libwayland to the first version with libwayland-egl Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-07 11:10:11 +01:00
Kenneth Graunke	3ea2d791f3	i965: Require softpin support for Cannonlake and later. This isn't strictly necessary, but anyone running Cannonlake will already have Kernel 4.5 or later, so there's no reason to support the relocation model on Gen10+. This will let us avoid dealing with them for new features. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-06-06 19:45:09 -07:00
Kenneth Graunke	a363bb2cd0	i965: Allocate VMA in userspace for full-PPGTT systems. This patch enables soft-pinning of all buffers, allowing us to skip relocation processing entirely. All systems with full PPGTT and > 4GB of VMA should gain these benefits. This should be most Gen8+. Unfortunately, this excludes a few systems: - Cherryview (only has 32-bit addressing, despite 48-bit pointers) - Broadwell with a 32-bit kernel - Anybody running pre-4.5 kernel. We may enable it for Cherryview in the future, but it would require some tweaks to the memory zone. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-06-06 19:45:09 -07:00
Kenneth Graunke	74259b98aa	intel/blorp: Emit VF cache invalidates for 48-bit bugs with softpin. commit `92f01fc5f9` made i965 start emitting VF cache invalidates when the high bits of vertex buffers change. But we were not tracking vertex buffers emitted by BLORP. This was papered over by a mistake where I emitted VF cache invalidates all the time, which Chris fixed in commit `3ac5fbadfd`. This patch adds a new hook which allows the driver to track addresses and request a VF cache invalidate as appropriate. v2: Make the driver do the PIPE_CONTROL so it can apply workarounds (caught by Jason Ekstrand). Rebase on anv bug fix. v3: Don't screw up the boolean (caught by Jason Ekstrand). Fixes: `92f01fc5f9` ("i965: Emit VF cache invalidates for 48-bit addressing bugs with softpin.") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-06 19:45:09 -07:00
Timothy Arceri	2a74296f24	nir: add opt_if_loop_terminator() This pass detects potential loop terminators and moves intructions from the non breaking branch after the if-statement. This enables both the new opt_if_simplification() pass and loop unrolling to potentially progress further. Unexpectedly this change speed up shader-db run times by ~3% Ivy Bridge shader-db results (all changes in dolphin/ubershaders): total instructions in shared programs: 9995662 -> 9995338 (-0.00%) instructions in affected programs: 87845 -> 87521 (-0.37%) helped: 27 HURT: 0 total cycles in shared programs: 230931495 -> 230925015 (-0.00%) cycles in affected programs: 56391385 -> 56384905 (-0.01%) helped: 27 HURT: 0 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-07 11:33:04 +10:00
Timothy Arceri	1098bc5e85	nir: move ends_in_break() helper to nir_loop_analyze.h We will use the helper while simplifying potential loop terminators in the following patch. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-07 11:33:04 +10:00
Timothy Arceri	186988e28f	radv: fix Coverity no effect control flow issue swizzle is unsigned so "desc->swizzle[c] < 0" is never true. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-07 10:10:57 +10:00
Jason Ekstrand	44c614843c	intel/blorp: Don't vertex fetch directly from clear values On gen8+, we have to VF cache flush whenever a vertex binding aliases a previous binding at the same index modulo 4GiB. We deal with this in Vulkan by ensuring that vertex buffers and the dynamic state (from which BLORP pulls its vertex buffers) are in the same 4GiB region of the address space. That doesn't work if we're reading clear colors with the VF unit. In order to work around this we switch to using MI commands to copy the clear value into the vertex buffer we allocate for the normal constant data. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-06 16:32:38 -07:00
Lionel Landwerlin	b28a2510cc	dri: add missing 16bits formats mapping i965 advertises the 16-bit R and RG formats through eglQueryDmaBufFormatsEXT but falls over when a client tries to use or asks more information about such a format because driImageFormatToGLFormat returns MESA_FORMAT_NONE. Found by Eero Tamminen. v2: Add G16R16 formats (Lionel) v3: Fix G16R16 mapping to mesa format (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106642 Reviewed-by: Plamena Manolova <plamena.manolova@intel.com> (v2) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-07 00:09:21 +01:00
Eric Anholt	833c404600	nir: Look into uniform structs for samplers when counting num_textures. mesa/st decides whether to update samplers after a program change based on whether num_textures is nonzero. By not counting samplers in a uniform struct, we would segfault in KHR-GLES3.shaders.struct.uniform.sampler_vertex if it was run in the same context after a non-vertex-shader-uniform testcase (as is the case during a full conformance run). v2: Implement using two separate pure functions instead of updating pointers. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-06 13:46:55 -07:00
Eric Anholt	f69473a712	v3d: Work around GFXH-1461/GFXH-1689 by using CLEAR_TILE_BUFFERS. This doesn't seem to have done anything to my test results. However, given that we've still got a class of GPU hangs, following the workarounds that the closed driver does so that we get the same command sequences seems like a good idea.	2018-06-06 13:46:55 -07:00
Eric Anholt	9d5860310d	v3d: Enable the new NIR bitfield operation lowering paths. These together get the GLSL 3.00 unorm/snorm pack functions and MESA_shader_integer operations working. v2: Fix commit message typo. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-06 13:44:28 -07:00
Eric Anholt	73953b0713	nir: Add lowering for nir_op_bit_count. This is basically the same as the GLSL lowering path. v2: Fix typo in the link Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-06 13:44:28 -07:00
Eric Anholt	7afa26d4e3	nir: Add lowering for nir_op_bitfield_reverse. This is basically the same as the GLSL lowering path. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-06 13:44:28 -07:00
Eric Anholt	6e1597c2d9	nir: Add an ALU lowering pass for mul_high. This is based on the glsl/lower_instructions.cpp implementation, but should be much more readable. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-06 13:44:28 -07:00
Eric Anholt	6a0db5f08f	nir: Add lowering for find_lsb. There is a fairly simple relation to turn this into ufind_msb. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-06 13:44:28 -07:00
Eric Anholt	d4c7c3c225	nir: Add lowering for ifind_msb to ufind_msb. ufind_msb is easily expressed in terms of clz, and we can reduce ifind_msb to that. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-06 13:44:28 -07:00
Eric Anholt	af88acf4c4	nir: Add lowering from ibitfield_extract/ubitfield_extract to shifts. V3D doesn't have opcodes for ibfe/ubfe, so we need to lower similarly to glsl/lower_instructions.cpp. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-06 13:44:28 -07:00
Eric Anholt	74618ccbca	nir: Add lowering for bitfieldInsert without using bfi. If you don't have HW to do bfi, then lowering bitfieldInsert to bfi makes things harder than keeping the "bits" argument around. This still uses bfm, but I've added the obvious lowering of bfm if you need it. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-06 13:44:28 -07:00
Eric Engestrom	735b104707	docs: add note about moving to libwayland-egl in 18.2.0 Cc: Emil Velikov <emil.l.velikov@gmail.com> Cc: Daniel Stone <daniels@collabora.com> Cc: Andres Gomez <agomez@igalia.com> Cc: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-06 12:12:03 -07:00
Eric Engestrom	b9361c9df0	egl: remove wayland-egl now that we're using libwayland-egl Cc: Emil Velikov <emil.l.velikov@gmail.com> Cc: Daniel Stone <daniels@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-06 12:12:01 -07:00
Eric Engestrom	1db4ec0546	egl: rewire the build systems to use libwayland-egl Cc: Emil Velikov <emil.l.velikov@gmail.com> Cc: Daniel Stone <daniels@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-06 12:11:57 -07:00
zhaowei yuan	67f7a16b59	glsl: Take 'double' as reserved after GLSL ES 1.0 GLSL ES 1.0.17 specifies that "double" is a keyword reserved Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106823 Signed-off-by: zhaowei yuan <zhaowei.yuan@samsung.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-05 23:39:25 -07:00
Marek Olšák	17a42062cc	r300g/swtcl: make pipe_context uploaders use malloc'd memory as before Discovered by Roland Scheidegger. The resource_create code uses GPU memory for PIPE_BIND_CUSTOM, but malloc'd memory otherwise. Vertex and index buffers should use malloc'd memory. Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>	2018-06-05 22:52:08 -04:00
Jason Ekstrand	01ad2067bb	intel/eu: Use a struct copy instead of a memcpy The memcpy had the wrong size and this was causing crashes on 32-bit builds of the driver. Fixes: `6a9525bf67` "intel/eu: Switch to a logical state stack" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106830 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-05 15:51:01 -07:00
Philip Rebohle	cc21e96d5f	radv: Use correct color format for fast clears Using the image format is incorrect when the view has a different format than the image. Instead, the view format needs to be used. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> CC: 18.1 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106687	2018-06-05 23:51:03 +02:00
Eric Anholt	2b1b2cbf61	v3d: Be more explicit about include directory from our generated code. You'd need src/broadcom/cle/ in the -I previously, for srcdir != builddir. nir was fine at that, but automake didn't have it. Bugzilla: https://github.com/anholt/mesa/issues/104	2018-06-05 12:44:49 -07:00
Bas Nieuwenhuizen	2a10fd902d	radv: Do not hardcode fast clear formats. except for the odd one out. This should support many more formats. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-06-05 20:53:21 +02:00
Scott D Phillips	6fb22114a0	intel/tools: add intel_sanitize_gpu to EXTRA_DIST Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106778 Fixes: `cc41603d6d` ("intel/tools: new intel_sanitize_gpu tool") Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-06-05 10:32:35 -07:00
Scott D Phillips	08535dd886	util/tests/vma: Fix warning c++11-narrowing Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106801 Fixes: `943fecc569` ("util: Add a randomized test for the virtual memory allocator") Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-06-05 10:32:07 -07:00
Scott D Phillips	4b123fb74b	util: tests: vma test depends on C++11 support Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106776 Fixes: `943fecc569` ("util: Add a randomized test for the virtual memory allocator") Tested-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-06-05 10:13:14 -07:00
Michel Dänzer	6b8f3724c8	glx: Fix number of property values to read in glXImportContextEXT We were trying to read twice as many as the X server sent us, which upset XCB: [xcb] Too much data requested from _XRead [xcb] This is most likely caused by a broken X extension library [xcb] Aborting, sorry about that. glx-free-context: ../../src/xcb_io.c:732: _XRead: Assertion `!xcb_xlib_too_much_data_requested' failed. Fixing this takes 3 GLX piglit tests from crash to pass. Fixes: `0852162950` "glx: Be more tolerant in glXImportContext (v2)" Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-06-05 18:56:43 +02:00
Eric Engestrom	c765c39ea7	configure: radv depends on mako Bug: https://bugs.freedesktop.org/show_bug.cgi?id=106784 Fixes: `17201a2eb0` "radv: port to using updated anv entrypoint/extension generator." Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-05 16:32:48 +01:00
Eric Engestrom	5bdc38f356	travis: use correct form for array options I'd like to eventually drop support for the confusing "an array of a single empty string is meant to be interpreted as an empty array", so let's start by not using it anymore. Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-05 16:31:23 +01:00
Lionel Landwerlin	9aedee64ac	anv: intel: add softpin flag on imported BOs Looks like we forgot to update this bit of the driver for softpin. Fixes: `4affeba1e9` ("anv: Soft-pin everything else") Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-05 14:18:35 +01:00
Eric Engestrom	66c61797ad	autotools: add missing android file to package Bug: https://bugs.freedesktop.org/show_bug.cgi?id=106779 Fixes: `ff904978a1` "gallium/util: Android backtrace support" Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-05 10:39:04 +01:00
Eric Engestrom	7c4423cce9	meson: fix platforms check for `-D egl=true` Fixes: `0ed6a87a10` "meson: fix platforms=[]" Reported-by: Christoph Haag <haagch@frickel.club> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-05 10:38:57 +01:00
Mathias Fröhlich	1ac4439d62	mesa: Make sure that imm draws are flushed before other draws execute. The recent patch mesa: Remove FLUSH_VERTICES from VAO state changes. Pending draw calls on immediate mode or display list calls do not depend on changes of the VAO state. So, remove calls to FLUSH_VERTICES and flag _NEW_ARRAY as appropriate. uncovered a problem that non immediate mode draw calls do only flush outstanding immediate mode draws if FLUSH_UPDATE_CURRENT is set in ctx->Driver.NeedFlush. In that case, due to the sequence of _mesa_set_draw_vao commands we could end up with the VAO from the FLUSH_VERTICES call set into gl_context::Array._DrawVAO when the array draw is executed. So the change pulls FLUSH_CURRENT out of _mesa_validate_* calls into the array draw calls being validated. The change introduces a new macro FLUSH_FOR_DRAW beside FLUSH_VERTICES and FLUSH_CURRENT that flushes on changed current attributes as well as on outstanding immediate mode draw calls. Use FLUSH_FOR_DRAW in the non immediate mode draw code paths. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106594 Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-06-05 07:05:24 +02:00
gurchetansingh@chromium.org	a7b74a77fa	virgl: use bits in caps set v2 Let's add another field to caps v2, that can help report boolean values. Suggested-by: Gert Wollny <gert.wollny@collabora.com> Suggested-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-06-05 14:29:00 +10:00
gurchetansingh@chromium.org	6ce94a50bb	virgl: add shader offset alignment to to v2 caps struct This is the SSBO analogue to fe0647. User supplied data must be a multiple of GL_SHADER_STORAGE_BUFFER_OFFSET_ALIGNMENT. This fixes 44 GLES31 tests on airlied@'s GLES31 sketch branches with Nvidia hardware, but this patch standalone can applied to master. The alignment restriction on Nvidia is 32, hence the default value. Example tests: dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.0 dEQP-GLES31.functional.ssbo.layout.multi_basic_types.single_buffer.std430 v2: Move to a better place in case statement v3: Rebase Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-06-05 14:28:49 +10:00
Kenneth Graunke	1c9053d076	i965: Prepare batchbuffer module for softpin support. If EXEC_OBJECT_PINNED is set, we don't want to emit any relocations. We simply want to add the BO to the validation list, and possibly mark it as writeable. The new brw_use_pinned_bo() interface does just that. To avoid having to make every caller consider both the relocation and softpin cases, we make emit_reloc() call brw_use_pinned_bo() when given a softpinned buffer. We also can't grow buffers that are softpinned - the mechanism places a larger BO at the same offset as the original, which requires moving BOs around in the VMA. With softpin, we only allocate enough VMA for the original size of the BO. v2: Assert that BOs aren't pinned if the kernel says we should move them (feedback from Chris Wilson) Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-06-04 18:38:41 -07:00
Kenneth Graunke	01058a5522	i965: Add virtual memory allocator infrastructure to brw_bufmgr. This introduces a new fast virtual memory allocator integrated with our BO cache bucketing. For larger objects, it falls back to the simple free-list allocator (util_vma). This puts the allocators in place but doesn't enable softpin yet. v2: (feedback from Chris Wilson) - Check (bo->kflags & EXEC_OBJECT_PINNED) instead of a global flag - Avoid vma_free(0ull) on the err_free path. - Only enable if the kernel says we have full PPGTT support - Make bucketing allocators more resistant to failing to grow arrays (feedback from Scott Phillips) - Don't use node after popping it from the list. - Avoid undefined behavior in canonicalization by reusing new helper - Comment updates (feedback from myself) - Avoid __vma_alloc vs. vma_alloc by making a zero_high_bits helper to return a non-canonical address with the high bits zeroed. - Don't shadow loop variable 'i' when destroying things (ugly; worked) v3: - Replace zero_high_bits with new common gen_48b_address helper. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-06-04 18:38:41 -07:00
Jason Ekstrand	e99b32d4d6	i965: Disable internal CCS for shadows of multi-sampled windows If window system supports Y-tiling but not CCS_E, we currently create an internal CCS for any window system buffers and then resolve right before handing it off to X or Wayland. In the case of the single-sampled shadow of a multi-sampled window system buffer, this is pointless because the only thing we do with it is use it as a MSAA resolve target so we do MSAA resolve -> CCS resolve -> hand to the window system. Instead, just disable CCS for the shadow and then the MSAA resolve will write uncompressed directly into it. If the window system supports CCS_E, we will still use CCS_E, we just won't do internal CCS. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-04 15:27:29 -07:00
Jason Ekstrand	6ab9fe7673	i965/miptree: Rename a parameter to create_for_dri_image Instead of having it be a general "is this a winsys image" boolean, make it more specific to the actual purpose. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-04 15:27:16 -07:00
Jason Ekstrand	6a9525bf67	intel/eu: Switch to a logical state stack Instead of the state stack that's based on copying a dummy instruction around, we start using a logical stack of brw_insn_states. This uses a bit less memory and is way less conceptually bogus. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-04 14:03:03 -07:00
Jason Ekstrand	db9675f5a4	intel/eu: Set flag [sub]register number differently for 3src Prior to gen8, the flag [sub]register number is in a different spot on 3src instructions than on other instructions. Starting with Broadwell, they made it consistent. This commit fixes bugs that occur when a conditional modifier gets propagated into a 3src instruction such as a MAD. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-04 14:03:03 -07:00
Jason Ekstrand	2d20303e18	intel/eu: Copy fields manually in brw_next_insn Instead of doing a memcpy, this moves us to start with a blank instruction (memset to zero) and copy the fields over one at a time. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-04 14:03:03 -07:00
Jason Ekstrand	381fac2740	intel/eu: Add some brw_get_default_ helpers This is much cleaner than everything that wants a default value poking at the bits of p->current directly. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-04 14:03:03 -07:00
Jose Fonseca	db38c3b4ba	trace: Fix parsing of recent traces. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-06-04 21:06:31 +01:00
Jose Fonseca	8652ff7cdf	trace: Fix trace_context_transfer_unmap methods. The emitted buffer_subdata/texture_subdata call didn't match the respective signatures. v2: Actually emit buffer_subdata call. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-06-04 21:06:31 +01:00
Nicolai Hähnle	a9a7993441	amd/common: use the dimension-aware image intrinsics on LLVM 7+ Requires LLVM trunk r329166. Acked-by: Marek Olšák <marek.olsak@amd.com>	2018-06-04 21:34:59 +02:00
Kenneth Graunke	b3ba47c592	i965: Fix batch-last mode to properly swap BOs. On pre-4.13 kernels, which don't support I915_EXEC_BATCH_FIRST, we move the validation list entry to the end...but incorrectly left the exec_bo array alone, causing a mismatch where exec_bos[0] no longer corresponded with validation_list[0] (and similarly for the last entry). One example of resulting breakage is that we'd update bo->gtt_offset based on the wrong buffer. This wreaked total havoc when trying to use softpin, and likely caused unnecessary relocations in the normal case. Fixes: `29ba502a4e` (i965: Use I915_EXEC_BATCH_FIRST when available.) Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-06-04 09:43:09 -07:00
Samuel Pitoiset	06d3c65098	radv: fix a GPU hang when MRTs are sparse When the i-th target format is set, all previous target formats must be non-zero to avoid hangs. In other words, without this if a fragment shader exports mrt0, mrt2 and mrt3, the GPU hangs because the target format of mrt1 is zero. This fixes DXVK GPU hangs with "Seven: The Days Long Gone", "GTA V" and probably more games. Cc: "18.0" 18.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-04 14:01:33 +02:00
Bas Nieuwenhuizen	2835b6baf4	radv: Don't pass a TESS_EVAL shader when tesselation is not enabled. Otherwise on pre-GFX9, if the constant layout allows both TESS_EVAL and GEOMETRY shaders, but the PIPELINE has only GEOMETRY, it would return the GEOMETRY shader for the TESS_EVAL shader. This would cause the flush_constants code to emit the GEOMETRY constants to the TESS_EVAL registers and then conclude that it did not need to set the GEOMETRY shader registers. Fixes: `dfff9fb6f8` "radv: Handle GFX9 merged shaders in radv_flush_constants()" CC: 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-06-04 13:46:24 +02:00
Samuel Pitoiset	e3e929f8c3	nir: implement the GLSL equivalent of if simplication in nir_opt_if This pass turns: if (cond) { } else { do_work(); } into: if (!cond) { do_work(); } else { } Here's the vkpipeline-db stats (from affected shaders) on Polaris10: Totals from affected shaders: SGPRS: 17272 -> 17296 (0.14 %) VGPRS: 18712 -> 18740 (0.15 %) Spilled SGPRs: 1179 -> 1142 (-3.14 %) Code Size: 1503364 -> 1515176 (0.79 %) bytes Max Waves: 916 -> 911 (-0.55 %) This pass only affects Serious Sam 2017 (Vulkan) on my side. The stats are not really good for now. Some shaders look quite dumb but this will be improved with further NIR passes, like ifs combination. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-04 12:41:10 +02:00
Samuel Pitoiset	e44f90eccf	nir: make is_comparison() a non-static helper function Rename and change the prototype for consistency regarding nir_tex_instr_is_query(). This function will be used in the following patch. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-04 12:41:08 +02:00
Dave Airlie	67eccd6aa2	nir: use num_components wrappers in print/validate. These wrappers were introduces, so start using them. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-04 05:58:42 +10:00
Juan A. Suarez Romero	bad7332f7c	doc: update calendar, add news and link release notes for 18.0.5 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2018-06-03 10:19:32 +00:00
Juan A. Suarez Romero	41c01d79ee	docs: add sha256 checksums for 18.0.5 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `aba161e63a`)	2018-06-03 10:12:02 +00:00
Juan A. Suarez Romero	a89cb6711b	docs: add release notes for 18.0.5 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `ca0037aaef`)	2018-06-03 10:12:00 +00:00
Jose Fonseca	8841c2cda5	scons: Fix MinGW cross compilation with LLVM 5.0. LLVM 5.0 requires additional Win32 libraries, and MinGW with pthreads. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-06-02 09:58:50 +01:00
Jason Ekstrand	64e619674e	anv: Don't even bother processing relocs if we have softpin Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-06-01 16:34:26 -07:00
Jason Ekstrand	c7be17c8d3	anv: Refactor reloc handling in execbuf_add_bo This just separates the reloc list vs. BO set cases and lets us avoid an allocation if relocs->deps->entries == 0. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-06-01 16:34:25 -07:00
Jason Ekstrand	7105b7890a	anv: Assert that the kernel leaves pinned BO addresses alone Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-06-01 16:33:07 -07:00
Scott D Phillips	4affeba1e9	anv: Soft-pin everything else v2 (Jason Ekstrand): - Break up Scott's mega-patch Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-06-01 14:27:13 -07:00
Scott D Phillips	f3dbe0419d	anv: Soft-pin batch buffers Co-authored-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-06-01 14:27:12 -07:00
Jason Ekstrand	a0b133286a	anv/batch_chain: Simplify secondary batch return chaining Previously, we did this weird thing where we left space and an empty relocation for use in a hypothetical MI_BATCH_BUFFER_START that would be added to the secondary later. Then, when it came time to chain it into the primary, we would back that out and emit an MI_BATCH_BUFFER_START. This worked well but it was always a bit hacky, fragile and ugly. This commit instead adds a helper for rewriting the MI_BATCH_BUFFER_START at the end of an anv_batch_bo and we use that helper for both batch bo list cloning and handling returns from secondaries. The new helper doesn't actually modify the batch in any way but instead just adjusts the relocation as needed. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-06-01 14:27:12 -07:00
Jason Ekstrand	4f20c665b4	anv/batch_chain: Call batch_bo_finish at the end of end_batch_buffer The only reason we were calling it in the middle was that one of the cases for figuring out the secondary command buffer execution type wanted batch_bo->length which gets set by batch_bo_finish. It's easy enough to recalculate and now batch_bo_finish is called in a sensible location. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-06-01 14:27:11 -07:00
Jason Ekstrand	e7d0378bd9	anv: Soft-pin client-allocated memory Now that we've done all that refactoring, addresses are now being directly written into surface states by ISL and BLORP whenever a BO is pinned so there's really nothing to do besides enable it. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-06-01 14:27:11 -07:00
Jason Ekstrand	caf41c78ca	anv/allocator: Support softpin in the BO cache Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-06-01 14:27:11 -07:00
Jason Ekstrand	b0d50247a7	anv/allocator: Set the BO flags in bo_cache_alloc/import It's safer to set them there because we have the opportunity to properly handle combining flags if a BO is imported more than once. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-06-01 14:27:10 -07:00
Scott D Phillips	27cc68d9e9	anv: For pinned BOs, skip relocations, but track bo usage References to pinned BOs won't need to be relocated at a later point, so just write the final value of the reference into the bo directly. Add a `set` to the relocation lists for tracking dependencies that were previously tracked by relocations. When a batch is executed, we add the referenced pinned BOs to the exec list. v2: - visit bos from the dependency set in a deterministic order (Jason) v3: - compar => compare, drat (Jason) - Reworded commit message, provided by (Jordan) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-01 14:27:10 -07:00
Scott D Phillips	c7db0ed4e9	anv: Use a separate pool for binding tables when soft pinning Soft pinning lets us satisfy the binding table address requirements without using both sides of a growing state_pool. If you do use both sides of a state pool, then you need to read the state pool's center_bo_offset (with the device mutex held) to know the final offset of relocations that target the state pool bo. By having a separate pool for binding tables that only grows in the forward direction, the center_bo_offset is always 0 and relocations don't need an update pass to adjust relocations with the mutex held. v2: - don't introduce a separate state flag for separate binding tables (Jason) - replace bo and map accessors with a single binding_table_pool accessor (Jason) v3: - assert bt_block->offset >= 0 for the separate binding table (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-06-01 14:27:10 -07:00
Scott D Phillips	e662bdb820	anv: Soft-pin state pools The state_pools reserve virtual address space of the full BLOCK_POOL_MEMFD_SIZE, but maintain the current behavior of growing from the middle. v2: - rename block_pool::offset to block_pool::start_address (Jason) - assign state pool start_address statically (Jason) v3: - remove unnecessary bo_flags tampering for the dynamic pool (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-06-01 13:49:22 -07:00
Ian Romanick	f00fcfb7a2	nir: Lower !f2b(x) to x == 0.0 Some trivial help now, but it also prevents ~40 regressions caused by Samuel's "nir: implement the GLSL equivalent of if simplication in nir_opt_if" patch. All Gen4+ platforms had similar results. (Skylake shown) total instructions in shared programs: 14369557 -> 14369555 (<.01%) instructions in affected programs: 442 -> 440 (-0.45%) helped: 2 HURT: 0 total cycles in shared programs: 532425772 -> 532425743 (<.01%) cycles in affected programs: 6086 -> 6057 (-0.48%) helped: 2 HURT: 0 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-06-01 10:14:53 -07:00
Ian Romanick	619c51722b	nir: Add some missing "optimization undo" patterns `d8d18516b0` and `03fb13f646` added some patterns to undo conversions like (('ior', ('flt', a, b), ('flt', a, c)), ('flt', a, ('fmax', b, c))) If further optimization cause some of the operands to either be the same or be constants, undoing the transformation can lead to further savings. I don't know why these patterns were not added in those patches. I did not check to see which specific patterns actually helped. I just added all of them for symmetry. This prevents some loop unrolling regressions Plane Shift caused by Samuel's "nir: implement the GLSL equivalent of if simplication in nir_opt_if" patch. Skylake and Broadwell had similar results. (Skylake shown) total instructions in shared programs: 14369768 -> 14369557 (<.01%) instructions in affected programs: 44076 -> 43865 (-0.48%) helped: 141 HURT: 0 helped stats (abs) min: 1 max: 5 x̄: 1.50 x̃: 1 helped stats (rel) min: 0.07% max: 1.52% x̄: 0.66% x̃: 0.60% 95% mean confidence interval for instructions value: -1.67 -1.32 95% mean confidence interval for instructions %-change: -0.72% -0.59% Instructions are helped. total cycles in shared programs: 532430629 -> 532425772 (<.01%) cycles in affected programs: 1170832 -> 1165975 (-0.41%) helped: 101 HURT: 5 helped stats (abs) min: 1 max: 160 x̄: 48.54 x̃: 32 helped stats (rel) min: <.01% max: 8.49% x̄: 2.76% x̃: 2.03% HURT stats (abs) min: 2 max: 22 x̄: 9.20 x̃: 4 HURT stats (rel) min: <.01% max: 0.05% x̄: 0.02% x̃: <.01% 95% mean confidence interval for cycles value: -53.64 -38.00 95% mean confidence interval for cycles %-change: -3.06% -2.20% Cycles are helped. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-01 10:13:16 -07:00
Eric Engestrom	57fbc2ac50	docs/meson: mention how to use array options Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-01 17:53:06 +01:00
Eric Engestrom	03a2e7b662	meson: drop unused empty string array element Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-01 17:53:06 +01:00
Eric Engestrom	0ed6a87a10	meson: fix platforms=[] Fixes: `5608d0a2ce` ("meson: use array type options") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-01 17:53:06 +01:00
Eric Engestrom	a92cdcd598	meson: fix vulkan-drivers=[] Fixes: `5608d0a2ce` ("meson: use array type options") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-01 17:53:06 +01:00
Eric Engestrom	a425db4d7d	meson: fix gallium-drivers=[] Fixes: `5608d0a2ce` ("meson: use array type options") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-01 17:53:06 +01:00
Eric Engestrom	393abd6a57	meson: fix dri-drivers=[] Fixes: `5608d0a2ce` ("meson: use array type options") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-01 17:53:06 +01:00
Eric Engestrom	8faa22c146	REVIEWERS: add root meson.build to the Meson reviewers group Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-01 17:53:06 +01:00
Juan A. Suarez Romero	cbe4baed1f	glsl: Add ir_binop_vector_extract in NIR Implement ir_binop_vector_extract using NIR operations. Based on SPIR-V to NIR approach. This fixes: dEQP-GLES3.functional.shaders.indexing.moredynamic.with_value_from_indexing_expression_fragment Piglit's glsl-fs-vec4-indexing-8.shader_test CC: mesa-stable@lists.freedesktop.org Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Iago Toral <itoral@igalia.com>	2018-06-01 18:09:22 +02:00
Dylan Baker	4ad8e2ac82	doc: update calendar, add news and link release notes for 18.1.1	2018-06-01 08:39:17 -07:00
Dylan Baker	55ee53ea19	docs/relnotes: Add sha256 sums for mesa 18.1.1	2018-06-01 08:39:17 -07:00
Dylan Baker	423c4fe954	docs: Add release notes for 18.1.1	2018-06-01 08:39:17 -07:00
Plamena Manolova	939312702e	i965: Add ARB_fragment_shader_interlock support. Adds suppport for ARB_fragment_shader_interlock. We achieve the interlock and fragment ordering by issuing a memory fence via sendc. Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-06-01 16:36:39 +01:00
Plamena Manolova	60e843c4d5	mesa: Add GL/GLSL plumbing for ARB_fragment_shader_interlock. This extension provides new GLSL built-in functions beginInvocationInterlockARB() and endInvocationInterlockARB() that delimit a critical section of fragment shader code. For pairs of shader invocations with "overlapping" coverage in a given pixel, the OpenGL implementation will guarantee that the critical section of the fragment shader will be executed for only one fragment at a time. Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-06-01 16:36:36 +01:00
Martin Pelikán	53719f818c	compiler/spirv: reject invalid shader code properly After `bebe3d626e`, b->fail_jump is prepared after vtn_create_builder which can longjmp(3) to it through its vtx_assert()s. This corrupts the stack and creates confusing core dumps, so we need to avoid it. While there, I decided to print the offending values for debugability. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-01 08:09:35 -07:00
Juan A. Suarez Romero	360bfb619f	docs: change release manager for 18.1 Dylan will replace Emil as the release manager for 18.1.x series. CC: Emil Velikov <emil.l.velikov@gmail.com> CC: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-06-01 15:24:02 +02:00
Gert Wollny	ef3a6e3d98	virgl: Always assume that ORIGIN_UPPER_LEFT and PIXEL_CENTER* are supported The driver must support at least one of PIPE_CAP_TGSI_FS_COORD_ORIGIN_UPPER_LEFT PIPE_CAP_TGSI_FS_COORD_ORIGIN_LOWER_LEFT and one of PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_HALF_INTEGER PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_INTEGER otherwise glsl_to_tgsi will fire an assert. ORIGIN_UPPER_LEFT is the default convention, and is supported by all mesa drivers, hence it seems reasonable to always report the caps to be enabled. On gles ORIGIN_LOWER_LEFT is generally not supported, so we rely on the caps reported by the host that depend on whether we run on an GL or an EGL host. For PIXEL_CENTER it is completely host driver dependend on what is supported, and since we do not report the actual host driver capabilities it is best to mark both as supported, this is how it works for a GL host too. Fixes: dEQP-GLES3.functional.shaders.builtin_variable.fragcoord_xyz dEQP-GLES3.functional.shaders.metamorphic.bubblesort_flag.variant_1 dEQP-GLES3.functional.shaders.metamorphic.bubblesort_flag.variant_2 Reviewed-by: Gurchetan Singh <gurcetansingh@chromium.org> Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>	2018-06-01 12:04:21 +01:00
Alex Smith	01a2414045	radeonsi: Fix crash on shaders using MSAA image load/store The value returned by tgsi_util_get_texture_coord_dim() does not account for the sample index. This means image_fetch_coords() will not fetch it, leading to a null deref in ac_build_image_opcode() which expects it to be present (the return value of ac_num_coords() does include the sample index). Signed-off-by: Alex Smith <asmith@feralinteractive.com> Cc: "18.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-01 08:53:38 +01:00
Alex Smith	dfff9fb6f8	radv: Handle GFX9 merged shaders in radv_flush_constants() This was not previously handled correctly. For example, push_constant_stages might only contain MESA_SHADER_VERTEX because only that stage was changed by CmdPushConstants or CmdBindDescriptorSets. In that case, if vertex has been merged with tess control, then the push constant address wouldn't be updated since pipeline->shaders[MESA_SHADER_VERTEX] would be NULL. Use radv_get_shader() instead of getting the shader directly so that we get the right shader if merged. Also, skip emitting the address redundantly - if two merged stages are set in push_constant_stages this change would have made the address get emitted twice. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Cc: "18.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-01 08:53:34 +01:00
Alex Smith	7ca0167ae9	radv: Consolidate GFX9 merged shader lookup logic This was being handled in a few different places, consolidate it into a single radv_get_shader() function. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Cc: "18.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-01 08:53:31 +01:00
Alex Smith	0fa51bfdbe	radv: Set active_stages the same whether or not shaders were cached With GFX9 merged shaders, active_stages would be set to the original stages specified if shaders were not cached, but to the stages still present after merging if they were. Be consistent and use the original stages. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Cc: "18.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-01 08:53:01 +01:00
Marek Olšák	9e61147ef6	st/mesa: relax requirements for ARB_ES3_compatibility Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106748 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-01 01:04:17 -04:00
Scott D Phillips	29a139b308	anv/blorp: Write relocated values into surface states v2 (Jason Ekstrand): - Split the blorp bit into it's own patch and re-order a bit - Use anv_address helpers Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-31 16:51:47 -07:00
Jason Ekstrand	bf34ef16ac	anv: Use an address for each anv_image plane This is better than having BO and offset fields. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-05-31 16:51:46 -07:00
Jason Ekstrand	1f2328c3b7	anv/cmd_buffer: Rework surface relocation helpers This commit renames add_surface_state_reloc to add_surface_reloc and makes it takes an address. We also rename add_image_view_relocs to add_surface_state_relocs because it takes an anv_surface_state and doesn't really care about the image view anymore. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-05-31 16:51:46 -07:00
Jason Ekstrand	f270a09737	anv: Use an anv_address in anv_buffer Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-05-31 16:51:46 -07:00
Jason Ekstrand	8a8bd39d5e	anv/cmd_buffer: Use anv_address for handling indirect parameters Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-05-31 16:51:46 -07:00
Jason Ekstrand	1029458ee3	anv: Use an anv_address in anv_buffer_view Instead of storing a BO and offset separately, use an anv_address. This changes anv_fill_buffer_surface_state to use anv_address and we now call anv_address_physical and pass that into ISL. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-05-31 16:51:46 -07:00
Jason Ekstrand	de1c5c1b50	anv: Use full anv_addresses in anv_surface_state This refactors surface state filling to work entirely in terms of anv_addresses instead of offsets. This should make things simpler for when we go to soft-pin image buffers. Among other things, add_image_view_relocs now only cares about the addresses in the surface state and doesn't really need the image view anymore. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-05-31 16:51:46 -07:00
Jason Ekstrand	94081ffc80	anv: Add some anv_address helpers Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-05-31 16:51:46 -07:00
Scott D Phillips	aaea46242d	anv: Add vma_heap allocators in anv_device These will be used to assign virtual addresses to soft pinned buffers in a later patch. Two allocators are added for separate 'low' and 'high' virtual memory areas. Another alternative would have been to add a double-sided allocator, which wasn't done here just because it didn't appear to give any code complexity advantages. v2 (Scott Phillips): - rename has_exec_softpin to use_softpin (Jason) - Only remove bottom one page and top 4 GiB from virt (Jason) - refer to comment in anv_allocator about state address + size overflowing 48 bits (Jason) - Mention hi/lo allocators vs double-sided allocator in commit message (Chris) - assign state pool memory ranges statically (Jason) v3 (Jason Ekstrand): - Use (LOW\|HIGH)_HEAP_(MIN\|MAX)_ADDRESS rather than (1 << 31) for determining which heap to use in anv_vma_free - Only return de-canonicalized addresses to the heap Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-05-31 16:51:46 -07:00
Jason Ekstrand	6e4672f881	intel/common: Add an address de-canonicalization helper Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-05-31 16:51:45 -07:00
Scott D Phillips	943fecc569	util: Add a randomized test for the virtual memory allocator The test pseudo-randomly makes allocations and deallocations with the virtual memory allocator and checks that the results are consistent. Specifically, we test that: * no result from the allocator overlaps an already allocated range * allocated memory fulfills the stated alignment requirement * a failed result from the allocator could not have been fulfilled * memory freed to the allocator can later be allocated again v2: - fix if() in test() to actually run fill() v3: - add c++11 build flag (Jason) - test the full 64-bit range (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-31 16:51:35 -07:00
Jason Ekstrand	f19ad5d31f	util: Add a virtual memory allocator This is simple linear-walk first-fit allocator roughly based on the allocator in the radeon winsys code. This allocator has two primary functional differences: 1) It cleanly returns 0 on allocation failure 2) It allocates addresses top-down instead of bottom-up. The second one is needed for Intel because high addresses (with bit 47 set) need to be canonicalized in order to work properly. If we allocate bottom-up, then high addresses will be very rare (if they ever happen). We'd rather always have high addresses so that the canonicalization code gets better testing. v2: - [scott-ph] remove _heap_validate() if NDEBUG is defined (Jordan) Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com> Tested-by: Scott D Phillips <scott.d.phillips@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-31 16:17:35 -07:00
Bas Nieuwenhuizen	b9fb2c266a	radv: Add startup debug option. This adds a RADV_DEBUG=startup option to dump more info about instance creation and device enumeration. A common question end users have is why the direver is not loading for them, and this has two common reasons: 1) They did not install the driver. 2) AMDGPU is not used for the card in the kernel. This adds some info messages so we can easily get a some useful output from end users. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-31 11:51:23 +02:00
Bas Nieuwenhuizen	38933c1151	radv: Add option to print errors even in optimized builds. Errors are not that common of a case so we can eat a slight perf hit in having to call a function and do a runtime check. In turn this makes debugging random errors happening for end users easier, because they don't have to have a debug build on hand. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-31 11:51:23 +02:00
Bas Nieuwenhuizen	729f7373de	radv: Make the sem_info allocate/free functions static. They are only used in 1 file. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-31 11:51:23 +02:00
Samuel Pitoiset	70f9e2589e	nir: optimize iand(ieq(a, 0), ieq(b, 0)) to ieq(ior(a, b), 0) Totals from affected shaders: SGPRS: 80 -> 80 (0.00 %) VGPRS: 48 -> 48 (0.00 %) Code Size: 2120 -> 2096 (-1.13 %) bytes Max Waves: 16 -> 16 (0.00 %) Only two Rise of Tomb Raider shaders are affected on my side. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-05-31 10:57:16 +02:00
Tapani Pälli	c983c6abaf	mesa: don't call Driver.TexEnv with invalid arguments Patch skips useless and possibly dangerous calls down to the driver in case invalid arguments were given. I noticed this would be happening with demo of Darwinia game. AFAIK this does not fix anything but makes this path safer and more like how other API functions are implemented. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-31 09:24:17 +03:00
Vinson Lee	d511bba2f9	v3d: Fix automake linking error. CXXLD gallium_dri.la ../../../../src/broadcom/.libs/libbroadcom.a(clif_dump.o): In function `clif_dump_packet': src/broadcom/clif/clif_dump.c:87: undefined reference to `v3d33_clif_dump_packet' src/broadcom/clif/clif_dump.c:85: undefined reference to `v3d41_clif_dump_packet' ../../../../src/broadcom/.libs/libbroadcom.a(clif_dump.o): In function `clif_process_worklist': src/broadcom/clif/clif_dump.c:140: undefined reference to `v3d41_clif_dump_gl_shader_state_record' src/broadcom/clif/clif_dump.c:144: undefined reference to `v3d33_clif_dump_gl_shader_state_record' Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-05-30 11:55:09 -07:00
Jakob Bornecrantz	d6cee5a162	virgl: Update virgl_hw.h Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org> Reviewed-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>	2018-05-30 17:07:26 +01:00
Dave Airlie	e2b6d830b2	virgl: add ARB_transform_feedback_overflow_query support Reviewed-by: Jakob Bornecrantz <jakob@collabora.com> Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>	2018-05-30 17:02:55 +01:00
Dave Airlie	22b072c194	virgl: add polygon offset clamp Reviewed-by: Jakob Bornecrantz <jakob@collabora.com> Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>	2018-05-30 17:02:51 +01:00
Dave Airlie	49204ff8ad	virgl: add derivative control support Reviewed-by: Jakob Bornecrantz <jakob@collabora.com> Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>	2018-05-30 17:02:47 +01:00
Dave Airlie	46fe349af2	virgl: add ARB_conditional_render_inverted support Reviewed-by: Jakob Bornecrantz <jakob@collabora.com> Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>	2018-05-30 17:02:40 +01:00
Dave Airlie	f9eb7e8b76	virgl: update caps bitset to latest version. This makes this use all 32 bits, so future sets need to be defined in a new struct. Reviewed-by: Jakob Bornecrantz <jakob@collabora.com> Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>	2018-05-30 17:02:19 +01:00
Timothy Arceri	e8b368ad1c	nir: add unsigned comparison simplifications This avoids loop unrolling regressions in Wolfenstein II on DXVK with an upcoming optimisation series from Samuel. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-30 22:48:37 +10:00
Bas Nieuwenhuizen	c2799574eb	radv: Only expose subgroup shuffles on VI+. The current implementation depends on bpermute, which is VI+. Fixes: `f2c6a55061` "radv: enable subgroup capabilities" Reviewed-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-30 13:49:46 +02:00
Samuel Pitoiset	02c7916298	radv: fix emitting descriptor pointers with LLVM < 7 This was terribly wrong, I forced use of 32-bit pointers when emitting shader descriptor pointers. This fixes GPU hangs with LLVM 5&6 because 32-bit pointers are only supported with LLVM 7. Fixes: `88d1ed0f81` ("radv: emit shader descriptor pointers consecutively") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-30 11:38:54 +02:00
Ilia Mirkin	04fff21c62	nv30: add a couple of missed shader caps Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-05-30 02:06:28 -04:00
Ilia Mirkin	30918b77ac	nv30: ensure that displayable formats are marked accordingly Fixes: `f7604d8af5` ("st/dri: only expose config formats that are display targets") Cc: "18.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-05-30 02:06:28 -04:00
Marek Olšák	858ac8942d	mesa: expose ARB_tessellation_shader in the compatibility profile Gallium drivers don't expose this yet due to: "st/mesa: use PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITY" Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-29 20:13:24 -04:00
Marek Olšák	16ac832392	mesa: expose AMD_vertex_shader_layer in the compatibility profile This requires layered FBOs from GL 3.2. Gallium drivers don't expose this yet due to: "st/mesa: use PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITY" Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-29 20:13:24 -04:00
Marek Olšák	518d8065ce	mesa: expose ARB_gpu_shader5 in the compatibility profile Gallium drivers don't expose this yet due to: "st/mesa: use PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITY" Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-29 20:13:24 -04:00
Marek Olšák	dd93bc4f34	st/mesa: use PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITY Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-29 20:13:24 -04:00
Marek Olšák	34ea55d820	gallium: add PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITY Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-29 20:13:24 -04:00
Marek Olšák	e453fc76e7	mesa: update fixed-func state constants for TCS, TES, GS Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-29 20:13:24 -04:00
Marek Olšák	27a9f27310	mesa: print Compatibility Profile in the version string Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-29 20:13:24 -04:00
Marek Olšák	d3a87537dd	glsl: parse #version XXX compatibility Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-29 20:13:24 -04:00
Marek Olšák	a7d0c53ab8	st/mesa: fix assertion failures with GL_UNSIGNED_INT64_ARB (v2) Bindless texture handles can be passed via vertex attribs using this type. They use the double codepath, so don't use st_pipe_vertex_format. Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-05-29 20:09:00 -04:00
Marek Olšák	a8e1413876	mesa: handle GL_UNSIGNED_INT64_ARB properly (v2) Bindless texture handles can be passed via vertex attribs using this type. This fixes a bunch of bindless piglit tests on radeonsi. Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-05-29 20:09:00 -04:00
Timothy Arceri	1f7a3a1102	mesa: add display list support for glPatchParameter{i,fv}() This is required for tessellation shader Compat profile support. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-30 09:37:35 +10:00
Dave Airlie	d3ff478732	glx/drisw: make the shm/non-shm loader extensions separately. I disliked removing the const here, function tables are meant to be const just to avoid having to think about them, make a second table for the shm vs non-shm paths to use. Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-05-30 09:11:54 +10:00
Marc-André Lureau	33ce3aa512	drisw/glx: implement getImageShm Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-05-30 09:11:54 +10:00
Marc-André Lureau	17b27725fe	drisw: use getImageShm() if available Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-05-30 09:11:54 +10:00
Marc-André Lureau	9feaf33371	drisw: learn to query shmid handle type Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-05-30 09:11:54 +10:00
Marc-André Lureau	bcd80be49a	drisw/glx: use XShm if possible Implements putImageShm from DRIswrastLoaderExtension. If XShm extension is not available, or fails, it will fallback on regular XPutImage(). Tested on Linux only with 16bpp and 32bpp visual. (airlied: tested on 24bpp as well) Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-05-30 09:11:54 +10:00
Marc-André Lureau	cf54bd5e83	drisw: use shared memory when possible If drisw_loader_funcs implements put_image_shm, allocates display target data with shared memory and display with put_image_shm(). Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-05-30 09:11:54 +10:00
Marc-André Lureau	63c427fa71	drisw: use putImageShm if available If the DRIswrastLoaderExtension implements putImageShm, bind it to drisw_loader_funcs. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-05-30 09:11:53 +10:00
Marc-André Lureau	de8085e649	dri: add putImageShm and getImageShm to swrastLoader Add new API to put and get an image using shared memory. Instead of only passing the data pointer, 3 arguments are given: the shmid, the data offset and the shmaddr. Bump interface version. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-05-30 09:11:53 +10:00
Dave Airlie	b7ac0779e0	gallium/winsys: rename DRM_API_HANDLE_* to WINSYS_HANDLE_* This just renames this as we want to add an shm handle which isn't really drm related. Originally by: Marc-André Lureau <marcandre.lureau@gmail.com> (airlied: I used this sed script instead) This was generated with: git grep -l 'DRM_API_' \| xargs sed -i 's/DRM_API_/WINSYS_/g' Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-30 09:11:53 +10:00
Marc-André Lureau	d2eaff33d0	gallium: move winsys handle to it's own file. This will be used in the drisw interface later, which isn't drm specific. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-30 09:11:53 +10:00
Francisco Jerez	4bd2047dee	intel/fs: Add explicit last_rt flag to fb writes orthogonal to eot. When using multiple RT write messages to the same RT such as for dual-source blending or all RT writes in SIMD32, we have to set the "Last Render Target Select" bit on all write messages that target the last RT but only set EOT on the last RT write in the shader. Special-casing for dual-source blend works today because that is the only case which requires multiple RT write messages per RT. When we start doing SIMD32, this will become much more common so we add a dedicated bit for it. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-05-29 15:44:50 -07:00
Francisco Jerez	d3cd6b7215	intel/fs: Replace the CINTERP opcode with a simple MOV The only reason it was it's own opcode was so that we could detect it and adjust the source register based on the payload setup. Now that we're using the ATTR file for FS inputs, there's no point in having a magic opcode for this. v2 (Jason Ekstrand): - Break the bit which removes the CINTERP opcode into its own patch Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-05-29 15:44:50 -07:00
Francisco Jerez	39de901a96	intel/fs: Use the ATTR file for FS inputs This replaces the special magic opcodes which implicitly read inputs with explicit use of the ATTR file. v2 (Jason Ekstrand): - Break into multiple patches - Change the units of the FS ATTR to be in logical scalars Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-05-29 15:44:50 -07:00
Francisco Jerez	4bfa2ac2ea	intel/fs: Rename a local variable so it doesn't shadow component() v2 (Jason Ekstrand): - Break the refactor into its own patch Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-05-29 15:44:50 -07:00
Francisco Jerez	11c71f0e75	intel/eu: Remove brw_codegen::compressed_stack. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-05-29 15:44:50 -07:00
Jason Ekstrand	71a86d1fc6	intel/fs: Use groups for SIMD16 LINTERP on gen11+ This is better than compression control because it naturally extends to SIMD32. v2: - Push/pop instruction state around adjusted codegen (Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-05-29 15:44:50 -07:00
Jason Ekstrand	a1a850cd34	intel/fs: Assert that the gen4-6 plane restrictions are followed The fall-back does not work correctly in SIMD16 mode and the register allocator should ensure that we never hit this case anyway. Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-05-29 15:44:50 -07:00
Jan Vesely	ed834aefa2	travis: Add clover llvm-6.0 build v2: Don't force build using gcc-4.8 Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-By: Aaron Watry <awatry@gmail.com>	2018-05-29 17:36:16 -04:00
Jan Vesely	41b878e1bd	clover: Cleanup compat code for llvm < 3.9 Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-By: Aaron Watry <awatry@gmail.com>	2018-05-29 17:36:16 -04:00
Jan Vesely	d424be0fed	clover: Fix build after llvm r332881. v2: fix whitespace and indentation r332881 added an extra parameter to the emit function. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106619 Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-By: Aaron Watry <awatry@gmail.com> Tested-By: Aaron Watry <awatry@gmail.com> Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>	2018-05-29 17:36:16 -04:00
Chris Wilson	3ac5fbadfd	i965: Only emit VF cache invalidations when the high bits changes Commit `92f01fc5f9` ("i965: Emit VF cache invalidates for 48-bit addressing bugs with softpin.") tried to only emit the VF invalidate if the high bits changed, but it accidentally always set need_invalidate to true; causing it to emit unconditionally emit the pipe control before every primitive. Fixes: `92f01fc5f9` ("i965: Emit VF cache invalidates for 48-bit addressing bugs with softpin.") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106708 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-29 12:16:26 -07:00
Eric Engestrom	e4fe2fd3bb	vulkan: don't free uninitialised memory The modifiers array hasn't been initialised by then, much less with data that would need freeing. Move the label after the loop to fix this. Fixes: `c80c08e226` ("vulkan/wsi/x11: Add support for DRI3 v1.2") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-05-29 17:44:13 +01:00
Eric Engestrom	51a17e7fee	dri: replace two-way switch case with a table lookup Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> --- v2: rebased on top of `432df741e0` "dri_util: Add R10G10B10{A,X}2 translation between DRI and mesa_format."	2018-05-29 17:44:13 +01:00
Eric Engestrom	d3ca7bd452	dri: fix error value returned by driGLFormatToImageFormat() 0 is not a valid value for the __DRI_IMAGE_FORMAT_* enum. It is, however, the value of MESA_FORMAT_NONE, which two of the callers (i915 & i965) checked for. The other callers (that check for errors, ie. st/dri) already check for __DRI_IMAGE_FORMAT_NONE. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-29 17:44:13 +01:00
Eric Engestrom	1945231b48	egl/x11: fix build with DRI3 disabled Fixes: `473af0b541` "egl/x11: deduplicate depth-to-format logic" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Tested-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Yogesh Marathe <yogesh.marathe@intel.com>	2018-05-29 17:01:21 +01:00
Emil Velikov	63b95fb291	meson: require shared glapi when using DRI based libGL Just like we do in the autotools build. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-05-29 16:56:19 +01:00
Emil Velikov	728d1da159	meson: remove unreachable with_glx == 'auto' check Cannot happen since, props to the autodetection further up. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-05-29 16:31:46 +01:00
Thierry Reding	9e539012df	tegra: Treat resources with modifiers as scanout Resources created with modifiers are treated as scanout because there is no way for applications to specify the usage (though that capability may be useful to have in the future). Currently all the resources created by applications with modifiers are for scanout, so make sure they have bind flags set accordingly. This is necessary in order to properly export buffers for such resources so that they can be shared with scanout hardware. Tested-by: Daniel Kolesa <daniel@octaforge.org> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Thierry Reding <treding@nvidia.com>	2018-05-29 16:48:37 +02:00
Thierry Reding	9603d81df0	tegra: Fix scanout resources without modifiers Resources created for scanout but without modifiers need to be treated as pitch-linear. This is because applications that don't use modifiers to create resources must be assumed to not understand modifiers and in turn won't be able to create a DRM framebuffer and passing along which modifiers were picked by the implementation. Tested-by: Daniel Kolesa <daniel@octaforge.org> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Thierry Reding <treding@nvidia.com>	2018-05-29 16:48:34 +02:00
Thierry Reding	bd3e97e5aa	tegra: Remove usage of non-stable UAPI This code path is no longer required with framebuffer modifier support. Tested-by: Daniel Kolesa <daniel@octaforge.org> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Thierry Reding <treding@nvidia.com>	2018-05-29 16:47:45 +02:00
Eric Engestrom	f736be86bb	docs: add favicon to the website favicon.png is just gears.png resized to 64x64, and favicon.ico is generated using this command, adapted from the ImageMagick example [1]: $ convert favicon.png -background black \ $ -clone 0 -resize 16x16 $ \ $ -clone 0 -resize 32x32 $ \ $ -clone 0 -resize 48x48 $ \ $ -clone 0 -resize 64x64 $ \ -delete 0 -alpha off -colors 256 favicon.ico We could edit every html page to add `<link rel="icon" href="favicon.ico" />`, but there's not much point as pretty much every browser will pick it up automatically if the file is named `favicon.ico` and is in the root folder. [1] http://www.imagemagick.org/Usage/thumbnails/#favicon Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-05-29 14:48:21 +01:00
Eric Engestrom	e6a1aca0b2	docs: add missing html closing tag Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-05-29 14:48:21 +01:00
Eric Engestrom	3b5376330f	docs: add missing html tag Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-05-29 14:48:21 +01:00
Karol Herbst	56792a0876	nir/print: fix printing of 8/16 bit constant variables v2 (Jose Maria Casanova Crespo <jmcasanova@igalia.com>): add float16 support Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>	2018-05-29 13:43:49 +02:00
Pierre Moreau	f0e80e123c	nv50/ir: Extend ImmediateValue::applyLog2 to 64-bit integers Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-05-29 13:37:45 +02:00
Pierre Moreau	03f592a164	util/u_math: Implement a logbase2 function for unsigned long v2 (Karol Herbst <kherbst@redhat.com>): * removed unneeded ll * ll -> ull Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-05-29 13:37:45 +02:00
Eric Engestrom	539aa604a0	docs: trivial typo fix Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-05-29 12:10:14 +01:00
Samuel Pitoiset	88d1ed0f81	radv: emit shader descriptor pointers consecutively This reduces the number of SET_SH_REG packets which are emitted for applications that use more than one descriptor set per stage. We should be able to emit more SET_SH_REG packets consecutively (like push constants and vertex buffers for the vertex stage), but this will be improved later. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-29 10:07:18 +02:00
Samuel Pitoiset	21baf33a94	radv: allow radv_emit_shader_pointer_head() to emit more pointers Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-29 10:07:16 +02:00
Samuel Pitoiset	288fe7ec71	radv: split radv_emit_shader_pointer() This will allow to emit consecutive shader pointers for reducing the number of emitted SET_SH_REG packets, which is recommended. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-29 10:07:13 +02:00
Rhys Perry	57e721a456	gm107/ir: prevent WaW hazards in instruction scheduling Previously, findFirstUse() only considered reads "uses". This fixes that by making it check both an instruction's sources and definitions. It also shortens both findFistUse() and findFirstDef() along the way. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-05-28 13:59:56 -04:00
Bas Nieuwenhuizen	a29bc043ae	radv: Implement VK_KHR_draw_indirect_count. Literally the same as the AMD ext. Passes indirect_draw_count CTS tests. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-28 12:08:26 +02:00
Bas Nieuwenhuizen	b0002e4e05	vulkan: Update header+vk.xml to 1.1.76 Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-05-28 12:08:20 +02:00
Bas Nieuwenhuizen	6914d5a2c0	radv: Implement alternate GFX9 scissor workaround. This improves dota2 performance for me by 11% when I force the GPU DPM level to low (otherwise dota2 is CPU limited for 4k on my threadripper), which should be a large part of the radv-amdvlk gap. (For me with that was radv 60.3 -> 66.6, while AMDVLK does about 68 fps) It looks like dota2 rendered the GUI with a bunch of draws with a SetScissors before almost each draw, causing a lot of pipeline stalls. I'm not really happy with the duplication of code, but overriding radeon_set_context_reg would also be messy since we have the pre-recorded pipelines and a bunch of si_cmd_buffer code, as well as some memory->context reg loads for which things would be more complicated. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-28 12:04:25 +02:00
Eric Anholt	3b6dfcf7ae	Revert "st/nir: use NIR for asm programs" This reverts commit `5c33e8c772`. It broke fixed function vertex programs on vc4 and v3d, and apparently caused trouble for radeonsi's NIR paths as well. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> https://bugs.freedesktop.org/show_bug.cgi?id=106673	2018-05-28 14:41:03 +10:00
Scott D Phillips	4714784dae	anv: move canonical_address calculation into a separate function A later patch will make use of this in other places. Also, remove dependency on undefined behavior of left-shifting a signed value. v2: - move function into a separate header (Chris) v3: (by Ken) Add new header to the various build systems. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-27 19:24:33 -07:00
Gert Wollny	1aec4a07d4	r600: Fix SSG when not all components are written Make sure only those components are written to that are specified in the write mask. Fixes: dEQP-GLES2.functional.shaders.operator.common_functions.sign.lowp_float_vertex dEQP-GLES2.functional.shaders.operator.common_functions.sign.lowp_float_fragment dEQP-GLES2.functional.shaders.operator.common_functions.sign.mediump_float_vertex dEQP-GLES2.functional.shaders.operator.common_functions.sign.mediump_float_fragment dEQP-GLES2.functional.shaders.operator.common_functions.sign.highp_float_vertex dEQP-GLES2.functional.shaders.operator.common_functions.sign.highp_float_fragment dEQP-GLES2.functional.shaders.operator.common_functions.sign.lowp_vec3_vertex dEQP-GLES2.functional.shaders.operator.common_functions.sign.lowp_vec3_fragment dEQP-GLES2.functional.shaders.operator.common_functions.sign.mediump_vec3_vertex dEQP-GLES2.functional.shaders.operator.common_functions.sign.mediump_vec3_fragment dEQP-GLES2.functional.shaders.operator.common_functions.sign.highp_vec3_vertex dEQP-GLES2.functional.shaders.operator.common_functions.sign.highp_vec3_fragment Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-05-28 02:57:46 +01:00
Gert Wollny	42cd2810aa	r600: Correct IDIV if DST and SRC use the same temporary In cases like IDIV TEMP[0].xy TEMP[0].xx TEMP[1].yy the result will be written to the same register that is also a source register. Since the components are evaluated one by one, this may result in overwriting the source value for a later operation. Work around this by adding another temporary to store the result if the destination temporary index is equal to one of the source temporary indices. Fixes: dEQP-GLES2.functional.shaders.operator.binary_operator.div.* Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-05-28 02:57:46 +01:00
Kenneth Graunke	58fb613a51	i965: Revert recent tiled memcpy changes. This reverts commit `79fe00efb4`. This reverts commit `f5e8b13f78`. This reverts commit `d21c086d81`. They broke the Android build and I'd rather not leave it broken for the long holiday weekend.	2018-05-26 16:25:50 -07:00
Scott D Phillips	79fe00efb4	i965/miptree: Use cpu tiling/detiling when mapping Rename the (un)map_gtt functions to (un)map_map (map by returning a map) and add new functions (un)map_tiled_memcpy that return a shadow buffer populated with the intel_tiled_memcpy functions. Tiling/detiling with the cpu will be the only way to handle Yf/Ys tiling, when support is added for those formats. v2: Compute extents properly in the x\|y-rounded-down case (Chris Wilson) v3: Add units to parameter names of tile_extents (Nanley Chery) Use _mesa_align_malloc for the shadow copy (Nanley) Continue using gtt maps on gen4 (Nanley) v4: Use streaming_load_memcpy when detiling v5: (edited by Ken) Move map_tiled_memcpy above map_movntdqa, so it takes precedence. Add intel_miptree_access_raw, needed after rebasing on commit `b499b85b0f`. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-25 21:35:50 -07:00
Chris Wilson	f5e8b13f78	i915: Fix streaming loads for intel_tiled_memcpy We stream from a tiled and aligned source into an unaligned user buffer, so we need to use _mm_storeu_si128. Fixes: `d21c086d81` (i965/tiled_memcpy: inline movntdqa loads in tiled_to_linear) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-25 21:35:50 -07:00
Marek Olšák	18c50498db	radeonsi: remove unused variable addr_vec trivial	2018-05-25 18:37:57 -04:00
Jason Ekstrand	ae514ca695	intel/blorp: Support blits and clears on surfaces with offsets For certain EGLImage cases, we represent a single slice or LOD of an image with a byte offset to a tile and X/Y intratile offsets to the given slice. Most of i965 is fine with this but it breaks blorp. This is a terrible way to represent slices of a surface in EGL and we should stop some day but that's a very scary and thorny path. This gets blorp to start working with those surfaces and fixes some dEQP EGL test bugs. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106629 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-25 14:01:44 -07:00
Marek Olšák	2f65c67043	radeonsi: fix passing gl_ClipVertex for GS and tess Also add the fprintf call. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-25 16:46:00 -04:00
Marek Olšák	a7d61c0753	radeonsi: fix color inputs/outputs for GS and tess GS is tested, tessellation is untested. Have outputs_written_before_ps for HW VS and outputs_written for other stages. The reason is that COLOR and BCOLOR alias for HW VS, which drives elimination of VS outputs based on PS inputs. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-25 16:46:00 -04:00
Marek Olšák	92ea9329e5	radeonsi: fix incorrect parentheses around VS-PS varying elimination I don't know if it caused issues. Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-25 16:46:00 -04:00
Marek Olšák	a4ba7cd6a2	st/mesa: simplify lastLevel determination in st_finalize_texture This fixes shader images where we always bind stObj->pt and not individual gl_texture_images. Roughly based on i965 commit `845ad2667a` which does a similar thing but for a different reason. This fixes GL CTS assertion failures introduced by Ilia. Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-25 16:31:36 -04:00
Scott D Phillips	d21c086d81	i965/tiled_memcpy: inline movntdqa loads in tiled_to_linear The reference for MOVNTDQA says: For WC memory type, the nontemporal hint may be implemented by loading a temporary internal buffer with the equivalent of an aligned cache line without filling this data to the cache. [...] Subsequent MOVNTDQA reads to unread portions of the WC cache line will receive data from the temporary internal buffer if data is available. This hidden cache line sized temporary buffer can improve the read performance from wc maps. v2: Add mfence at start of tiled_to_linear for streaming loads (Chris) Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-25 11:05:46 -07:00
Alok Hota	fb20ae0374	swr/rast: Adjusted avx512 primitive assembly for msvc codegen Optimize AVX-512 PA Assemble (PA_STATE_OPT). Reduced generated code by about 4x, MSVC compiler was going crazy making temporaries and split-loading inputs onto the stack unless explicit AVX-512 load ops were added Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-05-25 10:57:02 -05:00
Alok Hota	b3360f5c8b	swr/rast: Moved memory init out of core swr init Added two new files for a wrapper function for initialization v2: added missing include for single architecture builds Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-05-25 10:56:55 -05:00
Alok Hota	b6b114c1ae	swr/rast: Removed superfluous JitManager argument from passes Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-05-25 10:56:49 -05:00
Alok Hota	98d0201577	swr/rast: Renamed MetaData calls Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-05-25 10:56:43 -05:00
Alok Hota	14b5cac0be	swr/rast: Use metadata to communicate between passes Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-05-25 10:56:37 -05:00
Alok Hota	f09636e2e1	swr/rast: Check gCoreBuckets/CORE_BUCKETS equal length at compile time Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-05-25 10:56:01 -05:00
Alok Hota	cfe75cc7b5	swr/rast: Added in-place building to SCATTERPS SCATTERPS previously assumed it was being used with an existing basic block Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-05-25 10:55:37 -05:00
Samuel Pitoiset	45eb24fedf	radv: run the EarlyCSEMemSSA LLVM pass It's recommended by the instruction combining pass, and RadeonSI also runs it. This pass used to segfault with one shader of F12017 in the past, but it no longer crashes. Maybe the LLVM IR generated by RADV has changed. Polaris10: Totals from affected shaders: SGPRS: 441352 -> 441648 (0.07 %) VGPRS: 310888 -> 300784 (-3.25 %) Spilled SGPRs: 13576 -> 12983 (-4.37 %) Code Size: 22560328 -> 22420544 (-0.62 %) bytes Max Waves: 40755 -> 41366 (1.50 %) Vega10: Totals from affected shaders: SGPRS: 442848 -> 442000 (-0.19 %) VGPRS: 310396 -> 300460 (-3.20 %) Spilled SGPRs: 13708 -> 12906 (-5.85 %) Code Size: 22479428 -> 22336216 (-0.64 %) bytes Max Waves: 45783 -> 46506 (1.58 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-25 14:24:14 +02:00
Samuel Pitoiset	66e38654c9	radv: fix dumping compute shader on the graphics queue The graphics pipeline can be NULL. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-25 11:58:07 +02:00
Samuel Pitoiset	de06dfa9ea	radv: add radv_dump_pipeline_state() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-25 11:58:05 +02:00
Samuel Pitoiset	6f0530ecfe	radv: rework how shaders are dumped when generating a hang report Use a flag for the active stages instead. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-25 11:58:03 +02:00
Samuel Pitoiset	8c406f0b4d	radv: remove unused parameter in radv_dump_annotated_shader() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-25 11:57:59 +02:00
Jose Dapena Paz	6c61c31dc2	mesa: do not leak ctx->Shader.ReferencedProgram references When glUseProgram is used, references to the included shaders are added in ctx->Shader.ReferencedProgram. But those references are not decreased when the shader data is deallocated. Thus, those shaders are leaked. Explicitely remove the pending references to these shaders. Fixes: `e6506b3cd2` ("mesa: retain gl_shader_programs after glDeleteProgram if they are in use") Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-25 10:38:09 +10:00
Marek Olšák	508b423dd6	radeonsi: set DB_EQAA.MAX_ANCHOR_SAMPLES correctly Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-24 13:41:57 -04:00
Marek Olšák	07e02c8617	radeonsi: round ps_iter_samples in set_min_samples Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-24 13:41:57 -04:00
Marek Olšák	510c88f9d1	radeonsi: remove redundant ps_iter_samples clamp si_get_ps_iter_samples already does this. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-24 13:41:56 -04:00
Marek Olšák	25cdf754e4	radeonsi: remove some old gfx 9.x registers Leftover from bring up. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-24 13:41:56 -04:00
Marek Olšák	b936f9aa32	radeonsi: disable primitive binning for all blitter ops same as amdvlk. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-24 13:41:56 -04:00
Marek Olšák	8c1c451a90	ac/surface/gfx6: don't overallocate mipmapped HTILE Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-24 13:41:56 -04:00
Eric Engestrom	473af0b541	egl/x11: deduplicate depth-to-format logic Suggested-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-05-24 18:01:45 +01:00
Tapani Pälli	7b54404c9d	i965: enable OES_texture_view for gen8+ Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-24 12:53:07 +03:00
Tapani Pälli	3ddcdcf94d	mesa: changes to expose OES_texture_view extension Functionality already covered by ARB_texture_view, patch also adds missing 'gles guard' for enums (added in `f1563e6392`). Tested via arb_texture_view.*_gles3 tests and individual app utilizing texture view with ETC2. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-24 12:53:07 +03:00
Juan A. Suarez Romero	046b2b651e	docs: update release calendar for 18.1 series v2: extend 18.1 series (Andres) v3: fix copy/paste typo (Engestrom) CC: Andres Gomez <agomez@igalia.com> CC: Emil Velikov <emil.l.velikov@gmail.com> CC: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-05-24 11:47:47 +02:00
Samuel Pitoiset	38a8c5903b	radv: call nir_lower_io_to_temporaries for VS, GS, TES and FS Do not lower FS inputs because this moves all load_var instructions at beginning of shaders and because interp_var_at_sample (and friends) seem broken. That might be eventually enabled later on if we really want to preload all FS inputs at beginning. Polaris10: Totals from affected shaders: SGPRS: 54072 -> 54264 (0.36 %) VGPRS: 38580 -> 38124 (-1.18 %) Spilled SGPRs: 652 -> 652 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 2128116 -> 2127380 (-0.03 %) bytes Max Waves: 8048 -> 8086 (0.47 %) Vega10: Totals from affected shaders: SGPRS: 52616 -> 52656 (0.08 %) VGPRS: 37536 -> 37116 (-1.12 %) Spilled SGPRs: 828 -> 828 (0.00 %) Code Size: 2043756 -> 2042672 (-0.05 %) bytes Max Waves: 9176 -> 9254 (0.85 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-24 09:18:57 +02:00
Samuel Pitoiset	ded1509587	radv: call nir_split_var_copies() before nir_lower_var_copies() This doesn't nothing special currently because we don't create any copy_var instructions, but this is needed for the next patch. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-24 09:18:54 +02:00
Francisco Jerez	936cd3c87a	i965: Use intel_bufferobj_buffer() wrapper in image surface state setup. Instead of directly using intel_obj->buffer. Among other things intel_bufferobj_buffer() will update intel_buffer_object:: gpu_active_start/end, which are used by glBufferSubData() to decide which path to take. Fixes a failure in the Piglit ARB_shader_image_load_store-host-mem-barrier Buffer Update/WaW tests, which could be reproduced with a non-standard glGetTexSubImage implementation (see bug report). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105351 Reported-by: Nanley Chery <nanleychery@gmail.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-05-23 16:21:34 -07:00
Francisco Jerez	e989acb03b	i965: Handle non-zero texture buffer offsets in buffer object range calculation. Otherwise the specified surface state will allow the GPU to access memory up to BufferOffset bytes past the end of the buffer. Found by inspection. v2: Protect against out-of-range BufferOffset (Nanley). Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-05-23 16:21:28 -07:00
Francisco Jerez	156d2c6e62	i965: Move buffer texture size calculation into a common helper function. The buffer texture size calculations (should be easy enough, right?) are repeated in three different places, each of them subtly broken in a different way. E.g. the image load/store path was never fixed to clamp to MaxTextureBufferSize, and none of them are taking into account the buffer offset correctly. It's easier to fix it all in one place. Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106481 Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-05-23 16:21:09 -07:00
Francisco Jerez	5a68147803	Revert "mesa: simplify _mesa_is_image_unit_valid for buffers" This reverts commit `c0ed52f614`. It was preventing the image format validation from being done on buffer textures, which is required to ensure that the application doesn't attempt to bind a buffer texture with an internal format incompatible with the image unit format (e.g. of different texel size), which is not allowed by the spec (it's not allowed for any texture target, whether or not there is spec wording restricting this behavior specifically for buffer textures) and will cause the driver to calculate texel bounds incorrectly and potentially crash instead of the expected behavior. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106465 Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-05-23 16:21:09 -07:00
Bas Nieuwenhuizen	699e1f5aac	ac: Use DPP for build_ddxy where possible. WQM is pretty reliable now on LLVM 7, so let us just use DPP + WQM. This gives approximately a 1.5% performance increase on the vrcompositor built-in benchmark. v2: Use ac_build_quad_swizzle. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-23 21:02:45 +02:00
Miguel Casas	b73b340c37	i965: add {X,A}BGR2101010 to 'intel_image_formats' This patch adds {X,A}BGR2101010 entries to the list of supported 'intel_image_formats'. Bug: https://crbug.com/776093 Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-05-23 10:19:04 -07:00
Miguel Casas	432df741e0	dri_util: Add R10G10B10{A,X}2 translation between DRI and mesa_format. Add R10G10B10{A,X}2 translation between mesa_format and DRI format to driGLFormatToImageFormat() and driImageFormatToGLFormat(). Bug: https://crbug.com/776093 Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-05-23 10:17:45 -07:00
Dylan Baker	c8acfd5ab2	bin/get-pick-listh.sh: force git --pretty=medium Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Andres Gomez <agomez@igalia.com>	2018-05-23 09:54:17 -07:00
Dylan Baker	5a639bdb81	bin/bugzilla_mesa.sh: explicitly set the --pretty argument Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Andres Gomez <agomez@igalia.com>	2018-05-23 09:54:00 -07:00
Eric Engestrom	ec986241f3	docs: drop unnecessary out-of-frame target I'm guessing an earlier version of the website used to have the page contents in <frames>, but this isn't the case anymore so just drop the unnecessary `target="_main"` :) Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-05-23 16:52:23 +01:00
Eric Engestrom	09a6cb7be6	docs: fix various html tags mistakes Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-05-23 16:52:23 +01:00
Eric Engestrom	8034f5f623	docs: fix `<` & `>` used in html code Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-05-23 16:52:23 +01:00
Juan A. Suarez Romero	6db0660d08	docs: add news notes to 18.1.0 CC: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2018-05-23 13:06:55 +02:00
Dave Airlie	f2f464de57	tgsi/scan: add hw atomic to the list of memory accessing files This fixes 4 out of 5 cases in: arb_framebuffer_no_attachments-atomic on cayman. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "18.0 18.1" <mesa-stable@lists.freedesktop.org>	2018-05-23 03:51:40 +01:00
Roland Scheidegger	7b89fcec41	llvmpipe: improve rasterization discard logic This unifies the explicit rasterization discard as well as the implicit rasterization disabled logic (which we need for another state tracker), which really should do the exact same thing. We'll now toss out the prims early on in setup with (implicit or explicit) discard, rather than do setup and binning with them, which was entirely pointless. (We should eventually get rid of implicit discard, which should also enable us to discard stuff already in draw, hence draw would be able to skip the pointless clip and fallback stages in this case.) We still need separate logic for only null ps - this is not the same as rasterization discard. But simplify the logic there and don't count primitives simply when there's an empty fs, regardless of depth/stencil tests, which seems perfectly acceptable by d3d10. While here, also fix statistics for primitives if face culling is enabled. No piglit changes. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-05-23 04:23:32 +02:00
Bas Nieuwenhuizen	047438287c	ac/surface/gfx6: Don't force a tile index for fmask. The bpe of the fmask often differs from the bpe of the main surface. On SI that means it has to get a different tile index. addrlib is capable of figuring this out itself, so just pass -1 instead to let it know that it is not preset. Fixes: `9bf3570fed` "ac/surface/gfx6: compute FMASK together with the color surface" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106511 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106499 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-23 02:23:03 +02:00
Jason Ekstrand	a347a5a12c	i965: Remove ring switching entirely Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-22 15:46:39 -07:00
Jason Ekstrand	b499b85b0f	i965/miptree: Move the access_raw call to the individual map functions The only function that doesn't need to call access_raw is map_blit. If it takes the blitter path, it will happen as part of intel_miptree_copy. If map_blit takes the blorp path, brw_blorp_copy_miptrees will handle doing whatever resolves are needed. This should save us resolves in quite a few cases and will probably help performance a bit. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-22 15:46:37 -07:00
Jason Ekstrand	f566a1264c	i965: Remove support for the BLT ring We still support the blitter on gen4-5 but it's on the same ring as 3D. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-22 15:46:35 -07:00
Jason Ekstrand	33affda8bf	i965/miptree: Use blorp for blit maps on gen6+ Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-22 15:46:34 -07:00
Jason Ekstrand	0eedb0fca9	i965/miptree: Use blorp for validation tex copies on gen6+ It's faster than the blitter and can handle things like stencil properly so it doesn't require software fallbacks. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-22 15:46:32 -07:00
Jason Ekstrand	80fc3896f3	i965: Delete the blitter path for CopyTexSubImage The blorp path (called first) can do anything the blitter path can do so it's just dead code. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-22 15:46:31 -07:00
Jason Ekstrand	8162256b01	i965: Don't fall back to the blitter in BlitFramebuffer On gen4-5, we try the blitter before we even try blorp. On newer platforms, blorp can do everything the blitter can so there's no point in even having the blitter fall-back path. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-22 15:46:29 -07:00
Jason Ekstrand	e596563b08	i965: Remove some unused includes of intel_blit.h Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-22 15:46:27 -07:00
Jason Ekstrand	a9499374a9	i965/blit: Delete intel_emit_linear_blit This function is no longer used. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-22 15:46:25 -07:00
Jason Ekstrand	7fd962093f	i965: Use meta for pixel ops on gen6+ Using meta for anything is fairly aweful and definitely has more CPU overhead. However, it also uses the 3D pipe and is therefore likely faster in terms of GPU time than the blitter. Also, the blitter code has so many early returns that it's probably not buying us that much. We may as well just use meta all the time instead of working over-time to find the tiny case where we can use the blitter. We keep gen4-5 using the old blit paths to avoid perturbing old hardware too much. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-22 15:46:20 -07:00
Kenneth Graunke	92f01fc5f9	i965: Emit VF cache invalidates for 48-bit addressing bugs with softpin. We'd like to start using soft-pin to assign BO addresses up front, and never move them again. Our previous plan for dealing with 48-bit VF cache bugs was to relocate vertex buffers to the low 4GB, so we'd never have addresses that alias in the low 32 bits. But that requires moving buffers dynamically. This patch tracks the last seen BO address for each vertex/index buffer, and emits a VF cache invalidate if the high bits change. (Ideally, we won't hit this case very often.) This should work for the soft-pin case, but unfortunately won't work in the relocation case, as we don't actually know the addresses. So, we have to use both methods. v2: Mention that the cache uses a <VertexBufferIndex, Address> tuple more explicitly (suggested by Scott). Mention "single batch" too (suggested by Chris). Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-05-22 10:02:28 -07:00
Kenneth Graunke	c7259259d4	i965: Introduce a "memory zone" concept on BO allocation. We're planning to start managing the PPGTT in userspace in the near future, rather than relying on the kernel to assign addresses. While most buffers can go anywhere, some need to be restricted to within 4GB of a base address. This commit adds a "memory zone" parameter to the BO allocation functions, which lets the caller specify which base address the BO will be associated with, or BRW_MEMZONE_OTHER for the full 48-bit VMA. Eventually, I hope to create a 4GB memory zone corresponding to each state base address. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-05-22 10:01:09 -07:00
Jason Ekstrand	417b9e5770	intel/eu: Set EXECUTE_1 when setting the rounding mode in cr0 Fixes: `d6cd14f213` "i965/fs: Define new shader opcode to..." Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>	2018-05-22 09:53:23 -07:00
Michel Dänzer	fe2edb25dd	dri3: Stricter SBC wraparound handling Prevents corrupting the upper 32 bits of draw->recv_sbc when draw->send_sbc resets to 0 (which currently happens when the window is unbound from a context and bound to one again), which in turn caused loader_dri3_swap_buffers_msc to calculate target_msc with corrupted upper 32 bits. This resulted in hangs with the Xorg modesetting driver as of xserver 1.20 (older versions and other drivers ignored the upper 32 bits of the target MSC, which is why this wasn't noticed earlier). Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/106351 Tested-by: Mike Lothian <mike@fireburn.co.uk>	2018-05-22 17:59:53 +02:00
Samuel Pitoiset	75e919c045	radv: fix computation of user sgprs for 32-bit pointers With 32-bit pointers we only need one user SGPR per desc set. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-22 15:53:29 +02:00
Samuel Pitoiset	c5536fc813	radv: drop user_sgpr_info::sgpr_count It's only used inside allocate_user_sgprs(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-22 15:53:26 +02:00
Samuel Pitoiset	36a4d6d081	radv: add support for 32-bit pointers in user data SGPRs We still use 64-bit GPU pointers for all ring buffers because llvm.amdgcn.implicit.buffer.ptr doesn't seem to support 32-bit GPU pointers for now. This can be improved later anyways. Vega10: Totals from affected shaders: SGPRS: 1008722 -> 1026710 (1.78 %) VGPRS: 706580 -> 707136 (0.08 %) Spilled SGPRs: 22555 -> 22209 (-1.53 %) Spilled VGPRs: 75 -> 75 (0.00 %) Code Size: 34819208 -> 35202140 (1.10 %) bytes Max Waves: 175423 -> 175086 (-0.19 %) Polaris10: Totals from affected shaders: SGPRS: 1029849 -> 1036517 (0.65 %) VGPRS: 709984 -> 708872 (-0.16 %) Spilled SGPRs: 22672 -> 22309 (-1.60 %) Spilled VGPRs: 82 -> 66 (-19.51 %) Scratch size: 76 -> 60 (-21.05 %) dwords per thread Code Size: 34915336 -> 35309752 (1.13 %) bytes Max Waves: 151221 -> 151677 (0.30 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-22 15:53:22 +02:00
Samuel Pitoiset	b654ef5808	radv: add set_loc_shader_ptr() helper This helper will hep for switching to 32-bit GPU pointers. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-22 15:53:20 +02:00
Samuel Pitoiset	14a7547c08	radv: allocate descriptor BOs in the 32-bit addr space Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-22 15:53:18 +02:00
Samuel Pitoiset	0d1406ad12	radv: allocate the upload BO in the 32-bit addr space Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-22 15:53:17 +02:00
Samuel Pitoiset	d8a61d3232	radv: set amdgpu-32bit-address-high-bits LLVM attribute Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-22 15:53:15 +02:00
Samuel Pitoiset	fe2649d3ad	radv/winsys: allow to allocate BOs in the 32-bit addr space This introduces a new flag called RADEON_FLAG_32BIT. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-22 15:53:13 +02:00
Samuel Pitoiset	b60e0ee789	radv/winsys: request high address This is needed for 32-bit GPU pointers. Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-22 15:53:09 +02:00
Anuj Phogat	0748383a60	i965/glk: Add l3 banks count for 2x6 configuration 2x6 configuration with pci-id 0x3185 has same number of banks (2) as 3x6 configuration (pci-id 0x3184). Reported-by: Clayton Craft <clayton.a.craft@intel.com> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Tested-by: Clayton Craft <clayton.a.craft@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `eb23be1d97` "i965: Add and initialize l3_banks field for gen7+" Cc: Francisco Jerez <currojerez@riseup.net>	2018-05-21 16:43:26 -07:00
Vinson Lee	85f61197df	v3d: Include v3d_drm.h path. Fix build error. CC v3d_blit.lo In file included from v3d_blit.c:27:0: v3d_context.h:39:10: fatal error: v3d_drm.h: No such file or directory #include "v3d_drm.h" ^~~~~~~~~~~ Fixes: `8a793d42f1` ("v3d: Switch the vc5 driver to using the finalized V3D UABI.") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-05-21 11:15:47 -07:00
Samuel Pitoiset	73df16dcee	radv: fix centroid interpolation It's legal to set the centroid and sample interpolation modes when MSAA disabled. So, we have to initialize the centroid inputs because the hardware doesn't. This fixes rendering issues with DXVK and The Witness, World of Warcraft, Trackmania and probably more games. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106315 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102390 CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-21 13:57:46 +02:00
Bas Nieuwenhuizen	f26b008e28	radv: Cleanup unused prime blit path. Since we have the common WSI code, we use vkCmdCopyImageToBuffer instead. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-21 10:33:41 +02:00
Bas Nieuwenhuizen	a63a0960e3	radv: Fix SRGB compute copies. SRGB stores are broken. We had compensation code in the resolve path but none in the copy path. Since we don't want any conversion and it does not matter for DCC, just make everything UNORM instead. This happened to cause wrong colors for the PRIME path, as that uses image->buffer copies which always use the compute path. CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106587 Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-05-21 10:33:41 +02:00
Tapani Pälli	63525ba730	android: enable VK_ANDROID_native_buffer Patch changes entrypoints generator to not skip this extension even though it is set as disabled in the xml. We also need compilation flag VK_USE_PLATFORM_ANDROID_KHR to be enabled. It looks like this extension got disabled in commit `69f447553c`. v2: just remove the whole 'supported' attrib check + remove vk_icd.h compilation fix (fix in VulkanHeaders instead) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-21 09:26:50 +03:00
Tapani Pälli	437acae704	vulkan: update vk_icd.h to current upstream Import from commit eb0c1fd on branch 'master' of https://github.com/KhronosGroup/Vulkan-Headers.git. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-21 09:26:50 +03:00
Dave Airlie	bfa74bb44d	virgl: set texture buffer offset alignment to disable ARB_texture_buffer_range. The host side hasn't got support for this feature yet, so don't enable it unless we get the caps from the host. This makes the texture buffer range piglit tests skip now. Fixes: `fe0647df5a` (virgl: add offset alignment values to to v2 caps struct) Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2018-05-21 12:44:55 +10:00
Timothy Arceri	2e6c987a85	mesa: stop hiding query parameters from OpenGL compat Just let the extension detection do its job as we will be adding compat profile support in future, also we want these to work with compat profile version overrides. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-21 09:39:03 +10:00
Christoph Haag	549e54270b	radv: fix VK_EXT_descriptor_indexing GetPhysicalDeviceProperties2KHR() was crashing because features was null Fixes: `0e10790558` "radv: Enable VK_EXT_descriptor_indexing." CC: 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-20 13:36:07 +02:00
Bas Nieuwenhuizen	a1c87235a9	ac/surface: Only align linear power of two fmt textures. We're not sharing 32_32_32 formats between different GPUs, so we do not have to align for vega on pre-vega cards. Fixes: `e361970ed7` "radv: Add support for IMG_DATA_FORMAT_32_32_32." Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-20 11:57:59 +02:00
Bas Nieuwenhuizen	62e0e089d7	amd/addrlib: Use defines in autotools build. Otherwise stuff like NDEBUG would not be passed through. CC: <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106479 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-20 11:57:59 +02:00
Aaron Watry	cfe582f9dc	r600/compute: Mark several functions as static They're not used anywhere else, so keep them private Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>	2018-05-19 10:22:16 -05:00
Aaron Watry	d21e64c626	r600/compute: Remove unused compute_memory_pool functions Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>	2018-05-19 10:21:57 -05:00
Roland Scheidegger	6f558fb0f7	draw: get rid of special logic to not emit null tris I've confirmed after `77554d220d` we no longer need this to pass some tests from another api (as we no longer generate the bogus extra null tris in the first place). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-05-19 02:49:58 +02:00
Dylan Baker	c86e9a5fe5	docs: Add sha sums for release	2018-05-18 16:44:50 -07:00
Dylan Baker	1d46852830	docs: Add release notes for 18.1.0	2018-05-18 16:44:43 -07:00
Alyssa Rosenzweig	5d85a0a55b	nir: Implement optional b2f->iand lowering This pass is required by the Midgard compiler; our instruction set uses NIR-style booleans (~0 for true) but lacks a dedicated b2f instruction. Normally, this lowering pass would be implemented in a backend-specific algebraic pass, but this conflicts with the existing iand->b2f pass in nir_opt_algebraic.py, hanging the compiler. This patch thus makes the existing pass optional (default on -- all other backends should remain unaffected), adding an optional pass for lowering the opposite direction. v2: Defer lowering until late algebraic optimisations to allow optimising the b2f instruction itself. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2018-05-18 22:44:09 +02:00
Jan Vesely	8ed2cabd04	travis: Adapt to radeonsi dropping support for LLVM 4 meson Vulkan, Clover, and autotools Vulkan need to be switched to llvm 5 Fixes: `f9eb1ef870` Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-18 13:59:37 -04:00
Marek Olšák	3d64ed5785	radeonsi: skip ES output stores for undefined output components Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-18 13:38:07 -04:00
Nanley Chery	0ab25f05ab	i965: isl: Move the MCS gen7+ assertion into ISL This is useful for every user of ISL. Drop the comment along the way to match similar functions in ISL. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-18 09:53:06 -07:00
Nanley Chery	f88caf2321	i965/miptree: Remove format assertion in alloc_aux intel_miptree_supports_{ccs,mcs,hiz} ensures the format is valid for the color or depth miptree before the miptree is assigned an aux_usage. alloc_aux switches on the aux_usage so don't assert that the format is valid. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-18 09:53:06 -07:00
Nanley Chery	8007b2d78b	i965/miptree: Simplify the switch in supports_ccs Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-05-18 09:53:06 -07:00
Nanley Chery	da98441fef	i965: Make get_ccs_surf succeed in alloc_aux Synchronize the requirements listed in isl_surf_get_ccs_surf with intel_miptree_supports_ccs by importing a restriction from ISL. Some implications: * We successfully create every aux_surf in alloc_aux * We only return false from alloc_aux if we run out of memory Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-18 09:53:06 -07:00
Brian Paul	42aee8f4f6	llvmpipe: fix check for a no-op shader The tgsi_info.num_tokens fix broke llvmpipe's detection of no-op shaders. Fix the code to check for num_instructions <= 1 instead. Fixes: `8fde9429c3` ("tgsi: fix incorrect tgsi_shader_info::num_tokens computation") Tested-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-05-18 09:09:41 -06:00
Samuel Pitoiset	03c4816093	radv: pass radv_nir_compiler_options directly to create_llvm_function() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-18 11:07:01 +02:00
Christian Gmeiner	2eb3f794d9	st/mesa: only define GLSL 1.4 for compat if driver supports it Currently GLSL 1.4 is defined for all gallium drivers even only GLSL 1.2 is supported as seen on etnaviv. v1 -> v2: - use _min(..) as suggested by Lucas Stach and Michel Dänzer Fixes: `4560aad780` ("mesa: add GLSLVersionCompat constant") Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-18 10:46:24 +02:00
Dave Airlie	48e28ab961	vbo: remove MaxVertexAttribStride assert check. Some drivers (virgl) don't support GL4.4 or GLES3.1 yet, so never fill in this const. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-05-18 14:58:15 +10:00
Timothy Arceri	c0c69bd8dd	mesa: drop GL_EXT_polygon_offset support glPolygonOffset() has been part of the GL standard since 1.1. Also niether AMD or Nvidia support this in their binary drivers. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61761	2018-05-18 09:21:24 +10:00
Brian Paul	8fde9429c3	tgsi: fix incorrect tgsi_shader_info::num_tokens computation We were incrementing num_tokens in each loop iteration while parsing the shader. But each call to tgsi_parse_token() can consume more than one token (and often does). Instead, just call the tgsi_num_tokens() function. Luckily, this issue doesn't seem to effect any current users of this field (llvmpipe just checks for <= 1, for example). Reviewed-by: Neha Bhende<bhenden@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-05-17 15:02:05 -06:00
Samuel Pitoiset	fcba3934fc	radv: add radv_emit_shader_pointer() helper For future work (support for 32-bit GPU pointers). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-17 21:28:59 +02:00
Samuel Pitoiset	9b2c310a70	radv: add some helpers for cleaning up radv_get_preamble_cs() Because this function looks a bit ugly to me. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-17 21:28:57 +02:00
Marek Olšák	f9eb1ef870	amd: remove support for LLVM 4.0 It doesn't support GFX9. Acked-by: Dave Airlie <airlied@redhat.com> Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-17 14:54:41 -04:00
Juan A. Suarez Romero	11a0d5563f	docs: update calendar, add news and link release notes to 18.0.4 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2018-05-17 18:45:26 +00:00
Juan A. Suarez Romero	042e21976a	docs: add sha256 checksums for 18.0.4 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `69ef6e4a75`)	2018-05-17 18:40:53 +00:00
Juan A. Suarez Romero	bb7750e8da	docs: add release notes for 18.0.4 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `3b49ab6219`)	2018-05-17 18:40:51 +00:00
Mathias Fröhlich	6fac626193	mesa: The glArrayElement api is independent of the current program. All the shader program dependent handling is done on the level of the gl_Context::Array._DrawVAO/_DrawVAOEnabledAttribs. So, skip array element invalidation on _NEW_PROGRAM. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-05-17 20:13:40 +02:00
Mathias Fröhlich	984cb4e512	mesa: Flag _NEW_ARRAY only if we are changing ctx->Array.VAO. For the VAO internal helper functions that may be called with a non current VAO, flag the _NEW_ARRAY state only if it is the current ctx->Array.VAO. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-05-17 20:13:39 +02:00
Mathias Fröhlich	5c7e3a90ed	mesa: Remove flush_vertices argument from VAO methods. The flush_vertices argument is now unused, remove it. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-05-17 20:13:39 +02:00
Mathias Fröhlich	9c7be67968	mesa: Remove FLUSH_VERTICES from VAO state changes. Pending draw calls on immediate mode or display list calls do not depend on changes of the VAO state. So, remove calls to FLUSH_VERTICES and flag _NEW_ARRAY as appropriate. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-05-17 20:13:39 +02:00
Juan A. Suarez Romero	0a2c947556	docs: add 18.0.5 in the release calendar Mesa 18.1 series has not been released yet, so let's extend 18.0 lifetime. v2: Add missing closing TR tags (Eric Engestrom) CC: Andres Gomez <agomez@igalia.com> CC: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2018-05-17 19:01:19 +02:00
Alok Hota	936ce75285	swr/rast: Added FEClipRectangles event and also added some comments Reviewed-By: George Kyriazis <george.kyriazis@intel.com>	2018-05-17 10:53:14 -05:00
Alok Hota	a33d376133	swr/rast: Whitespace and tab-to-spaces changes Reviewed-By: George Kyriazis <george.kyriazis@intel.com>	2018-05-17 10:53:10 -05:00
Alok Hota	7970fcff25	swr/rast: fix VCVTPD2PS generation for AVX512 Reviewed-By: George Kyriazis <george.kyriazis@intel.com>	2018-05-17 10:53:06 -05:00
Alok Hota	a0dddac1cb	swr/rast: Rectlist support for GS Add rectlist as an option for GS. Needed to support some driver optimizations. Reviewed-By: George Kyriazis <george.kyriazis@intel.com>	2018-05-17 10:53:01 -05:00
Alok Hota	7926d18fa5	swr/rast: Remove unneeded virtual from methods Reviewed-By: George Kyriazis <george.kyriazis@intel.com>	2018-05-17 10:52:21 -05:00
Stefan Schake	b0acc3a562	broadcom/vc4: Native fence fd support With the syncobj support in place, lets use it to implement the EGL_ANDROID_native_fence_sync extension. This mostly follows previous implementations in freedreno and etnaviv. v2: Drop the flags (Eric) Handle in_fence_fd already in job_submit (Eric) Drop extra vc4_fence_context_init (Eric) Dup fds with CLOEXEC (Eric) Mention exact extension name (Eric) Signed-off-by: Stefan Schake <stschake@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-05-17 16:04:30 +01:00
Stefan Schake	44036c354d	broadcom/vc4: Store job fence in syncobj This gives us access to the fence created for the render job. v2: Drop flag (Eric) Signed-off-by: Stefan Schake <stschake@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-05-17 16:04:28 +01:00
Stefan Schake	9ed05e2520	broadcom/vc4: Detect syncobj support We need to know if the kernel supports syncobj submission since otherwise all the DRM syncobj calls fail. v2: Use drmGetCap to detect syncobj support (Eric) Signed-off-by: Stefan Schake <stschake@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-05-17 16:04:26 +01:00
Stefan Schake	4fc0ebdff5	broadcom/vc4: Bump libdrm requirement Require a version of libdrm with syncobj support. v2: Don't require a libdrm_vc4, just bump core libdrm if vc4 enabled (by anholt) Signed-off-by: Stefan Schake <stschake@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-05-17 16:04:24 +01:00
Stefan Schake	580d1f4c60	drm-uapi: Update vc4 header with syncobj submit support v2: Synchronized with kernel v2 v3: Update for the finalized kernel ABI (pad2 field) Signed-off-by: Stefan Schake <stschake@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-05-17 16:04:21 +01:00
Stefan Schake	1ec01a911b	broadcom/vc4: Drop libdrm_vc4 requirement This was missed in the move back to the local uapi copy. libdrm_vc4 only seems to consist of headers that also exist in the Mesa tree. Signed-off-by: Stefan Schake <stschake@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-05-17 16:04:12 +01:00
Eric Anholt	97894b1267	v3d: Add support for glSampleMask / glSampleCoverage.	2018-05-17 15:09:46 +01:00
Eric Anholt	9bbc3f8cf1	v3d: Enable NaN propagation in the VS and CS as well. Fixes piglit vs-isnan-*.shader_test at the expense of gl-1.0-spot-light.	2018-05-17 15:09:12 +01:00
Nanley Chery	edfb57c0a0	i965/blorp: Disable BLORP clear color updates With the previous patches, we now update the indirect clear color buffer every time the clear color changes. Avoid redundant updates. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-17 07:06:42 -07:00
Nanley Chery	02f5512fed	intel/blorp: Add a NO_UPDATE_CLEAR_COLOR batch flag Allow callers to handle updating the indirect clear color buffer themselves. This can reduce the number of clear color updates in the case where a caller performs multiple fast clears with the same clear color. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-17 07:06:42 -07:00
Nanley Chery	f8ac11d69f	i965/blorp: Also skip the fast clear if the clear color differs If the aux state is CLEAR and clear color value has changed, only the surface state must be updated. The bit-pattern in the aux buffer is exactly the same. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-17 07:06:42 -07:00
Nanley Chery	43616404be	i965/clear: Drop a stale comment in fast_clear_depth This comment made more sense when it was above the calls to intel_miptree_slice_set_needs_depth_resolve(). We stopped using these functions at commit `554f7d6d02` ("i965: Move depth to the new resolve functions"). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-17 07:06:41 -07:00
Nanley Chery	82849fb6d5	i965: Update the indirect buffer in set_clear_color For depth buffers, we avoid fast-clearing if the aux_state is already CLEAR. We do the same for color buffers only if the clear color doesn't change. We require that the clear colors match because, in that case, we don't update the indirect clear color outside of BLORP. Update the indirect clear color for color buffers as well. We'll enable the same depth buffer optimization for color buffers in a later patch. Note that we're now actually updating the indirect clear color twice in the case where we use BLORP to perform the fast-clear. This is only temporary. In later patches, we'll prevent BLORP from performing the update. v2: Add more context to the commit message (Topi). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-17 07:06:41 -07:00
Nanley Chery	5b315f3ad1	i965/clear: Remove an early return in fast_clear_depth Reduce complexity and allow the next patch to delete some code. With this change, clear operations will still be skipped and setting the aux_state will cause no side-effects. Remove the associated comment which implies an early return. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-17 07:06:41 -07:00
Nanley Chery	6f609ca609	i965: Use set_clear_color for depth miptrees Reduce code duplication now and prevent it in the following commits. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-17 07:06:41 -07:00
Nanley Chery	92a0a87b6f	Revert "i965: Make the miptree clear color setter take a gl_color_union" This reverts commit `1d94aa1987`. The next patch will make depth miptrees use the clear color setter that was originally being used for color miptrees. Go back to using the isl_color_value parameter because it's the same type as the fast_clear_color field used by color and depth miptrees. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-17 07:06:41 -07:00
Nanley Chery	bb18af82c3	i965/miptree: Unify aux buffer allocation There isn't much that changes between the aux allocation functions. Remove the duplicated code. v2: Inline the switch statement (Jason). Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-17 07:06:41 -07:00
Nanley Chery	6c41a2ef3b	i965: Prepare to delete intel_miptree_alloc_ccs() We're going to delete intel_miptree_alloc_ccs() in the next commit. With that in mind, replace the use of this function in do_single_blorp_clear() with intel_miptree_alloc_aux() and move the delayed allocation logic to it's callers. v2: Duplicate the delayed allocation comment (Topi Pohjolainen). Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-17 07:06:41 -07:00
Nanley Chery	beed9c4550	i965/miptree: Drop the mt param from alloc_aux_buffer Drop an unused parameter. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-17 07:06:41 -07:00
Nanley Chery	6b1836aabe	i965/miptree: Drop the alloc_flags param from alloc_aux_buffer We have enough information to determine the optimal flags internally. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-17 07:06:41 -07:00
Nanley Chery	3dd7f600e0	i965/miptree: Drop the name param from alloc_aux_buffer A name of "aux-miptree" should be sufficient. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-17 07:06:41 -07:00
Nanley Chery	58d99a21f1	i965/miptree: Initialize the indirect clear color to zero The indirect clear color isn't correctly tracked in intel_miptree::fast_clear_color. The initial value of ::fast_clear_color is zero, while that of the indirect clear color is undefined. Topi Pohjolainen discovered this issue with MCS buffers. This issue is apparent when fast-clearing an MCS buffer for the first time with glClearColor = {0.0,}. Although the indirect clear color is undefined, the initial aux state of the MCS is CLEAR and the tracked clear color is zero, so we avoid updating the indirect clear color with {0.0,}. Make the indirect clear color match the initial value of ::fast_clear_color. Note: although we only have to drop HiZ's BO_ALLOC_BUSY flag for gen10+, we also drop it pre-gen10 to keep things simple. We add this flag back for pre-gen10 in a later patch. v2: Add a note about dropping HiZ's BO_ALLOC_BUSY flag (Topi). Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-17 07:06:41 -07:00
Nanley Chery	b58675e93f	i965/miptree: Add and use a memset option in alloc_aux_buffer Add infrastructure for initializing the clear color BO. intel_miptree_init_mcs is no longer needed with change. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-17 07:06:41 -07:00
Nanley Chery	8a9491058d	i965/miptree: Zero-initialize CCS_D buffers Before this patch, the aux_state was actually AUX_INVALID because the BO was never defined. This was fine on single slice miptrees because we would fast-clear the resource right after creation. For multi-slice miptrees on SKL+ however, this results in undefined behavior when accessing a non-base slice. Here's a specific example: 1) Fast clear level 0 * Undefined CCS_D buffer allocated in "PASS_THROUGH" state. * Level 0 transitions to the CLEAR state. 2) Render to level 1 * Level 1 may have a 2-bit pattern of 2's. * Rendering with a 2 in the CCS is undefined. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-17 07:06:41 -07:00
Nanley Chery	816f2dc67d	i965/miptree: Fix handling of uninitialized MCS buffers Before this patch, if we failed to initialize an MCS buffer, we'd end up in a state in which the miptree thinks it has an MCS buffer, but doesn't. We also leaked the clear_color_bo if it existed. With this patch, we now free the miptree aux buffer resources and let intel_miptree_alloc_mcs() know that the MCS buffer no longer exists. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-17 07:06:41 -07:00
Samuel Pitoiset	1fba2e10b3	radv: only declare the ESGS rings for pre GFX9 chips GFX9 uses LDS instead. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-17 14:14:20 +02:00
Samuel Pitoiset	d349d4bd24	radv: allow to print GPU info with RADV_DEBUG=info Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-17 14:14:17 +02:00
Samuel Pitoiset	56d53ed1d6	radv: do not emit unnecessary ES output stores GFX9: Totals from affected shaders: SGPRS: 472 -> 464 (-1.69 %) VGPRS: 576 -> 584 (1.39 %) Code Size: 45432 -> 44324 (-2.44 %) bytes Max Waves: 40 -> 40 (0.00 %) VI: SGPRS: 720 -> 720 (0.00 %) VGPRS: 728 -> 728 (0.00 %) Code Size: 45348 -> 43992 (-2.99 %) bytes Max Waves: 120 -> 120 (0.00 %) This affects Rise of Tomb Raider and the three Vulkan demos that use a geometry shader (geometryshader, deferredshadows and viewportarray). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-17 14:14:13 +02:00
Samuel Pitoiset	a6e44d1271	radv: do not emit unnecessary GS output stores Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-17 14:14:11 +02:00
Samuel Pitoiset	507402ada6	radv: only pass the global BO list at submit time if enabled That way the winsys might use a faster path when the global BO list is NULL. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-17 13:48:27 +02:00
Samuel Pitoiset	6211799aff	radv: remove the radv_finishme() when compiling shaders Having an entrypoint different than "main" doesn't mean we have multiple shaders per module. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-17 13:48:24 +02:00
Samuel Pitoiset	1e86eaf7d8	radv: remove radv_device::llvm_supports_spill It's always true. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-17 13:48:21 +02:00
Timothy Arceri	f71714022b	mesa: add glUniform*ui{v} support to display lists Fixes: `a017c7ecb7` "mesa: display list support for uint uniforms" Reviewed-by: Marek Olšák <marek.olsak@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78097	2018-05-17 13:07:48 +10:00
Dieter Nützel	7f1dc93357	radeonsi: create .gitignore Signed-off-by: Dieter Nützel <Dieter@nuetzel-hh.de> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-05-16 21:48:17 -04:00
Dave Airlie	eba4cf797c	ac/llvm: use amdgcn.tbuffer.store instead of SI.tbuffer.store intrinsic Drop the use of the old intrinsic. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-17 11:46:53 +10:00
Eric Anholt	b2e7c32703	v3d: Fix wiring filters to NEAREST for 32-bit texture returns. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104626	2018-05-16 21:19:07 +01:00
Eric Anholt	795488d2bf	v3d: Enable the driver by default. Now that we have a stabilized ABI and a fairly conformant driver, turn it on.	2018-05-16 21:19:07 +01:00
Eric Anholt	01ae6a9181	v3d: Rename driver functions from vc5 to v3d. This is the final step of the driver rename.	2018-05-16 21:19:07 +01:00
Eric Anholt	8c47ebbd23	v3d: Rename the driver files from "vc5" to "v3d".	2018-05-16 21:19:07 +01:00
Eric Anholt	c4c488a2ae	v3d: Rename the vc5_dri.so driver to v3d_dri.so. This allows the driver to load against the merged kernel DRM driver. In the process, rename most of the build system variables and gallium plumbing functions.	2018-05-16 21:19:07 +01:00
Eric Anholt	8a793d42f1	v3d: Switch the vc5 driver to using the finalized V3D UABI. In the process of merging to the kernel, I renamed the driver to the general product line's name (since we have both vc5 and vc6 supported already). Since the ABI is finalized, move the header to include/drm-uapi.	2018-05-16 21:19:07 +01:00
Charmaine Lee	33a86acd78	svga: fix incompatible bind flags at buffer validation time At buffer resource validation time, if the resource handle is not yet created and if the initial buffer bind flags and the tobind flags are incompatible, just use the tobind flags to create the resource handle. On the other hand, if the bind flags are compatible, we can combine the bind flags for the resource handle creation. Fixes piglit gl-3.1-buffer-bindings crash. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-05-16 13:04:16 -06:00
jenny.q.cao	1261b34cd5	mesa: cast the GLenum16 to GLint to avoid compile warning on android Cast the enum to GLint to avoid the compile warning: /src/mesa/main/get.c:3005:19: warning: comparison of constant -32768 with expression of type 'GLenum16' (aka 'unsigned short') is always false -Wtautologicalia-constant-out-of-range-compare Tests: compilation without this warning Signed-off-by: jenny.q.cao <jenny.q.cao@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-05-16 13:02:43 -06:00
Stuart Young	f806cc9eb6	etnaviv: Fix missing rnndb file in tarballs Seems that when the rnndb files for etniviv were updated/included back in Nov 2017, hw/texdesc_3d.xml.h was missed from Makefile.sources and meson.build. This was all during the conversion to meson, so it apears to have slipped through the cracks. As such, this file has been missing from the official tarballs since inclusion in Mesa, so the git trees and tarballs differ. Found due to lintian errors in the Debian packages. Fixes: `f1e1c60ff6` ("etnaviv: Update from rnndb") Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2018-05-16 19:36:10 +02:00
Matthias Groß	71892fbe19	gallium/hud: add frametime graph (v2) Thanks for your comment. This version has an additional boolean in the fps_info struct to distinguish between fps and frame time calculation. The struct is initialised in the respecting install functions for this purpose. Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-05-15 19:30:12 -04:00
Jan Vesely	f3521ce2c4	eg/compute: Use reference counting to handle compute memory pool. Use pipe_reference to release old RAT surfaces. RAT surface adds a reference to pool bo, so use reference counting for pool->bo as well. v2: Use the same pattern for both defrag paths Drop confusing comment CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-05-15 19:01:47 -04:00
Roland Scheidegger	e01af38d6f	gallivm: Use alloca_undef with array type instead of alloca_array Use a single allocation of array type instead of the old-style array allocation for the temp and immediate arrays. Probably only makes a difference if they aren't used indirectly (so, if we used them solely because there's too many temps or immediates). In this case the sroa and early-cse passes can sometimes do some optimizations which they otherwise cannot. (As a side note, for the temp reg array, we actually really should use one allocation per array id, not just one for everything.) Note that the instcombine pass would actually promote such allocations to single alloc of array type as well, but it's too late for some artificial shaders we've seen to help (we don't want to run instcombine at the beginning due to its cost, hence would need another sroa/cse pass after instcombine). sroa/early-cse help there because they can actually eliminate all of the huge shader, reducing it to a single const output (don't ask...). (Interestingly, instcombine also removes all the bitcasts we do on that allocation for single-value gathering, and in the end directly indexes into the single vector elements, which according to spec is only semi-valid, but this happens regardless. Another thing instcombine also does is use inbound GEPs, which is probably something we should do manually as well - for indirectly indexed reg files llvm may not be able to figure it out on its own, but we should be able to guarantee all pointers are always inbound. In any case, by the looks of it using single allocation with array type seems to be the right thing to do even for ordinary shaders.) No piglit change. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-05-16 00:04:48 +02:00
Dieter Nützel	bd0b6b9f17	radv: add generated files to .gitignore(s) Signed-off-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-15 22:53:55 +02:00
Samuel Pitoiset	6bde8c5608	spirv: fix visiting inner loops with same break/continue block We should stop walking through the CFG when the inner loop's break block ends up as the same block as the outer loop's continue block because we are already going to visit it. This fixes the following assertion which ends up by crashing in RADV or ANV: SPIR-V parsing FAILED: In file ../src/compiler/spirv/vtn_cfg.c:381 block->node.link.next == NULL 0 bytes into the SPIR-V binary This also fixes a crash with a camera shader from SteamVR. v2: make use of vtn_get_branch_type() and add an assertion Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106090 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106504 CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-15 21:38:19 +02:00
Rob Clark	d89f58a6b8	mesa/st: handle vert_attrib_mask in nir case too Note, actually fixes `9987a072cb`, but the problems don't show up until `19a91841c3`. Fixes: `19a91841c3` st/mesa: Use Array._DrawVAO in st_atom_array.c. Fixes: `9987a072cb` st/mesa: Make the input_to_index array available. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-05-15 15:15:33 -04:00
Marek Olšák	3e27b377f2	cso: check count == 0 in cso_set_vertex_buffers The code didn't expect that, leading to crashes. Fixes: `86d63b53a2` "gallium: remove aux_vertex_buffer_slot code" Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2018-05-15 12:36:27 -04:00
Rob Clark	dace607245	vc5: use util_copy_framebuffer_state Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-05-15 08:48:13 -04:00
Rob Clark	dae4c98dd7	vc4: use util_copy_framebuffer_state Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-05-15 08:47:35 -04:00
Rob Clark	f897b67dc1	freedreno/a5xx: remove fd5_shader_stateobj Extra level of indirection that serves no purpose. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-05-15 08:46:46 -04:00
Rob Clark	d48a2404a2	freedreno/a4xx: remove fd4_shader_stateobj Extra level of indirection that serves no purpose. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-05-15 08:46:46 -04:00
Rob Clark	2c40f2ba32	freedreno/a3xx: remove fd3_shader_stateobj Extra level of indirection that serves no purpose. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-05-15 08:46:46 -04:00
Rob Clark	273f7d8404	freedreno: fence should hold a ref to pipe Since the fence can outlive the context, and all it really needs to wait on a fence is the pipe, use the new fd_pipe reference counting to hold a ref to the pipe and drop the ctx pointer. This fixes a crash seen with (for example) glmark2: #0 fd_pipe_wait_timeout (pipe=0xbf48678b3cd7b32b, timestamp=0, timeout=18446744073709551615) at freedreno_pipe.c:101 #1 0x0000ffffbdf75914 in fd_fence_finish (pscreen=0x561110, ctx=0x0, fence=0xc55c10, timeout=18446744073709551615) at ../src/gallium/drivers/freedreno/freedreno_fence.c:96 #2 0x0000ffffbde154e4 in dri_flush (cPriv=0xb1ff80, dPriv=0x556660, flags=3, reason=__DRI2_THROTTLE_SWAPBUFFER) at ../src/gallium/state_trackers/dri/dri_drawable.c:569 #3 0x0000ffffbecd8b44 in loader_dri3_flush (draw=0x558a28, flags=3, throttle_reason=__DRI2_THROTTLE_SWAPBUFFER) at ../src/loader/loader_dri3_helper.c:656 #4 0x0000ffffbecbc36c in glx_dri3_flush_drawable (draw=0x558a28, flags=3) at ../src/glx/dri3_glx.c:132 #5 0x0000ffffbecd91e8 in loader_dri3_swap_buffers_msc (draw=0x558a28, target_msc=0, divisor=0, remainder=0, flush_flags=3, force_copy=false) at ../src/loader/loader_dri3_helper.c:827 #6 0x0000ffffbecbcfc4 in dri3_swap_buffers (pdraw=0x5589f0, target_msc=0, divisor=0, remainder=0, flush=1) at ../src/glx/dri3_glx.c:587 #7 0x0000ffffbec98218 in glXSwapBuffers (dpy=0x502bb0, drawable=2097154) at ../src/glx/glxcmds.c:840 #8 0x000000000040994c in CanvasGeneric::update (this=0xfffffffff400) at ../src/canvas-generic.cpp:114 #9 0x0000000000411594 in MainLoop::step (this=this@entry=0x5728f0) at ../src/main-loop.cpp:108 #10 0x0000000000409498 in do_benchmark (canvas=...) at ../src/main.cpp:117 #11 0x00000000004071b0 in main (argc=<optimized out>, argv=<optimized out>) at ../src/main.cpp:210 Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-05-15 08:46:46 -04:00
Rob Clark	a8c0daa172	freedreno: batch cache doesn't hold a ref to batch The cache doesn't hold a (strong) reference to the batch. So we shouldn't be trying to drop a reference, as that leads to: #0 0x0000ffffbecb37a0 in raise () from /lib64/libc.so.6 #1 0x0000ffffbeca159c in abort () from /lib64/libc.so.6 #2 0x0000ffffbecacf48 in __assert_fail_base () from /lib64/libc.so.6 #3 0x0000ffffbecacfa8 in __assert_fail () from /lib64/libc.so.6 #4 0x0000ffffbd28def0 in pipe_reference_described (ptr=0x4f47130, reference=0x0, get_desc=0xffffbd2e0f08 <__fd_batch_describe>) at ../src/gallium/auxiliary/util/u_inlines.h:88 #5 0x0000ffffbd28e188 in fd_batch_reference_locked (ptr=0x4f40de0, batch=0x0) at ../src/gallium/drivers/freedreno/freedreno_batch.h:258 #6 0x0000ffffbd28e9a8 in fd_bc_invalidate_resource (rsc=0x4f40ca0, destroy=true) at ../src/gallium/drivers/freedreno/freedreno_batch_cache.c:244 #7 0x0000ffffbd293778 in fd_resource_destroy (pscreen=0xedc170, prsc=0x4f40ca0) at ../src/gallium/drivers/freedreno/freedreno_resource.c:644 #8 0x0000ffffbd922674 in u_transfer_helper_resource_destroy (pscreen=0xedc170, prsc=0x4f40ca0) at ../src/gallium/auxiliary/util/u_transfer_helper.c:144 #9 0x0000ffffbd29527c in pipe_resource_reference (ptr=0x4f455d8, tex=0x0) at ../src/gallium/auxiliary/util/u_inlines.h:144 #10 0x0000ffffbd29548c in fd_surface_destroy (pctx=0x1012720, psurf=0x4f455d0) at ../src/gallium/drivers/freedreno/freedreno_surface.c:78 #11 0x0000ffffbd1f9c48 in pipe_surface_reference (ptr=0x4f471d0, surf=0x0) at ../src/gallium/auxiliary/util/u_inlines.h:113 #12 0x0000ffffbd1f9ef4 in util_copy_framebuffer_state (dst=0x4f471c8, src=0x0) at ../src/gallium/auxiliary/util/u_framebuffer.c:114 #13 0x0000ffffbd2e0e30 in __fd_batch_destroy (batch=0x4f47130) at ../src/gallium/drivers/freedreno/freedreno_batch.c:225 #14 0x0000ffffbd28e1b0 in fd_batch_reference_locked (ptr=0xfffffffff010, batch=0x0) at ../src/gallium/drivers/freedreno/freedreno_batch.h:262 #15 0x0000ffffbd28e6b0 in fd_bc_invalidate_context (ctx=0x1012720) at ../src/gallium/drivers/freedreno/freedreno_batch_cache.c:190 #16 0x0000ffffbd2e2b6c in fd_context_destroy (pctx=0x1012720) at ../src/gallium/drivers/freedreno/freedreno_context.c:139 #17 0x0000ffffbd2c3280 in fd5_context_destroy (pctx=0x1012720) at ../src/gallium/drivers/freedreno/a5xx/fd5_context.c:56 #18 0x0000ffffbd5b7a8c in st_destroy_context_priv (st=0xfd72f0, destroy_pipe=true) at ../src/mesa/state_tracker/st_context.c:281 Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-05-15 08:46:26 -04:00
Eric Engestrom	37d44e2608	docs/meson: mark code/commands as <code> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-05-15 10:33:39 +01:00
Eric Engestrom	5829f616ec	docs/meson: replace plaintext url with a link Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-05-15 10:33:36 +01:00
Eric Engestrom	67c550708a	docs/meson: fix various html issues Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-05-15 10:33:34 +01:00
Eric Engestrom	dc2dc1fa30	docs/meson: fix various typos Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-05-15 10:33:28 +01:00
Eric Engestrom	6c5df78d8b	meson: fix copyright symbol Fixes: `bd68f1013c` "autotools, meson: add tileset.h" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-05-15 10:31:46 +01:00
Juan A. Suarez Romero	bd68f1013c	autotools, meson: add tileset.h Fixes: `4e52cb51b5` ("swr/rast: Thread locked tiles improvement") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-05-15 10:00:11 +02:00
Thomas Hellstrom	3d0b4979ee	st/xa: Bump minor Bump xa minor to signal that the underlying mesa version is suitable for dri3. This is a bit ugly since it doesn't relate to a specific xa interface change. Recently there has been a number of fixes in mesa that helps enabling dri3 without any significant regressions in automated testing and common desktop usage latency. However, the xf86-video-vmware driver has no other way to tell but inspecting the xa version. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-05-15 09:27:46 +02:00
Dave Airlie	9585e70206	virgl: enable vertex streams when glsl level is high enough. This enabled the vertex streams out when the host supports GL4.0.	2018-05-15 14:56:57 +10:00
Kai Wasserbäch	b691d9192c	opencl: autotools: Fix linking order for OpenCL target Otherwise the build fails with an undefined reference to clang::FrontendTimesIsEnabled. Bugzilla: https://bugs.freedesktop.org/106209 Cc: Jan Vesely <jan.vesely@rutgers.edu> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Acked-by: Jan Vesely <jan.vesely@rutgers.edu> Tested-by: Aaron Watry <awatry@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-05-14 22:45:01 -04:00
Samuel Pitoiset	97b179570c	radv: reduce the number of parameters export by the GS copy shader By using the geometry shader output usage mask. This improves all Vulkan demos that use a geometry shader (ie. geometryshader, deferredshadows, viewportarray). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-14 21:38:23 +02:00
Samuel Pitoiset	560bd9eb67	radv: scan the geometry shader output usage mask For reducing the number of parameters that are exported by the GS copy shader. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-14 21:38:21 +02:00
Samuel Pitoiset	ea43d935ab	radv: run the shader info pass before emitting the GS copy shader For further optimizations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-14 21:38:19 +02:00
Samuel Pitoiset	7cbc6f2621	radv: check that layout isn't NULL in radv_nir_shader_info_pass() An upcoming patch will run the shader info pass on the geometry shader just before emitting the GS copy shader. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-14 21:38:17 +02:00
Jason Ekstrand	18f8200a99	intel/blorp: Use linear formats for CCS_E clear colors in copies It's clear that the original code meant to do this and there is even a 10-line comment explaining why. Originally, we had a simple function for packing the clear colors which was unaware of sRGB. However, in `a6b66a7b26`, when we started using ISL to do the packing, the wrong format was used. Fixes: `a6b66a7b26` "intel/blorp: Use ISL instead of bitcast_color..." Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-14 10:41:26 -07:00
Bas Nieuwenhuizen	f944a59996	radv: Disable texel buffers with A2 SNORM/SSCALED/SINT for pre-vega. The hardware always interprets the alpha as unsigned and fixing it in the shader is going to add unacceptable overheads. CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106480 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-14 18:58:30 +02:00
Bas Nieuwenhuizen	3d4d388e39	radv: Fix up 2_10_10_10 alpha sign. Pre-Vega HW always interprets the alpha for this format as unsigned, so we have to implement a fixup to do the sign correctly for signed formats. v2: Improve indexing mess. CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106480 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-14 18:58:20 +02:00
Bas Nieuwenhuizen	e361970ed7	radv: Add support for IMG_DATA_FORMAT_32_32_32. Basic sampling support for linear tiling. No CTS regressions, but it seems the blitting coverage is not very extensive. https://bugs.freedesktop.org/show_bug.cgi?id=106331 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-14 18:58:12 +02:00
Bas Nieuwenhuizen	dd102405de	radv: Translate logic ops. radeonsi could pass them through but the enum changed between Gallium and Vulkan, so we have to translate. In progress I made the register defines a bit more readable. CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100430 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-14 16:49:06 +02:00
Bas Nieuwenhuizen	62f50df7b7	radv: Fix multiview queries. This moves the extra queries to after the main query ended, instead of doing it after the begin and hence doing nesting. We also emit only (view count - 1) extra queries, as the main query is already there for the first view. This fixes the CTS occasionally getting stuck in dEQP-VK.multiview.queries* waiting on results. Fixes: `32b4f3c38d` "radv/query: handle multiview queries properly. (v3)" CC: 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-14 16:49:06 +02:00
Eric Engestrom	f0cdc39b13	meson: remove dependency antipattern `dep_valgrind != []` now (0.45) produces a warning that is quite explicit: WARNING: Trying to compare values of different types (DependencyHolder, list) using !=. The result of this is undefined and will become a hard error in a future Meson release. `dep_valgrind = []` used to be the recommended way to deal with non-existant dependency, but these don't work with `.found()`, so now the recommended way is to declare a impossible dependency, which null_dep does for us in Mesa. In short, we don't need and shouldn't check for `!= []` anywhere anymore. Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-05-14 14:55:36 +01:00
Samuel Pitoiset	ece398277c	radv: remove useless check in radv_create_shaders() radv_can_dump_shader() already handles if module is NULL. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-14 12:38:01 +02:00
Samuel Pitoiset	8ade3e4684	radv: allow to dump the GS copy shader with RADV_DEBUG="shaders" Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-14 12:38:00 +02:00
Samuel Pitoiset	553418af1e	radv: move {load,store}_var intrinsics scanning in different functions These are going to be crazy and we are probably going to add more scan stuff in the future. Also use switch cases instead. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-14 12:37:58 +02:00
jenny.q.cao	ff7521c9ba	android: change include "cutils/log.h" to "log/log.h" on Android API >=26 There is a compile warning from Android 8 (API version 26) from "include cutils/log.h" warning: "Deprecated: don't include cutils/log.h, use either android/log.h or log/log.h"-W#warnings, Change to include "log/log.h" on Android 8 or later major version to avoid this warning Signed-off-by: jenny.q.cao <jenny.q.cao@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-05-14 08:08:31 +03:00
Roland Scheidegger	cf3fb42fb5	llvmpipe: Fix random number generation for unit tests We were never producing negative numbers for signed types. Also fix only producing half the valid range for uint32, and properly clamp signed values. Because this now also properly tests snorm with actually negative values, need to increase eps for such conversions. I believe these cannot actually be hit in ordinary operation (e.g. if a snorm texture is sampled and output to snorm RT, it will still go through snorm->float and float->snorm conversion), so don't bother to do anything to fix the bad accuracy (might be quite complex). Basically, the issue is for something like snorm16->snorm8 that in the end this will just use a 8 bit arithmetic right shift. But the math behind it says we should actually do a division by 32767 / 127, which is ~258, not 256. So the result can be one bit off (values have too large magnitude), and furthermore, the shift has incorrect rounding (always rounds down). For positive numbers, these errors have different direction, but for negative ones they have the same, hence for some values the error will be 2 bit in the end. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=106232	2018-05-14 03:14:00 +02:00
Dave Airlie	5978d54a09	radv: use compute path for multi-layer images. I don't think the hw resolve path can't handle multi-layer images. This fixes all the: dEQP-VK.renderpass.multisample_resolve.layers_* tests on my VI card. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: <mesa-stable@lists.freedesktop.org>	2018-05-14 08:57:54 +10:00
Dave Airlie	98dbaa445a	radv: resolve all layers in compute resolve path. This path should iterate across all layers, I've some ideas for doing this in a single pass, but this is simpler for now. This passes the tests because we don't use the fragment path unless we have DCC, and we don't have DCC on layered images. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: <mesa-stable@lists.freedesktop.org>	2018-05-14 08:57:27 +10:00
Dave Airlie	b16fc6cda1	radv/resolve: do fmask decompress on all layers. For a multi-layer subpass resolve we want to make sure we flush all the layers. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: <mesa-stable@lists.freedesktop.org>	2018-05-14 08:56:47 +10:00
Rhys Perry	8f6cbb8c7d	nvc0: fix setting of subpixel precision during conservative rasterization Fixes: `07dac3e040` ("nvc0: add conservative rasterization support") Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-05-13 13:21:41 -04:00
Rhys Perry	c879011c72	anv,nir: add generated files to .gitignore(s) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-12 20:14:49 -07:00
Marek Olšák	86d63b53a2	gallium: remove aux_vertex_buffer_slot code The slot index is always 0, and is pretty unlikely to change in the future. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-05-12 21:08:09 -04:00
Timothy Arceri	ce188813bf	radv: add initial support for VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT When VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT is set we skip NIR linking optimisations and only run over the NIR optimisation loop once similar to the GLSLOptimizeConservatively constant used by some GL drivers. We need to run over the opts at least once to avoid errors in LLVM (e.g. dead vars it can't handle) and also to reduce the time spent compiling the IR in LLVM. With this change the Blacksmith Unity demos compilation times go from 329760 ms -> 299881 ms when using Wine and DXVK. V2: add bit to radv_pipeline_key Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106246	2018-05-13 09:58:33 +10:00
Vinson Lee	26ddc4f9e1	scons: Add PROGRAM_NIR_FILES. Fix SCons build error. Linking build/linux-x86_64-debug/gallium/targets/libgl-xlib/libGL.so.1.5 ... build/linux-x86_64-debug/mesa/libmesa.a(st_program.os): In function `st_translate_prog_to_nir': src/mesa/state_tracker/st_program.c:392: undefined reference to `prog_to_nir' Fixes: `5c33e8c772` ("st/nir: use NIR for asm programs") Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2018-05-12 00:50:05 -07:00
Timothy Arceri	5c33e8c772	st/nir: use NIR for asm programs Reviewed-by: Eric Anholt <eric@anholt.net>	2018-05-12 14:48:21 +10:00
Timothy Arceri	0b3e9564bd	st/nir: make st_nir_opts() available externally The following patch will make use of this for asm style programs. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-05-12 14:48:21 +10:00
Boyuan Zhang	0907d3ab9c	radeon/vce: add firmware support for ver 53 and up All vce firmwares with major version greater than or equal to 53 are supported Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2018-05-11 14:59:00 -04:00
Rob Clark	a7c81a7f67	etnaviv: remove pipe_fence_handle::ctx A fence can outlive the ctx it was created from (see glmark2).. etnaviv doesn't actually need fence->ctx so lets remove it before someone makes the mistake of assuming it is a valid pointer. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2018-05-11 18:42:13 +02:00
George Kyriazis	4e52cb51b5	swr/rast: Thread locked tiles improvement - Change tilemgr TILE_ID encoding to use Morton-order (Z-order). - Change locked tiles set to bitset. Makes clear, set, get much faster. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-05-11 11:26:35 -05:00
George Kyriazis	8238c791dc	swr/rast: Add Builder::GetVectorType() Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-05-11 11:25:47 -05:00
George Kyriazis	8cb55dae2e	swr/rast: Prepend the console output with a newline It can get jumbled with output from other threads. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-05-11 11:25:24 -05:00
George Kyriazis	db25fcfcde	swr/rast: Add ConcatLists() for concatenating lists Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-05-11 11:22:57 -05:00
George Kyriazis	dcaca3c7b3	swr/rast: Add constant initializer for uint64_t Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-05-11 11:22:17 -05:00
George Kyriazis	70f0a28b83	swr/rast: Use binner topology to assemble backend attributes Previously was using the draw topology, which may change if GS or Tess are active. Only affected attributes marked with constant interpolation, which limited the impact. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-05-11 11:21:52 -05:00
George Kyriazis	b3b0f0e0ec	swr/rast: Change formatting Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-05-11 11:21:22 -05:00
Ville Syrjälä	659910eda0	meson: Fix build for egl platform_x11 with dri3 platform_x11 with dri3 needs inc_loader. In file included from ../src/egl/drivers/dri2/platform_x11_dri3.c:35:0: ../src/egl/drivers/dri2/egl_dri2.h:41:32: fatal error: loader_dri3_helper.h: No such file or directory In file included from ../src/egl/drivers/dri2/platform_x11.c:46:0: ../src/egl/drivers/dri2/egl_dri2.h:41:32: fatal error: loader_dri3_helper.h: No such file or directory In file included from ../src/egl/drivers/dri2/egl_dri2.c:61:0: ../src/egl/drivers/dri2/egl_dri2.h:41:32: fatal error: loader_dri3_helper.h: No such file or directory Cc: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2018-05-11 17:41:57 +03:00
Samuel Pitoiset	efc10949cc	radv: move ac_build_if_state on top of radv_nir_to_llvm.c These helpers will be needed for future work. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-11 12:35:07 +02:00
Samuel Pitoiset	3a410f0afc	radv: minor cleanups in radv_fill_shader_variant() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-11 12:35:05 +02:00
Jan Vesely	58272c1ad7	winsys/amdgpu: Destroy dev_hash table when the last winsys is removed. Fixes memory leak on module unload. CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-10 23:23:50 -04:00
Marek Olšák	a2e9d9b4c1	ac/gpu_info: add has_read_registers_query Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:40:11 -04:00
Marek Olšák	9b1fdfc541	ac/gpu_info: add has_2d_tiling Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:40:10 -04:00
Marek Olšák	d26696283d	ac/gpu_info: add has_sparse_vm_mappings Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:40:08 -04:00
Marek Olšák	125adc92ad	ac/gpu_info: add has_unaligned_shader_loads Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:40:07 -04:00
Marek Olšák	8b9694da4b	radeonsi: expose ARB_query_buffer_object on ancient kernels too It doesn't use indirect dispatches. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:40:04 -04:00
Marek Olšák	e9c08bc658	ac/gpu_info: add has_indirect_compute_dispatch Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:40:03 -04:00
Marek Olšák	64265ac8d5	ac/gpu_info: add kernel_flushes_tc_l2_after_ib Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:40:01 -04:00
Marek Olšák	14c5a93bfa	ac/gpu_info: add has_format_bc1_through_bc7 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:40:00 -04:00
Marek Olšák	2bd2c173e8	ac/gpu_info: add has_eqaa_surface_allocator Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:39:58 -04:00
Marek Olšák	e720cb6135	radeonsi: clean up the reset status query implementation Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:39:57 -04:00
Marek Olšák	3060f62340	ac/gpu_info: add has_bo_metadata Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:39:56 -04:00
Marek Olšák	09f1bab483	ac/gpu_info: add si_TA_CS_BC_BASE_ADDR_allowed Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:39:54 -04:00
Marek Olšák	8b58a14ef7	ac/gpu_info: add htile_cmask_support_1d_tiling Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:39:53 -04:00
Marek Olšák	b81149e258	ac/gpu_info: add kernel_flushes_hdp_before_ib Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:39:47 -04:00
Marek Olšák	a969f184cf	radeonsi: add an environment variable that forces EQAA for MSAA allocations This is for testing and experiments. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:34:37 -04:00
Marek Olšák	2309cedf44	radeonsi: set up EQAA image descriptors properly Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:34:36 -04:00
Marek Olšák	7ac4ef097d	radeonsi: add EQAA SC,DB,CB register programming Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:34:34 -04:00
Marek Olšák	9d00580e75	radeonsi: support creating EQAA color textures Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:34:32 -04:00
Marek Olšák	912b0163dc	ac/surface: add EQAA support Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:34:31 -04:00
Marek Olšák	ee31762ef5	radeonsi: use better sample locations for 8x EQAA Verified with the piglit MSAA accuracy test. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:32:57 -04:00
Marek Olšák	4b6df225f7	radeonsi: improve quality of 16 sample locations This results in better 16x and 8x quality when using these locations. Verified with the piglit MSAA accuracy test. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:29:02 -04:00
Marek Olšák	01fd543c82	radeonsi: use better sample locations for 4x MSAA Discovered by luck. Verified with the piglit MSAA accuracy test. It also shows that the worst case EQAA 16s4f results in very good 4x MSAA in the worst case. Nine might not like these positions, but they are prettier to the eye and GL doesn't care. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:28:12 -04:00
Marek Olšák	8d8b71ccfa	radeonsi: reorder sample locations as required by EQAA Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:27:46 -04:00
Marek Olšák	5769a5ec01	radeonsi: simplify si_get_sample_position Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:26:33 -04:00
Marek Olšák	9f456b3a3c	radeonsi: simplify arrays of sample locations Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:26:33 -04:00
Marek Olšák	3d70b5beae	radeonsi: set DB_EQAA the same as Vulkan These never change, but they only affect EQAA, which isn't implemented. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:26:33 -04:00
Marek Olšák	b5ed039325	radeonsi: remove CM_ prefixes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:26:33 -04:00
Marek Olšák	656fd607be	radeonsi: don't update clear color registers if they don't change Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:26:33 -04:00
Marek Olšák	835095973d	radeonsi: remove r600_fmask_info radeon_surf contains almost everything. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:26:33 -04:00
Marek Olšák	bdc3e410f7	ac/surface: unify common legacy and gfx9 fmask fields Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:26:33 -04:00
Marek Olšák	9bf3570fed	ac/surface/gfx6: compute FMASK together with the color surface instead of invoking FMASK computation separately. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:26:33 -04:00
Marek Olšák	276acda835	ac/surface/gfx9: fix a typo in CMASK RB/pipe alignment No change in behavior because it's always aligned. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:26:32 -04:00
Marek Olšák	6841845b00	ac: set correct LLVM processor names for Raven & Vega12 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:26:32 -04:00
Marek Olšák	6f7f10d285	ac: sort raster configs Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:26:32 -04:00
Marek Olšák	e7b82a9978	ac: remove 1 RB raster config for Iceland Iceland always reports 2 RBs. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:26:32 -04:00
Marek Olšák	cb0f5cddcc	ac: move the Fiji kernel workaround for raster config out of the switch Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:26:32 -04:00
Marek Olšák	ce954ac6f3	ac: enable both RBs on Kaveri This can result in 2x increase in performance on non-harvested Kaveris. v2: don't do it on radeon Tested-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:26:32 -04:00
Marek Olšák	597b9e8810	radeonsi/gfx9: work around a GPU hang due to broken indirect indexing in LLVM Fixes: `6d19120da8` "radeonsi/gfx9: workaround for INTERP with indirect indexing" Cc: 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:26:32 -04:00
Jason Ekstrand	b784561c1a	intel/isl/storage: Don't lower most UNORM formats on gen11+ Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Tested-by: Anuj Phogat <anuj.phogat@gmail.com>	2018-05-10 14:13:24 -07:00
Jason Ekstrand	399962e7c6	intel/isl: Several UNORM formats support typed writes on gen11+ Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Tested-by: Anuj Phogat <anuj.phogat@gmail.com>	2018-05-10 14:12:55 -07:00
Brian Paul	e4211b36bb	mesa: revert GL_[SECONDARY_]COLOR_ARRAY_SIZE glGet type to TYPE_INT Since size can be 3, 4 or GL_BGRA we need to keep these glGet types as TYPE_INT, not TYPE_UBYTE. Fixes: `d07466fe18` ("mesa: fix glGetInteger/Float/etc queries for vertex arrays attribs") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106462 cc: mesa-stable@lists.freedesktop.org Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-05-10 09:49:40 -06:00
Andres Rodriguez	34e9e4023f	radv: disable DCC for shareable images on GFX9+ This seems to be broken at the moment for opengl interop. Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-10 11:27:12 -04:00
Thomas Petazzoni	54bbe600ec	configure.ac: rework -latomic check The configure.ac logic added in commit `2ef7f23820` ("configure: check if -latomic is needed for __atomic_*") makes the assumption that if a 64-bit atomic intrinsic test program fails to link without -latomic, it is because we must use -latomic. Unfortunately, this is not completely correct: libatomic only appeared in gcc 4.8, and therefore gcc versions before that will not have libatomic, and therefore don't provide atomic intrinsics for all architectures. This issue was for example encountered on PowerPC with a gcc 4.7 toolchain, where the build fails with: powerpc-ctng_e500v2-linux-gnuspe/bin/ld: cannot find -latomic This commit aims at fixing that, by not assuming -latomic is available. The commit re-organizes the atomic intrinsics detection as follows: (1) Test if a program using 64-bit atomic intrinsics links properly, without -latomic. If this is the case, we have atomic intrinsics, and we're good to go. (2) If (1) has failed, then test to link the same program, but this time with -latomic in LDFLAGS. If this is the case, then we have atomic intrinsics, provided we link with -latomic. This has been tested in three situations: - On x86-64, where atomic instrinsics are all built-in, with no need for libatomic. In this case, config.log contains: GCC_ATOMIC_BUILTINS_SUPPORTED_FALSE='#' GCC_ATOMIC_BUILTINS_SUPPORTED_TRUE='' LIBATOMIC_LIBS='' This means: atomic intrinsics are available, and we don't need to link with libatomic. - On NIOS2, where atomic intrinsics are available, but some of them (64-bit ones) require using libatomic. In this case, config.log contains: GCC_ATOMIC_BUILTINS_SUPPORTED_FALSE='#' GCC_ATOMIC_BUILTINS_SUPPORTED_TRUE='' LIBATOMIC_LIBS='-latomic' This means: atomic intrinsics are available, and we need to link with libatomic. - On PowerPC with an old gcc 4.7 toolchain, where 32-bit atomic instrinsics are available, but not 64-bit atomic instrinsics, and there is no libatomic. In this case, config.log contains: GCC_ATOMIC_BUILTINS_SUPPORTED_FALSE='' GCC_ATOMIC_BUILTINS_SUPPORTED_TRUE='#' With means that atomic intrinsics are not usable. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>	2018-05-10 08:13:57 -07:00
Brian Paul	d07466fe18	mesa: fix glGetInteger/Float/etc queries for vertex arrays attribs The vertex array Size and Stride attributes are now ubyte and short, respectively. The glGet code needed to be updated to handle those types, but wasn't. Fixes the new piglit test gl-1.5-get-array-attribs test. v2: fix inadvertant whitespace change, change COLOR_ARRAY_SIZE to UBYTE, misc fixes suggested by Justin Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106450 Fixes: `d5f42f96e1` ("mesa: shrink size of gl_array_attributes (v2)") Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-05-10 08:08:11 -06:00
Jan Vesely	45dfa6f4e7	winsys/radeon: Destroy fd_hash table when the last winsys is removed. Fixes memory leak on module unload. v2: Use util_hash_table helper function CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>	2018-05-10 05:12:48 -04:00
Jan Vesely	d146768d13	gallium/auxiliary: Add helper function to count the number of entries in hash table CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>	2018-05-10 05:12:43 -04:00
Samuel Pitoiset	0defc55547	radv: move handling nosisched option in a better place It's a per-application optimization, so it makes more sense to do that in radv_handle_per_app_options(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-10 10:57:41 +02:00
Grazvydas Ignotas	4fdce205dd	radv: assorted typo fixes Trivial. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-10 11:50:46 +03:00
Mathias Fröhlich	f660683027	mesa/vbo/tnl: Move gl_vertex_array related stuff to tnl. The only remaining users of gl_vertex_array are tnl based drivers. So move everything related to that into tnl and rename it accordingly. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-05-10 07:06:16 +02:00
Mathias Fröhlich	881d2fcafa	mesa: Remove Array._DrawArrays. Only tnl based drivers still use this array. So remove it from core mesa and use Array._DrawVAO instead. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-05-10 07:06:16 +02:00
Mathias Fröhlich	899476b6b1	i965: Remove the now unused gl_vertex_array. Was meant to be temporary in i965. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-05-10 07:06:16 +02:00
Mathias Fröhlich	0fabd55306	i965: Remove the gl_vertex_array indirection. For now store binding and attrib in brw_vertex_element. The i965 driver still provides lots of opportunity to make use of the unique binding information in the VAO which is currently not taken from the VAO. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-05-10 07:06:16 +02:00
Mathias Fröhlich	172c9a908f	i965: Implement all_varyings_in_vbos in terms of Array._DrawVAO. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-05-10 07:06:16 +02:00
Mathias Fröhlich	79eb6ab7b6	st/mesa: Remove the now unused gl_vertex_array. Was meant to be temporary in gallium. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-05-10 07:06:16 +02:00
Mathias Fröhlich	4c77f0d065	st/mesa: Make feedback draw and rasterpos use _DrawVAO. Instead of playing with Array._DrawArrays, make the feedback draw path use Array._DrawVAO. Also st_RasterPos needs to use the VAO then. v2: Use helper methods to get the offset values for array and binding. Update comments. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-05-10 07:06:16 +02:00
Mathias Fröhlich	19a91841c3	st/mesa: Use Array._DrawVAO in st_atom_array.c. Finally make use of the binding information in the VAO when setting up arrays for draw. v2: Emit less relocations also for interleaved userspace arrays. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-05-10 07:06:15 +02:00
Mathias Fröhlich	9987a072cb	st/mesa: Make the input_to_index array available. The input_to_index array is already available internally when preparing vertex programs. Store the map in struct st_vertex_program. Also store the bitmask of mesa vertex processing inputs in struct st_vp_variant. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-05-10 07:06:15 +02:00
Mathias Fröhlich	f24bf45210	st/mesa: Use _DrawVAO for edgeflag enabled check. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-05-10 07:06:15 +02:00
Mathias Fröhlich	d1698d4311	mesa: Compute effective buffer bindings in the vao. Compute VAO buffer binding information past the position/generic0 mapping. Scan for duplicate buffer bindings and collapse them into derived effective buffer binding index and effective attribute mask variables. Provide a set of helper functions to access the distilled information in the VAO. All of them prefixed with _mesa_draw_... to indicate that they are meant to query draw information. v2: Also group user space arrays containing interleaved arrays. Add _Eff*Offset to be copied on attribute and binding copy. Update comments. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-05-10 07:06:15 +02:00
Gert Wollny	fb4011ace9	virgl: Add support for passing GL_ANY_SAMPLES_PASSED_CONSERVATIVE This is needed for fixing CTS: dEQP-GLES3.functional.occlusion_query.conservative* Reviewed-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Gert Wollny <gert.wollny@collabora.com>	2018-05-10 12:26:57 +10:00
Dave Airlie	ce027ac5c7	r600: fix constant buffer bounds. If you have an indirect access to a constant buffer on r600/eg use a vertex fetch in the shader. However apps have expected behaviour on those out of bounds accessess (even if illegal). If the constants were being uploaded as part of a larger upload buffer, we'd set the range of allowed access to a lot larger than required so apps would get values back from other parts of the upload buffer instead of the expected out of bounds access. This fixes rendering bugs in Trine and Witcher 1, thanks to iive for nagging me effectively until I figured it out :-) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91808 Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-05-10 02:14:32 +01:00
Jason Ekstrand	a8a740f272	i965,anv: Set the CS stall bit on the ISP disable PIPE_CONTROL From the bspec docs for "Indirect State Pointers Disable": "At the completion of the post-sync operation associated with this pipe control packet, the indirect state pointers in the hardware are considered invalid" So the ISP disable is a post-sync type of operation which means that it should be combined with a CS stall. Without this, the simulator throws an error. Fixes: `766d801ca` "anv: emit pixel scoreboard stall before ISP disable" Fixes: `f536097f6` "i965: require pixel scoreboard stall prior to ISP disable" Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-05-09 18:03:28 -07:00
Dave Airlie	56766b8515	radv: handle arrays in the fmask descriptor. This fixes the fmask descriptor generation to handle 2d ms arrays. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-10 10:42:49 +10:00
Matt Turner	0f959215c3	gallium/tests: Fix assignment of EXTRA_DIST Fixes: `6754c2e83d` ("autotools: Include new meson files")	2018-05-09 16:38:47 -07:00
Matt Turner	0097940223	configure.ac: Check for grep with AC_PROG_GREP Perhaps with a new version of autoconf, I began seeing: \| checking the name lister (/usr/bin/nm -B) interface... ./configure: line 6973: External.some_variable: command not found \| BSD nm This is because AC_PROG_NM expands to ... if $GREP 'External.some_variable' conftest.out > /dev/null; then lt_cv_nm_interface="MS dumpbin" fi ... I'm not sure if it's a bug in AC_PROG_NM that it doesn't call AC_PROG_GREP, but it's easy enough for us to do it.	2018-05-09 16:38:47 -07:00
Xiong, James	0ab266dc1b	main: fail texture_storage() call if the size is not okay Signed-off-by: Xiong, James <james.xiong@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 09:34:31 +10:00
Xiong, James	08c1444c95	main: return 0 length when the queried program object's not linked Signed-off-by: Xiong, James <james.xiong@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-10 09:34:19 +10:00
Kenneth Graunke	a83face48a	i965: Shut up unused variable warnings. These are only used in assertions.	2018-05-09 16:20:50 -07:00
Ross Burton	1755654d9f	src/intel/Makefile.vulkan.am: add missing MKDIR_GEN Out of tree builds can try to write into a directory that doesn't exist yet: \| Traceback (most recent call last): \| File "../../../mesa-18.0.2/src/intel/vulkan/anv_icd.py", line 46, in <module> \| with open(args.out, 'w') as f: \| IOError: [Errno 2] No such file or directory: 'vulkan/intel_icd.x86_64.json' \| Makefile:4882: recipe for target 'vulkan/intel_icd.x86_64.json' failed Add missing MKDIR_GEN calls to solve this. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-05-09 16:08:52 -07:00
Rhys Perry	5ac16ed047	mesa: fix error handling in get_framebuffer_parameteriv CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-05-09 14:32:40 -07:00
Lionel Landwerlin	766d801ca3	anv: emit pixel scoreboard stall before ISP disable We want to make sure that all indirect state data has been loaded into the EUs before disable the pointers. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Fixes: `78c125af39` ("anv/gen10: Ignore push constant packets during context restore.")	2018-05-09 20:11:57 +01:00
Lionel Landwerlin	f536097f67	i965: require pixel scoreboard stall prior to ISP disable Invalidating the indirect state pointers might affect a previously scheduled & still running 3DPRIMITIVE (causing page fault). So stall on pixel scoreboard before that. v2: Fix compile issue :( v3: Stall on pixel scoreboard v4: Drop the post sync operation (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Fixes: `ca19ee33d7` ("i965/gen10: Ignore push constant packets during context restore.") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106243	2018-05-09 20:11:51 +01:00
Jason Ekstrand	561348caa1	intel/isl: Allow CCS_E on 1010102 formats On CNL and above, CCS_E supports 1010102 formats and R11G11B10F. We had shut them off during early enabling because blorp_copy couldn't handle them. Now it can handle 1010102 formats so we can turn them back on. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	ccb44b8a94	intel/blorp: Allow CCS copies of 1010102 formats Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	1978de66f7	intel/blorp: Add support for more format bitcasting nir_format_bitcast_uint_vec_unmasked can only be used to cast between formats with uniform channel sizes. In particular, it cannot handle 10_10_10_2 formats. By making use of the NIR helper for uint vector casts, we should now be able to bitcast between any two uint formats so long as their channels are in RGBA order (possibly with channels missing). In order to do this we need to rework the key a bit to pass the actual formats instead of just the number of bits in each. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	7998fe268e	intel/blorp: Use nir_format_bitcast_uint_vec_unmasked Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	047e68389f	nir/format_convert: Add code for bitcasting vectors This is a fairly direct port from blorp. The only real change is that the nir_format_convert version doesn't assume that everything is a vec4. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	a6b66a7b26	intel/blorp: Use ISL instead of bitcast_color_value_to_uint Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	09ced65420	intel/isl: Add format conversion code This adds helpers to ISL to convert an isl_color_value to and from binary data encoded with a given isl_format. The conversion is done using ISL's built-in format introspection so it's fairly slow as format conversions go but it should be fine for a single pixel value. In particular, we can use this to convert clear colors. As a side-effect, we now rely on the sRGB helpers in libmesautil so we need to tweak the build system a bit. All prior uses of src/util in ISL were header-only. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	8152c60e01	intel/isl/format: Get rid of the ALPHA colorspace Alpha-only formats are just linear. There's no need to specially deliminate them as being in their own colorspace. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	8ab73790ef	intel/isl/format: Add field locations informations to channel_layout Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	96598fbc02	intel/isl/format: Add a column for channel order to the table Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	d08d6a3da8	i965/blorp: Remove a pile of blorp_blit restrictions Previously, blorp could only blit into something that was renderable. Thanks to recent additions to blorp, it can now blit into basically anything so long as it isn't compressed. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	465d8566cd	i965/blorp: Allow blorp blits for 16x MSAA BLORP has supported 16x MSAA for quite a while now, we just never bothered to enable it for CopyTexSubImage. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	09eede9c9d	anv: Allow blitting to/from any supported format Now that blorp handles all the cases, why not? The only real change we have to make is to stop using anv_swizzle_for_render() in blorp_blit because it doesn't work for B4G4R4A4 and blorp now natively handles that. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	8ce31c9cc5	intel/blorp: Support the RGB workaround on more formats Previously we only supported UINT formats because that's what blorp_copy required. If we want to use it in blorp_blit, however, we need to support everything. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	4e26e3dea9	intel/blorp: Silently convert RGBX destination formats to RGBA Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	08cd834996	intel/isl: Add some helpers for working with RGBX formats Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	804856fa57	intel/blorp: Handle more exotic destination formats This commit adds support for the following formats as destination formats even though the hardware does not support rendering to them: - ISL_FORMAT_R24_UNORM_X8_TYPELESS - ISL_FORMAT_A4B4G4R4_UNORM - ISL_FORMAT_L8_UNORM_SRGB - ISL_FORMAT_R9G9B9E5_SHAREDEXP This is done by using a different format and emitting shader code to fake it the rest of the way. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	9e492bb92e	intel/blorp: Include nir_format_convert.h in blorp_blit.c nir_mask_shift_or is now defined in nir_format_convert.h so we can delete the copy in blorp_blit.c. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	9981709d8f	nir/format_convert: Add a function to pack RGB9_E5 formats Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	4e337b42f9	nir/format_convert: Add pack/unpack for R11F_G11F_B10F Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	98156b0019	nir/format_convert: Add linear <-> sRGB helpers Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	2fdd966e3d	nir: Add the start of a format conversion helper header Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	906c32ce87	intel/blorp: Add swizzle support for all hardware This commit makes blorp capable of swizzling anything even on hardware that doesn't support texture swizzle. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	1ef4f5aff1	intel/isl: Add a helper for inverting swizzles Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	242f6f7492	intel/isl: Add a helper for composing swizzles Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	dad67cc245	intel/isl: Add an isl_swizzle_supports_rendering helper This helper encodes more details, specifically about Haswell, than the previous asserts in isl_surface_state.c. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	23d703de1f	i965/surface_state: Use an identity swizzle pre-Haswell Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	293b8de161	blorp: Handle the RGB workaround more like other workarounds The previous version was sort-of strapped on in that it just adjusted the blit rectangle and trusted in the fact that we would use texelFetch and round to the nearest integer to ensure that the component positions matched. This new version, while slightly more complicated, is more accurate because all three components end up with exactly the same dst_pos and so they will get interpolated and sampled at the same texture coordinate. This makes the workaround suitable for using with scaled blits. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Lionel Landwerlin	3853f1c6f4	i965: silence unused variable Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `2dc29e095f` ("i965: Don't leak blorp on Gen4-5.") Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-05-09 18:12:10 +01:00
Lionel Landwerlin	11d36c373a	intel: devinfo: silence coverity warning It's just not possible to have a device with no subslices. CID: 1433511 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-05-09 15:21:01 +01:00
Michel Dänzer	6f81e07ecb	dri3: Only update number of back buffers in loader_dri3_get_buffers And only free no longer needed back buffers there as well. We want to stick to the same back buffer throughout a frame, otherwise we can run into various issues. Bugzilla: https://bugs.freedesktop.org/105906 Bugzilla: https://bugs.freedesktop.org/106399 Fixes: `3160cb86aa` "egl/x11: Re-allocate buffers if format is suboptimal" Reported-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Tested-by: Eero Tamminen <eero.t.tamminen@intel.com> Acked-by: Daniel Stone <daniels@collabora.com>	2018-05-09 15:40:41 +02:00
Samuel Iglesias Gonsálvez	2cf64fdb46	anv: ignore pColorBlendState if all color attachments of the subpass are unused According to Vulkan spec: "pColorBlendState is a pointer to an instance of the VkPipelineColorBlendStateCreateInfo structure, and is ignored if the pipeline has rasterization disabled or if the subpass of the render pass the pipeline is created against does not use any color attachments." Fixes tests from CL#2505: dEQP-VK.renderpass.*.simple.color_unused_omit_blend_state v2: - Check that blend is not NULL before usage. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-09 07:01:10 +02:00
Timothy Arceri	e7a7b712fe	mesa: remove hard-coded OpenGL 3.2 compat limit Just let validate_context_version() do it instead. This fixes MESA_GL_VERSION_OVERRIDE for compat, it will also allow us to enable new compat versions on a per driver bases in future. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-09 14:24:43 +10:00
Timothy Arceri	4560aad780	mesa: add GLSLVersionCompat constant This allows drivers to define what version of GLSL they support in compat. This will be needed in order to support compat 3.2 without breaking drivers that wont support it. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-09 14:24:36 +10:00
Timothy Arceri	be3ee9d141	mesa: dont call _mesa_override_glsl_version() in _mesa_init_constants() All drivers that support GLSL will later set their default GLSL versions overriding this override call. They currently all call _mesa_override_glsl_version() again later in order to support overrides. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-09 14:24:29 +10:00
Timothy Arceri	2a621acc8d	mesa: dont set GLSLVersion in _mesa_init_constants() Just leave it as 0 and let the drivers set it (as they already do) to avoid redundantly initialising it. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-09 14:24:22 +10:00
Jan Vesely	0783399d79	pipe-loader: Free driver_name in error path CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-08 21:35:07 -04:00
Brian Paul	901db25d5b	glsl: change ast_type_qualifier bitset size to work around GCC 5.4 bug Change the size of the bitset from 128 bits to 96. This works around an apparent GCC 5.4 bug in which bad SSE code is generated, leading to a crash in ast_type_qualifier::validate_in_qualifier() (ast_type.cpp:654). This can be repro'd with the Piglit test tests/spec/glsl-1.50/execution/ varying-struct-basic-gs-fs.shader_test Bugzilla:https://bugs.freedesktop.org/show_bug.cgi?id=105497 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Charmaine Lee <charmainel@vmware.com> Tested-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-05-08 19:06:09 -06:00
Kenneth Graunke	20f06bc72b	i965: Dump validation list on INTEL_DEBUG=bat,submit. This is really useful when debugging any sort of buffer management issues, so just printing it during INTEL_DEBUG=bat,submit seems reasonable. With bat, we're already spamming so much output that it doesn't really hurt. With submit, it's still easy to grep for the older information, and the new information is nice too. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-05-08 10:08:16 -07:00
Jason Ekstrand	06d3841882	i965/miptree: Remove redundant fields from intel_miptree_aux_buffer Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-08 08:27:46 -07:00
Jason Ekstrand	4f4779b367	i965: Simplify brw_emit_depthbuffer and brw_emit_depth_stencil_hiz Now that we're using ISL, a good chunk of brw_emit_depthstencil is pointless checks which ISL will do for us anyway. Since we only have one manual depth buffer emit function, move the useful bits into it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-08 08:27:45 -07:00
Jason Ekstrand	96f01501d7	i965: Move brw_emit_depth_stencil_hiz higher up in the file Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-08 08:27:45 -07:00
Jason Ekstrand	bdbb527a65	i965: Use ISL for emitting depth/stencil/hiz state on gen6+ We leave gen4-5 alone because the ISL code hasn't really been well- tested on gen4-5 or with combined depth-stencil because we don't use BLORP for depth operations on gen4-5. Also, the gen4-5 code has to deal with intratile offsets for LOD hacks and ISL doesn't handle those yet. We could make ISL handle gen4-5 capable or we could just not bother. Among other things, this should make future platform enabling easier because it means we don't have to update multiple (or hand-rolled!) depth stencil emit paths. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-08 08:27:44 -07:00
Jason Ekstrand	ccd3dce3c0	i965: Use the brw_depthbuffer atom on all gens The only reason why we had two atoms was that the one we used for gen7+ depended on _NEW_DEPTH and _NEW_STENCIL as well as _NEW_BUFFERS. Since this is no longer true, we can combine them into one atom. We do add a dependence on BRW_NEW_AUX_STATE but that should never get set on gen4-5 so adding it is a no-op for those platforms. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-08 08:27:44 -07:00
Jason Ekstrand	514bb6f41e	i965: Always set depth/stencil write enables on gen7+ The hardware will AND these fields with the corresponding fields in DEPTH_STENCIL_STATE so there's no real reason to toggle them on and off based on state bits. This removes our reliance on the _NEW_DEPTH and _NEW_STENCIL state bits and better matches what ISL does. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-08 08:27:43 -07:00
Jason Ekstrand	c4d00da7b7	i965: Re-order depth/stencil/hiz/clear packets to match ISL Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-08 08:27:42 -07:00
Jason Ekstrand	6fc3404911	i965: Re-emit depth/stencil/hiz on BRW_NEW_AUX_STATE Certain things can change the aux usage or fast clear color of a depth surface and we want to re-emit if that happens. For instance, if you do a fast depth clear of an already clear depth surface, we will just set the clear color and not do anything else. In that case, we could fail to re-emit 3DSTATE_CLEAR_PARAMS and not get the new fast-clear color. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-08 08:23:55 -07:00
Lionel Landwerlin	3cdf1bf97d	intel: devinfo: fix assertion on devices with odd number of EUs I forgot to change the assert in the second helper function in a previous change. This hit the assert() on a Broadwell platform with 1 slice, 3 subslices but all EUs disabled in subslice 1 & 2. Fixes: `c1900f5b0f` ("intel: devinfo: add helper functions to fill fusing masks values") Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-08 15:15:54 +01:00
Bas Nieuwenhuizen	b17cfb08a3	vulkan/wsi: Only use LINEAR modifier for prime if supported. This was setting the LINEAR modifier if neither the X server nor the driver supported modifiers. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106180 Fixes: `c80c08e226` "vulkan/wsi/x11: Add support for DRI3 v1.2" CC: 18.1 <mesa-stable@lists.freedesktop.org> Tested-by: Abel Garcia Dorta <mercuriete@gmail.com> Acked-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-08 15:47:16 +02:00
Jan Vesely	a9e4be9212	eg/compute: Drop reference to kernel_param bo in destructor CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-08 09:02:38 -04:00
Jan Vesely	a1e8fcce3e	r600: Cleanup constant buffers on context destruction CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-08 09:02:30 -04:00
Alejandro Piñeiro	b6648798cf	mesa/formatquery: remove online compression check on is_resource_supported is_resource_supported returns if the combination of target/internalformat is supported in at least one operation. Online compression is only mandatory for glTexImage2D. Some formats doesn't support online compression, but can be used in any case, with glCompressed*D methods. Without this commit, ETC2 internalformats were returning FALSE, even for the drivers supporting it. So any other query (like TEXTURE_COMPRESSED) was returning FALSE/NONE instead of the proper value. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-08 08:19:38 +02:00
Kenneth Graunke	e6fb8196ce	intel/genxml: Assert that genxml field start and ends are sane. Chris recently fixed a bunch of genxml end < start bugs, as well as booleans that are wider than a bit. These are way too easy to write, so asserting that the fields are sane is a good plan. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-07 23:06:52 -07:00
Kenneth Graunke	f83fd929b7	intel/genxml: Fix some more fake booleans in genxml. None of these are actually booleans. Tile Parameter is a tiling mode enum. Display pipes take plane numbers. Predicate Enable has some operations (and the default value of 6 was particular bogus). Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-07 23:06:52 -07:00
Kenneth Graunke	33906eeaca	intel/genxml: Make assert in gen_pack_header print a message. Python's assert can take both a condition and a string, which will cause it to print the string if the assertion trips. (You can't use parens as that creates a tuple.) Doing "condition and string" works in C, but doesn't have the desired effect in Python. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-07 23:06:52 -07:00
Kenneth Graunke	2dc29e095f	i965: Don't leak blorp on Gen4-5. We used to only initialize BLORP on Gen6+. When we added it on Gen4-5, we forgot to destroy it unconditionally. Fixes: `752d7af77a` (i965: Add blorp support for gen4-5) Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-05-07 23:05:59 -07:00
Matt Turner	ed5af94373	nir: Transform discard_if(true) into discard Noticed while reviewing Tim Arceri's NIR inlining series. Without his series: instructions in affected programs: 16 -> 14 (-12.50%) helped: 2 With his series: instructions in affected programs: 196 -> 174 (-11.22%) helped: 22 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-07 13:50:23 -07:00
Jan Vesely	ea1fff4416	eg/compute: Drop reference on code_bo in destructor. Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-07 15:04:03 -04:00
Nicolas Boichat	54ba73ef10	configure.ac/meson.build: Fix -latomic test When compiling with LLVM 6.0 on x86 (32-bit) for Android, the test fails to detect that -latomic is actually required, as the atomic call is inlined. In the code itself (src/util/disk_cache.c), we see this pattern: p_atomic_add(cache->size, - (uint64_t)size); where cache->size is an uint64_t *, and results in the following link time error without -latomic: src/util/disk_cache.c:628: error: undefined reference to '__atomic_fetch_add_8' Fix the configure/meson test to replicate this pattern, which then correctly realizes the need for -latomic. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>	2018-05-07 10:14:53 -07:00
Scott D Phillips	8b519075ea	anv: remove unused field anv_queue::pool The last use of the field was removed in 2015's ("48a87f4ba06 anv/queue: Get rid of the serial") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-07 09:03:46 -07:00
Kenneth Graunke	0b1cfd01ff	i965: Set initial kflags on BO creation. This simplifies kflag initialization, by creating a bufmgr-wide setting for initial kflags, and just applying it whenever we create a new BO. This also properly allows 48-bit addresses for imported BOs (via prime or flink), which I had missed in my earlier 48-bit support series. This will be useful when adding softpin support, as we'll want to add EXEC_OBJECT_PINNED to initial_kflags as well. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2018-05-07 08:47:21 -07:00
Juan A. Suarez Romero	7ee54fc33d	docs: update calendar, add news and link release notes to 18.0.3 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2018-05-07 11:25:54 +00:00
Juan A. Suarez Romero	78e103da8b	docs: add sha256 checksums for 18.0.3 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `ae12c5e990`)	2018-05-07 11:19:36 +00:00
Juan A. Suarez Romero	6c06d4e17b	docs: add sha256 checksums for 18.0.3 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `6dc2658fd6`)	2018-05-07 11:19:34 +00:00
Chris Wilson	cf440d85db	intel/genxml: Fix a few invalid field widths A couple of typos found by inspecting field.end - field.start, revealed a few wide integers declared as bool and some that ended before they started. Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-05-07 11:34:13 +01:00
Vinson Lee	cd5319a64f	swr/rast: Fix include for createInstructionCombiningPass with llvm-7.0. Fix build error after llvm-7.0.0svn r330669 ("InstCombine: Fix layering by not including Scalar.h in InstCombine"). CXX rasterizer/jitter/libmesaswr_la-blend_jit.lo rasterizer/jitter/blend_jit.cpp:816:20: error: use of undeclared identifier 'createInstructionCombiningPass'; did you mean 'createInstructionSimplifierPass'? passes.add(createInstructionCombiningPass()); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ createInstructionSimplifierPass Suggested-by: George Kyriazis <george.kyriazis@intel.com> Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-By: George Kyriazis <george.kyriazis@intel.com>	2018-05-05 13:20:53 -07:00
Jan Vesely	2f1ad72ac1	clover: Add explicit virtual destructor to argument class It is needed to destroy the v vector in scalar_argument Fixes memory leaks on parameter set/bind. v2: Drop redundant sclara_argument destructor Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-05-05 13:17:08 -04:00
Iago Toral Quiroga	e4c667b9e8	anv/device: expose shaderInt16 support in gen8+ This rollbacks the revert of this patch introduced with commit `7cf284f18e`. Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-05 12:41:14 +02:00
Iago Toral Quiroga	5a12bdac09	i965/compiler: handle conversion to smaller type in the lowering pass for that This rollbacks the revert of this same patch introduced in commit `7b9c15628a`. And also squahes the following patch to prevent a piglit regression caused by this change: intel/compiler: Fix lower_conversions for 8-bit types. Author: Jose Maria Casanova Crespo <jmcasanova@igalia.com> For 8-bit types the execution type is word. A byte raw MOV has 16-bit execution type and 8-bit destination and it shouldn't be considered a conversion case. So there is no need to change alignment and enter in lower_conversions for these instructions. Fixes a regresion in the piglit test "glsl-fs-shader-stencil-export" that is introduced with this patch from the Vulkan shaderInt16 series: 'i965/compiler: handle conversion to smaller type in the lowering pass for that'. The problem is caused because there is already a case in the driver that injects Byte instructions like this: mov(8) g127<1>UB g2<32,8,4>UB And the aforementioned pass was not accounting for the special handling of the execution size of Byte instructions. This patch fixes this. v2: (Jason Ekstrand) - Simplify is_byte_raw_mov, include reference to PRM and not consider B <-> UB conversions as raw movs. v3: (Matt Turner) - Indentation style fixes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106393 Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-05 12:41:02 +02:00
Iago Toral Quiroga	a75f967388	intel/compiler: handle 16-bit to 64-bit conversions in BSW platforms These are subject to the general restriction that anything that is converted to 64-bit needs to be aligned to 64-bit. We had this already in place for 32-bit to 64-bit conversions, so this patch generalizes the implementation to take effect on any conversion to 64-bit from a source smaller than 64-bit. Fixes assembly validation errors in the following CTS tests in BSW: dEQP-VK.spirv_assembly.instruction.compute.sconvert.int16_to_int64 dEQP-VK.spirv_assembly.instruction.compute.uconvert.uint16_to_uint64 dEQP-VK.spirv_assembly.instruction.compute.sconvert.int16_to_uint64 Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-05 12:26:37 +02:00
Caio Marcelo de Oliveira Filho	9d1ff2261c	intel/genxml: recognize 0x, 0o and 0b when setting default value Remove the need of converting values that are documented in hexadecimal. This patch would allow writing <field name="3D Command Sub Opcode" ... default="0x1B"/> instead of <field name="3D Command Sub Opcode" ... default="27"/> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-05-04 23:58:10 +01:00
Ian Romanick	9a10a2fd5f	r200: Enable NV_fog_distance With the previous fixes in place, it appears to just work. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-04 15:29:30 -07:00
Ian Romanick	9d0bf720ed	i965: Enable NV_fog_distance With the previous fixes in place, it appears to just work. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-04 15:29:28 -07:00
Ian Romanick	df80ffa4aa	ffvertex: Don't try to read output registers in fog calculation Gallium drivers use _mesa_remove_output_reads() via st_program to lower output reads away. It seems better to just generate the right thing in the first place. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-04 15:27:50 -07:00
Ian Romanick	f2db3be620	mesa: Add missing support for glFogiv(GL_FOG_DISTANCE_MODE_NV) Found by inspection, so I made a piglit test too. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-04 15:27:44 -07:00
Ian Romanick	d350276b03	mesa: Silence an unused parameter warning main/framebuffer.c: In function ‘update_color_draw_buffers’: main/framebuffer.c:629:46: warning: unused parameter ‘ctx’ [-Wunused-parameter] update_color_draw_buffers(struct gl_context ctx, struct gl_framebuffer fb) ^~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-04 15:27:40 -07:00
Gert Wollny	e695a35f40	mesa/main/readpix: Correct handling of packed floating point values Make sure that clamping in the pixel transfer operations is enabled/disabled for packed floating point values just like it is done for single normal and half precision floating point values. This fixes a series of CTS tests with virgl that use r11f_g11f_b10f buffers as target, and where virglrenderer reads these surfaces back using the format GL_UNSIGNED_INT_10F_11F_11F_REV. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-05-04 10:47:46 -07:00
Scott D Phillips	5c075b0855	util/set: add a set_clear function Clear a set back to the state of having zero entries. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-04 10:13:33 -07:00
Tapani Pälli	affe63b1da	egl: add EGL_BAD_MATCH error case for surfaceless and android Just like is done for other backends when suitable config is not found (added in `fd4eba4929`). Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2018-05-04 14:04:03 +03:00
Nicolai Hähnle	c0acb596f4	amd/common: use llvm.amdgcn.wqm for explicit derivatives To comply with an upcoming change in LLVM, see https://reviews.llvm.org/D46051 Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-04 11:02:48 +02:00
Rhys Perry	b30949a9c2	nv50/ir: fix printing of pixld Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-05-03 22:57:46 -04:00
Drew Davenport	4373dd3215	st/va: Support YUV formats in vaCreateSurfaces Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2018-05-03 15:48:35 -07:00
Mark Janes	7cf284f18e	Revert "anv/device: expose shaderInt16 support in gen8+" This reverts commit `0ba0ac815e`. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106393 Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-05-03 15:26:59 -07:00
Mark Janes	7b9c15628a	Revert "i965/compiler: handle conversion to smaller type in the lowering pass for that" This reverts commit `96b5153790`. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106393 Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-05-03 15:26:59 -07:00
Vinson Lee	589622a2fe	swr/rast: Fix WriteBitcodeToFile usage with llvm-7.0. Fix build error after llvm-7.0svn r325155 ("Pass a reference to a module to the bitcode writer."). CXX rasterizer/jitter/libmesaswr_la-JitManager.lo rasterizer/jitter/JitManager.cpp:548:30: error: reference to type 'const llvm::Module' could not bind to an lvalue of type 'const llvm::Module *' llvm::WriteBitcodeToFile(M, bitcodeStream); ^ Suggested-by: George Kyriazis <george.kyriazis@intel.com> Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-By: George Kyriazis <george.kyriazis@intel.com>	2018-05-03 14:06:09 -07:00
Deepak Rawat	9a21c96126	egl/x11: Send invalidate to driver on copy_region path in swap_buffer Similar to swap_available path send invalidate to the driver because egl/X11 is not watching for for server's invalidate events. The dri2_copy_region path is trigerred when server supports DRI2 version minor 1. Tested with piglit egl tests for regression. V2: Move invalidate from dri2_copy_region to swap_buffer common. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Deepak Rawat <drawat@vmware.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Acked-by: Michel Dänzer <michel.daenzer@amd.com>	2018-05-03 13:55:58 +02:00
Juan A. Suarez Romero	fd4eba4929	egl: check if colorspace/surface type is supported According to EGL 1.4 spec, section 3.5.1 ("Creating On-Screen Rendering Surfaces"), if config does not support the colorspace or alpha format attributes specified in attrib_list (as defined for eglCreateWindowSurface), an EGL_BAD_MATCH error is generated. This fixes dEQP-EGL.functional.wide_color.*_888_colorspace_srgb (still not merged, https://android-review.googlesource.com/c/platform/external/deqp/+/667322), which is crashing when trying to create a windows surface with RGB888 configuration and sRGB colorspace. v2: Handle the fix in other backends (Tapani) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-05-03 12:26:12 +02:00
Iago Toral Quiroga	0ba0ac815e	anv/device: expose shaderInt16 support in gen8+ Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:26 +02:00
Iago Toral Quiroga	002cb6f2b3	anv/pipeline: support SpvCapabilityInt16 in gen8+ Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:26 +02:00
Iago Toral Quiroga	f07c05576f	compiler/spirv: add implementation to check for SpvCapabilityInt16 support Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:26 +02:00
Iago Toral Quiroga	dd41630d9a	intel/compiler: implement 16-bit pack/unpack opcodes Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:26 +02:00
Iago Toral Quiroga	1dacb56279	compiler/spirv: implement 16-bit bitcasts Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:26 +02:00
Iago Toral Quiroga	2d648e5ba3	compiler/lower_64bit_packing: rename the pass to be more generic It can do 32-bit packing too now. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:26 +02:00
Iago Toral Quiroga	d2564af842	nir/lower_64bit_packing: extend the pass to handle packing from / to 16-bit. With 16-bit support we can now do 32-bit packing, a follow-up patch will rename the pass to something more generic. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:26 +02:00
Iago Toral Quiroga	c9653cc14c	nir: add opcodes for 16-bit packing and unpacking Noitice that we don't need 'split' versions of the 64-bit to / from 16-bit opcodes which we require during pack lowering to implement these operations. This is because these operations can be expressed as a collection of 32-bit from / to 16-bit and 64-bit to / from 32-bit operations, so we don't need new opcodes specifically for them. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:26 +02:00
Iago Toral Quiroga	6318808a05	intel/compiler: fix 16-bit comparisons NIR assumes that booleans are always 32-bit, but Intel hardware produces 16-bit booleans for 16-bit comparisons. This means that we need to convert the 16-bit result to 32-bit. In the future we want to add an optimization pass to clean this up and hopefully remove the conversions. v2 (Jason): use the type of the source for the temporary and use brw_reg_type_from_bit_size for the conversion to 32-bit. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:25 +02:00
Iago Toral Quiroga	b11e9425df	intel/compiler: lower some 16-bit integer operations to 32-bit These are not supported in hardware for 16-bit integers. We do the lowering pass after the optimization loop to ensure that we lower ALU operations injected by algebraic optimizations too. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:25 +02:00
Iago Toral Quiroga	b9a3d8c23e	compiler/nir: add a lowering pass to convert the bit size of ALU operations Not all bit-sizes may be supported natively in hardware for all operations. This pass allows drivers to lower such operations to a bit-size that is actually supported and then converts the result back to the original bit-size. Compiler backends control which operations and wich bit-sizes require the lowering through a callback function. v2: generalize this pass and make it available in NIR core (Rob, Jason) v3: remove some temporaries and reduce nesting in instruction loop using a continue statement (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:25 +02:00
Jose Maria Casanova Crespo	f575277f7e	intel/compiler: support negate and abs of half float immediates Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:25 +02:00
Jose Maria Casanova Crespo	f0e6dacee5	intel/compiler: fix brw_imm_w for negative 16-bit integers 16-bit immediates need to replicate the 16-bit immediate value in both words of the 32-bit value. This needs to be careful to avoid sign-extension, which the previous implementation was not handling properly. For example, with the previous implementation, storing the value -3 would generate imm.d = 0xfffffffd due to signed integer sign extension, which is not correct. Instead, we should cast to uint16_t, which gives us the correct result: imm.ud = 0xfffdfffd. We only had a couple of cases hitting this path in the driver until now, one with value -1, which would work since all bits are one in this case, and another with value -2 in brw_clip_tri(), which would hit the aforementioned issue (this case only affects gen4 although we are not aware of whether this was causing an actual bug somewhere). v2: Make explicit uint32_t casting for left shift (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "18.0 18.1" <mesa-stable@lists.freedesktop.org>	2018-05-03 11:40:25 +02:00
Jose Maria Casanova Crespo	2a76f03c90	intel/compiler: fix 16-bit int brw_negate_immediate and brw_abs_immediate From Intel Skylake PRM, vol 07, "Immediate" section (page 768): "For a word, unsigned word, or half-float immediate data, software must replicate the same 16-bit immediate value to both the lower word and the high word of the 32-bit immediate field in a GEN instruction." This fixes the int16/uint16 negate and abs immediates that weren't taking into account the replication in lower and upper words. v2: Integer cases are different to Float cases. (Jason Ekstrand) Included reference to PRM (Jose Maria Casanova) v3: Make explicit uint32_t casting for left shift (Jason Ekstrand) Split half float implementation. (Jason Ekstrand) Fix brw_abs_immediate (Jose Maria Casanova) Cc: "18.0 18.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:25 +02:00
Jose Maria Casanova Crespo	e5fc3c0717	intel/compiler: implement nir_instr_type_load_const for 16-bit constants Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:25 +02:00
Iago Toral Quiroga	939501c8ed	intel/compiler: implement conversions from 16-bit int/float to bool Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:25 +02:00
Iago Toral Quiroga	d5a419176f	intel/compiler: implement conversion between float/int 16-bit types Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:25 +02:00
Iago Toral Quiroga	96b5153790	i965/compiler: handle conversion to smaller type in the lowering pass for that The lowering pass was specialized to act on 64-bit to 32-bit conversions only, but the implementation is valid for other cases. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:25 +02:00
Iago Toral Quiroga	5361a87ee7	intel/compiler: fix isign for 16-bit integers We need to use 16-bit constants with 16-bit instructions, otherwise we get the following validation error: "Destination stride must be equal to the ratio of the sizes of the execution data type to the destination type" Because the execution data type is 4B due to the 32-bit integer constant. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:25 +02:00
Chris Wilson	b5e266765a	i965: Always try to create a logical context Always enable use of HW logical contexts to preserve GPU state between batches when the kernel supports such constructs, continuing to enforce the required support for gen6+. At runtime, this effectively removes the BRW_NEW_CONTEXT flag (and the upload of invariant state) from the start of every batch for any kernel supporting contexts. So long as the older atoms are correctly listening to the right flag (NEW_CONTEXT rather than NEW_BATCH) this should eliminate a few redundant state uploads for the older platforms. No piglits were harmed on ctg and ilk, both with and without logical contexts. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-03 01:39:33 -07:00
Neil Roberts	e17d0ccbbd	spirv: Apply OriginUpperLeft to FragCoord This behaviour was changed in `1e5b09f42f`. The commit message for that says it is just a “tidy up” so my assumption is that the behaviour change was a mistake. It’s a little hard to decipher looking at the diff, but the previous code before that patch was: if (builtin == SpvBuiltInFragCoord \|\| builtin == SpvBuiltInSamplePosition) nir_var->data.origin_upper_left = b->origin_upper_left; if (builtin == SpvBuiltInFragCoord) nir_var->data.pixel_center_integer = b->pixel_center_integer; After the patch the code was: case SpvBuiltInSamplePosition: nir_var->data.origin_upper_left = b->origin_upper_left; /* fallthrough / case SpvBuiltInFragCoord: nir_var->data.pixel_center_integer = b->pixel_center_integer; break; Before the patch origin_upper_left affected both builtins and pixel_center_integer only affected FragCoord. After the patch origin_upper_left only affects SamplePosition and pixel_center_integer affects both variables. This patch tries to restore the previous behaviour by changing the code to: case SpvBuiltInFragCoord: nir_var->data.pixel_center_integer = b->pixel_center_integer; / fallthrough */ case SpvBuiltInSamplePosition: nir_var->data.origin_upper_left = b->origin_upper_left; break; This change will be important for ARB_gl_spirv which is meant to support OriginLowerLeft. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Fixes: `1e5b09f42f` "spirv: Tidy some repeated if checks..."	2018-05-03 10:08:42 +02:00
Samuel Iglesias Gonsálvez	b291a3a4a3	spirv: convert some operands for bitwise shift and bitwise ops to uint32 SPIR-V allows to define the shift, offset and count operands for shift and bitfield opcodes with a bit-size different than 32 bits, but in NIR the opcodes have that limitation. As agreed in the mailing list, this patch adds a conversion to 32 bits to fix this. For more info, see: https://lists.freedesktop.org/archives/mesa-dev/2018-April/193026.html v2: - src_bit_size will have zero value for variable bit-size operands (Jason). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 07:07:24 +02:00
Timothy Arceri	58c05ede96	mesa: enable geom shaders in OpenGL 3.2 Compat profile Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-03 12:08:21 +10:00
Bas Nieuwenhuizen	ffa15861ef	radv: UseEnumerateInstanceVersion for the default version. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-02 21:57:08 +02:00
Bas Nieuwenhuizen	467c562a29	radv: Don't check the incoming apiVersion on CreateInstance. This fixes dEQP-VK.api.device_init.create_instance_invalid_api_version CC: 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-02 21:57:08 +02:00
Bas Nieuwenhuizen	9267ff9883	radv: Allow vkEnumerateInstanceVersion ProcAddr without instance. Apparently the somewhere between 1.1.70 and 1.1.73 the loader started depending on this. The loader then creates a 1.0 instance, which gets into funny situation because we have a 1.1 device. No idea how to do line wrapping in Mako though, my random guesses did not work. CC: 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-02 21:57:08 +02:00
Lionel Landwerlin	336decd67e	intel: aubinator: add an option to limit the number of decoded VBO lines Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-02 19:46:47 +01:00
Lionel Landwerlin	000452aebc	intel: decoder: limit to the number decoded lines from VBO By default we set no limit, but the debug batch decoder in i965 sets it to 100. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-02 19:46:47 +01:00
Jason Ekstrand	bd35345e85	anv: Advertise variableMultisampleRate Initially, I didn't understand this feature. Turns out that all it means is that you can switch multisample rates in the middle of a zero-attachment subpass. We've been able to do this since forever. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2018-05-02 10:59:03 -07:00
Rob Clark	28e410f6a5	nir: add missing dependency in meson.build nir_builder_opcodes.h also depends on nir_intrinsics.py for generating the system-value builders. Reported-by: Christoph Haag <haagch@frickel.club> Reported-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-02 13:57:51 -04:00
Matthew Nicholls	97d57ef917	radv: fix multisample image copies Previously before `fb077b0728`, the LOD parameter was being used in place of the sample index, which would only copy the first sample to all samples in the destination image. After that multisample image copies wouldn't copy anything from my observations. This fixes some copy_and_blit CTS tests. v3.1: - set lod to 0 for nir_txf_ms (Samuel) v2: - use GLSL_SAMPLER_DIM_MS instead of 2D (Samuel) - updated commit description (Samuel) Fix this properly by copying each sample in a separate radv_CmdDraw and using a pipeline with the correct rasterizationSamples for the destination image. Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-02 19:32:00 +02:00
Kenneth Graunke	169d8e011a	intel: Fix 3DSTATE_CONSTANT buffer decoding. First, this was iterating over the 3DSTATE_CONSTANT_* instruction but trying to process fields of the 3DSTATE_CONSTANT_BODY substructure. Secondly, the fields have been called Buffer[0] and Read Length[0], for a while now, and we were not handling the subscripts correctly. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-05-02 10:09:28 -07:00
Lionel Landwerlin	cf1d587879	intel: fix aubinator include Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `7c22c150c4` ("intel: Move batch decoder/disassembler from tools/ to common/") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-02 17:54:29 +01:00
Kenneth Graunke	0ab423388c	i965: Reuse batch decoder infrastructure rather than open coding it. With the new callback, Jason's newer batch decoder infrastructure should be able to do just as well as the old open coded INTEL_DEBUG=bat handling, with much less code. If there are any limitations, we'd like to improve the common code rather than doing one-off hacks here. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-05-02 09:27:56 -07:00
Kenneth Graunke	bf91b81a0b	intel: Give the batch decoder a callback to ask about state size. Given an arbitrary batch, we don't always know what the size of certain things are, such as how many entries are in a binding table. But it's easy for the driver to track that information, so with a simple callback we can calculate this correctly for INTEL_DEBUG=bat. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-05-02 09:27:56 -07:00
Kenneth Graunke	7c22c150c4	intel: Move batch decoder/disassembler from tools/ to common/ Making these part of libintel_common allows us to use them in the DRI driver. The standalone tool binaries already link against the common library, too, so it's no harder for them. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-05-02 09:27:56 -07:00
Kenneth Graunke	5c04971831	i965: Allocate shadow batches to explicitly be the BO size. This unfortunately makes it malloc/realloc on every new batch, rather than once at startup. But it ensures that the shadow buffer's size will absolutely match the BO size. Otherwise, as we tune BATCH_SZ/STATE_SZ or bufmgr cache bucket sizes, we may get a BO size that's rounded up, and fail to allocate the shadow buffer large enough. This doesn't fix any bugs today, as BATCH_SZ/STATE_SZ are the size of a cache bucket, but it's better to be safe than sorry. Reported-by: James Xiong <james.xiong@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-05-02 09:26:55 -07:00
Lionel Landwerlin	ec5df73803	intel: batch-decoder: iterate VERTEX_BUFFER_STATE fields The gen_field_iterator only iterates the fields of a given gen_group. If we want to iterate the fields of another gen_group contained as field, we need to do it manually. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-02 17:11:28 +01:00
Lionel Landwerlin	acbce2ac57	intel: decoder: fix starting dword of struct fields Struct fields might span several dwords, but iter_dword is incremented up to the last dword of the current field before we print out the struct's fields. We can't use iter_dword for computing the offset into the pointer of data to decode. v2: Fix displayed offset number (Ken) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-02 17:11:28 +01:00
Lionel Landwerlin	467430ddcc	intel: decoder: document when fields should be used Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-02 17:10:37 +01:00
Lionel Landwerlin	4f128f7850	intel: decoder: identify groups with fixed length <register> & <struct> elements always have fixed length. The get_length() method implies that we're dealing with an instruction in which the length is encoded into the variable data but the field iterator uses it without checking what kind of gen_group it is dealing with. Let's make get_length() report the correct length regardless of the gen_group (register, struct or instruction). Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-02 17:10:37 +01:00
Lionel Landwerlin	3c416a50d8	intel: decoder: make the field iterator use more natural while (iter_next()) { ... } instead of do { ... } while (iter_next()); Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-02 17:10:37 +01:00
Vlad Golovkin	967aabca06	nv50: Extract needed value bits without shifting them before calling bitcount This can save one instruction since bitcount doesn't care about specific bits' positions. Reviewed-by: Karol Herbst <kherbst@redhat.com>	2018-05-02 15:12:48 +02:00
Antia Puentes	3a1df14a7b	intel: activate the gl_BaseVertex lowering Surplus code related to the basevertex is removed. The Vertex Elements contain now: * VE 1: <firstvertex, BaseInstance, VertexID, InstanceID> * VE 2: <DrawID, is_indexed_draw, 0, 0> Also fixes unreachable message. Fixes OpenGL CTS tests: * KHR-GL46.shader_draw_parameters_tests.ShaderDrawArraysInstancedParameters * KHR-GL46.shader_draw_parameters_tests.ShaderMultiDrawArraysParameters * KHR-GL46.shader_draw_parameters_tests.MultiDrawArraysIndirectCountParameters * KHR-GL46.shader_draw_parameters_tests.ShaderDrawArraysParameters * KHR-GL46.shader_draw_parameters_tests.ShaderMultiDrawArraysIndirectParameters Fixes Piglit tests: * arb_shader_draw_parameters-drawid-indirect baseinstance * arb_shader_draw_parameters-basevertex Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102678	2018-05-02 11:24:46 +02:00
Antia Puentes	0fb204fac1	compiler/nir: Add conditional lowering for gl_BaseVertex Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-02 11:24:31 +02:00
Antia Puentes	0cbf29fa55	intel: emit is_indexed_draw in the same VE than gl_DrawID The Vertex Elements are now: * VE 1: <BaseVertex/firstvertex, BaseInstance, VertexID, InstanceID> * VE 2: <DrawID, is-indexed-draw, 0, 0> VE1 is it kept as it was before, VE2 additionally contains the new system value. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-02 11:23:34 +02:00
Antia Puentes	6ba9088d9c	intel/compiler: Add uses_is_indexed_draw flag Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-02 11:20:48 +02:00
Antia Puentes	9e6b886cf2	compiler: Add SYSTEM_VALUE_IS_INDEXED_DRAW and instrinsics This VS system value contains if the draw command used to start the rendering was an indexed draw command or a non-indexed one (~0/0 respectively). Useful to calculate the gl_BaseVertex as: (SYSTEM_VALUE_IS_INDEXED_DRAW & SYSTEM_VALUE_FIRST_VERTEX). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-02 11:20:40 +02:00
Samuel Pitoiset	0737c1e3a6	radv: enable out-of-order rasterization by default As the implementation is conservative, we can now enable it by default. It can be disabled with RADV_DEBUG=nooutoforder. Don't expect much more than 1% of improvements, but the gain seems consistent. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-02 10:33:24 +02:00
Samuel Pitoiset	1d766b0196	radv: only disable out-of-order rast for perfect occlusion queries Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-02 10:33:22 +02:00
Kenneth Graunke	1122fb2d98	i965: Drop unused gen5 sampler default color struct. Trivial.	2018-05-01 23:09:25 -07:00
Kenneth Graunke	9f6082f6c7	i965: Make brw_vs_outputs_written static. Drop a prototype. Trivial.	2018-05-01 23:09:16 -07:00
Nanley Chery	3e56e4642f	i965/tex_image: Avoid the ASTC LDR workaround on gen9lp Both the internal documentation and the results of testing this in the CI suggest that this is unnecessary. Add the fixes tag because this reduces an internal benchmark's startup time by about 17 seconds (reported by Eero). Fixes: `710b1d2e66` "i965/tex_image: Flush certain subnormal ASTC channel values" Tested-by: Eero Tamminen <eero.t.tamminen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-01 16:47:39 -07:00
Eric Anholt	800be7f277	freedreno: Fix ir3_cmdline.c build. Fixes: `6487e7a30c` ("nir: move GL specific passes to src/compiler/glsl") Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-05-01 16:38:37 -07:00
Jason Ekstrand	d216ffc604	anv: Allow lookup of vkEnumerateInstanceVersion without an instance Fixes: `cbab2d1da5` Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-01 14:45:51 -07:00
Jason Ekstrand	d5a0787f03	anv: Don't advertise Float64 or Int64 on HW without 64-bit types Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2018-05-01 14:45:50 -07:00
Samuel Pitoiset	d8db5986ce	radv: compute the number of subpass attachments correctly Only count color attachments twice if resolves are used, also account for the depth stencil attachment if present. Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-01 22:18:03 +02:00
Dave Airlie	e66f64c285	radv: set fmask_surf_index on fmask surfaces. This is needed for gfx9 and later for all fmask surface index. (Mentioned by Marek on irc) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-02 06:01:42 +10:00
Brian Paul	f298ed93d9	gallium/i915: fix PIPE_CAPF_MIN_CONSERVATIVE_RASTER_DILATE typo Fixes: `fffe5e2d14` ("gallium: add initial support for conservative rasterization") Trivial.	2018-05-01 09:52:22 -06:00
Rhys Perry	07dac3e040	nvc0: add conservative rasterization support Subpixel precision bias, dilation and the post-snap mode are supported on GM200 and newer. The pre-snap mode is supported for triangle primitives on GP100. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-04-30 21:13:53 -06:00
Rhys Perry	97f5f399ef	st/mesa: add support for nvidia conservative rasterization extensions Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-04-30 21:13:53 -06:00
Rhys Perry	fffe5e2d14	gallium: add initial support for conservative rasterization Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-30 21:13:53 -06:00
Rhys Perry	4580617509	mesa: add support for nvidia conservative rasterization extensions Although the specs are written against compatibility GL 4.3 and allows core profile and GLES2+, it is exposed for GL 1.0+ and GLES1 and GLES2+. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-04-30 21:13:53 -06:00
Brian Paul	31ab0427a7	glsl/tests: add GLSL_TYPE_UINT8, GLSL_TYPE_INT8 cases to switch statements To silence warnings about unhandled switch values. Untested otherwise. v2: move the INT/UINT8 cases after the INT/UINT16 cases, per Eric. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-30 21:13:53 -06:00
Brian Paul	efec712d51	tgsi: use enums instead of unsigned in ureg code Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-04-30 21:13:53 -06:00
Timothy Arceri	6487e7a30c	nir: move GL specific passes to src/compiler/glsl With this we should have no passes in src/compiler/nir with any dependencies on headers from core GL Mesa. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2018-05-01 12:39:33 +10:00
Andres Rodriguez	f56e22e496	radv/winsys: fix leaking resources from bo's imported by fd A bo's ref_count was not being initialized when imported from an fd. Therefore, we would fail to free the resource during VkFreeMemory(). This patch fixes applications like hifi VR in threaded mode, which perform frequent imports/releases of IPC shared memory. Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org>	2018-04-30 18:20:30 -04:00
Scott D Phillips	2a08ae3c7c	i965/tiled_memcpy: ytiled_to_linear a cache line at a time Similar to the transformation applied to linear_to_ytiled, also align each readback from the ytiled source to a cacheline (i.e. transfer a whole cacheline from the source before moving on to the next column). This will allow us to utilize movntqda (_mm_stream_si128) in a subsequent patch to obtain near WB readback performance when accessing the uncached ytiled memory, an order of magnitude improvement. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-30 15:18:36 -07:00
Chris Wilson	682bdaa658	i965: Record mipmap resolver for unmapping When mapping a region of the mipmap_tree, record which complementary method to use to unmap it afterwards. By doing so we can avoid duplicating the decision tree used when mapping and thereby eliminate trivial errors that can be introduced if the two if-chains become out of sync. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-30 14:06:23 -07:00
Chris Wilson	5367295e1a	i965: Move unmap_depthstencil before map_depthstencil Reorder code to avoid a forward declaration in the next patch. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-30 14:06:23 -07:00
Chris Wilson	ab2825c898	i965: Move unmap_etc before map_etc Reorder code to avoid a forward declaration in the next patch. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-30 14:06:23 -07:00
Chris Wilson	9e7e88049f	i965: Move unmap_s8 before map_s8 Reorder code to avoid a forward declaration in the next patch. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-30 14:06:23 -07:00
Chris Wilson	b3ad6f5ca6	i965: Move unmap_movntdqa before map_movntdqa Reorder code to avoid a forward declaration in the next patch. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-30 14:06:23 -07:00
Chris Wilson	f348d07a62	i965: Move unmap_blit before map_blit Reorder code to avoid a forward declaration in the next patch. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-30 14:06:23 -07:00
Chris Wilson	359624142d	i965: Move unmap_gtt before map_gtt Reorder code to avoid a forward declaration in the next patch. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-30 14:06:23 -07:00
Dave Airlie	8d3529872c	ac/nir: expand 64-bit vec3 loads to fix shuffling. If loading 64-bit vec3 values, a 4 component load would be followed by a 2 component load and the resulting shuffle would fail as it requires 2 4 components. This just expands the second results vector out to 4 components. This fixes 100 CTS tests: dEQP-VK.spirv_assembly.type.vec3.64 Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-01 05:58:14 +10:00
Kenneth Graunke	bde12f75e1	i965: Don't stomp initial kflags for program cache. We want to flag EXEC_OBJECT_CAPTURE, but we ought to preserve any existing kflags. Today, there are none (as the program cache doesn't support 48-bit addressing), but once we start using softpin, we'll need to preserve EXEC_OBJECT_PINNED. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-04-30 11:34:19 -07:00
Kenneth Graunke	0cc98522f9	i965: Let batchbuffers be placed anywhere in the 48-bit address space. We were trying to mark batch buffers with EXEC_OBJECT_CAPTURE, and accidentally stomped EXEC_OBJECT_SUPPORTS_48B_ADDRESS in the process. There's no reason to restrict batch buffers to the lower 4GB. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-04-30 11:34:19 -07:00
Scott D Phillips	8ffc6ee251	intel: fix check for 48b ppgtt support The previous logic of the supports_48b_addresses wasn't actually checking if i915.ko was running with full_48bit_ppgtt. The ENOENT it was checking for was actually coming from the invalid context id provided in the test execbuffer. There is no path in the kernel driver where the presence of EXEC_OBJECT_SUPPORTS_48B_ADDRESS leads to an error. Instead, check the default context's GTT_SIZE param for a value greater than 4 GiB v2 (Ken): Fix in i965 as well. v3 Check GTT_SIZE instead of HAS_ALIASING_PPGTT (Chris Wilson) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-30 11:34:19 -07:00
Leo Liu	1c5f4f4e17	st/omx/enc: fix blit setup for YUV LoadImage The blit here involves scaling since it's copying from I8 format to R8G8 format. Half of source will be filtered out with PIPE_TEX_FILTER_NEAREST instruction, it looks that GPU always uses the second half as source. Currently we use "1" as the start point of x for R, then causing 1 source pixel of U component shift to right. So "-1" should be the start point for U component. Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-30 11:55:36 -04:00
Juan A. Suarez Romero	4d449c94e4	autotools, meson: bump up required VA version Due using a new VP9 config we use, required VA API 0.39 Fixes: `413c5ca372` ("travis: update libva required version") CC: 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-04-30 13:59:37 +02:00
Juan A. Suarez Romero	96ed3714fc	docs: update calendar, add news and link release notes to 18.0.2 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2018-04-28 17:01:48 +00:00
Juan A. Suarez Romero	8f1159bf9a	docs: add sha256 checksums for 18.0.2 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `b3eed3ad03`)	2018-04-28 16:58:39 +00:00
Juan A. Suarez Romero	14f85260de	docs: add release notes for 18.0.2 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `d38da7bd2d`)	2018-04-28 16:58:36 +00:00
Marek Olšák	8b7358fe43	radeonsi: increase the number of compiler threads depending on the CPU The compiler queue was limited to 3 threads, so shader-db running on a 16-thread CPU would have a bottleneck on the 3-thread queue. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Benedikt Schemmer <ben at besd.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	3f0eaaf6d9	radeonsi: avoid a crash in gallivm_dispose_target_library_info Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Benedikt Schemmer <ben at besd.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	e75fc8d033	radeonsi: move data_layout into si_compiler Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Benedikt Schemmer <ben at besd.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	797d673c9a	radeonsi: move passmgr into si_compiler Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Benedikt Schemmer <ben at besd.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	c1823ff661	radeonsi: move target_library_info into si_compiler Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Benedikt Schemmer <ben at besd.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	5a94f15aa7	radeonsi: use si_compiler::triple in si_llvm_optimize_module Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Benedikt Schemmer <ben at besd.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	43f0a10051	radeonsi: add triple into si_compiler Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Benedikt Schemmer <ben at besd.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	87eb597758	radeonsi: add struct si_compiler containing LLVMTargetMachineRef It will contain more variables. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Benedikt Schemmer <ben at besd.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	788d66553a	radeonsi: rename r600_texture::resource to buffer r600_resource could be renamed to si_buffer. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	6fadfc01c6	radeonsi: use r600_resource() typecast helper Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	3160ee876a	radeonsi: remove unused atom parameter from si_atom::emit Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	de344209ad	radeonsi: inline 2 trivial state structures Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	e395475096	radeonsi: remove function si_init_atom Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	ccebcba893	radeonsi: remove si_atom::id Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	639b673fc3	radeonsi: don't use an indirect table for state atoms Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	9054799b39	radeonsi: rename r600_atom -> si_atom Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	a8abbbb172	radeonsi: remove r600_pipe_common.h Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	6d19120da8	radeonsi/gfx9: workaround for INTERP with indirect indexing and clean up the conditions. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>	2018-04-27 17:56:04 -04:00
Marek Olšák	2d69b485f5	radeonsi: rewrite DCC format compatibility checking code It might be better to use a slow compressed clear when clearing to 1. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	c732d069b3	radeonsi: implement DCC fast clear swizzle constraints more accurately Reduce swizzle constraints to the ALPHA_IS_ON_MSB constraint and the clear value of 1. This significantly changes the DCC fast clear code, and fixes fast clear for RGB formats without alpha. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	9ef423f720	radeonsi: rename variables and document stuff around DCC fast clear Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	1cc2e0cc6b	radeonsi: fully enable 2x DCC MSAA for array and non-array textures The clear code is exactly the same as for 1 sample buffers - just clear the whole thing. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	ca33d961a4	radeonsi: enable fast color clear for level 0 of mipmapped textures on <= VI GFX9 is more complicated and needs a compute shader that we should just copy from amdvlk. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	174e11c3f5	ac/surface: handle DCC subresource fast clear restriction on VI v2: require the previous level to be clearable for determining whether the last unaligned level is clearable Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
George Kyriazis	838f15650e	swr/rast: No need to export GetSimdValidIndicesGfx Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	7caeee3432	swr/rast: Small editorial changes Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	f276517ebf	swr/rast: Use new processor detection mechanism Use specific avx512 selection mechanism based on avx512er bit instead of getHostCPUName(). LLVM 6.0.0 has a bug that reports wrong string for KNL (fixed in 6.0.1). Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	8ace547e8d	swr/rast: Output rasterizer dir to console since it's process specific Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	c328c5d0f4	swr/rast: Add TranslateGfxAddress for shader Also add GFX_MEM_CLIENT_SHADER Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	edc41f73b8	swr/rast: jit PRINT improvements. Sign-extend integer types to 32bit when specifying "%d" and add new %u which zero-extends to 32bit. Improves printing of sub 32bit integer types (i1 specifically). Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	5d403178e6	swr/rast: Fix regressions. Bump jit cache revision number to force recompile. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	577af2bed4	swr/rast: Cleanup old cruft. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	aeab9db50a	swr/rast: Package events.proto with core output However only if the file exists in DEBUG_OUTPUT_DIR. The expectation is that AR rasterizerLauncher will start placing it there when launching a workload (which is in a subsequent checkin) Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	b97bb0ea6d	swr/rast: Fix init in EventHandlerWorkerStats Make sure we initialize variables. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	9a72d4c03e	swr/rast: Fix return type of VCVTPS2PH. expecting <8xi16> return. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	3f008c5505	swr/rast: WIP Translation handling Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	7986519d50	swr/rast: Use different handing for stream masks Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	6b1c852ebc	swr/rast: Silence warnings Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	e6daa62a48	swr/rast: Add support for TexelMask evaluation Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	cec1b52cac	swr/rast: Internal core change Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	7b343a215e	swr/rast: Fix x86 lowering 64-bit float handling - 64-bit cvt-to-float needs to be explicitly handled - gathers need the right parameter types to work with doubles Fixes draw-vertices piglit tests Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	fa4ab7910e	swr/rast: Add some SIMD_T utility functors VecEqual and VecHash Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	18c9cb85d1	swr/rast: Fix wrong type allocation ALLOCA pointer elements, not pointers. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	1cdbce8805	swr: touch generated files to update timestamp previous change in generators necessitates this change Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	9ceeb671a3	swr/rast: Fix byte offset for non-indexed draws for the case when USE_SIMD16_SHADERS == FALSE Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
Marek Olšák	7083ac7290	util/u_queue: fix a deadlock in util_queue_finish Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 13:28:17 -04:00
Dylan Baker	7772de5283	meson: fix race condition revealed by using 0.44 Previously there was a special target that blocked for the generation of anv_entrypoints.h, with meson 0.44 we don't need this, we can use a new language feature instead. The problem is that previously that blocking target would hide a race condition for the generation of another header, anv_extensions.h. Now the build sometimes fails when anv_extensions.h is not generated in time. v2: - clarify the race condition in the commit message (Emil) CC: Mark Janes <mark.a.janes@intel.com> Fixes: `92550d9b16` ("meson: remove workaround for custom target creating .h and .c files") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-04-27 10:24:51 -07:00
Dylan Baker	0c23bd76d1	bin: force git show to use default pretty setting I have pretty default to short, which breaks this script. v2: - Fix both places that don't define a --pretty (Emil) cc: Juan A. Suarez <jasuarez@igalia.com> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Andres Gomez <agomez@igalia.com> (v1) Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-04-27 10:19:55 -07:00
Tapani Pälli	b3ad4b6971	mesa: add TBO support for GL_EXT_texture_norm16 Earlier plumbing missed interaction with texture buffer objects. Fixes: `7f467d4f73` "mesa: GL_EXT_texture_norm16 extension plumbing" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-27 14:34:43 +03:00
Samuel Pitoiset	d38425ce87	ac: fix texture query LOD for 1D textures on GFX9 1D textures are allocated as 2D which means we only need one coordinate for texture query LOD. Fixes: `625dcbbc45` ("amd/common: pass address components individually to ac_build_image_intrinsic") Cc: 18.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 11:15:35 +02:00
Christian Gmeiner	3e69127939	etnaviv: remove not needed includes Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>	2018-04-27 09:04:56 +02:00
Christian Gmeiner	2ba587aac7	etnaviv: remove redundant include Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>	2018-04-27 09:04:53 +02:00
Timothy Arceri	79b0556f29	glsl: replace some asserts with unreachable when processing the ast Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-04-27 10:18:47 +10:00
Timothy Arceri	410f901bee	mesa: drop the buffer mode param from the DrawBuffer driver function No drivers used it. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-04-27 10:09:10 +10:00
Anuj Phogat	b695a7bd8e	anv/icl: Enable Vulkan on Ice Lake This patch enables the Vulkan driver on Ice Lake h/w with added warning about preliminary support. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-04-26 16:31:27 -07:00
Caio Marcelo de Oliveira Filho	c9bdc7f7e2	anv: enable VK_EXT_shader_viewport_index_layer Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-26 15:32:05 -07:00
Jason Ekstrand	3db93f9128	anv/allocator: Don't shrink either end of the block pool Previously, we only tried to ensure that we didn't shrink either end below what was already handed out. However, due to the way we handle relocations with block pools, we can't shrink the back end at all. It's probably best to not shrink in either direction. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105374 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106147 Tested-by: Eero Tamminen <eero.t.tamminen@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com> Cc: mesa-stable@lists.freedesktop.org	2018-04-26 13:17:14 -07:00
Eric Anholt	76ee9edcb4	broadcom/vc5: Add support for centroid varyings. It would be nice to share the flags packet emit logic with flat shade flags, but I couldn't come up with a good way while still using our pack macros. We need to refactor this to shader record setup at compile time, anyway. Fixes ext_framebuffer_multisample-interpolation * centroid-*	2018-04-26 11:30:22 -07:00
Eric Anholt	e2f3317801	broadcom/vc5: Add an assert about GFXH-1559. Our TF outputs always start at 6 or 7 currently, so we don't hit the broken 8 case. Let's make sure that doesn't change somehow.	2018-04-26 11:30:22 -07:00
Eric Anholt	77b4f30bae	broadcom/vc5: Add validation that we don't violate GFXH-1633 requirements. We don't use ldunifa yet, but we will eventually for UBOs.	2018-04-26 11:30:22 -07:00
Eric Anholt	089c32eefd	broadcom/vc5: Add validation that we don't violate GFXH-1625 requirements. We don't use TMUWT yet, but we will once we do SSBOs.	2018-04-26 11:30:22 -07:00
Eric Anholt	57ceb95c84	broadcom/vc5: Implement GFXH-1742 workaround (emit 2 dummy stores on 4.x). This should fix help with intermittent GPU hangs in tests switching formats while rendering small frames. Unfortunately, it didn't help with the tests I'm having troubles with.	2018-04-26 11:30:22 -07:00
Eric Anholt	dc4cb04ee5	broadcom/vc5: Add QPU validation for register writes after thrend. The next shader gets to start writing the register file during these slots, so make sure we don't stomp over them. The only case of hitting this that I could imagine would be dead writes.	2018-04-26 11:30:22 -07:00
Eric Anholt	8adf813f83	st: Choose a 2101010 format for GL_RGB/GL_RGBA with a 2_10_10_10 type. GLES's GL_EXT_texture_type_2_10_10_10_REV allows uploading this type to an unsized internalformat, and it should be non-color-renderable. fbobject.c's implementation of the check for color-renderable is checks that the texture has a 2101010 mesa format, so make sure that we have chosen a 2101010 format so that check can do what it meant to. Fixes KHR-GLES3.packed_pixels.pbo_rectangle.rgb on vc5. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-26 11:30:22 -07:00
Charmaine Lee	8aef7fccb7	st/mesa: fix missing setting of _ElementSize in new_draw_rasterpos_stage With this patch, _ElementSize is initialized along with the rest of the vertex array attributes in new_draw_rasterpos_stage(). This fixes a crash in st_pipe_vertex_format() when running topogun-1.06-orc-84k-resize trace file with VMware svga driver. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-04-26 10:29:02 -07:00
Drew Davenport	e923e8151d	st/va: Fix typos s/attibute/attribute/ s/suface/surface/ v2: rebased(Leo) Reviewed-by: Leo Liu <leo.liu@amd.com>	2018-04-26 11:16:05 -04:00
Drew Davenport	893808006a	st/va: Fix potential buffer overread VASurfaceAttribExternalBuffers.pitches is indexed by plane. Current implementation only supports single plane layout. Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Leo Liu <leo.liu@amd.com>	2018-04-26 11:16:05 -04:00
Boyuan Zhang	deba56accf	radeon/vcn: fix mpeg4 msg buffer settings Previous bit-fields assignments are incorrect and will result certain mpeg4 decode failed due to wrong flag values. This patch fixes these assignments. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2018-04-26 11:16:05 -04:00
Ian Romanick	bf5e0276b6	radeon: Drop broken front_buffer_reading/drawing optimization Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-26 09:38:51 -04:00
Ian Romanick	0b3231966f	radeon: Use _mesa_is_front_buffer_drawing Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-26 09:38:51 -04:00
Samuel Pitoiset	d7ffe3b384	radv: set ac_surf_info::num_channels correctly num_channels has been introduced since "ac/surface: don't set the display flag for obviously unsupported cases". Based on RadeonSI. Fixes: `e29facff31` ("ac/surface: don't set the display flag for obviously unsupported cases (v2)") Cc: 18.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-26 15:34:14 +02:00
Samuel Pitoiset	a6fbefa67b	radv: fix DCC enablement since partial MSAA implementation dcc_msaa_allowed is always false on GFX9+ and only true on VI if RADV_PERFTEST=dccmsaa is set. This means DCC was disabled in some situations where it should not. This is likely going to fix a performance regression. Fixes: `2f63b3dd09` ("radv: enable DCC for MSAA 2x textures on VI under an option") Cc: 18.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-26 15:34:11 +02:00
Karol Herbst	227b1af866	nir/opt_constant_folding: fix folding of 8 and 16 bit ints Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-26 11:16:15 +02:00
Karol Herbst	14943add44	nir: print 8 and 16 bit constants correctly Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-26 11:16:15 +02:00
Karol Herbst	543a8c66a7	nir: support converting to 8-bit integers in nir_type_conversion_op Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-26 11:16:15 +02:00
Neil Roberts	c4ab1bdcc9	spirv: Don’t check for NaN for most OpFOrd* comparisons For all of the OpFOrd* comparisons except OpFOrdNotEqual the hardware should probably already return false if one of the operands is NaN so we don’t need to have an explicit check for it. This seems to at least work on Intel hardware. This should reduce the number of instructions generated for the most common comparisons. For what it’s worth, the original code to handle this was added in `e062eb6415`. The commit message for that says that it was to fix some CTS tests for OpFUnord* opcodes. Even if the hardware doesn’t handle NaNs this patch shouldn’t affect those tests. At any rate they have since been moved out of the mustpass list. Incidentally those tests fail on the nvidia proprietary driver so it doesn’t seem like handling NaNs correctly is a priority. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-26 10:08:14 +02:00
Matt Atwood	3ba5a646e5	Intel: Add a Kaby Lake PCI ID v2: Branding changed Signed-off-by: Matt Atwood <matthew.s.atwood@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-04-25 13:31:55 -07:00
Eric Anholt	069c409f43	gallium/util: Fix incorrect refcounting of separate stencil. The driver may have a reference on the separate stencil buffer for some reason (like an unflushed job using it), so we can't directly free the resource and should instead just decrement the refcount that we own. Fixes double-free in KHR-GLES3.packed_depth_stencil.blit.depth32f_stencil8 on vc5. Fixes: `e94eb5e600` ("gallium/util: add u_transfer_helper") Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-04-25 12:14:33 -07:00
Eric Anholt	0d4ce00d70	broadcom/vc5: Fix reloads of separate stencil buffers. Like for stores, we need to emit a separate load_general packet.	2018-04-25 09:21:54 -07:00
Eric Anholt	9f3f4284c0	broadcom/vc5: Fix cpp of MSAA surfaces on 4.x. The internal-type-bpp path is for surfaces that get stored in the raw TLB format. For 4.x, we're storing MSAA as just 2x width/height at the original format.	2018-04-25 09:21:54 -07:00
Eric Anholt	ac207acb97	broadcom/vc5: Implement stencil blits using RGBA. Fixes piglit fbo-depthstencil blit default_fb	2018-04-25 09:21:54 -07:00
Eric Anholt	503716fa86	broadcom/vc5: Remove leftover vc4 MSAA lowering setup in the FS key.	2018-04-25 09:21:54 -07:00
Eric Anholt	5710532e9e	broadcom/vc5: Fix tile load/store of MSAA surfaces on 4.x. For single-sample we have to always program SAMPLE_0, but for multisample we want to store all the samples.	2018-04-25 09:21:54 -07:00
Juan A. Suarez Romero	413c5ca372	travis: update libva required version Commit `fa328456e8` added VP9 config support, but this needs a newer libva version, 1.7.0 or above. Fixes: `fa328456e8` ("st/va: add VP9 config to enable profile2") CC: 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-04-25 16:09:20 +02:00
Tapani Pälli	7f467d4f73	mesa: GL_EXT_texture_norm16 extension plumbing Patch enables use of short and unsigned short data for texture uploads, rendering and reading of framebuffers within the restrictions specified in GL_EXT_texture_norm16 spec. Patch also enables those 16bit format layout qualifiers listed in GL_NV_image_formats that depend on EXT_texture_norm16. v2: expose extension with dummy_true fix layout qualifier map changes (Ilia Mirkin) v3: use _mesa_has_EXT_texture_norm16, other fixes and cleanup (Ilia Mirkin) v4: fix rest of the issues found Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-25 14:26:20 +03:00
Jordan Justen	b0c5774027	meson: Fix with_intel_vk and with_amd_vk variables Fixes: `5608d0a2ce` "meson: use array type options" Cc: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-04-24 23:12:42 -07:00
Roland Scheidegger	77554d220d	draw: fix different sign logic when clipping The logic was flawed, since mul(x,y) will be <= 0 (exactly 0) when the sign is the same but both numbers are sufficiently small (if the product is smaller than 2^-128). This could apparently lead to emitting a sufficient amount of additional bogus vertices to overflow the allocated array for them, hitting an assertion (still safe with release builds since we just aborted clipping after the assertion in this case - I'm however unsure if this is now really no longer possible, so that code stays). Not sure if the additional vertices could cause other grief, I didn't see anything wrong even when hitting the assertion. Essentially, both +-0 are treated as positive (the vertex is considered to be inside the clip volume for this plane), so integrate the logic determining different sign into the branch there. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-04-25 04:50:20 +02:00
Roland Scheidegger	98578df27b	draw: simplify clip null tri logic Simplifies the logic when to emit null tris (albeit the reasons why we have to do this remain unclear). This is strictly just logic simplification, the behavior doesn't change at all. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-04-25 04:50:20 +02:00
Ilia Mirkin	c17ddcb4b4	nvc0/ir: all short immediates are sign-extended, adjust LIMM test Some analysis suggests that all short immediates are sign-extended. The insnCanLoad logic already accounted for this, but we could still pick the wrong form when emitting actual instructions that support both short and long immediates (with the long form usually having additional restrictions that insnCanLoad should be aware of). This also reverses a bunch of commits that had previously "worked around" this issue in various emitters: `9c63224540`: gm107/ir: make use of ADD32I for all immediates `83a4f28dc2`: gm107/ir: make use of LOP32I for all immediates `b84c97587b`: gm107/ir: make use of IMUL32I for all immediates `d30768025a`: gk110/ir: make use of IMUL32I for all immediates as well as the original import for UMUL in the nvc0 emitter. Reported-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Karol Herbst <kherbst@redhat.com>	2018-04-24 21:37:44 -04:00
Boyan Ding	6695f9d5c5	mesa: call DrawBufferAllocate driver hook in update_framebuffer for windows-system FB When draw buffers are changed on a bound framebuffer, DrawBufferAllocate() hook should be called. However, it is missing in update_framebuffer with window-system framebuffer, in which FB's draw buffer state should match context state, potentially resulting in a change. Note: This is needed because gallium delays creating the front buffer, i965 works fine without this change. V2 (Timothy Arceri): - Rebased on merged/simplified DrawBuffer driver function - Move DrawBuffer call outside fb->ColorDrawBuffer[0] != ctx->Color.DrawBuffer[0] check to make piglit pass. v3 (Timothy Arceri): - Call new DrawBuffaerAllocate() driver function. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> (v2) Reviewed-by: Brian Paul <brianp@vmware.com> (v2) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99116	2018-04-25 09:08:26 +10:00
Timothy Arceri	6ca09f3a60	st/mesa: add new driver function DrawBufferAllocate Unlike some of the classic drivers the st was only using DrawBuffer() to allocated some buffers on-demand. Creating a separate function will allow us to call it from update_framebuffer() in the following patch without regressing some of the older classic drivers. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-25 09:08:26 +10:00
Timothy Arceri	2554b8cb00	mesa: some C99 tidy ups for framebuffer.c Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-04-25 09:08:26 +10:00
Dylan Baker	1d01b52d76	meson: Fix no-rtti in llvm detection Because I clearly wasn't thinking and clearly didn't do a good job testing. Sigh Fixes: `c5a97d658e` ("meson: fix builds against LLVM built without rtti") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-24 15:26:51 -07:00
Dylan Baker	be0a2cfc65	meson: use new warning function Instead of emulating it with message. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-24 14:08:15 -07:00
Dylan Baker	5608d0a2ce	meson: use array type options This option type is nice since it involves less converting strings into lists, and because it validates the values that are provided. v2: - Set with_any_vk to true if any vulkan driver is built (Eric) Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-24 14:08:15 -07:00
Dylan Baker	c5a97d658e	meson: fix builds against LLVM built without rtti Building without rtti is a frought with peril, but it's something that autotools supports so we need to support it too. Since we've moved to version 0.44 as a whole we can use the meson functionality for accessing random llvm-config options we can check for rtti and add -fno-rtti to all C++ code accordingly. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-04-24 14:08:15 -07:00
Dylan Baker	595021bf1a	meson: remove dummy_cpp meson has gotten pretty smart about tracking C and C++ dependencies (internal and external), and using the right linker. This wasn't always the case and we created empty c++ files to force the use of the c++ linker. We don't need that any more. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-24 14:08:15 -07:00
Dylan Baker	db90c8627c	meson: allow empty sources when using link_whole meson used to get grumpy if the sources list was empty, even when using --whole-archive (link_whole). In more recent versions that's not true, so remove the workaround. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-24 14:08:15 -07:00
Dylan Baker	92550d9b16	meson: remove workaround for custom target creating .h and .c files In more modern versions of meson a custom_target returns an index-able object. This allows us to create accurate dependency models for targets that rely only on the header and not on the code from anv_entrypoints. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-24 14:08:15 -07:00
Dylan Baker	5a670d08c0	meson: raise required version to 0.44.1 We have already required 0.44 for building clover and swr, so it was already partially required. This just makes it required across the board instead of just for clover and swr. There is a bug in 0.44 which makes it impossible to build mesa in some configurations, so require 0.44.1 which fixes this. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-24 14:08:15 -07:00
Dylan Baker	1546f76a39	meson: fix graw-xlib after auxiliary consolidation This one's completely my fault, I didn't do good enough testing after rebasing and this got missed. Fixes: `d28c246501` ("meson: build graw tests") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-24 14:08:15 -07:00
Dylan Baker	c73abb4f82	meson: only build mesa_st tests when build-tests is true Since we have an option to turn test building on and off, we should honor that. Fixes: `34cb4d0ebc` ("meson: build tests for gallium mesa state tracker") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-24 14:08:15 -07:00
Dylan Baker	aaab624245	meson: don't build classic mesa tests without dri_drivers Since mesa_classic is build-on-demand the tests will create a demand and add a bunch of extra compilation. Fixes: `43a6e84927` ("meson: build mesa test.") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-24 14:08:15 -07:00
Nanley Chery	0e8b16e0a2	i965/meta_util: Re-enable sRGB-encoded fast-clears on CNL The paths which sample with the clear color are now using a getter which performs the sRGB decode needed to enable this fast clear. This path can be exercised by fast-clearing a texture, then performing an operation which requires sRGB decoding. Test coverage for this feature is provided with the following tests: * Shader texture calls: - spec@ext_texture_srgb@tex-srgb * Shader texelfetch calls: - spec@arb_framebuffer_srgb@fbo-fast-clear - spec@arb_framebuffer_srgb@msaa-fast-clear * Blending: - spec@arb_framebuffer_srgb@arb_framebuffer_srgb-fast-clear-blend * Blitting: - spec@arb_framebuffer_srgb@blit texture srgb msaa enabled clear Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-24 13:41:14 -07:00
Nanley Chery	129ad66dd5	i965/miptree: Extend the sRGB-blending WA to future platforms The blending issue seems to be present on CNL as well. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-24 13:41:14 -07:00
Nanley Chery	7ea013c6d3	i965: Add and use a getter for the clear color It returns both the inline clear color and a clear address which points to the indirect clear color buffer (or NULL if unused/non-existent). This getter allows CNL to sample from fast-cleared sRGB textures correctly by doing the needed sRGB-decode on the clear color (inline) and making the indirect clear color buffer unused. v2 (Rafael): * Have a more detailed commit message. * Add a comment on the sRGB conversion process. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-24 13:41:14 -07:00
Jason Ekstrand	b55077a8bc	util/srgb: Add a float sRGB -> linear helper Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-24 13:41:14 -07:00
Nanley Chery	cd5ce363e3	i965/wm_surface_state: Use the clear address if clear_bo is non-NULL We want to add and use a getter that turns off the indirect path by returning zero for the clear color bo and offset. v2: Fix usage of "clear address" in commit message (Jason). Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-24 13:41:14 -07:00
Nanley Chery	af4e9295fe	i965: Add and use a single miptree aux_buf field We want to add and use a function that accesses the auxiliary buffer's clear_color_bo and doesn't care if it has an MCS or HiZ buffer specifically. v2 (Jason Ekstrand): * Drop intel_miptree_get_aux_buffer(). * Mention CCS in the aux_buf field. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> (v1) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-24 13:41:14 -07:00
Nanley Chery	5503b65103	i965: Add and use a getter for the miptree aux buffer Make the next patch easier to read by eliminating most of the would-be duplicate field accesses now. v2: Update the HiZ comment instead of deleting it (Rafael). Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-04-24 13:41:14 -07:00
Karol Herbst	e4f675dc42	gm107/ir/lib: fix sched in div u32 builtin Imad needs to set a read barrier. With significant big work groups I was getting wrong results for div u32. Turns out the issue was with the sched opcodes. Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-24 22:31:59 +02:00
Ian Romanick	0d5ce25c1c	intel/compiler: Add scheduler deps for instructions that implicitly read g0 Otherwise the scheduler can move the writes after the reads. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95009 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95012 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Mark Janes <mark.a.janes@intel.com> Cc: Clayton A Craft <clayton.a.craft@intel.com> Cc: mesa-stable@lists.freedesktop.org	2018-04-24 14:31:21 -04:00
Ian Romanick	cd32a4e5f4	intel/compiler: Silence unused parameter warnings in empty vec4_instruction_scheduler methods src/intel/compiler/brw_schedule_instructions.cpp: In member function ‘virtual void vec4_instruction_scheduler::count_reads_remaining(backend_instruction)’: src/intel/compiler/brw_schedule_instructions.cpp:764:72: warning: unused parameter ‘be’ [-Wunused-parameter] vec4_instruction_scheduler::count_reads_remaining(backend_instruction be) ^~ src/intel/compiler/brw_schedule_instructions.cpp: In member function ‘virtual void vec4_instruction_scheduler::setup_liveness(cfg_t)’: src/intel/compiler/brw_schedule_instructions.cpp:769:51: warning: unused parameter ‘cfg’ [-Wunused-parameter] vec4_instruction_scheduler::setup_liveness(cfg_t cfg) ^~~ src/intel/compiler/brw_schedule_instructions.cpp: In member function ‘virtual void vec4_instruction_scheduler::update_register_pressure(backend_instruction)’: src/intel/compiler/brw_schedule_instructions.cpp:774:75: warning: unused parameter ‘be’ [-Wunused-parameter] vec4_instruction_scheduler::update_register_pressure(backend_instruction be) ^~ src/intel/compiler/brw_schedule_instructions.cpp: In member function ‘virtual int vec4_instruction_scheduler::get_register_pressure_benefit(backend_instruction)’: src/intel/compiler/brw_schedule_instructions.cpp:779:80: warning: unused parameter ‘be’ [-Wunused-parameter] vec4_instruction_scheduler::get_register_pressure_benefit(backend_instruction be) ^~ src/intel/compiler/brw_schedule_instructions.cpp: In member function ‘virtual int vec4_instruction_scheduler::issue_time(backend_instruction)’: src/intel/compiler/brw_schedule_instructions.cpp:1550:61: warning: unused parameter ‘inst’ [-Wunused-parameter] vec4_instruction_scheduler::issue_time(backend_instruction inst) ^~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-24 14:31:21 -04:00
Ian Romanick	bdb15c2344	intel/compiler: Silence unused parameter warning in compile_cs_to_nir src/intel/compiler/brw_fs.cpp: In function ‘nir_shader* compile_cs_to_nir(const brw_compiler, void, const brw_cs_prog_key, brw_cs_prog_data, const nir_shader, unsigned int)’: src/intel/compiler/brw_fs.cpp:7205:44: warning: unused parameter ‘prog_data’ [-Wunused-parameter] struct brw_cs_prog_data prog_data, ^~~~~~~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-24 14:31:21 -04:00
Ian Romanick	d84b2ed1d7	intel/compiler: Silence unused parameter warnings in generate_foo methods Since all of the fs_generator::generate_foo methods take a fs_inst * as the first parameter, just remove the name to quiet the compiler. src/intel/compiler/brw_fs_generator.cpp: In member function ‘void fs_generator::generate_barrier(fs_inst, brw_reg)’: src/intel/compiler/brw_fs_generator.cpp:743:41: warning: unused parameter ‘inst’ [-Wunused-parameter] fs_generator::generate_barrier(fs_inst inst, struct brw_reg src) ^~~~ src/intel/compiler/brw_fs_generator.cpp: In member function ‘void fs_generator::generate_discard_jump(fs_inst)’: src/intel/compiler/brw_fs_generator.cpp:1326:46: warning: unused parameter ‘inst’ [-Wunused-parameter] fs_generator::generate_discard_jump(fs_inst inst) ^~~~ src/intel/compiler/brw_fs_generator.cpp: In member function ‘void fs_generator::generate_pack_half_2x16_split(fs_inst, brw_reg, brw_reg, brw_reg)’: src/intel/compiler/brw_fs_generator.cpp:1675:54: warning: unused parameter ‘inst’ [-Wunused-parameter] fs_generator::generate_pack_half_2x16_split(fs_inst inst, ^~~~ src/intel/compiler/brw_fs_generator.cpp: In member function ‘void fs_generator::generate_shader_time_add(fs_inst, brw_reg, brw_reg, brw_reg)’: src/intel/compiler/brw_fs_generator.cpp:1743:49: warning: unused parameter ‘inst’ [-Wunused-parameter] fs_generator::generate_shader_time_add(fs_inst inst, ^~~~ src/intel/compiler/brw_vec4_generator.cpp: In function ‘void generate_set_simd4x2_header_gen9(brw_codegen, brw::vec4_instruction, brw_reg)’: src/intel/compiler/brw_vec4_generator.cpp:1412:52: warning: unused parameter ‘inst’ [-Wunused-parameter] vec4_instruction inst, ^~~~ src/intel/compiler/brw_vec4_generator.cpp: In function ‘void generate_mov_indirect(brw_codegen, brw::vec4_instruction, brw_reg, brw_reg, brw_reg, brw_reg)’: src/intel/compiler/brw_vec4_generator.cpp:1430:41: warning: unused parameter ‘inst’ [-Wunused-parameter] vec4_instruction inst, ^~~~ src/intel/compiler/brw_vec4_generator.cpp:1432:63: warning: unused parameter ‘length’ [-Wunused-parameter] struct brw_reg indirect, struct brw_reg length) ^~~~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-24 14:31:21 -04:00
Eric Anholt	3d21fc193e	broadcom/vc5: Set up internal_format for imported resources. Without this, we'd assertion fail in u_transfer_helper when mapping an imported resource.	2018-04-24 10:37:29 -07:00
Eric Anholt	f08f477a93	broadcom/vc5: Assert that created BOs have offset != 0. The kernel shouldn't return a bo at NULL, and the HW special-cases NULL address values for things like OQs.	2018-04-24 10:37:29 -07:00
Eric Anholt	482f2e24b5	broadcom/vc5: Don't allocate simulator BOs at offset 0. The kernel won't return us BOs at offset 0 (because things like OQs wouldn't work there), so we shouldn't in the simulator either.	2018-04-24 10:37:29 -07:00
Eric Anholt	82cdb801fd	broadcom/vc5: Add sim support for the GET_BO_OFFSET ioctl. Otherwise we'd crash immediately upon importing a BO through EGL interfaces.	2018-04-24 10:37:29 -07:00
Eric Anholt	3cdd055ed2	broadcom/vc5: Treat imports of DRM_FORMAT_MOD_INVALID BOs as linear. We don't have any kernel metadata about BO tiling, so this probably is all we should do for the moment.	2018-04-24 10:37:29 -07:00
Tapani Pälli	c2e159d050	i965: expose MESA_FORMAT_R8G8B8A8_SRGB visual Exposing the visual makes following dEQP tests pass on Android: dEQP-EGL.functional.wide_color.window_8888_colorspace_srgb dEQP-EGL.functional.wide_color.pbuffer_8888_colorspace_srgb Visual is exposed only when DRI_LOADER_CAP_RGBA_ORDERING is set. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-24 14:55:18 +03:00
Tapani Pälli	fa4d4d97f3	dri: Add __DRI_IMAGE_FORMAT_SABGR8 Add format definition and required plumbing to create images. Note that there is no match to drm_fourcc definition, just like with existing _DRI_IMAGE_FOURCC_SARGB8888. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-24 14:55:18 +03:00
Marek Olšák	4559aefb5c	Revert "st/dri: Fix dangling pointer to a destroyed dri_drawable" This reverts commit `dab02dea34`. It causes crashes of qtcreator and firefox. Fixes: `dab02de` "st/dri: Fix dangling pointer to a destroyed dri_drawable" Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>	2018-04-24 00:00:20 -04:00
Roland Scheidegger	e8e1d287a3	gallivm: dump bitcode before optimization If we dump the bitcode for off-line debug purposes, we really want the pre-optimized bitcode, otherwise it's useless in identifying problems with IR optimization (if you have a shader which takes an hour to do IR optimization, it's also nice you don't have to wait that hour...). Also, print out the function passes for opt which correspond to what was used for jit compilation (and also the opt level for codegen). Using opt/llc this way should then pretty much mimic what was done for jit. (When specifying something like -time-passes -debug-pass=[Structure\|Arguments] (for either opt or llc) that also gives very useful information in which passes all the time was spent, and which passes are really run along with the order - llvm will add passes due to dependencies on its own, and of course -O2 for llc comes with a ~100 pass list.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-04-24 04:49:39 +02:00
Roland Scheidegger	e89cf59c27	gallivm: (trivial) do division by 1000 with int64 Conversion to int can otherwise overflow if compile times are over ~71min. (Yes this can happen...) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-04-24 04:49:39 +02:00
Roland Scheidegger	45b8f620a5	gallivm: remove LICM pass LICM is simply too expensive, even though it presumably can help quite a bit in some cases. It was definitely cheaper in llvm 3.3, though as far as I can tell with llvm 3.3 it failed to do anything in most cases. early-cse also actually seems to cause licm to be able to move things when it previously couldn't, which causes noticeable compile time increases. There's more loop passes in llvm, but I'm not sure which ones are helpful, and I couldn't find anything which would roughly do what the old licm in llvm 3.3 did, so ditch it. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-04-24 04:49:39 +02:00
Roland Scheidegger	8b9ab674b9	gallivm: add early cse pass This pass is quite cheap, and can simplify the IR quite a bit for our generated IR. In particular on a variety of shaders I've found the time saved by other passes due to the simplified IR more than makes up for the cost of this pass, and on top of that the end result is actually better. The only downside I've found is this enables the LICM pass to move some things out of the main shader loop (in the case I've seen, instanced vertex fetch (which is constant within the jit shader) plus the derived instructions in the shader) which it couldn't do before for some reason. This would actually be desirable but can increase compile time considerably (licm seems to have considerable cost when it actually can move things out of loops, due to alias analysis). But blaming early cse for this seems inappropriate. (Note that the first two sroa / earlycse passes are similar to what a standard llvm opt -O1/-O2 pipeline would do, albeit this has some more passes even before but I don't think they'd do much for us.) It also in particular helps some crazy shader used for driver verification (don't ask...) a lot (about factor of 6 faster in compile time) (due to simplfiying the ir before LICM is run). While here, also move licm behind simplifycfg. For some shaders there seems to be very significant compile time gains (we've seen a factor of 10000 albeit that was a really crazy shader you'd certainly never see in a real app), beause LICM is quite expensive and there's cases where running simplifycfg (along with sroa and early-cse) before licm reduces IR complexity significantly. (I'm not entirely sure if it would make sense to also run it afterwards.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-04-24 04:49:39 +02:00
Vlad Golovkin	1ff1dc1c63	glsl/glcpp: Handle hex constants with 0X prefix GLSL 4.6 spec describes hex constant as: hexadecimal-constant: 0x hexadecimal-digit 0X hexadecimal-digit hexadecimal-constant hexadecimal-digit Right now if you have a shader with the following structure: #if 0X1 // or any hex number with the 0X prefix // some code #endif the code between #if and #endif gets removed because the checking is performed only for "0x" prefix which results in strtoll being called with the base 8 and after encountering the 'X' char the strtoll returns 0. Letting strtoll detect the base makes this limitation go away and also makes code easier to read. From the strtoll Linux man page: "If base is zero or 16, the string may then include a "0x" prefix, and the number will be read in base 16; otherwise, a zero base is taken as 10 (decimal) unless the next character is '0', in which case it is taken as 8 (octal)." This matches the behaviour in the GLSL spec. This patch also adds a test for uppercase hex prefix. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-24 09:55:05 +10:00
Timothy Arceri	295f57e09a	mesa: rename api_validate.{c,h} -> draw_validate.{c,h} Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65422	2018-04-24 09:23:30 +10:00
Dave Airlie	a90c9f33cf	ac/radv/radeonsi: refactor harvest config register getters. This refactors the code out to share it between radv and radeonsi. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-24 09:08:34 +10:00
Dave Airlie	8e4d54505a	radv: only set raster_config_1 outside the index registers. This follows what radeonsi does. Ported from radeonsi: radeonsi: emit PA_SC_RASTER_CONFIG_1 only once Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-24 09:08:34 +10:00
Dave Airlie	f77caa7411	ac/radv/radeonsi: refactor max simd waves into common code. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-24 09:08:33 +10:00
Dave Airlie	899df55ee0	ac/radv/radeonsi: refactor raster_config default values getters. This just makes this common code between the two drivers. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-24 09:07:51 +10:00
Dave Airlie	8de7ff91be	radeonsi: use common gs_table_depth code Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-24 09:05:43 +10:00
Dave Airlie	9afe9c0fe2	radv: use common gs_table_depth code. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-24 09:05:43 +10:00
Dave Airlie	5e2ef28390	ac/info: move gs table depth to common code. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-24 09:05:38 +10:00
Dave Airlie	b25f6cde89	radeonsi: don't runtime check gs table info We can just unreachable here, this aligns with radv code, makes it easier to move to common code. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-24 09:05:29 +10:00
Dave Airlie	40783a7fa3	radv/gfx9: don't use gs_table_depth on gfx9. Missed this on initial radeonsi port, we shouldn't use this value on gfx9, but also in gfx8 only for when we have a geom shader. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-04-24 09:04:42 +10:00
Jason Ekstrand	de1f22d595	i965/fs: Return mlen * 8 for size_read() for INTERPOLATE_AT_* They are send messages and this makes size_read() and mlen agree. For both of these opcodes, the payload is just a dummy so mlen == 1 and this should decrease register pressure a bit. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Cc: mesa-stable@lists.freedesktop.org	2018-04-23 14:04:42 -07:00
Samuel Pitoiset	d136a5fad9	ac: fix the number of coordinates for ac_image_get_lod and arrays This fixes crashes for the following CTS: dEQP-VK.glsl.texture_functions.query.texturequerylod.* Cubemaps are the same as 2D arrays. Fixes: `625dcbbc45` ("amd/common: pass address components individually to ac_build_image_intrinsic") Cc: 18.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-23 21:48:38 +02:00
Lionel Landwerlin	2964e16e51	i965: perf: enable GPA query statistics The combinaison of GPA/MDAPI components expects a particular name & layout for their pipeline statistics query. v2: Limit the query GPA/MDAPI statistics to gen7->9 (Lionel) v3: Add curly braces (Ken) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-23 18:30:10 +01:00
Lionel Landwerlin	2e3025c817	i965: perf: add support for raw queries The INTEL_performance_query extension provides a list of queries that a user can select to monitor a particular workload. Each query reports different sets of counters (roughly looking at different parts of the hardware, i.e. caches/fixed functions/etc...). Each query has an associated configuration that we need to program into the hardware before using the query. Up to now, we provided predefined queries. This change allows the user to build its own query (and associated configuration) externally, and have the i965 driver use that configuration through a new query named : Intel_Raw_Hardware_Counters_Set_0_Query When this query is selected, the i965 driver will report raw counters deltas (meaning their values need to be interpreted by the user, as opposed to existing queries that provide human readable values). This change is also useful for debug purposes for building new pre-defined queries and verifying the underlying numbers make sense before writing equations for user readable output. This change's purpose is also to enable GPA. GPA uses a library called MDAPI that processes raw counter data. MDAPI expects raw data to have a certain layout (per generation which is a bit unfortunate...). This change also embeds the expected data layouts. v2: Enable raw queries on gen 7->11, v1 had 7->9 (Lionel) v3: Don't assert on cherryview for gen7... (Ken) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-23 18:30:10 +01:00
Lionel Landwerlin	c61d445a5a	i965: perf: read slice/unslice frequencies from OA reports v2: Add comment breaking down where the frequency values come from (Ken) v3: More documentation (Ken/Lionel) Adjust clock ratio multiplier to reflect the divider's behavior (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-23 18:30:10 +01:00
Lionel Landwerlin	43fcb72d2c	i965: perf: snapshot RPSTAT register This register contains the current/previous frequency of the GT, it's one of the value GPA would like to have as part of their queries. v2: Don't use this register on baytrail/cherryview (Ken) Use GET_FIELD() macro (Ken) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-23 18:30:10 +01:00
Lionel Landwerlin	d71b442416	i965: perf: extract utility functions We would like to reuse a number of the functions and structures in another file in a future commit. We also move the previous content of brw_performance_query.h into brw_performance_query_metrics.h to be included by generated metrics files. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-23 18:30:10 +01:00
Samuel Pitoiset	e37e643589	ac: teach get_ac_sampler_dim() about subpass attachments Suggested by Nicolai. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-23 19:10:56 +02:00
Samuel Pitoiset	84fef802fb	ac/nir: add missing round_slice for 1D arrays This fixes a bunch of CTS fails with 1D arrays: dEQP-VK.glsl.texture_functions.texture.sampler1darray_ Fixes: `625dcbbc45` ("amd/common: pass address components individually to ac_build_image_intrinsic") Cc: 18.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-23 19:10:52 +02:00
Dylan Baker	10e4290524	bin/install_megadrivers: rename a few variables to make things clearer Originally the "each" variable was just a part of the "drivers" variable. It's not anymore so it's a bit ambiguous. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-04-23 09:57:35 -07:00
Dylan Baker	ae3f45c11e	bin/install_megadrivers: fix DESTDIR and -D*-path This fixes -Ddri-drivers-path, -Dvdpau-libs-path, etc. with DESTDIR when those paths are absolute. Currently due to the way python's os.path.join handles absolute paths these will ignore DESTDIR, which is bad. This fixes them to be relative to DESTDIR if that is set. Fixes: `3218056e0e` ("meson: Build i965 and dri stack") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-04-23 09:57:35 -07:00
Dylan Baker	dbf5b772b3	compiler/glsl: close fd's in glcpp_test.py I would have thought falling out of scope would allow the gc to collect these, but apparently it doesn't, and this hits an fd limit on macos. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106133 Fixes: `db8cd8e367` ("glcpp/tests: Convert shell scripts to a python script") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Vinson Lee <vlee@freedesktop.org>	2018-04-23 09:55:17 -07:00
Bas Nieuwenhuizen	0e945fdf23	nir: Do not use progress for unreachable code in return lowering. We seem to use progress for two cases: 1) When we lowered some returns. 2) When we remove unreachable code. If just case 2 happens we assert as state->return_flag has not been allocated yet, but we are still trying to do insert all predicates based on it. This splits the concerns. We only use progress internally for case 1 and then keep track of 2 in a separate variable to indicate progress in the return value of the pass. This is slightly better than transforming the assert into if (!state->return_flag) return, as the solution in this patch avoids inserting predicates even if some other part of the might need them. Fixes: `6e22ad6edc` "nir: return early when lowering a return at the end of a function" CC: 18.1 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106174 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-23 16:55:15 +02:00
Józef Kucia	8328c64eb1	radv: advertise 8 bits of subpixel precision for viewports This is what radeonsi does. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-23 11:16:11 +02:00
Johan Klokkhammer Helsing	dab02dea34	st/dri: Fix dangling pointer to a destroyed dri_drawable If an EGLSurface is created, made current and destroyed, and then a second EGLSurface is created. Then the second malloc in driCreateNewDrawable may return the same pointer address the first surface's drawable had. Consequently, when dri_make_current later tries to determine if it should update the texture_stamp it compares the surface's drawable pointer against the drawable in the last call to dri_make_current and assumes it's the same surface (which it isn't). When texture_stamp is left unset, then dri_st_framebuffer_validate thinks it has already called update_drawable_info for that drawable, leaving it unvalidated and this is when bad things starts to happen. In my case it manifested itself by the width and height of the surface being unset. This is fixed this by setting the pointer to NULL before freeing the surface. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106126 Signed-off-by: Johan Klokkhammer Helsing <johan.helsing@qt.io> Signed-off-by: Marek Olšák <marek.olsak@amd.com> Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>	2018-04-23 04:25:40 -04:00
Ilia Mirkin	5428066f5e	nv50/ir: make a copy of tex src if it's referenced multiple times For nv50 we coalesce the srcs and defs into a single node. As such, we can end up with impossible constraints if the source is referenced after the tex operation (which, due to the coalescing of values, will have overwritten it). This logic already exists for inserting moves for MERGE/UNION sources. It's the exact same idea here, so leverage that code, which also includes a few optimizations around not extending live ranges unnecessarily. Fixes tests/spec/glsl-1.30/execution/fs-textureSize-components.shader_test Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-04-22 23:03:16 -04:00
Lepton Wu	6c5abb68c7	virgl: disable virgl when no 3D for virtio gpu. If users are running mesa under old version of qemu or have turned off GL at runtime, virtio gpu driver actually doesn't work. Adds a detection here so mesa can fall back to software rendering. v2: - move detection from loader to virgl (Ilia, Emil) Signed-off-by: Lepton Wu <lepton@chromium.org> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-04-23 12:35:29 +10:00
Dave Airlie	a8420e2530	radv: mark const structs as extern in header file to avoid lto damage The copr repo from che was using LTO and he reported radv broke recently with it. When testing with lto builds here I noticed that we weren't seeing any instance extensions reported. It appears LTO was treating the const without extern as an empty struct, this is possibly a gcc bug, but we can work around it just by marking these with extern. Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-04-23 05:55:22 +10:00
Dylan Baker	f8c4716854	Bump version after 18.1 Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-04-22 09:35:56 -07:00
Ilia Mirkin	3f1cad48b8	gallium/tests/trivial: fix viewport depth transform These were getting mapped off into outer space, which would cause nv50 and nvc0 to clip the primitives (as depth_clip was enabled). These drivers are configured to clip everything outside the [0, 1] range, even though the hardware supports other view settings. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-04-21 23:31:48 -04:00
Ilia Mirkin	fe8b6d7e1f	trace: allow image resource to be null Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-21 23:29:39 -04:00
Karol Herbst	63572091b5	nv50/ir/ra: prefer def == src2 for fma with immediates on nvc0 This helps with the PostRALoadPropagation pass moving long immediates into FMA/MAD instructions. changes in shader-db: total instructions in shared programs : 5894114 -> 5886074 (-0.14%) total gprs used in shared programs : 666558 -> 666563 (0.00%) total shared used in shared programs : 520416 -> 520416 (0.00%) total local used in shared programs : 53524 -> 53524 (0.00%) total bytes used in shared programs : 54006744 -> 53932472 (-0.14%) local shared gpr inst bytes helped 0 0 2 4192 4192 hurt 0 0 7 9 9 Signed-off-by: Karol Herbst <karolherbst@gmail.com> [imirkin: minor edits to separate nv50 and nvc0+ cases] Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-04-21 10:53:59 -04:00
Rhys Perry	cc35b76e99	docs/features: mark GL_ARB_post_depth_coverage as DONE for nvc0 This was done a while ago but never marked on features.txt. Note that this is only supported on GM200+. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-04-21 10:02:55 -04:00
Dylan Baker	6754c2e83d	autotools: Include new meson files Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-20 20:26:56 -07:00
Dylan Baker	5c8e2501a6	autotools: Add passes.h to sources so it will be included in the tarball This was introduced in commit `8f848ada8a` but not added to the sources list, which is necessary for it to be included in release tarballs. Fixes: `8f848ada8a` ("swr/rast: Start refactoring of builder/packetizer.") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-20 20:26:54 -07:00
Dylan Baker	cfd7d2ba0d	autotools: include include/vulkan headers This is needed to provide vk_android_native_buffer.h for vk_enum_to_str. v2: - remove accidentally included changes Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-20 20:26:49 -07:00
Rhys Perry	a0e57432b7	nvc0: fix line width on GM20x+ This has the side-effect of fixing polygon-offset piglit test failures. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-04-20 20:43:59 -04:00
Nanley Chery	7b20329107	i965/miptree: Delete an unused function We're going to combine ::mcs_buf and ::hiz_buf in later commits. Once that happens, this function no longer make sense. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-20 17:14:37 -07:00
Nanley Chery	010abacc95	i965/miptree: Don't leak the clear_color_bo Free the clear_color_bo in addition to freeing the intel_miptree_aux_buffer which holds the reference to it. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-20 17:14:37 -07:00
Jason Ekstrand	9d2ef3c9ec	i965/blorp: Do the gen11 BTI flush Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-04-20 16:30:14 -07:00
Jason Ekstrand	185630c6bc	anv/blorp: Do the gen11 BTI flush Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-04-20 16:30:14 -07:00
Lucas Stach	52e93e309f	etnaviv: fix texture_format_needs_swiz memcmp returns 0 when both swizzles are the same, which means we don't need any hardware swizzling. texture_format_needs_swiz should return true when the return value of the memcmp is non-zero. Fixes: `751ae6afbe` ("etnaviv: add support for swizzled texture formats") Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Tested-by: Marek Vasut <marex@denx.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>	2018-04-20 18:54:10 +02:00
Samuel Pitoiset	8f13975713	ac/nir: fix image dimension for subpass attachments For subpass attachments we need one more coordinate with the layer, so make them array types. This fixes a bunch of CTS fails with RADV. Fixes: `24fb3e6aa1` ("ac/nir: use ac_build_image_opcode for image intrinsics") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-20 18:44:51 +02:00
Bas Nieuwenhuizen	e1df849c3c	radv: Mark GTT memory as device local for APUs. Otherwise a lot of games complain about not having enough memory, and it is sort of local so this seems reasonable to me. CC: 18.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-20 18:16:16 +02:00
Samuel Pitoiset	fedd0a4215	radv/winsys: allow to submit up to 4 IBs for chips without chaining The SI family doesn't support chaining which means the maximum size in dwords per CS is limited. When that limit was reached we failed to submit the CS and the application crashed. This patch allows to submit up to 4 IBs which is currently the limit, but recent amdgpu supports more than that. Please note that we can reach the limit of 4 IBs per submit but currently we can't improve that. The only solution is to upgrade libdrm. That will be improved later but for now this should fix crashes on SI or when using RADV_DEBUG=noibs. Fixes: `36cb5508e8` ("radv/winsys: Fail early on overgrown cs.") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105775 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-20 18:12:26 +02:00
Stefan Schake	ff904978a1	gallium/util: Android backtrace support We can't use any of the existing implementations in u_debug_stack. Android technically has libunwind, but it's been modified to the point where it no longer compiles with the Mesa usage. The library is also not meant to be referenced by vendor libraries. The officially sanctioned way of obtaining backtraces is through the Android own libbacktrace, a C++ library. Access it through a separate C++ source file on Android only. Signed-off-by: Stefan Schake <stschake@gmail.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-04-20 18:49:49 +03:00
Stefan Schake	2abd4f4b49	gallium/util: Don't stub u_debug_stack on Android The fallback path for no libunwind ends up being stubs for Android. Don't compile them in so we can provide our own implementation. Signed-off-by: Stefan Schake <stschake@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-04-20 18:49:37 +03:00
Samuel Pitoiset	dd069e9b41	ac/nir: handle nir_intrinsic_load_first_vertex like base_vertex This fixes a ton of CTS crashes. Fixes: `c366f422f0` ("nir: Offset vertex_id by first_vertex instead of base_vertex") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-20 17:07:38 +02:00
Samuel Pitoiset	b21a4efb55	radv/winsys: allow local BOs on APUs Ported from RadeonSI. Local BOs ignore BO priorities, and we don't need those on APUs. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-20 16:18:24 +02:00
Samuel Pitoiset	5c1233ed62	radv: use a global BO list only for VK_EXT_descriptor_indexing Maintaining two different paths is annoying but this gets rid of the performance regression introduced by the global BO list. We might find a better solution in the future, but for now just keeps two paths. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-20 16:18:18 +02:00
Samuel Pitoiset	7bd5367546	Revert "radv: Don't store buffer references in the descriptor set." In order to reduce a performance regression introduced by `4b13fe55a4` ("radv: Keep a global BO list for VkMemory."), we are going to maintain two different paths. One when VK_EXT_descriptor_indexing is enabled by the application because we need to have a global BO list, and one (the old one) when it's not enabled. With Talos on Polaris, the global BO list reduces performance by 10% which is too much for me. This reverts commit `ab6cadd3ec`. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-20 16:18:13 +02:00
Jose Maria Casanova Crespo	eb96bd57c7	i965/fs: retype offset_reg to UD at load_ssbo All operations with offset_reg at do_vector_read are done with UD type. So copy propagation was not working through the generated MOVs: mov(8) vgrf9:UD, vgrf7:D This change allows removing the MOV generated for reading the first components for 16-bit and 64-bit ssbo reads with non-constant offsets. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-04-20 13:30:12 +02:00
Nicolai Hähnle	24fb3e6aa1	ac/nir: use ac_build_image_opcode for image intrinsics So that we'll use the dimension-aware intrinsics in the future. Acked-by: Marek Olšák <marek.olsak@amd.com>	2018-04-20 09:30:07 +02:00
Nicolai Hähnle	74063431f1	radeonsi: generate image load/store/atomic ops using ac_build_image_opcode In preparation of dimension-aware LLVM image intrinsics. Acked-by: Marek Olšák <marek.olsak@amd.com>	2018-04-20 09:29:57 +02:00
Nicolai Hähnle	625dcbbc45	amd/common: pass address components individually to ac_build_image_intrinsic This is in preparation for the new image intrinsics. Acked-by: Marek Olšák <marek.olsak@amd.com>	2018-04-20 09:23:52 +02:00
Nicolai Hähnle	f931583828	amd/common: pass new enum ac_image_dim to ac_build_image_opcode This is in preparation for the new, dimension-aware LLVM image intrinsics. Acked-by: Marek Olšák <marek.olsak@amd.com>	2018-04-20 09:23:40 +02:00
Nicolai Hähnle	9cb52d470a	radeonsi/nir: fix crash in test involving the sample mask Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-20 09:21:50 +02:00
Nicolai Hähnle	552bc37c6f	radeonsi/nir: set FS properties only when scanning a fragment shader Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-20 09:21:47 +02:00
Nicolai Hähnle	a807a9b215	ac/nir: fix atomic compare-and-swap The LLVM instruction returns { i32, i1 }, where the i1 indicates success. We're only interested in the first part, which is the loaded value. Fixes dEQP-GLES31.functional.compute.shared_var.atomic.compswap.* Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-20 09:21:40 +02:00
Nicolai Hähnle	e788b987d8	radeonsi: fix error paths of si_texture_transfer_map trans is zero-initialized, but trans->resource is setup immediately so needs to be dereferenced. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-20 09:21:33 +02:00
Nicolai Hähnle	68ee1d5796	glsl: prevent spurious Valgrind errors when serializing NIR It looks as if the structure fields array is fully initialized below, but in fact at least gcc in debug builds will not actually overwrite the unused bits of bit fields. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-20 09:21:23 +02:00
Aaron Watry	354b12681b	clover: Fix host access validation for sub-buffer creation From CL 1.2 Section 5.2.1: CL_INVALID_VALUE if buffer was created with CL_MEM_HOST_WRITE_ONLY and flags specify CL_MEM_HOST_READ_ONLY , or if buffer was created with CL_MEM_HOST_READ_ONLY and flags specify CL_MEM_HOST_WRITE_ONLY , or if buffer was created with CL_MEM_HOST_NO_ACCESS and flags specify CL_MEM_HOST_READ_ONLY or CL_MEM_HOST_WRITE_ONLY . Fixes CL 1.2 CTS test/api get_buffer_info v2: Correct host_access_flags check (Francisco) Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-04-19 20:57:37 -05:00
Neil Roberts	c366f422f0	nir: Offset vertex_id by first_vertex instead of base_vertex base_vertex will be zero for non-indexed calls and in that case we need vertex_id to be offset by the ‘first’ parameter instead. That is what we get with first_vertex. This is true for both GL and Vulkan. The freedreno driver is also setting vertex_id_zero_based on nir_options. In order to avoid breakage this patch switches the relevant code to handle SYSTEM_VALUE_FIRST_VERTEX so that it can retain the same behavior. v2: change a3xx/fd3_emit.c and a4xx/fd4_emit.c from SYSTEM_VALUE_BASE_VERTEX to SYSTEM_VALUE_FIRST_VERTEX (Kenneth). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: Rob Clark <robdclark@gmail.com> Acked-by: Marek Olšák <marek.olsak@amd.com>	2018-04-19 15:57:45 -07:00
Neil Roberts	c4f30a9100	spirv: Lower BaseVertex to FIRST_VERTEX instead of BASE_VERTEX The base vertex in Vulkan is different from GL in that for non-indexed primitives the value is taken from the firstVertex parameter instead of being set to zero. This coincides with the new SYSTEM_VALUE_FIRST_VERTEX instead of BASE_VERTEX. v2 (idr): Add comment describing why SYSTEM_VALUE_FIRST_VERTEX is used for SpvBuiltInBaseVertex. Suggested by Jason. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1] Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-19 15:57:45 -07:00
Antia Puentes	c32e1035cb	intel: Handle firstvertex in an identical way to BaseVertex Until we set gl_BaseVertex to zero for non-indexed draw calls both have an identical value. The Vertex Elements are kept like that: * VE 1: <BaseVertex/firstvertex, BaseInstance, VertexID, InstanceID> * VE 2: <Draw ID, 0, 0, 0> v2 (idr): Mark nir_intrinsic_load_first_vertex as "unreachable" in emit_system_values_block and fs_visitor::nir_emit_vs_intrinsic.	2018-04-19 15:57:45 -07:00
Neil Roberts	0c8395e15d	intel/compiler: Add a uses_firstvertex flag Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-19 15:57:45 -07:00
Antia Puentes	5ff848df7b	compiler: Add SYSTEM_VALUE_FIRST_VERTEX and instrinsics This VS system value will contain the value passed as <basevertex> for indexed draw calls or the value passed as <first> for non-indexed draw calls. It can be used to calculate the gl_VertexID as SYSTEM_VALUE_VERTEX_ID_ZERO_BASE plus SYSTEM_VALUE_FIRST_VERTEX. From the OpenGL 4.6 spec, 10.4 "Drawing Commands Using Vertex Arrays": - Page 352: "The index of any element transferred to the GL by DrawArraysOneInstance is referred to as its vertex ID, and may be read by a vertex shader as gl_VertexID. The vertex ID of the ith element transferred is first + i." - Page 355: "The index of any element transferred to the GL by DrawElementsOneInstance is referred to as its vertex ID, and may be read by a vertex shader as gl_VertexID. The vertex ID of the ith element transferred is the sum of basevertex and the value stored in the currently bound element array buffer at offset indices + i." Currently the gl_VertexID calculation uses SYSTEM_VALUE_BASE_VERTEX but this will have to change when the value of gl_BaseVertex is fixed. Currently its value is broken for non-indexed draw calls because it must be zero but we are setting it to <first>. v2: use SYSTEM_VALUE_FIRST_VERTEX as name for the value, instead of SYSTEM_VALUE_BASE_VERTEX_ID (Kenneth). v3 (idr): Rebase on Rob Clark converting nir_intrinsics.h to be generated. Reformat commit message to 72 columns. Reviewed-by: Neil Roberts <nroberts@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-19 15:57:45 -07:00
Mike Lothian	051fddb4a9	meson: Build st_tests_common with gtest Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106131 Fixes: `34cb4d0ebc` ("meson: build tests for gallium mesa state tracker") Signed-off-by: Mike Lothian <mike@fireburn.co.uk> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-04-19 09:04:51 -07:00
Bas Nieuwenhuizen	dffdef6737	radv: Add Vega M support. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-19 16:36:21 +02:00
Bas Nieuwenhuizen	d1ce31d36c	radv: Add bound checking workaround for dynamic buffers. I have seen a few applications and games do the dynamic buffer bounds incorrectly, this make it easier to work around, e.g. for debugging. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-19 16:13:25 +02:00
Thomas Hellstrom	e0c08183fb	svga: Fix incorrect advertizing of EGL_KHR_gl_colorspace When advertizing this extension, egl_dri2 uses the DRI2_RENDERER_QUERY extension to query whether an sRGB format is supported. That extension will query our driver with the BIND flag PIPE_BIND_RENDER_TARGET rather than PIPE_BIND_DISPLAY_TARGET which is used when building the configs. We only return the correct value for PIPE_BIND_DISPLAY_TARGET. The inconsistency causes EGL to crash at surface initialization if sRGB is not supported. Fix this by supporting both bind flags. Testing done: piglit egl_gl_colorspace srgb Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-04-19 13:42:51 +02:00
Mike Lothian	79487c427e	swr: Fix include for createPromoteMemoryToRegisterPass Include llvm/Transforms/Utils.h with the newest LLVM 7 v2: Include with " " rather than < > (Vinson Lee) v3: Use LLVM_VERSION_MAJOR rather than HAVE_LLVM (George Kyriazis) Signed-of-by: Mike Lothian <mike@fireburn.co.uk> Tested-by: Vinson Lee <vlee@freedesktop.org> Reviewed-By: George Kyriazis <george.kyriazis@intel.com>	2018-04-19 00:39:04 -07:00
Samuel Pitoiset	2f63b3dd09	radv: enable DCC for MSAA 2x textures on VI under an option This can be enabled with RADV_PERFTEST=dccmsaa. DCC for MSAA textures is actually not as easy to implement. It looks like there is some corner cases. I will improve support incrementally. Vega support, as well as Polaris improvements, will be added later. No CTS changes on Polaris using RADV_DEBUG=zerovram and RADV_PERFTEST=dccmsaa. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-19 09:10:55 +02:00
Samuel Pitoiset	dc3d39771f	radv: decompress DCC for multisampled source images before resolving Multisampled source images (ie. color attachments) can be now DCC compressed, so the driver needs to perform a DCC decompression pass before resolving Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-19 09:10:52 +02:00
Samuel Pitoiset	1aefb62f1e	radv: add a workaround for fast clears with DCC and MSAA textures This should be fixed at some point in order to improve performance. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-19 09:10:50 +02:00
Samuel Pitoiset	373fa0b599	radv: allocate CMASK for DCC fast clear with MSAA CMASK is required because it should be cleared to 0xCCCCCCCC for MSAA textures. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-19 09:10:48 +02:00
Samuel Pitoiset	255506c4e0	radv: implement fast color clear for DCC with MSAA When DCC is enabled with MSAA textures, CMASK should be cleared to 0xCCCCCCCC. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-19 09:10:45 +02:00
Samuel Pitoiset	796b6f4aab	radv: make sure to sync after resolving using the compute path This fixes some random CTS failures: dEQP-VK.renderpass.multisample.*. Performing a fast-clear eliminate is still useless, but it seems that we need to sync. Found while running CTS with RADV_DEBUG=zerovram. Fixes: `56a171a499` ("radv: don't fast-clear eliminate after resolving a subpass with compute") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-19 09:09:55 +02:00
Samuel Pitoiset	4a698660ae	radv: dump the SHA1 of SPIRV in the hang report Might be useful for debugging purposes, especially when we want to replace a shader on the fly. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-19 09:09:52 +02:00
Bas Nieuwenhuizen	0e10790558	radv: Enable VK_EXT_descriptor_indexing. This adds everything except non-uniform indexing, which needs a bit more work and testing. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen	5f7ebb5206	spirv: Add support for runtime descriptor array cap. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen	c48feaf2d1	spirv: Add support for VK_EXT_descriptor_indexing uniform indexing caps. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen	b5e04e9217	radv: Support allocating variable size descriptor sets. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen	78c54acbe8	radv: Add support for variable descriptor set layouts. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen	082c11e8a5	radv: Fix GetDescriptorSetLayoutSupport. The continue means we do alignment differently than during creation, making the buffer smaller than expected. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen	d02bbde1a8	radv: Use sorted bindings for set layout creation. Previously we did not care about havin the set storage in order, but for variable descriptor count we want the highest binding at the end of the storage. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen	ab6cadd3ec	radv: Don't store buffer references in the descriptor set. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen	4b13fe55a4	radv: Keep a global BO list for VkMemory. With update after bind we can't attach bo's to the command buffer from the descriptor set anymore, so we have to have a global BO list. I am somewhat surprised this works really well even though we have implicit synchronization in the WSI based on the bo list associations and with the new behavior every command buffer is associated with every swapchain image. But I could not find slowdowns in games because of it. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen	22d6b89e39	spirv: Update spirv.h to 12f8de9f04327336b699b1b80aa390ae7f9ddbf4 Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-18 22:56:54 +02:00
Kenneth Graunke	da25ae92be	i965: Fix shadow batches to be the same size as the real BO. brw_bo_alloc may round up our allocation size to the next bucket size. In this case, we would malloc a shadow buffer that was the original intended size, but use bo->size (the larger size) for all of our checks. This could cause us to run off the end of the shadow buffer. v2: Actually use the new BO size (caught by Lionel) Reported-by: James Xiong <james.xiong@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `c7dcee58b5` (i965: Avoid problems from referencing orphaned BOs after growing.)	2018-04-18 13:55:08 -07:00
Marek Olšák	7bd24d951a	glsl_to_tgsi: try harder to lower unsupported ir_binop_vector_extract This fixes some piglits. Cc: 18.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-18 15:34:52 -04:00
Leo Liu	90de03708f	radeon/vce: disable vce dual pipe on VegaM Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-18 14:45:35 -04:00
Marek Olšák	c6f1d36019	radeonsi: add support for VegaM Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-18 14:45:33 -04:00
Marek Olšák	d6a66bc8db	amd/addrlib: add support for VegaM Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-18 14:45:32 -04:00
Marek Olšák	d15fb766aa	radeonsi/gfx9: fix a hang with an empty first IB This packet causes the no-op IB detection to fail, so the IB is always submitted. Also fix the no-op IB detection by moving the begin call. Cc: 18.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-18 14:42:06 -04:00
Dylan Baker	d28c246501	meson: build graw tests This only enables the null and xlib target, so no windows support yet. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-18 09:03:57 -07:00
Dylan Baker	34cb4d0ebc	meson: build tests for gallium mesa state tracker v2: - Fix typo Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-18 09:03:57 -07:00
Dylan Baker	de01018293	meson: build gallium unit tests v2: - gate unit tests on swrast being enabled (Eric A) v3: - rebase on libtrace being merged with gallium auxiliary Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> (v2)	2018-04-18 09:03:57 -07:00
Dylan Baker	4c794c7834	meson: Build gallium trivial tests Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-18 09:03:57 -07:00
Dylan Baker	7fee8fed16	meson: Remove TODO about mesa/main tests They're already done. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-18 09:03:57 -07:00
Dylan Baker	5d16c86add	meson: enable glcpp test Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-18 09:03:57 -07:00
Dylan Baker	db8cd8e367	glcpp/tests: Convert shell scripts to a python script This ports glcpp-test.sh and glcpp-test-cr-lf.sh to a python script that accepts arguments for each line ending type. This should allow for better reporting to users. v2: - Use $PYTHON2 to be consistent with other tests in mesa Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-18 09:03:57 -07:00
Dylan Baker	8cb96c4031	glsl/tests: Remove unused compare_ir.py script Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-04-18 09:03:57 -07:00
Dylan Baker	877d250ea1	meson: enable optimization-test Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-04-18 09:03:57 -07:00
Dylan Baker	97c28cb082	glsl/tests: Convert optimization-test.sh to pure python This patch converts optimization-test.sh to python, in this process it removes external shell dependencies including diff. It replaces the python script that generates shell scripts with a python library that generates test cases and runs them using subprocess. v2: - use $PYTHON2 to be consistent with other tests in mesa Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-04-18 09:03:57 -07:00
Dylan Baker	ad9c2f2018	meson: run glsl compiler warnings test Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-18 09:03:57 -07:00
Dylan Baker	3b52d29227	glsl/tests: reimplement warnings-test in python This reimplements the test in python with a shell script wrapper that allows autotools to continue to run the test without realizing that anything has changed. Using python has two advantages, first it's portable so this test can be run on windows as well as Linux since it just requires python, no more diff, pwd or sh. It's also no longer tied to autotools implementation details, like the environment variables $srcdir and $abs_builddir, though the autotools shell wrapper still uses those, which makes it possible to run the test in meson. v2: - Use $PYTHON2 in script to be consistent with other scripts in mesa Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-18 09:03:57 -07:00
George Kyriazis	12a002a3a1	swr/rast: Fix VGATHERPD lowering Also Implement VHSUBPS in x86 lowering pass. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	99fe90722d	swr/rast: Replace x86 VMOVMSK with llvm-only implementation Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	0899122c03	swr/rast: Optimize late/bindless JIT of samplers Add per-worker thread private data to all shader calls Add per-worker sampler cache and jit context Add late LoadTexel JIT support Add per-worker-thread Sampler / LoadTexel JIT Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	ec7154abc0	swr/rast: Implement VROUND intrinsic in x86 lowering pass Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	bb02da3c1b	swr/rast: Refactor to improve code sharing. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	94ca1c018f	swr/rast: minimize codegen redundant work Move filtering of redundant codegen operations into gen scripts themselves Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	7f34860125	swr/rast: double-pump in x86 lowering pass Add support for double-pumping a smaller SIMD width intrinsic. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	96ad8f5a23	swr/rast: Fix 64bit float loads in x86 lowering pass Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	1ffbbbee97	swr/rast: Add shader stats infrastructure (WIP) Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	a81c625cb7	swr/rast: Type-check TemplateArgUnroller Allows direct use of enum values in conversion to template args. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	2966ee1028	swr/rast: Add vgather to x86 lowering pass. Add support for generic VGATHERPD intrinsic in x86 lowering pass. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	e4929b5d26	swr/rast: fix comment Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	670a99c233	swr/rast: add cvt instructions in x86 lowering pass Support generic VCVTPD2PS and VCVTPH2PS in x86 lowering pass. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	aa482014e5	swr/rast: Fix alloca usage in jitter Fix issue where temporary allocas were getting hoisted to function entry unnecessarily. We now explicitly mark temporary allocas and skip hoisting during the hoist pass. Shuold reduce stack usage. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	81371a5909	swr/rast: Change gfx pointers to gfxptr_t Changing type to gfxptr for indices and related changes to fetch and mem builder code. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	71239478d3	swr/rast: Fix byte offset for non-indexed draws Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	c57b594317	swr/rast: Add support for setting optimization level for JIT compilation Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	4f0df5e2f7	swr/rast: Adding translate call to builder_gfx_mem. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	f135f54b18	swr/rast: Fix codegen for typedef types Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	c5d7b37fe7	swr: add x86 lowering pass to fragment shader Needed because some FP paths (namely stipple) use gather intrinsics that now need to be lowered to x86. v2: fix typo in commit message Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	9161c40d14	swr/rast: Enable generalized fetch jit Enable generalized fetch jit with 8 or 16 wide SIMD target. Still some work needed to remove some simd8 double pumping for 16-wide target. Also removed unused non-gather load vertices path. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	d73082b98b	swr/rast: Add builder_gfx_mem.{h\|cpp} Abstract usage scenarios for memory accesses into builder_gfx_mem. Builder_gfx_mem will convert gfxptr_t from 64-bit int to regular pointer types for use by builder_mem. v2: reworded commit message; renamed enum more appropriately Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	1eb72673fc	swr/rast: Lower VGATHERPS and VGATHERPS_16 to x86. Some more work to do before we can support simultaneous 8-wide and 16-wide and remove the VGATHERPS_16 version. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	b15fb78df5	swr/rast: Cleanup of JitManager convenience types Small cleanup. Remove convenience types from JitManager and standardize on the Builder's convenience types. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	d68694016c	swr/rast: Lower PERMD and PERMPS to x86. Add support for providing an emulation callback function for arch/width combinations that don't map cleanly to an x86 intrinsic. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	8f848ada8a	swr/rast: Start refactoring of builder/packetizer. Move x86 intrinsic lowering to a separate pass. Builder now instantiates generic intrinsics for features not supported by llvm. The separate x86 lowering pass is responsible for lowering to valid x86 for the target SIMD architecture. Currently it's a port of existing code to get it up and running quickly. Will eventually support optimized x86 for AVX, AVX2 and AVX512. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	ffc0aeb4ec	swr/rast: Simplify #define usage in gen source file Removed preprocessor defines from structures passed to LLVM jitted code. The python scripts do not understand the preprocessor defines and ignores them. So for fields that are compiled out due to a preprocessor define the LLVM script accounts for them anyway because it doesn't know what the defines are set to. The sanitize defines for open source are fine in that they're safely used. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	f36026ce2e	swr/rast: Move CallPrint() to a separate file Needed work for jit code debug. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	67c8bb4db7	swr/rast: Fix name mangling for LLVM pow intrinsic Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	7a5054aa1c	swr/rast: Add some archrast counters Hook up archrast counters for shader stats: instructions executed. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	f52a501716	swr/rast: Code cleanup Removing some code that doesn't seem to do anything meaningful. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	093c1aee88	swr/rast: Add "Num Instructions Executed" stats intrinsic. Added a SWR_SHADER_STATS structure which is passed to each shader. The stats pass will instrument the shader to populate this. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	5fbee5e4ef	swr/rast: Add MEM_ADD helper function to Builder. mem[offset] += value This function will be heavily used by all stats intrinsics. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	9103119cb3	swr/rast: Permute work for simd16 Fix slow permutes in PA tri lists under SIMD16 emulation on AVX Added missing permute (interlane, immediate) to SIMDLIB Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	4c69823d15	swr/rast: WIP builder rewrite (2) Finish up the remaining explicit intrinsic uses. At this point all explicit Intrinsic::getDeclaration() usage has been replaced with auto generated macros generated with gen_llvm_ir_macros.py. Going forward, make sure to only use the intrinsics here, adding new ones as needed. Next step is to remove all references to x86 intrinsics to keep the builder target-independent. Any x86 lowering will be handled by a separate pass. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	c2163dc56a	swr/rast: Add autogen of helper llvm intrinsics. Replace sqrt, maskload, fp min/max, cttz, ctlz with llvm equivalent. Replace AVX maskedstore intrinsic with LLVM intrinsic. Add helper llvm macros for stacksave, stackrestore, popcnt. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	6427315e43	swr/rast: WIP builder rewrite. Start removing avx2 macros for functionality that exists in llvm. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	a16f8e0554	swr/rast: LLVM 6 fix for getting masked gather intrinsic (also compatible with LLVM 4) Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	a92cc09c7a	swr/rast: Changes to allow jitter to compile with LLVM5 Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	0f6fef9632	swr/rast: Add some archrast stats Add stats for degenerate and backfacing primitive counts Wire archrast stats for alpha blend and alpha test. pass value to jitter, upon return have archrast event increment a value Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	b488028854	swr/rast: Silence some unused variable warnings Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	e84bfec4ab	swr/rast: Add debug type info for i128 Help support debug info in 16 wide shaders. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	a3edcfe1fb	swr/rast: Use blend context struct to pass params Stuff parameters into a blend context struct before passing down through the PFN_BLEND_JIT_FUNC function pointer. Needed for stat changes. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	be6cf0fd7c	swr/rast: Introduce JIT_MEM_CLIENT Add assert for correct usage of memory accesses v2: reworded commit message; renamed enum more appropriately Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	d34edffe48	swr/rast: Add some instructions to jitter VPHADDD, PMAXUD, PMINUD Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
Juan A. Suarez Romero	4aa03581b5	docs: update calendar, add news and link release notes to 18.0.1 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2018-04-18 15:29:12 +00:00
Juan A. Suarez Romero	ad51d8871e	docs: add sha256 checksums for 18.0.1 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `a1c421c638`)	2018-04-18 15:25:32 +00:00
Juan A. Suarez Romero	76cadaa1de	docs: add release notes for 18.0.1 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `8bd719e3fa`)	2018-04-18 15:25:30 +00:00
Juan A. Suarez Romero	193d615917	docs: update calendar, add news and link release notes to 17.3.9 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2018-04-18 09:45:11 +00:00
Juan A. Suarez Romero	6372227209	docs: add sha256 checksums for 17.3.9 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `cf0864dc63`)	2018-04-18 09:40:44 +00:00
Juan A. Suarez Romero	6a1261bd09	docs: add release notes for 17.3.9 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `6d88ea9dd4`)	2018-04-18 09:40:42 +00:00
Dylan Baker	b9ad5282ba	Revert "meson: add wrap for libdrm" This reverts commit `6217eedc9b`. I was using this for testing and accidentally put it on master Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-04-17 13:48:55 -07:00
Dylan Baker	efcbcfa7c8	Revert "Add subprojects directory and git ignore" This reverts commit `21e2e73f71`. I was using this for testing and accidentally put it on master Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-04-17 13:48:43 -07:00
Jan Alexander Steffens (heftig)	5cf752b18b	meson: Version libMesaOpenCL like autotools does This is for parity with autotools. It names the library libMesaOpenCL.so.1.0.0 and points mesa.icd to the .1 symlink. opencl_version now matches configure.ac's OPENCL_VERSION. Signed-off-by: Jan Alexander Steffens (heftig) <jan.steffens@gmail.com> Tested-By: Aaron Watry <awatry@gmail.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-04-17 13:46:15 -07:00
Jan Alexander Steffens (heftig)	5bb98cfd92	meson: Add library versions to swr drivers This is for parity with autotools. Signed-off-by: Jan Alexander Steffens (heftig) <jan.steffens@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2018-04-17 13:46:15 -07:00
Dylan Baker	6217eedc9b	meson: add wrap for libdrm Currently this requires libdrm from git, since the version reported by meson is wrong.	2018-04-17 13:46:15 -07:00
Dylan Baker	21e2e73f71	Add subprojects directory and git ignore For meson wraps.	2018-04-17 13:46:15 -07:00
Samuel Pitoiset	893e19efb7	radv: fix scissor computation when using half-pixel viewport offset 'scale[i]' can be non-integer. Original patch by Philip Rebohle. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106074 Fixes: `0f3de89a56` ("radv: Use the guard band.") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-17 22:12:14 +02:00
Neil Roberts	608d70bc02	spirv: Accept doubles in FaceForward, Reflect and Refract The SPIR-V spec doesn’t specify a size requirement for these and the equivalent functions in the GLSL spec have explicit alternatives for doubles. Refract is a little bit more complicated due to the fact that the final argument is always supposed to be a scalar 32- or 16- bit float regardless of the other operands. However in practice it seems there is a bug in glslang that makes it convert the argument to 64-bit if you actually try to pass it a 32-bit value while the other arguments are 64-bit. This adds an optional conversion of the final argument in order to support any type. These have been tested against the automatically generated tests of glsl-4.00/execution/built-in-functions using the ARB_gl_spirv branch which tests it with quite a large range of combinations. The issue with glslang has been filed here: https://github.com/KhronosGroup/glslang/issues/1279 v2: Convert the eta operand of Refract from any size in order to make it eventually cope with 16-bit floats. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-17 20:58:11 +02:00
Neil Roberts	6e499572b9	spirv: Add a 64-bit implementation of OpIsInf The only change neccessary is to change the type of the constant used to compare against. This has been tested against the arb_gpu_shader_fp64/execution/ fs-isinf-dvec tests using the ARB_gl_spirv branch. v2: Use nir_imm_floatN_t for the constant. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-17 20:58:06 +02:00
Neil Roberts	696f4abcbc	spirv: Use nir_imm_floatN_t for constants for GLSL450 builtins There is an existing macro that is used to choose between either a float or a double immediate constant based on the bit size of the first operand to the builtin. This is now changed to use the new nir_imm_floatN_t helper function to reduce the number of places that make this decision. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-17 20:58:03 +02:00
Neil Roberts	e7b2c125c3	nir/builder: Add a nir_imm_floatN_t helper This lets you easily build float immediates just given the bit size. If we have this single place here to handle this then it will be easier to add support for 16-bit floats later. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-17 20:57:36 +02:00
Timothy Arceri	6e22ad6edc	nir: return early when lowering a return at the end of a function Otherwise we create unused conditional return flags and things get unnecessarily ugly fast when lowering nested functions. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-17 14:17:56 +10:00
Timothy Arceri	d3cafc18fc	mesa: merge the driver functions DrawBuffers and DrawBuffer The extra params we unused by the drivers that used DrawBuffers. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-04-17 14:17:48 +10:00
Marc Dietrich	268d8f244b	glsl: fix gcc 8 parenthesis warning fixes warnings like this: [184/1137] Compiling C++ object 'src/compiler/glsl/glsl@sta/lower_jumps.cpp.o'. In file included from ../src/mesa/main/mtypes.h:48, from ../src/compiler/glsl_types.h:149, from ../src/compiler/glsl/lower_jumps.cpp:59: ../src/compiler/glsl/lower_jumps.cpp: In member function '{anonymous}::block_record {anonymous}::ir_lower_jumps_visitor::visit_block(exec_list)': ../src/compiler/glsl/list.h:650:17: warning: unnecessary parentheses in declaration of 'node' [-Wparentheses] for (__type (__inst) = (__type *)(__list)->head_sentinel.next; \ ^ ../src/compiler/glsl/lower_jumps.cpp:510:7: note: in expansion of macro 'foreach_in_list' foreach_in_list(ir_instruction, node, list) { ^~~~~~~~~~~~~~~ Signed-off-by: Marc Dietrich <marvin24@gmx.de> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-17 11:53:59 +10:00
Rob Clark	2a55344e7d	compiler: int8/uint8 fixes A couple spots were missed for handling of the new INT8/UINT8 base type. Also de-duplicate get_base_type().. get_scalar_type() had nearly the same switch statement, with the exception that anything with base_type that was not scalar would return error_type. So just handle that one special case in get_scalar_type(). Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-16 20:41:18 -04:00
Marek Olšák	60299e9abe	radeonsi: don't emit partial flushes for internal CS flushes only Tested-by: Benedikt Schemmer <ben@besd.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-16 16:58:10 -04:00
Marek Olšák	692f550740	winsys/amdgpu: always set AMDGPU_IB_FLAG_TC_WB_NOT_INVALIDATE There is a kernel patch that adds the new flag. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Benedikt Schemmer <ben@besd.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-16 16:58:10 -04:00
Marek Olšák	1b3199d14d	radeonsi: implement mechanism for IBs without partial flushes at the end (v6) (This patch doesn't enable the behavior. It will be enabled in a later commit.) Draw calls from multiple IBs can be executed in parallel. v2: do emit partial flushes on SI v3: invalidate all shader caches at the beginning of IBs v4: don't call si_emit_cache_flush in si_flush_gfx_cs if not needed, only do this for flushes invoked internally v5: empty IBs should wait for idle if the flush requires it v6: split the commit If we artificially limit the number of draw calls per IB to 5, we'll get a lot more IBs, leading to a lot more partial flushes. Let's see how the removal of partial flushes changes GPU utilization in that scenario: With partial flushes (time busy): CP: 99% SPI: 86% CB: 73: Without partial flushes (time busy): CP: 99% SPI: 93% CB: 81% Tested-by: Benedikt Schemmer <ben@besd.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-16 16:58:10 -04:00
Erico Nunes	d19b488339	nir: fix ir_binop_gequal glsl_to_nir conversion ir_binop_gequal needs to be converted to nir_op_sge when native integers are not supported in the driver. Otherwise it becomes no different than ir_binop_less after the conversion. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-16 07:59:25 -07:00
Jason Ekstrand	72ab499c9f	anv,radv: Drop XML workarounds for VK_ANDROID_native_buffer Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-16 07:59:25 -07:00
Jason Ekstrand	35ef0f767e	vulkan: Update the XML and headers to 1.1.73 Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-16 07:59:25 -07:00
Samuel Pitoiset	62510846b6	radv: clean up radv_decompress_resolve_subpass_src() To handle the source color image transitions in the same place. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-16 14:21:05 +02:00
Samuel Pitoiset	56a171a499	radv: don't fast-clear eliminate after resolving a subpass with compute That looks useless, and I think radv_handle_image_transition() will do a fast-clear eliminate because it's called after the resolve. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-16 14:21:02 +02:00
Samuel Pitoiset	7e84d69861	radv: handle CMASK/FMASK transitions only if DCC is disabled DCC implies a fast-clear eliminate, so I think this sounds reasonable. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-16 14:20:59 +02:00
Samuel Pitoiset	584d1f2711	radv: merge radv_handle_{dcc,cmask}_image_transition() functions Into radv_handle_color_image_transition(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-16 14:20:56 +02:00
Samuel Pitoiset	d5812b900b	radv: add radv_init_color_image_metadata() helper In order to separate initialization from decompression. In the future, that will allow us to init DCC/FMASK/CMASK in one shot. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-16 14:20:54 +02:00
Samuel Pitoiset	fde7b90ecf	radv: make radv_initialise_cmask() static Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-16 14:20:51 +02:00
Samuel Pitoiset	790f6e4718	radv: clean up radv_handle_image_transition() a bit Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-16 14:20:49 +02:00
Samuel Pitoiset	6967d32beb	radv: add radv_handle_color_image_transition() helper To handle CMASK, FMASK and DCC transitions in the same place. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-16 14:20:45 +02:00
Samuel Pitoiset	c6b1f1c97a	radv: handle DCC image transitions before CMASK/FMASK transitions Mostly because DCC implies a fast-clear eliminate and we should be able to skip some DCC decompressions by setting a predicate like for CMASK and FMASK. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-16 14:20:42 +02:00
Samuel Pitoiset	79c87a45b6	radv: disable prediction only if it has been enabled When decompressing DCC we don't enable it, so it's useless to disable it. This reduces the number of prediction packets sent to the GPU when performing color decompression passes. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-16 14:20:39 +02:00
Bas Nieuwenhuizen	b0e3a9b19f	ac/nir: Make the GFX9 buffer size fix apply to image loads/atomics too. No clue how I missed those ... Fixes: `4503ff760c` "ac/nir: Add workaround for GFX9 buffer views." CC: <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105320 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-16 11:55:48 +02:00
Brian Paul	6a519a157b	gallium/osmesa: link with winsock2 library on Windows To fix the MSVC build. The build broke because we started to compile the ddebug code on Windows after the mtypes.h changes. Building ddebug caused us to also use the u_network.c code for the first time. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-04-13 19:06:55 -06:00
Brian Paul	201c08c463	gallium/util: put (void) in a few function signatures To match the header file. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-04-13 19:06:55 -06:00
Brian Paul	65d1040435	ddebug: add PIPE_OS_UNIX/LINUX checks to fix MSVC build Don't include Unix headers or use Unix functions when building with MSVC. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-04-13 19:06:55 -06:00
Brian Paul	6d41edbf8a	mesa: protect #include of unistd.h with _MSV_VER check unistd.h is unix only. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-04-13 19:06:55 -06:00
Brian Paul	bf67fec235	mesa: remove unused 'i' in dimensions_error_check() Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-13 19:06:55 -06:00
Marek Olšák	976db661ff	radeonsi: restore si_emit_cache_flush call at the end of IBs Fixes: `918b798668` "radeonsi: make sure CP DMA is idle at the end of IBs"	2018-04-13 20:05:53 -04:00
Daniel Schürmann	f2c6a55061	radv: enable subgroup capabilities Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-14 01:03:15 +02:00
Daniel Schürmann	4b0616e533	ac: handle subgroup intrinsics Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-14 01:03:15 +02:00
Daniel Schürmann	d5f7ebda3e	ac: add LLVM build functions for subgroup instrinsics Co-authored-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-14 01:03:09 +02:00
Daniel Schürmann	d19f20e793	ac: make ballot and umsb capable of 64bit inputs Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-14 00:52:22 +02:00
Daniel Schürmann	79701b414c	nir: lower 64bit subgroup shuffle intrinsics Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-14 00:52:22 +02:00
Daniel Schürmann	fd5b0e0a64	nir/spirv: Fix warning and add missing breaks. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-14 00:52:22 +02:00
Daniel Schürmann	54937d820d	nir: use ballot_bit_size when lowering ballot_bitfield_extract Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-14 00:52:22 +02:00
Daniel Schürmann	4d802df3aa	nir: subgroups instructions for 64bit ballot sizes Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-14 00:52:22 +02:00
Brian Paul	1098c18af3	glsl: #undef THIS macro to fix MSVC build THIS is a macro in one of the MSVC header files. It's also a token in the GLSL lexer. This causes a compilation failure with MSVC. This issue seems to be newly exposed after the recent mtypes.h removal patches. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-04-13 13:53:12 -06:00
Brian Paul	5dc7233f44	glsl: rename 'interface' var to 'iface' to fix MSVC build The recent mtypes.h removal patches seems to have exposed a MSVC issue where 'interface' is defined as a macro in an MSVC header file. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-04-13 13:53:08 -06:00
Brian Paul	73f1e33d34	mesa: remove snprintf macro in imports.h to fix MSVC build snprintf is a macro in the MSVC stdio.h header and we needed to include that header before imports.h where we also defined an snprintf macro. Otherwise, the MSVC build would fail. The recent mtypes.h removal patches seems to have exposed this issue. This patch simply removes our snprintf macro and replaces one use of it in teximage.c with _mesa_snprintf(). There are other calls to snprintf() in DRI drivers, but none of them are built on Windows. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-04-13 13:52:57 -06:00
Lionel Landwerlin	0a6547014f	anv: fix number of planes for depth & stencil We're not counting correctly with depth & stencil images. Additionally we need to move an assert that is meant just for color attachments. v2: Move an assert() (Reported by Craig) Change aspect mask checks (Francesco) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `a62a979335` ("anv: enable multiple planes per image/imageView") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105994 Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-04-13 11:44:53 -07:00
Marek Olšák	6ff0c6f4eb	gallium: move ddebug, noop, rbug, trace to auxiliary to improve build times which also simplifies the build scripts.	2018-04-13 14:08:14 -04:00
Marek Olšák	918b798668	radeonsi: make sure CP DMA is idle at the end of IBs	2018-04-13 14:07:20 -04:00
Marek Olšák	b6ad7075b9	gallium/hud: add a simple HUD view that only draws text Add this prefix to the env var: "simple," For example: GALLIUM_HUD=simple,fps The X coordinates are the same, but the Y coordinates are different, because there is only text. '+' happens to behave the same as "\n". ',' happens to behave the same as "\n\n".	2018-04-13 14:07:20 -04:00
Dylan Baker	506671594a	mesa: Include unistd.h in program_lexer Which was previously provided implicitly by mtypes.h CC: Marek Olšák <marek.olsak@amd.com> CC: Mark Janes <mark.a.janes@intel.com> Fixes: `43d66c8c2d` ("mesa: include mtypes.h less") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-04-13 11:03:37 -07:00
Marek Olšák	9a1363427e	radeonsi: always prefetch later shaders after the draw packet so that the draw is started as soon as possible. v2: only prefetch the API VS and VBO descriptors Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-13 12:31:04 -04:00
Marek Olšák	e4b7974ec7	radeonsi: emit shader pointers before cache flushes & waits This code was written with the constant engine in mind. We can simplify it now. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-13 12:31:04 -04:00
Marek Olšák	82799c5035	radeonsi/gfx9: don't use the workaround for gather4 + stencil it doesn't seem to be needed. Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-13 12:31:04 -04:00
Marek Olšák	1372ccfe6f	radeonsi: disable TC-compat HTILE on Tonga and Iceland Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-13 12:31:04 -04:00
Marek Olšák	afe0bd2c55	radeonsi: force 2D tiling on VI only when TC-compat HTILE is really enabled just pass the flag that indicates it. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-13 12:31:04 -04:00
Marek Olšák	29a09e1d38	radeonsi: don't flush HTILE if there is no HTILE clear Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-13 12:31:04 -04:00
Marek Olšák	5fb31a1734	radeonsi: merge 2 identical if statements in si_clear and other cleanups Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-13 12:31:04 -04:00
Marek Olšák	8a28679987	radeonsi: don't do GFX-specific texture decompression for compute Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-13 12:31:04 -04:00
Marek Olšák	307bccc6df	radeonsi: simplify generating the renderer string HAVE_LLVM > 0 is a tautology. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-13 12:31:04 -04:00
Marek Olšák	a3b785be4d	winsys/amdgpu: allow local BOs on APUs Local BOs ignore BO priorities, and we don't need those on APUs. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-13 12:31:04 -04:00
Juan A. Suarez Romero	b37b35a5d2	getteximage: assume texture image is empty for non defined levels Current code is returning an INVALID_OPERATION when trying to use getTextureImage() on a level that has not been explicitly defined. That is, we define a mipmapped Texture2D with 3 levels, and try to use GetTextureImage() for the 4th levels, and INVALID_OPERATION is returned. Nevertheless, such case is not listed as an error in OpenGL 4.6 spec, section 8.11.4 ("Texture Image Queries"), where all the case errors for this function are defined. So it seems this is a valid operation. On the other hand, in section 8.22 ("Texture State and Proxy State") it states: "Each initial texture image is null. It has zero width, height, and depth, internal format RGBA, or R8 for buffer textures, component sizes set to zero and component types set to NONE, the compressed flag set to FALSE, a zero compressed size, and the bound buffer object name is zero." We can assume that we are reading this initialized empty image when calling GetTextureImage() with a non defined level. With this assumption, we will reach one of the other error cases defined for the functions. In the end this means that we would end up returning INVALID_VALUE to the caller. This fixes arb_get_texture_sub_image piglit tests. v2: just return INVALID_VALUE if there is no defined level (Iago) Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-04-13 17:47:37 +02:00
Juan A. Suarez Romero	8d411eb6b3	gettextureimage: verify cube map is complete According to OpenGL 4.6 spec, section 8.11.4 ("Texture Image Queries"), relative to errors for GetTexImage, GetTextureImage, and GetnTexImage: "An INVALID_OPERATION error is generated by GetTextureImage if the effective target is TEXTURE_CUBE_MAP or TEXTURE_CUBE_MAP_ARRAY, and the texture object is not cube complete or cube array complete, respectively." This fixes arb_get_texture_sub_image piglit tests. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-04-13 17:47:27 +02:00
Juan A. Suarez Romero	42891dbaa1	gettextsubimage: verify zoffset and depth are correct According to OpenGL 4.6 spec, section 8.11.4 ("Texture Image Queries"), relative to errors for GetTextureSubImage() function: "An INVALID_VALUE error is generated if the effective target is TEXTURE_1D and either yoffset is not zero, or height is not one. An INVALID_VALUE error is generated if the effective target is TEXTURE_1D, TEXTURE_1D_ARRAY, TEXTURE_2D or TEXTURE_RECTANGLE, and either zoffset is not zero, or depth is not one." The commit fixes the check for height and depth. This fixes arb_get_texture_sub_image piglit tests. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-04-13 17:47:27 +02:00
Timothy Arceri	a63e69f5f0	mesa: free debug messages when destroying the debug state Fixes: `04a8baad37` "mesa: refactor _mesa_PopDebugGroup and _mesa_free_errors_data" Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98281	2018-04-13 22:20:48 +10:00
Timothy Arceri	c500ab2735	mesa: fix x86 builds Fixes: `43d66c8c2d` "mesa: include mtypes.h less"	2018-04-13 22:13:46 +10:00
Marek Olšák	e961824ba8	Fix make check	2018-04-12 20:03:13 -04:00
Marek Olšák	6d6b1b3890	Fix scons build	2018-04-12 19:55:01 -04:00
Marek Olšák	43d66c8c2d	mesa: include mtypes.h less - remove mtypes.h from most header files - add main/menums.h for often used definitions - remove main/core.h v2: fix radv build Reviewed-by: Brian Paul <brianp@vmware.com>	2018-04-12 19:31:30 -04:00
Marek Olšák	57f4268da4	mesa: include dispatch.h less Reviewed-by: Brian Paul <brianp@vmware.com>	2018-04-12 19:31:28 -04:00
Bas Nieuwenhuizen	6ff98dbf7c	radv: Implement VK_EXT_vertex_attribute_divisor. Pretty straight forward, just pass the divisors through the shader key and then do a LLVM divide. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-12 22:57:23 +02:00
Bas Nieuwenhuizen	7eff8d7d35	ac/surface: Allow S swizzle for displayable surfaces. For dcn1 && < 64 bpp displayable surfaces, addrlib only accepts S swizzles. At the same time addrlib prefers D swizzles is allowed, so we can just allow S swizzles as fallback. Fixes: `b64b712558` "ac/surface/gfx9: request desired micro tile mode explicitly" Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-12 21:24:55 +02:00
Eric Anholt	7bc77dbb00	broadcom/vc5: Fix a stray '`' in a comment.	2018-04-12 11:20:50 -07:00
Eric Anholt	b225cdcecc	broadcom/vc5: Update the UABI for in/out syncobjs This is the ABI I'm hoping to stabilize for merging the driver. seqnos are eliminated, which allows for the GPU scheduler to task-switch between DRM fds even after submission to the kernel. In/out sync objects are introduced, to allow the Android fencing extension (not yet implemented, but should be trivial), and to also allow the driver to tell the kernel to not start a bin until a previous render is complete.	2018-04-12 11:20:50 -07:00
Eric Anholt	d9c525ed22	broadcom/vc5: Drop the finished_seqno optimization. With the DRM scheduler changes, I'm about to remove all seqnos from the UABI.	2018-04-12 11:20:50 -07:00
Eric Anholt	aedfd8ede4	broadcom/vc5: Drop the throttling code. Since I'll be using the DRM scheduler, we won't run into the problem of a runaway client starving other clients of GPU time.	2018-04-12 11:20:50 -07:00
Eric Anholt	dd9c476165	broadcom/vc5: Move flush_last_load into load_general, like for stores. This should avoid mistakes with not flushing as we change the series of loads. Already, it fixes a hopefully unreachable case where we were emitting just the TILE_COORDINATES and not the dummy store that needs to go with it.	2018-04-12 11:20:50 -07:00
Eric Anholt	6a21a582fb	broadcom/vc5: Rename read_but_not_cleared to loads_pending. This is a more obvious name for what the variable means, and matches what it's called for stores.	2018-04-12 11:20:50 -07:00
Eric Anholt	b946218c48	broadcom/vc5: Refactor the implicit coords/stores_pending logic. Since I just fixed a bug due to forgetting to do these right, do it once in the helper func.	2018-04-12 11:20:50 -07:00
Eric Anholt	ec60559f97	broadcom/vc5: Emit missing TILE_COORDINATES_IMPLICIT in separate z/s stores. Fixes a simulator assertion failure in KHR-GLES3.packed_depth_stencil.blit.depth32f_stencil8	2018-04-12 11:20:50 -07:00
Eric Anholt	8f2999120d	broadcom/vc5: Add checks that we don't try to do raw Z+S load/stores. This was dying in the simulator on GTF-GLES3.gtf.GL3Tests.packed_depth_stencil.packed_depth_stencil_blit. We'll need to do basically the same thing as Z32F/S8 does in the MSAA Z24S8 case.	2018-04-12 11:20:50 -07:00
Eric Anholt	7553cbfc9d	broadcom/vc5: Fix MSAA depth/stencil size setup. The v3dX(get_internal_type_bpp_for_output_format)() call only handles color output formats (which overlap in enum numbers with depth output formats), so for depth we just need to take the normal cpp times the number of samples.	2018-04-12 11:20:50 -07:00
Leo Liu	fa328456e8	st/va: add VP9 config to enable profile2 Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:13 -04:00
Leo Liu	dac0024b58	radeonsi: use PIPE_FORMAT_P016 format for VP9 profile2 Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:13 -04:00
Leo Liu	f1277dabbc	radeon/vcn: add VP9 profile2 support Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:13 -04:00
Leo Liu	e8724bd1e3	vl: add VP9 profile2 support Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:13 -04:00
Leo Liu	d9a31341ec	st/va: add VP9 config to enable profile0 Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:13 -04:00
Leo Liu	ef52ba8aa0	st/va: parse VP9 uncompressed frame header To get some of UVD required parameters. Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:13 -04:00
Leo Liu	bf0f5fe929	st/va: add slice parameter handling for VP9 Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:13 -04:00
Leo Liu	05176fe65e	st/va: add picture parameter handling for VP9 Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:13 -04:00
Leo Liu	9ff83d13e5	st/va: add handles for VP9 buffers Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:13 -04:00
Leo Liu	30438fbf46	st/va: add VP9 picture to context Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:13 -04:00
Leo Liu	0f373a65e5	radeonsi: cap VP9 support to progressive buffer Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:13 -04:00
Leo Liu	6adaf6de6d	radeonsi: cap VP9 support to Raven Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:13 -04:00
Leo Liu	905368669d	radeon/vcn: add VP9 context buffer Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:13 -04:00
Leo Liu	e2ce7c0a62	radeon/vcn: get VP9 msg buffer Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:13 -04:00
Leo Liu	6000bdb75b	radeon/vcn: fill probability table to prob buffers Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:13 -04:00
Leo Liu	93c0f3cc13	radeon/vcn: add VP9 message buffer interface Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:13 -04:00
Leo Liu	caaecf3d3b	radeon/vcn: add VP9 prob table buffer Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:12 -04:00
Leo Liu	b628ea039f	vl: add VP9 probability tables Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:12 -04:00
Leo Liu	eb22785bd8	radeon/vcn: add VP9 dpb buffer size The current FW has restricted the size to the worse case, and the new dynamic dpb buffer support is on the way from firmware side, we will change accordingly. Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:12 -04:00
Leo Liu	f73befdd9b	radeon/vcn: add VP9 stream type for decoder Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:12 -04:00
Leo Liu	ca1646db89	vl: add VP9 picture description Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:12 -04:00
Leo Liu	29bc354684	vl: add VP9 profile0 and format Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:12 -04:00
Samuel Pitoiset	9eac49246c	radv: fix radv_layout_dcc_compressed() when image doesn't have DCC num_dcc_levels means that DCC is supported, but this doesn't mean that it's enabled by the driver. Instead, we should rely on radv_image_has_dcc(). This fixes some multisample regressions since `0babc8e5d6` ("radv: fix picking the method for resolve subpass") on Vega. This is because the resolve method changed from HW to FS, but those fails are totally unexpected, so there might some differences between Polaris and Vega here. Fixes: `44fcf58744` ("radv: Disable DCC for GENERAL layout and compute transfer dest.") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-12 09:58:46 +02:00
Samuel Pitoiset	ab0e625a67	radv: add radv_decompress_resolve_{subpass}_src() helpers This helper shares common code before resolving using either a fragment or a compute shader. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-12 09:58:44 +02:00
Samuel Pitoiset	ed93d90a67	radv: add radv_init_dcc_control_reg() helper And add some comments. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-12 09:58:41 +02:00
Timothy Arceri	c7e3d31b0b	glsl: fix compat shaders in GLSL 1.40 The compatibility and core tokens were not added until GLSL 1.50, for GLSL 1.40 just assume all shaders built with a compat profile are compat shaders. Fixes rendering issues in Dawn of War II on radeonsi which has enabled OpenGL 3.1 compat support. Fixes: `a0c8b49284` "mesa: enable OpenGL 3.1 with ARB_compatibility" Reviewed-by: Marek Olšák <marek.olsak@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105807	2018-04-12 11:51:08 +10:00
Ian Romanick	f3b14ca2e1	mesa: Silence remaining unused parameter warnings in teximage.c src/mesa/main/teximage.c: In function ‘_mesa_test_proxy_teximage’: src/mesa/main/teximage.c:1301:51: warning: unused parameter ‘level’ [-Wunused-parameter] GLuint numLevels, GLint level, ^~~~~ src/mesa/main/teximage.c: In function ‘texsubimage_error_check’: src/mesa/main/teximage.c:2186:30: warning: unused parameter ‘dsa’ [-Wunused-parameter] bool dsa, const char callerName) ^~~ src/mesa/main/teximage.c: In function ‘copytexture_error_check’: src/mesa/main/teximage.c:2297:32: warning: unused parameter ‘width’ [-Wunused-parameter] GLint width, GLint height, GLint border ) ^~~~~ src/mesa/main/teximage.c:2297:45: warning: unused parameter ‘height’ [-Wunused-parameter] GLint width, GLint height, GLint border ) ^~~~~~ src/mesa/main/teximage.c: In function ‘check_rtt_cb’: src/mesa/main/teximage.c:2679:21: warning: unused parameter ‘key’ [-Wunused-parameter] check_rtt_cb(GLuint key, void data, void *userData) ^~~ src/mesa/main/teximage.c: In function ‘override_internal_format’: src/mesa/main/teximage.c:2756:55: warning: unused parameter ‘width’ [-Wunused-parameter] override_internal_format(GLenum internalFormat, GLint width, GLint height) ^~~~~ src/mesa/main/teximage.c:2756:68: warning: unused parameter ‘height’ [-Wunused-parameter] override_internal_format(GLenum internalFormat, GLint width, GLint height) ^~~~~~ src/mesa/main/teximage.c: In function ‘texture_sub_image’: src/mesa/main/teximage.c:3293:24: warning: unused parameter ‘dsa’ [-Wunused-parameter] bool dsa) ^~~ src/mesa/main/teximage.c: In function ‘can_avoid_reallocation’: src/mesa/main/teximage.c:3788:53: warning: unused parameter ‘x’ [-Wunused-parameter] mesa_format texFormat, GLint x, GLint y, GLsizei width, ^ src/mesa/main/teximage.c:3788:62: warning: unused parameter ‘y’ [-Wunused-parameter] mesa_format texFormat, GLint x, GLint y, GLsizei width, ^ src/mesa/main/teximage.c: In function ‘valid_texstorage_ms_parameters’: src/mesa/main/teximage.c:5987:40: warning: unused parameter ‘samples’ [-Wunused-parameter] GLsizei samples, unsigned dims) ^~~~~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-04-11 16:20:56 -07:00
Ian Romanick	fa44941072	mesa: Silence unused parameter warning in compressedteximage_only_format Passing ctx to compressedteximage_only_format was the only use of the ctx parameter in _mesa_format_no_online_compression, so that parameter had to go too. ../../SOURCE/master/src/mesa/main/teximage.c: In function ‘compressedteximage_only_format’: ../../SOURCE/master/src/mesa/main/teximage.c:1355:57: warning: unused parameter ‘ctx’ [-Wunused-parameter] compressedteximage_only_format(const struct gl_context *ctx, GLenum format) ^~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-04-11 16:20:42 -07:00
Nanley Chery	377da9eb78	blorp: Silence unused function warnings vulkan/genX_blorp_exec.c:69:1: warning: ‘blorp_get_surface_base_address’ defined but not used [-Wunused-function] blorp_get_surface_base_address(struct blorp_batch batch) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from vulkan/genX_blorp_exec.c:35:0: ./blorp/blorp_genX_exec.h:1249:1: warning: ‘blorp_emit_memcpy’ defined but not used [-Wunused-function] blorp_emit_memcpy(struct blorp_batch batch, ^~~~~~~~~~~~~~~~~ genX_blorp_exec.c:99:1: warning: ‘blorp_get_surface_base_address’ defined but not used [-Wunused-function] blorp_get_surface_base_address(struct blorp_batch batch) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from genX_blorp_exec.c:33:0: ../../../../../src/intel/blorp/blorp_genX_exec.h:1249:1: warning: ‘blorp_emit_memcpy’ defined but not used [-Wunused-function] blorp_emit_memcpy(struct blorp_batch batch, ^~~~~~~~~~~~~~~~~ Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-04-11 13:04:49 -07:00
Caio Marcelo de Oliveira Filho	89542c9ce6	nir/vars_to_ssa: Simplify node matching code The matching code doesn't make real use of the return value. The main function return value is ignored, and while the worker function propagate its return value, the actual callback never returns false. v2: Style fixes. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-11 11:05:05 -07:00
Caio Marcelo de Oliveira Filho	fac9dd1b93	nir/vars_to_ssa: Remove an unnecessary deref_arry_type check Only fully-qualified direct derefs, collected in direct_deref_nodes, are checked for aliasing, so it is already known up front that they have only array derefs of type direct. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-11 11:05:05 -07:00
Caio Marcelo de Oliveira Filho	1c9bccdeb8	nir/vars_to_ssa: Rework register_variable_uses() The return value was needed to make use of the old nir_foreach_block helper, but not needed anymore with the macro version. Then go one step further and move the foreach directly into the register variable uses function. v2: Move foreach to register_variable_uses(). (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-11 11:05:05 -07:00
Jason Ekstrand	bc2b170d68	nir: Use nir_builder in lower_io_to_temporaries Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-04-11 11:03:22 -07:00
Bas Nieuwenhuizen	bd95397d65	radv: Enable RB+ on Raven. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-11 18:46:55 +02:00
Tapani Pälli	9f29b1a4c8	vulkan: fix build issue on android (both anv/radv) Fixes linking errors against: anv_GetPhysicalDeviceImageFormatProperties2KHR radv_GetPhysicalDeviceImageFormatProperties2KHR Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-11 13:55:49 +03:00
Nicolai Hähnle	41e6ffee49	radeonsi: correctly parse disassembly with labels LLVM now emits labels as part of the disassembly string, which is very useful but breaks the old parsing approach. Use the semicolon to detect the boundary of instructions instead of going by line breaks. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-11 12:44:30 +02:00
Nicolai Hähnle	0630e52c9e	radeonsi: pass -O halt_waves to umr for hang debugging This will give us meaningful wave information in the case of a hang where shaders are still running in an infinite loop. Note that we call umr multiple times for different sections of the ddebug hang dump, and so the wave information will not necessarily match up between sections. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-11 12:44:24 +02:00
Jason Ekstrand	69f447553c	vulkan: Drop vk_android_native_buffer.xml All the information in vk_android_native_buffer.xml is now in vk.xml. The only exception is the extension type attribute which we can work around in the generators while we wait for the XML to be fixed. Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-04-10 19:29:49 -07:00
Jason Ekstrand	ae3a856c34	nir/lower_atomics: Rework the main walker loop a bit This replaces some "if (...} { }" with "if (...) continue;" to reduce nesting depth and makes nir_metadata_preserve conditional on progress for the given impl. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-04-10 19:28:49 -07:00
Bas Nieuwenhuizen	ed94638156	radv: Enable RB+ where possible. According to Marek, not enabling it on Stoney has a significant negative performance impact. (And I guess this might impact performance on Raven as well) The register settings are pretty much copied from radeonsi. I did not put this in the pipeline as that would make the pipeline more dependent on the format which mean we would have to have more pipelines for the meta shaders. v2: Don't clear RB+ regs if not enabled as the CLEAR_STATE packet does already. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-11 01:19:10 +02:00
Topi Pohjolainen	5d895a1f37	nir: Check if u_vector_init() succeeds However, it only fails when running out of memory. Now, if we are about to check that, we should be consistent and check the allocation of the worklist as well. CID: 1433512 Fixes: `edb18564c7` nir: Initial implementation of a nir_instr_worklist Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-04-11 01:49:56 +03:00
Topi Pohjolainen	98d3874754	mesa: Assert base format before truncating to unsigned short CID: 1433709 Fixes: `ca721b3d8`: mesa: use GLenum16 in a few more places Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-04-11 01:49:56 +03:00
Topi Pohjolainen	26f48fe010	intel/dev: Assert the number of slices is not zero Fixes: `c1900f5b` intel: devinfo: add helper functions to fill... CID: 1433511 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-04-11 01:49:56 +03:00
Kenneth Graunke	8960903c90	i965: Remove brw_bo_alloc_tiled_2d from intel_detect_swizzling. I'd like to drop this pre-isl function. This drops one of the two uses. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-04-10 15:31:31 -07:00
Timothy Arceri	a05faf80c3	mesa: fix glsl version mismatch in compat profile Drivers that only support compat 3.0 were reporting GLSL 1.40 support. This fixes issues with the menu of Dawn of War II. Fixes: `a0c8b49284` "mesa: enable OpenGL 3.1 with ARB_compatibility" Reviewed-by: Marek Olšák <marek.olsak@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105807	2018-04-11 08:05:19 +10:00
Samuel Pitoiset	0babc8e5d6	radv: fix picking the method for resolve subpass The source and destination image parameters were swapped. No CTS changes on Polaris10, but I suspect this might fix something. Fixes: `2a04f5481d` ("radv/meta: select resolve paths") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-10 21:55:28 +02:00
Samuel Pitoiset	9f6a28eb27	radv: add shader BOs to the list at pipeline bind time Otherwise, the shader BOs are not added to the list on SI because prefetching isn't supported. Calling radv_cs_add_buffer() in the prefetch codepath was a bad idea. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105952 Fixes: `4ad7595f35` ("radv: rename radv_emit_prefetch() to radv_emit_prefetch_L2") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Turo Lamminen <turo@alternativegames.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-10 21:55:28 +02:00
Marek Olšák	e29facff31	ac/surface: don't set the display flag for obviously unsupported cases (v2) This enables the tile swizzle for some cases of the displayable micro mode, and it also fixes an addrlib assertion failure on Vega. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2018-04-10 13:06:03 -04:00
Marek Olšák	19ce5048ee	radeonsi: add shader binary padding for UMR	2018-04-10 13:05:20 -04:00
Marek Olšák	b64b712558	ac/surface/gfx9: request desired micro tile mode explicitly Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-10 12:44:41 -04:00
Emil Velikov	5dd02123a0	docs/release-calendar: update to include 18.1 and 18.2 Dylan has kindly stepped up to help with 18.1.0, while I've taken the liberty to nominate Andres for 18.2.0 ;-) As always, people are welcome to swap/adjust where needed. v2: Add Juan for 18.0.x (Juan) Cc: Andres Gomez <agomez@igalia.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Acked-by: Dylan Baker <dylan@pnwbakers.com> (v1) Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-04-10 16:08:54 +01:00
Emil Velikov	8eceac9de7	glsl: remove unreachable assert() Earlier commit enforced that we'll bail out if the number of terminators is different than 2. With that in mind, the assert() will never trigger. Fixes: `56b867395d` ("glsl: fix infinite loop caused by bug in loop unrolling pass") Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-04-10 16:04:50 +01:00
Juan A. Suarez Romero	0d0ef8ae33	spirv: autotools: add vtn_gather_types_c.py in distribution tarball Fixes: `042ee4bea2` "(spirv: Move SPIR-V building to Makefile.spirv.am and spirv/meson.build") Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-04-10 10:37:46 +02:00
Juan A. Suarez Romero	15ed757834	radeonsi: autotools: add si_build_pm4.h in dist tarball Fixes: `5777488406` ("radeonsi: move r600_cs.h contents into si_pipe.h, si_build_pm4.h") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-04-10 10:33:28 +02:00
Bas Nieuwenhuizen	4381be4648	ac/nir: Use an array instead of hashtable for SSA defs. Saves about 2% of compile time for F1 2017, as well as reduce code size of an optimized libvulkan_radeon.so by about 1 KiB. This still keeps the hashtable, as we also stored blocks in there. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-10 09:53:16 +02:00
Timothy Arceri	6066f08ee9	st/mesa: finalise tcs/tes/geom NIR before storing it to the cache We don't create variants of the NIR so here we finalise it before caching to avoid unnecessary processing when restoring it. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-10 15:10:16 +10:00
Timothy Arceri	bc71e20993	st/mesa: exit st_translate_fragment_program() earlier for NIR path This avoids a bunch of scanning that is only used by the TGSI path. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-10 15:10:16 +10:00
Timothy Arceri	494a5c3501	radeonsi/nir: tidy up si_nir_load_sampler_desc() This makes it easier to follow the code, and also initialises dynamic_index which will be useful for adding bindless textures support. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-10 14:43:45 +10:00
Timothy Arceri	d7cbe795ed	radeonsi/nir: set uses_bindless_images for images V2: add missing intrinsics (Spotted-by: Samuel Pitoiset) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-10 14:43:45 +10:00
Timothy Arceri	74b3fc2ce0	nir: dont lower bindless samplers We neeed to skip the var if its not a uniform here as well as checking the bindless flag since UBOs can contain bindless samplers. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-10 14:43:45 +10:00
Timothy Arceri	bd4cc54c8b	st/glsl_to_nir: set paramater value offset as driver location for packed uniforms This allows us to simplify the code and will also be useful for supporting bindless textures. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-10 14:43:45 +10:00
Timothy Arceri	222d862cd3	radeonsi/nir: don't add bindless samplers/images to declared bitmasks Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-10 14:43:45 +10:00
Timothy Arceri	f33d9036b9	st/mesa: stop calling _mesa_init_shader_object_functions() This sets the LinkShader function for the driver, but for the st we set it properly with the following call to st_init_program_functions(). Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-10 14:43:45 +10:00
Jason Ekstrand	c3f9d5c235	anv/pipeline: Lower more constant initializers earlier Once we've gotten rid of everything but the main entrypoint, there's no reason why we should go ahead and lower them all. This is what radv does and it will make future work easier. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-04-09 19:45:25 -07:00
Jason Ekstrand	14e0a222d9	spirv: Use the LOCAL_GROUP_SIZE system value Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-04-09 19:45:25 -07:00
Jason Ekstrand	131d454c35	nir/lower_system_values: Support SYSTEM_VALUE_LOCAL_GROUP_SIZE Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-04-09 19:45:25 -07:00
Lionel Landwerlin	f3353e53db	intel: aubinator: print out addresses of invalid instructions Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-04-10 00:58:38 +01:00
Bas Nieuwenhuizen	41fbcc7901	radv: Always reset draw user SGPRs after secondary command buffer. As we sometimes reset them to -1, -1 does not mean that they are not written by the secondary command buffer. Fixes: `ad11fc3571` "radv: don't emit unneeded vertex state." Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-09 23:04:42 +02:00
Bas Nieuwenhuizen	74b0b869dd	radv: Don't set instance count using predication. The packet can sometimes be skipped, but we still think the change takes effect. This just makes the packet always take effect. Fixes: `ad11fc3571` "radv: don't emit unneeded vertex state." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105942 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-09 23:04:35 +02:00
Rob Clark	d66dc34316	mesa/st/nir: fix instruction removal At one point this kinda worked (or at least didn't cause problems). But with deref-instructions it results in dangling deref instructions not being properly removed. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-09 15:36:21 -04:00
Rob Clark	becf2d1fac	mesa/st/nir: fix naked lowering pass call Not using the macro means no nir_validate in debug builds, resulting in problems showing up only after later passes. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-09 15:36:21 -04:00
Rob Clark	c4457113e9	nir: add comment about nir_src_copy() So it is more clear about when to use nir_instr_rewrite_src() Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-09 15:36:21 -04:00
Nanley Chery	1d94aa1987	i965: Make the miptree clear color setter take a gl_color_union We want to hide the internal details of how the miptree's clear color is calculated. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-09 10:56:48 -07:00
Nanley Chery	3dbb49a978	i965/miptree: Move the clear color and value setter implementations These will get more complex in later commits. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-09 10:56:48 -07:00
Nanley Chery	1ce7ae391e	i965: Use the brw_context for the clear color and value setters Do what all the other functions in the miptree API do. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-09 10:56:48 -07:00
Bas Vermeulen	c63bef15fc	radeonsi: convert dispatch packet to little endian The parameters for the compute engine are wrong when using an E8860 on a big endian machine. To fix this, convert the contents of struct dispatch_packet to little endian. This ensures that get_global_id(0) and similar functions in the OpenCL code get the correct endian values, and makes my simple OpenCL program work correctly. Signed-off-by: Bas Vermeulen <bas@daedalean.ai> Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2018-04-09 13:47:52 -04:00
Bas Vermeulen	be628e4749	radeonsi: correct si_vgt_param_key on big endian machines Using mesa OpenCL failed on a big endian PowerPC machine because si_vgt_param_key is using bitfields and a 32 bit int for an index into an array. Fix si_vgt_param_key to work correctly on both little endian and big endian machines. Signed-off-by: Bas Vermeulen <bas@daedalean.ai> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-04-09 13:42:30 -04:00
Marek Olšák	f33e4482b3	radeonsi: don't set RB+ registers on GFX9 chips without RB+ CLEAR_STATE initializes them properly. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-09 13:40:25 -04:00
Emil Velikov	ea2536cd26	etnaviv: meson: add etnaviv_query_pm.[ch] to the sources Otherwise building the driver will fail with unresolved symbols. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105960 Fixes: `72d2043be0` ("etnaviv: add perfmon query implementation") Cc: Christian Gmeiner <christian.gmeiner@gmail.com> Cc: Clayton Craft <clayton.a.craft@intel.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2018-04-09 19:09:24 +02:00
Xiong, James	f23b45dce3	i965: return the fourcc saved in __DRIimage when possible When creating a image from a texture, the image's dri_format is set to the first plane's format, and used to look up for the fourcc. e.g. for FOURCC_NV12 texture, the dri_format is set to __DRI_IMAGE_FORMAT_R8, we end up with a wrong entry in function intel_lookup_fourcc(): { __DRI_IMAGE_FOURCC_R8, __DRI_IMAGE_COMPONENTS_R, 1, { { 0, 0, 0, __DRI_IMAGE_FORMAT_R8, 1 }, } }, instead of the correct one: { __DRI_IMAGE_FOURCC_NV12, __DRI_IMAGE_COMPONENTS_Y_UV, 2, { { 0, 0, 0, __DRI_IMAGE_FORMAT_R8, 1 }, { 1, 1, 1, __DRI_IMAGE_FORMAT_GR88, 2 } } }, as a result, a wrong fourcc __DRI_IMAGE_FOURCC_R8 was returned. To fix this bug, the image inherits the texture's planar_format that has the original fourcc; Upon querying, if planar_format is set, return the saved fourcc; Otherwise fall back to the old way. v3: add a bug description and "cc mesa-stable" tag (Jason) remove redundant null pointer check (Tapani) squash 2 patches into one (James) v2: fall back to intel_lookup_fourcc() when planar_format is NULL (Dongwon & Matt Roper) Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Xiong, James <james.xiong@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-04-09 18:16:59 +03:00
Bastien Orivel	42c2f5b579	nir: Fix a typo in src/compiler/Makefile.nir.am Since `31d91f019b`, the makefile tries to find the file SConstript.spirv instead of SConscript.spirv which breaks the make dist command. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-04-09 08:32:45 -06:00
Samuel Pitoiset	04e609f1f8	radv: fix prefetching of vertex shader and VBOs on SI Forgot one check... Too many mistakes for a simple change. Fixes: `f1d7c16e85` ("radv: fix prefetching compute shaders on CIK and older chips") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 16:14:12 +02:00
Samuel Pitoiset	56a4d03b0c	radv: implement VK_AMD_shader_core_properties Simple extension that only returns information for AMD hw. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 14:28:13 +02:00
Samuel Pitoiset	466aba9fa2	radv: add RADV_NUM_PHYSICAL_VGPRS constant Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 14:28:13 +02:00
Samuel Pitoiset	2f7bb93146	radv: add radv_get_num_physical_sgprs() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 14:28:13 +02:00
Samuel Pitoiset	b30dec738a	vulkan: Update the XML and headers to 1.1.72 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 14:28:13 +02:00
Andres Gomez	a055f5108d	docs: properly escape characters Signed-off-by: Andres Gomez <agomez@igalia.com>	2018-04-09 13:47:40 +03:00
Andres Gomez	7cf3932098	mesa: adds some comments regarding MESA_GLES_VERSION_OVERRIDE usage Fixes: `03fd6704db` ("mesa: Add support for a new override string MESA_GLES_VERSION_OVERRIDE") Cc: Jordan Justen <jordan.l.justen@intel.com> Cc: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-04-09 13:47:40 +03:00
Marek Olšák	806ab42c0f	mesa: simplify MESA_GL_VERSION_OVERRIDE behavior of API override v2: - Provide a correct explanation on the envvars documentation (Ian). - Provide a more correct explanation on the function comments (Andres). v3: - Homogenize documentation and inline comments (Emil). - Correct a typo (Emil). Fixes: `2599b92eb9` ("mesa: allow forcing >=3.1 compatibility contexts with MESA_GL_VERSION_OVERRIDE") Cc: Jordan Justen <jordan.l.justen@intel.com> Cc: Ian Romanick <ian.d.romanick@intel.com> Cc: Eric Engestrom <eric.engestrom@imgtec.com> Cc: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-04-09 13:47:40 +03:00
Andres Gomez	c6067fcd07	dri_util: don't fail when not supporting ARB_compatibility with GL3.1 Currently, any driver that does not support the ARB_compatibility extension will fail on GL3.1 context creation if the application does not request the forward-compatiblity flag. Restore the original check which changes mesa_api to API_OPENGL_CORE, only when: - GL3.1 is requested, without the forward-compatiblity flag. - driver does not support ARB_compatibility - as deduced by max_gl_compat_version. Fixes: `a0c8b49284` ("mesa: enable OpenGL 3.1 with ARB_compatibility") v2: - Improve commit log (Emil). - Provide a correct explanation on the features documentation (Ian). Cc: Marek Olšák <marek.olsak@amd.com> Cc: Ian Romanick <ian.d.romanick@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Eric Engestrom <eric.engestrom@imgtec.com> Cc: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-09 13:46:34 +03:00
Andres Gomez	044acd3569	dri_util: when overriding, always reset the core version This way we won't fail when validating just because we may have a non overriden core version that is lower than the requested one, even when the compat version is high enough. For example, running glcts from VK-GL-CTS with i965, this will succeed: $ MESA_GL_VERSION_OVERRIDE=4.6 ./glcts --deqp-case=KHR-GL46.info.vendor While, this will fail: $ MESA_GL_VERSION_OVERRIDE=4.6COMPAT ./glcts --deqp-case=KHR-GL46.info.vendor Fixes: `464c56d3d5` ("dri_util: Use _mesa_override_gl_version_contextless") Cc: Ian Romanick <ian.d.romanick@intel.com> Cc: Tapani Pälli <tapani.palli@intel.com> Cc: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-04-09 13:18:16 +03:00
Samuel Pitoiset	b0f8ad189c	radv: add radv_image_is_tc_compat_htile() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 11:21:26 +02:00
Samuel Pitoiset	95d5ad80e9	radv: add radv_use_dcc_for_image() helper And add some TODOs. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 11:21:24 +02:00
Samuel Pitoiset	fab5fe4284	radv: rename radv_image_is_tc_compat_htile() ... to radv_use_tc_compat_htile_for_image(). This function name makes more sense to me because we want to know if and only if TC-compat HTILE should be used. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 11:21:21 +02:00
Samuel Pitoiset	2692736cee	radv: simplify a check in radv_initialise_color_surface() If the image has FMASK metadata, the number of samples is > 1 because radv_image_can_enable_fmask() handles that already. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 11:21:16 +02:00
Samuel Pitoiset	ed41e776d0	radv: clean up radv_vi_dcc_enabled() And rename to radv_dcc_enabled() to be consistent. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 11:21:14 +02:00
Samuel Pitoiset	e213f19907	radv: clean up radv_htile_enabled() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 11:21:12 +02:00
Samuel Pitoiset	0fc9113ac5	radv: add radv_image_has_{cmask,fmask,dcc,htile}() helpers Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 11:21:10 +02:00
Samuel Pitoiset	32f5174ce8	radv: add radv_get_cmask_fast_clear_value() helper DCC for MSAA textures are currently unsupported but that will be used later on. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 11:21:08 +02:00
Samuel Pitoiset	f882c62218	radv: add radv_clear_{cmask,dcc} helpers They will help for DCC MSAA textures and if we support mipmaps in the future. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 11:21:05 +02:00
Axel Davy	d899826733	st/nine: Do not use scratch for face register Scratch registers are reused every instructions. Since vFace is reused, a new temporary register should be used. Fixes: https://github.com/iXit/Mesa-3D/issues/311 Signed-off-by: Axel Davy <davyaxel0@gmail.com> CC: "17.3 18.0" <mesa-stable@lists.freedesktop.org>	2018-04-08 22:49:43 +02:00
Christian Gmeiner	9e80273693	etnaviv: expose perfmon query groups Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Tested-by: Chris Healy <cphealy@gmail.com>	2018-04-08 22:23:45 +02:00
Christian Gmeiner	c320b158f5	etnaviv: add query_group_info for perfmon counters Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Tested-by: Chris Healy <cphealy@gmail.com>	2018-04-08 22:23:38 +02:00
Christian Gmeiner	5a3b744ed2	etnaviv: assign group_ids to perfmon queries Prep work for AMD_performance_monitor support. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Tested-by: Chris Healy <cphealy@gmail.com>	2018-04-08 22:23:34 +02:00
Christian Gmeiner	4020fa3e08	etnaviv: support MC performance counters Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Tested-by: Chris Healy <cphealy@gmail.com>	2018-04-08 22:21:40 +02:00
Christian Gmeiner	3c3f936ae1	etnaviv: support TX performance counters Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Tested-by: Chris Healy <cphealy@gmail.com>	2018-04-08 22:21:12 +02:00
Christian Gmeiner	f380ce13f0	etnaviv: support RA performance counters Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Tested-by: Chris Healy <cphealy@gmail.com>	2018-04-08 22:21:04 +02:00
Christian Gmeiner	3af0e228e5	etnaviv: support SE performance counters Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Tested-by: Chris Healy <cphealy@gmail.com>	2018-04-08 22:20:50 +02:00
Christian Gmeiner	9ae86c1306	etnaviv: support PA performance counters Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Tested-by: Chris Healy <cphealy@gmail.com>	2018-04-08 22:20:46 +02:00
Christian Gmeiner	69bebe06e3	etnaviv: support SH performance counters Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Tested-by: Chris Healy <cphealy@gmail.com>	2018-04-08 22:20:42 +02:00
Christian Gmeiner	1f603402f6	etnaviv: support PE performance counters Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Tested-by: Chris Healy <cphealy@gmail.com>	2018-04-08 22:20:37 +02:00
Christian Gmeiner	d0bed0b494	etnaviv: support HI performance counters Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Tested-by: Chris Healy <cphealy@gmail.com>	2018-04-08 22:20:32 +02:00
Christian Gmeiner	72d2043be0	etnaviv: add perfmon query implementation Add needed infrastructure to use performance monitor requests for queries. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Tested-by: Chris Healy <cphealy@gmail.com>	2018-04-08 22:20:25 +02:00
Christian Gmeiner	7e3dba301e	etnaviv: sw queries: return correct number of groups Fixes: `3d912bd742` ("etnaviv: add query_group_info for sw counters") Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2018-04-08 22:13:04 +02:00
Lucas Stach	208891650b	etnaviv: advertise YUV formats as external only We only support importing YUV as OES external resources. This will change in the future, but for now this fixes the advertised capabilities in eglQueryDmaBufModifiersEXT. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2018-04-08 22:11:46 +02:00
Lucas Stach	dfe4a08ccd	gallium/util: implement util_format_is_yuv This adds a helper to check if a pipe format is in YUV color space. Drivers want to know about this, as YUV mostly needs special handling. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2018-04-08 22:10:57 +02:00
Rhys Perry	19254a977b	nvc0: finish implementation of PIPE_QUERY_SO_OVERFLOW_PREDICATE This also removes some useless code leftover from old changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-04-07 16:45:00 -04:00
Rhys Perry	14cc8c55ea	nvc0: change ACQUIRE_EQUAL to ACQUIRE_GEQUAL in nvc0_hw_query_fifo_wait If a fence is created in between nvc0_hw_end_query and nvc0_hw_query_fifo_wait, the sequence number in nvc0->screen->fence.bo can be larger than hq->fence->sequence before the semaphore is created, resulting in the semaphore never being triggered. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-04-07 16:45:00 -04:00
Rhys Perry	98d15e0550	nvc0: ensure the query's fence has been emitted in nvc0_hw_query_fifo_wait If the fence has not been emitted, hq->fence->sequence would be zero. This would result in the semaphore never being triggered, blocking all later commands in the pushbuf. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> [imirkin: use nouveau_fence_emit instead] Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-04-07 16:45:00 -04:00
Ilia Mirkin	90bb2d7152	st/mesa: tex offsets can't be in a const or 2d-indexed All consts are now implicitly 2d (they set .Dimension), so trigger asserts. Also, the texture offset can't handle any sort of 2d indexing. While this could be tacked on, this seems unnecessary, just move it off into a separate temp. Fixes assertion failure in tests/spec/arb_gpu_shader5/compiler/builtin-functions/fs-gatherOffset-uniform-offset.frag Note that this was an issue even before the const-always-2d thing, since there was no detection of when even a proper second dimension was used, e.g. for UBO or geom/tess inputs. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-07 16:45:00 -04:00
Ilia Mirkin	2a2b22e9b1	nvc0: restore image binding on RGB10A2, remove from BGR10A2 Fixes a bunch of new CTS pbo tests that use those as an output format, which the state tracker converts into buffer image writes. No part of the driver is ready for BGR10A2. It could probably be enabled on Maxwell+, but seems unnecessary. This error was introduced when flipping the displayable bit on those formats, which accidentally also moved the image bit. Fixes: `e1a70aed10` (nv50,nvc0: mark ABGR format as displayable instead of ARGB format) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-04-07 16:45:00 -04:00
Rob Clark	684f7cd7e3	freedreno/ir3: use lower_global_vars_to_local in cmdline compiler tgsi_to_nir emits things with arrays as global vars.. and nir->ir3 does lower_locals_to_regs. But nothing was lowering global to local, which breaks compiling tgsi shaders Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-04-07 11:33:41 -04:00
Kenneth Graunke	a3782a612f	i965: Use %x instead of %u in debug print. I mistakenly printed out the address as 0x<decimal number> instead of printing a proper hex number. This was...surprising.	2018-04-06 22:57:48 -07:00
Dylan Baker	b5f92b6fd4	meson: fix warnings about comparing unlike types In the old days (0.42.x), when mesa's meson system was written the recommendation for handling conditional dependencies was to define them as empty lists. When meson would evaluate the dependencies of a target it would recursively flatten all of the arguments, and empty lists would be removed. There are some problems with this, among them that lists and dependencies have different methods (namely .found()), so the recommendation changed to use `dependency('', required : false)` for such cases. This has the advantage of providing a .found() method, so there is no need to do things like `dep_foo != [] and dep_foo.found()`, such a dependency should never exist. I've tested this with 0.42 (the minimum we claim to support) and 0.45. On 0.45 this removes warnings about comparing unlike types, such as: meson.build:1337: WARNING: Trying to compare values of different types (DependencyHolder, list) using !=. v2: - Use dependency('', required : false) instead of declare_dependency(), the later will always report that it is found, which is not what we want. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-04-06 15:29:53 -07:00
Ian Romanick	81ed629b38	intel/compiler: Explicitly cast register type in switch brw_reg::type is "enum brw_reg_type type:4". For whatever reason, GCC is treating this as an int instead of an enum. As a result, it doesn't detect missing switch cases and it doesn't detect that flow can get out of the switch. This silences the warning: src/intel/compiler/brw_reg.h: In function ‘bool brw_regs_negative_equal(const brw_reg, const brw_reg)’: src/intel/compiler/brw_reg.h:305:1: warning: control reaches end of non-void function [-Wreturn-type] } ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-04-06 15:22:10 -07:00
Axel Davy	39240926cd	st/nine: Declare lighting consts for ff shaders The lighting constants were not declared previously, but were accessed with indirect addressing, which is illegal. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=105442 Signed-off-by: Axel Davy <davyaxel0@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> CC: "17.3 18.0" <mesa-stable@lists.freedesktop.org>	2018-04-06 23:34:31 +02:00
Caio Marcelo de Oliveira Filho	67c728f7a9	nir: rename variables in nir_lower_io_to_temporaries for clarity In the emit_copies() function, the use of "newv" and "temp" names made sense when only copies from temporaries to the new variables were being done. But now there are other calls to copy with other pairings, and "temp" doesn't always refer to a temporary created in this pass. Use the names "dest" and "src" instead. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-06 11:08:08 -07:00
Samuel Pitoiset	8f9f62c2db	radv: don't pass the pipeline to radv_flush_constants() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-06 19:46:27 +02:00
Samuel Pitoiset	2bd50cceff	radv: rename radv_cmd_buffer_update_vertex_descriptors() ... to radv_flush_vertex_descriptors(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-06 19:46:23 +02:00
Samuel Pitoiset	e829a0cc1e	radv: do not try to skip draw calls when VBOs upload failed This is unnecessary because we record an error which should be returned by vkEndCommandBuffer(), and the app shouldn't submit a command buffer when this happens. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-06 19:46:21 +02:00
Samuel Pitoiset	f1d7c16e85	radv: fix prefetching compute shaders on CIK and older chips Because the check was moved to radv_emit_prefetch_L2(). Fixes: `4ad7595f35` ("radv: rename radv_emit_prefetch() to radv_emit_prefetch_L2()") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-06 19:46:18 +02:00
Samuel Pitoiset	7fe586f6fb	radv: only enable PERFECT_ZPASS_COUNTS for precision occlusion queries This unnecessary when the precision bit flag is not set, and this might hurt performance. The Vulkan explains that not setting VK_QUERY_CONTROL_PRECISE_BIT might be more efficient on some implementations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-06 09:07:34 +02:00
Samuel Pitoiset	d53dff3bfc	radv: enable the Polaris small primitive filter control Enable it directly in the preamble, but do not enable line on Polaris10/11/12 because there is a hw bug. There is possibly an issue when MSAA is off, but this doesn't regress any CTS and AMDVLK doesn't have a workaround as well. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-06 09:07:31 +02:00
Jason Ekstrand	c5b87c94d8	anv: Add WSI support for the I915_FORMAT_MOD_Y_TILED_CCS v2 (Jason Ekstrand): - Return the correct enum values from anv_layout_to_fast_clear_type v3 (Jason Ekstrand): - Always return ANV_FAST_CLEAR_NONE and leave doing the right thing for the patch which adds a modifier which supports fast-clears. Reviewed-by: Daniel Stone <daniels@collabora.com> Tested-by: Daniel Stone <daniels@collabora.com> Acked-by: Nanley Chery <nanley.g.chery@intel.com>	2018-04-05 21:17:02 -07:00
Anuj Phogat	ff8b82666a	Add more Coffee Lake brand strings Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-04-05 14:50:11 -07:00
Jan Vesely	2406e8848e	radeonsi: Reorder checks in si_check_render_feedback si_get_total_colormask accesses NULL pointer on compute shaders Fixes crashes on clover Fixes: `0669dca9c0` ("radeonsi: skip DCC render feedback checking if color writes are disabled") CC: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-05 17:11:18 -04:00
Kevin Rogovin	cc41603d6d	intel/tools: new intel_sanitize_gpu tool Adds a new debug tool to pad each GEM BO allocated with (weak) pseudo-random noise values which are then checked after each batchbuffer dispatch to the kernel. This can be quite valuable to find diffucult to track down heisenberg style bugs. [scott.d.phillips@intel.com: split to separate tool] v2: (by Scott D Phillips) - track gem handles per fd (Kevin) - remove handles on GEM_CLOSE (Kevin) - ignore prime handles - meson & shell script v3: (by Scott D Phillips) - don't track prime bos at all (Kevin) - protect the hash table with a mutex (Kevin) - hook fds by drm_version.name, not path (Chris Wilson) Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com> Reviewed-by: Kevin Rogovin <kevin.rogovin@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-04-05 13:52:49 -07:00
Jason Ekstrand	e85b95269e	prog/nir: Simplify some load/store operations Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-05 13:20:39 -07:00
Marek Olšák	c7dd59b06d	radeonsi: fix a crash if ps_shader.cso is NULL in si_get_total_colormask	2018-04-05 15:53:52 -04:00
Marek Olšák	be4250aa88	radeonsi: remove more R600 references Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	c0dfc0c6df	radeonsi: try to fix android Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	f55d1f806e	radeonsi: try to fix meson This is not fully tested. Meson can't link LLVM even though automake can. PATH=/usr/llvm/x86_64-linux-gnu/bin:$PATH meson build/ -Dgallium-va=false \ -Dplatforms=x11,drm -Dgallium-drivers=radeonsi -Ddri-drivers= \ -Dgallium-omx=disabled -Dgallium-xvmc=false -Dgles1=false \ -Dtexture-float=true -Dvulkan-drivers= src/gallium/auxiliary/libgallium.a(gallivm_lp_bld_misc.cpp.o): (.data.rel.ro._ZTI26DelegatingJITMemoryManager[_ZTI26DelegatingJITMemoryManager]+0x10): undefined reference to `typeinfo for llvm::RTDyldMemoryManager' Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	38faac43e3	radeonsi: don't build libradeon.la separately for better parallelism Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	f9323ddbb9	radeonsi: clean up GET_MAX_VIEWPORT_RANGE definition Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	6a93441295	radeonsi: remove r600_common_context Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	5f77361d2e	radeonsi: remove r600_pipe_common::screen Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	321bd6c280	radeonsi: move r600_buffer_common.c and r600_texture.c into radeonsi Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	d58080b318	radeonsi: move r600_gpu_load.c to si_gpu_load.c Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	f7f4ba5306	radeonsi: move r600_query.c/h files to si_query.c/h Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	5777488406	radeonsi: move r600_cs.h contents into si_pipe.h, si_build_pm4.h Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	eced536ed6	radeonsi: rename query definitions R600_ -> SI_ Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	72e9e98076	radeonsi: move and rename R600_ERR out of r600_pipe_common.h Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	076afb4f0e	radeonsi: rename a few R600/r600_ -> SI_/si_ Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	5f1cddde78	radeonsi: move definitions out of r600_pipe_common.h Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	a67ee02388	radeonsi: move functions out of and remove r600_pipe_common.c Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	90d12f1d77	radeonsi: rename r600 -> si in some places Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	50c7aa6756	radeonsi: use si_context instead of pipe_context in parameters pt3 Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	e332ba61f4	radeonsi: use si_context instead of pipe_context in parameters pt2 Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	c424f86180	radeonsi: use si_context instead of pipe_context in parameters pt1 Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	2a62e5eec9	radeonsi: pass sctx to si_rebind_buffer and clean up Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	605ba1b9ae	radeonsi: use r600_common_context less pt7 Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	0b2f2a6a18	radeonsi: use r600_common_context less pt6 Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	4c5efc40f4	radeonsi: update copyrights Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	95bc30275b	radeonsi: switch radeon_add_to_buffer_list parameter to si_context Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	e5053060eb	radeonsi: use r600_common_context less pt5 Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	884fd97f6b	radeonsi: use r600_common_context less pt4 Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	a8291a23c5	radeonsi: use r600_common_context less pt3 Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	3069cb8b78	radeonsi: use r600_common_context less pt2 Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	71d9028b7a	radeonsi: use r600_common_context less pt1 Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	0606190059	radeonsi: don't use r600_common_context in si_emit_cache_flush Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	3de323f9bb	radeonsi: switch r600_atom::emit parameter to si_context Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	2b70dd8c8a	radeonsi: flatten / remove struct r600_ring Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	f7de8686de	radeonsi: remove r600_ring::flush callback Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	4598ad6a00	radeonsi: make radeon_add_to_buffer_list_check_mem be gfx-only Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	426ef367f3	radeonsi: add_to_buffer_list functions can return void Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	c0987d8adf	radeonsi: move saved_cs functions from r600_pipe_common.c to si_debug.c Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	37ef4765ff	radeonsi: move DMA CS functions from r600_pipe_common.c to si_dma_cs.c Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	19f550f1d2	radeonsi: move EOP event code from r600_pipe_common.c to si_fence.c Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	fc6a44e169	radeonsi: rename si_hw_context.c -> si_gfx_cs.c Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	42500d1dab	radeonsi: move si_destroy_saved_cs to si_debug.c Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	02a61e71a2	radeonsi: rename si_begin_new_cs -> si_begin_new_gfx_cs Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	fa09388704	radeonsi: rename si_need_cs_space -> si_need_gfx_cs_space Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	85e75b2da5	radeonsi: remove r600_pipe_common::blit_decompress_depth Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	e04389cc2a	radeonsi: remove r600_pipe_common::decompress_dcc Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	9d7f809c03	radeonsi: remove r600_pipe_common::invalidate_buffer Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	898500c440	radeonsi: remove r600_pipe_common::rebind_buffer Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	fbf1bf9b8f	radeonsi: remove r600_common_context::set_occlusion_query_state and remove unused old_enable parameter. Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	5ed8b54ffe	radeonsi: remove r600_pipe_common::save_qbo_state Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	72842d15ac	radeonsi: remove unused query code The get_size perf counter callback is also inlined and removed. Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	3f55fe99d6	radeonsi: use num_cs_dw_queries_suspend Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	54f28359b5	radeonsi: remove r600_pipe_common::need_gfx_cs_space Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	0447e8e59e	radeonsi: remove r600_pipe_common::set_atom_dirty Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	5c125ab1ba	radeonsi: remove r600_pipe_common::check_vm_faults Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	17e8f1608e	radeonsi: call CS flush functions directly whenever possible Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	0669dca9c0	radeonsi: skip DCC render feedback checking if color writes are disabled	2018-04-05 15:34:58 -04:00
Dylan Baker	6ac87c1769	meson: fix megadriver symlinking Which should be relative instead of absolute. Fixes: `f7f1b30f81` ("meson: extend install_megadrivers script to handle symmlinking") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105567 Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-and-Tested-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-04-05 10:48:38 -07:00
Dylan Baker	19dbed6477	meson: Set .so version for xa like autotools does Fixes: `0ba909f0f1` ("meson: build gallium xa state tracker") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-04-05 10:46:14 -07:00
Rafael Antognolli	7728720f07	anv: Make blorp update the clear color. Instead of updating the clear color in anv before a resolve, just let blorp handle that for us during fast clears. v5: Update comment about HiZ clear color (Jordan). Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-04-05 07:42:45 -07:00
Rafael Antognolli	e8cadb673d	anv: Use clear address for HiZ fast clears too. Store the default clear address for HiZ fast clears on a global bo, and point to it when needed. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-04-05 07:42:45 -07:00
Rafael Antognolli	021e1885d0	anv: Emit the fast clear color address, instead of value. On Gen10+, instead of copying the clear color from the state buffer to the surface state, just use the address of the state buffer in the surface state directly. This way we can avoid the copy from state buffer to surface state. v4: - Remove use_clear_address from anv code. (Jason) - Use the helper to extract clear color from attachment (Jason) Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-04-05 07:42:45 -07:00
Rafael Antognolli	3f96b459f4	anv: Add a helper to extract clear color from the attachment. Extract the code from color_attachment_compute_aux_usage, so we can later reuse it to update the clear color state buffer. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-04-05 07:42:45 -07:00
Rafael Antognolli	7987d041fd	i965/surface_state: Emit the clear color address instead of value. On Gen10, when emitting the surface state, use the value stored in the clear color entry buffer by using a clear color address in the surface state. v4: Use the clear color offset from the clear_color_bo, when available. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-04-05 07:42:45 -07:00
Rafael Antognolli	2efe8309d3	i965/blorp: Update the fast clear value buffer. On Gen10, whenever we do a fast clear, blorp will update the clear color state buffer for us, as long as we set the clear color address correctly. However, on a hiz clear, if the surface is already on the fast clear state we skip the actual fast clear operation and, before gen10, only updated the miptree. On gen10+ we need to update the clear value state buffer too, since blorp will not be doing a fast clear and updating it for us. v4: - do not use clear_value_size in the for loop - Get the address of the clear color from the aux buffer or the clear_color_bo, depending on which one is available. - let core blorp update the clear color, but also update it when we skip a fast clear depth. v5: Better subject (Jordan). v6: Remove outdated comment (Jason). Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-05 07:42:45 -07:00
Rafael Antognolli	5449f942f2	i965: Add aux_buf variable to simplify code. In a follow up patch, we make use of clear_color_bo, which is in mt->mcs_buf or mt->hiz_buf. To avoid duplicating more code that does the same thing on both aux buffers, just use aux_buf already. v5: Add aux_buf to brw_wm_surface_state too. v6: Drop aux_surf and use aux_buf->surf instead (Jason). Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-05 07:42:45 -07:00
Rafael Antognolli	8735c86ce0	i965/miptree: Add new clear color BO for winsys aux buffers Add an extra BO to store clear color when we receive the aux buffer from the window system. Since we have no control over the aux buffer size in this case, we need the new BO to store only the clear color. v5: - Better subject (Jordan). - Drop alignment from brw_bo_alloc(). Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-05 07:42:45 -07:00
Rafael Antognolli	ab633c2d61	i965/miptree: Add space to store the clear value in the aux surface. Similarly to vulkan where we store the clear value in the aux surface, we can do the same in GL. v2: Remove unneeded extra function. v3: Use clear_value_state_size instead of clear_value_size. v4: - rename to clear_color_state_size - store clear_color_bo and clear_color_offset in the aux buf struct v5: Unreference clear color bo (Jordan) Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-04-05 07:42:45 -07:00
Rafael Antognolli	14260e7c60	intel/blorp: Update clear color state buffer during fast clears. We always want to update the fast clear color during a fast clear on i965. On anv, we are doing that before a resolve, but by adding support to blorp, we can do a similar thing and update it during a fast clear instead. The goal is to remove some code from anv that does such update, and centralize everything in blorp, hopefully removing a lot of code duplication. It also allows us to have a similar behavior on gen < 9 and gen >= 10. v5: s/we/we are/ (Jordan) Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-04-05 07:42:45 -07:00
Rafael Antognolli	92eb5bbc68	intel/blorp: Only copy clear color when doing a resolve. We only need to copy the clear color from the state buffer to the inlined surface state when doing a resolve. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-04-05 07:42:45 -07:00
Rafael Antognolli	188a473b9a	intel/blorp: Add support for fast clear address. On gen10+, if surface->clear_color_addr is present, use it directly intead of copying it to the surface state. v4: Remove redundant #if clause for GEN <= 10 (Jason) v5: Move flush after the reloc, and keep lower bits (Topi). Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-04-05 07:42:45 -07:00
Rafael Antognolli	b8f45cf967	intel/isl: Add support to emit clear value address. gen10 can emit the clear color by setting it on a buffer somewhere, and then adding only the address to the surface state. This commit add support for that on isl_surf_fill_state, and if that is requested, skip setting the clear value itself. v2: Add assert to make sure we are at least on gen10. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-05 07:42:45 -07:00
Rafael Antognolli	94675edcfd	intel: Use Clear Color struct size. The size of the clear color struct (expected by the hardware) is 8 dwords (isl_dev.ss.clear_value_state_size here). But we still need to track the size of the clear color, used when memcopying it to/from the state buffer. For that we keep isl_dev.ss.clear_value_size. v4: - Add struct to gen11 too (Jason, Jordan) - Add field for Converted Clear Color to gen11 (Jason) - Add clear_color_state_offset to differentiate from clear_value_offset. - Fix all the places where clear_value_size was used. v5 (Jason): - Split genxml changes to another commit. - Remove unnecessary gen checks. - Bring back missing offset increment to init_fast_clear_color(). v6 (Jason): - On init_fast_clear_color, change: addr.offset += 4 => sdi.Address.offset += i * 4 - Use GEN_GEN instead of GEN_VERSIONx10. [jordan.l.justen@intel.com: isl_device_init changes] Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-05 07:42:45 -07:00
Rafael Antognolli	f77789a3f0	intel/genxml: Add Clear Color struct to gen10+. v5: Split genxml changes into its own commit (Jason). Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-05 07:42:45 -07:00
Rafael Antognolli	7e616ae201	intel/genxml: Use a single field for clear color address on gen10. genxml does not support having two address fields with different names but same position in the state struct. Both "Clear Color Address" and "Clear Depth Address Low" mean the same thing, only for different surface types. To workaround this genxml limitation, rename "Clear Color Address" to "Clear Value Address" and use it for both color and depth. Do the same for the high bits. TODO: add support for multiple addresses at the same position in the xml. v2: Combine high and low order bits into a single address field. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-04-05 07:42:45 -07:00
Rafael Antognolli	8e1f2e1d2d	genxml: Preserve fields that share dword space with addresses. Some instructions contain fields that are either an address or a value of some type based on the content of other fields, such as clear color values vs address. That works fine if these fields are in the less significant dword, the lower 32 bits of the address, because they get OR'ed with the address. But if they are in the higher 32 bits, they get discarded. On Gen10 we have fields that share space with the higher 16 bits of the address too. This commit makes sure those fields don't get discarded. v5: Remove spurious whitespace (Jason). Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-05 07:42:45 -07:00
Rafael Antognolli	f421a31637	anv/image: Do not override lower bits of dword. The lower bits seem to have extra fields in every platform but gen8 (even though we don't use them in gen9). So just go ahead and avoid using them for the address. v4: Use Jason's suggestion for comment explaining the change. v5: Fix aux_address comment in anv_private.h (Jason) Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-04-05 07:42:45 -07:00
Samuel Pitoiset	942fdfe357	radv: implement a fast prefetch path for the vertex stage This allows to start draws as soon as possible. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-05 10:03:48 +02:00
Samuel Pitoiset	4ad7595f35	radv: rename radv_emit_prefetch() to radv_emit_prefetch_L2() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-05 10:03:45 +02:00
Samuel Pitoiset	a8a696a38f	radv: use a mask for VBOs and shaders prefetching Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-05 10:03:42 +02:00
Marek Olšák	8cd58df2f2	gallium/pp: fix MLAA shaders Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99549	2018-04-04 20:01:43 -04:00
Marek Olšák	096942be2c	gallium/pp: use user constant buffers This fixes a radeonsi crash. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105026	2018-04-04 20:01:43 -04:00
Marek Olšák	d9dc26c94e	st/mesa: set stencil border color the same as intensity This fixes some stencil border color tests on Vega and Raven chips. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-04-04 16:55:52 -04:00
Jon Turney	498d9d0f4d	Fix use of alloca() without #include <c99_alloca.h> Fix use of alloca() without #include <c99_alloca.h> in `1da345e5` vbo/vbo_context.c: In function '_vbo_draw_indirect': vbo/vbo_context.c:284:34: error: implicit declaration of function 'alloca' [-Werror=implicit-function-declaration] struct _mesa_prim space = alloca(draw_countsizeof(struct _mesa_prim)); ^~~~~~ vbo/vbo_context.c:284:34: warning: initialization makes pointer from integer without a cast [-Wint-conversion] Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-04-04 14:34:07 +01:00
Samuel Pitoiset	922cd38172	radv: implement out-of-order rasterization when it's safe on VI+ Disabled by default for now, it can be enabled with RADV_PERFTEST=outoforder. No CTS regressions on Polaris, and all Vulkan games I tested look good as well. Expect small performance improvements for applications where out-of-order rasterization can be enabled by the driver. Loosely based on RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-04 13:32:00 +02:00
Samuel Pitoiset	d6709c91a6	radv: change blend_enable field to use four bits per CB Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-04 13:32:00 +02:00
Samuel Pitoiset	a8818d1af2	radv: scan which color blend attachments are enabled With cb_target_enabled_4bit in order to have four bits per CB. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-04 13:32:00 +02:00
Samuel Pitoiset	ac456d0d1b	radv: put more fields in radv_blend_state Some will be used for further optimizations (ie. out-of-order rast). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-04 13:32:00 +02:00
Samuel Pitoiset	e4976ca33b	radv: do not always disable dual quad mode when chip has RbPlus For GFX9+ only, RadeonSI does this too. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-04 13:32:00 +02:00
Samuel Pitoiset	b8c06a961c	radv: don't use the SPI barrier management bug workaround Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-04 13:32:00 +02:00
Samuel Pitoiset	ab147cba77	radv: mask out high VM address bits in registers where needed Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-04 13:32:00 +02:00
Lionel Landwerlin	1beb80cb56	intel: compiler: silence compiler warning ../src/intel/compiler/brw_reg.h: In function ‘bool brw_regs_negative_equal(const brw_reg, const brw_reg)’: ../src/intel/compiler/brw_reg.h:305:1: warning: control reaches end of non-void function [-Wreturn-type] Introduced by `8f83eea71e` ("i965: Add negative_equals methods"). Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-04-04 11:57:39 +01:00
Iago Toral Quiroga	41ac0b1443	compiler/spirv: set is_shadow for depth comparitor sampling opcodes From the SPIR-V spec, OpTypeImage: "Depth is whether or not this image is a depth image. (Note that whether or not depth comparisons are actually done is a property of the sampling opcode, not of this type declaration.)" The sampling opcodes that specify depth comparisons are OpImageSample{Proj}Dref{Explicit,Implicit}Lod, so we should set is_shadow only for these (we were using the deph property of the image until now). v2: - Do the same for OpImageDrefGather. - Set is_shadow to false if the sampling opcode is not one of these (Jason) - Reuse an existing switch statement instead of adding a new one (Jason) Fixes crashes in: dEQP-VK.spirv_assembly.instruction.graphics.image_sampler.depth_property.* Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: mesa-stable@lists.freedesktop.org	2018-04-04 07:57:58 +02:00
Sergii Romantsov	98b860e311	i965: Extend the negative 32-bit deltas to 64-bits Gen8+ use 48-bit address relocations so need to extend the sign to 64-bit return value. Without it we have higher bits zeroed and missing the negavive values. Haswell and older use 32-bit deltas so are unaffected by this issue. v2: used int32_t fucntion parameter instead of explicit type conversion. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101408 Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Tested-by: Andriy Khulap <andriy.khulap@globallogic.com> Tested-by: Stuart Young <cefiar@gmail.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "18.0 17.3" <mesa-stable@lists.freedesktop.org>	2018-04-03 22:48:09 -07:00
Jason Ekstrand	800df942ea	nir/lower_vec_to_movs: Only coalesce if the vec had a SSA destination Otherwise we may end up trying to coalesce in a case such as ssa_1 = fadd r1, r2 r3.x = fneg(r2); r3 = vec4(ssa_1, ssa_1.y, ...) and that would cause us to move the writes to r3 from the vec to the fadd which would re-order them with respect to the write from the fneg. In order to solve this, we just don't coalesce if the destination of the vec is not SSA. We could try to get clever and still coalesce if there are no writes to the destination of the vec between the vec and the ALU source. However, since registers only come from phi webs and indirects, the chances of having a vec with a register destination that is actually coalescable into its source is very slim. Shader-db results on Haswell: total instructions in shared programs: 13657906 -> 13659101 (<.01%) instructions in affected programs: 149291 -> 150486 (0.80%) helped: 0 HURT: 592 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105440 Fixes: `2458ea95c5` "nir/lower_vec_to_movs: Coalesce movs on-the-fly when possible" Reported-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com> Tested-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-04-03 22:21:23 -07:00
Kevin Strasser	5bbde9b80f	anv: Fix close(fd) before import issue in vkCreateDmaBufImageINTEL If we close the fd before calling DRM_IOCTL_PRIME_FD_TO_HANDLE the kernel will hit a -EBADF error. Move the close(fd) call to the end of anv_CreateDmaBufImageINTEL(). Signed-off-by: Kevin Strasser <kevin.strasser@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-03 18:33:17 -07:00
Timothy Arceri	b42633db8e	glsl: always call do_lower_jumps() after loop unrolling This fixes a bug in radeonsi where LLVM cannot handle the case where a break exists but its not the last instruction in the block. LLVM would fail with: Terminator found in the middle of a basic block! LLVM ERROR: Broken function found, compilation aborted! Fixes: `96fe8834f5` "glsl_to_tgsi: do fewer optimizations with GLSLOptimizeConservatively" Reviewed-by: Matt Turner <mattst88@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105317	2018-04-04 08:40:16 +10:00
James Legg	a58fdc61e9	vulkan/wsi/wayland: fix leaks Fixes: `bfa22266cd` ("vulkan/wsi/wayland: Add support for zwp_dmabuf") Reviewed-by: Daniel Stone <daniels@collabora.com> CC: Jason Ekstrand <jason@jlekstrand.net>	2018-04-03 22:09:57 +01:00
Juan A. Suarez Romero	06076ead28	docs: update calendar, add news and link release notes to 17.3.8 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2018-04-03 17:38:36 +00:00
Juan A. Suarez Romero	ca71b7bab8	docs: add sha256 checksums for 17.3.8 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `ba371c7262`)	2018-04-03 17:34:16 +00:00
Juan A. Suarez Romero	d89ef8ce62	docs: add release notes for 17.3.8 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `3bf5c10c5c`)	2018-04-03 17:34:16 +00:00
Jakob Bornecrantz	88e958257c	st/mesa: Also use PIPE_FORMAT_R8G8B8A8_SRGB for framebuffer_sRGB. When running virgl on a GLES host the only sRGB formats that support rendering is RGBA and RGBX. That pipe format is in the sRGB default lists that the state tracker uses when mapping mesa formats. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>	2018-04-03 17:48:52 +01:00
Lionel Landwerlin	78c18d99dc	intel: gen-decoder: print all dword a field belongs to Prior to printing a decoded field, print out all dwords that field belongs to. In particular with address fields spanning multiple dwords, we want to have all the dwords presented before the field is decoded to make it easier to read. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-04-03 16:55:53 +01:00
Lionel Landwerlin	4d59127213	intel: genxml: decode variable length MI_LRI MI_LOAD_REGISTER_IMM can load multiple (register, value) tuples in one command. In our drivers we only use one tuple at a time, but the kernel might load more than one at a time. Instead of making all the tuple part of a group, we leave out the first tuple (the one we use in the generated packing structures). This is particularly useful for looking at error stats generated by the kernel. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-04-03 16:55:53 +01:00
Lionel Landwerlin	2841af6238	intel: gen-decoder: don't decode fields beyond a dword length For example, a PIPE_CONTROL with DWordLength = 2 should look like this : 0xffffe374: 0x7a000002: PIPE_CONTROL 0xffffe374: 0x7a000002 : Dword 0 DWord Length: 2 0xffffe378: 0x00800000 : Dword 1 Depth Cache Flush Enable: false Stall At Pixel Scoreboard: false State Cache Invalidation Enable: false Constant Cache Invalidation Enable: false VF Cache Invalidation Enable: false DC Flush Enable: false Pipe Control Flush Enable: false Notify Enable: false Indirect State Pointers Disable: false Texture Cache Invalidation Enable: false Instruction Cache Invalidate Enable: false Render Target Cache Flush Enable: false Depth Stall Enable: false Post Sync Operation: 0 (No Write) Generic Media State Clear: false TLB Invalidate: false Global Snapshot Count Reset: false Command Streamer Stall Enable: false Store Data Index: 0 LRI Post Sync Operation: 1 (MMIO Write Immediate Data) Destination Address Type: 0 (PPGTT) Flush LLC: false 0xffffe37c: 0x00000000 : Dword 2 Address: 0x00000000 0xffffe384: 0x05000000: MI_BATCH_BUFFER_END Prior to this change, fields beyond the length of the command would be decoded (notice the MI_BATCH_BUFFER_END decoded as part of the previous PIPE_CONTROL) : 0xffffe374: 0x7a000002: PIPE_CONTROL 0xffffe374: 0x7a000002 : Dword 0 DWord Length: 2 0xffffe378: 0x00800000 : Dword 1 Depth Cache Flush Enable: false Stall At Pixel Scoreboard: false State Cache Invalidation Enable: false Constant Cache Invalidation Enable: false VF Cache Invalidation Enable: false DC Flush Enable: false Pipe Control Flush Enable: false Notify Enable: false Indirect State Pointers Disable: false Texture Cache Invalidation Enable: false Instruction Cache Invalidate Enable: false Render Target Cache Flush Enable: false Depth Stall Enable: false Post Sync Operation: 0 (No Write) Generic Media State Clear: false TLB Invalidate: false Global Snapshot Count Reset: false Command Streamer Stall Enable: false Store Data Index: 0 LRI Post Sync Operation: 1 (MMIO Write Immediate Data) Destination Address Type: 0 (PPGTT) Flush LLC: false 0xffffe37c: 0x00000000 : Dword 2 Address: 0x00000000 0xffffe380: 0x00000000 : Dword 3 0xffffe384: 0x05000000 : Dword 4 Immediate Data: 83886080 0xffffe384: 0x05000000: MI_BATCH_BUFFER_END Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-04-03 16:55:53 +01:00
Lionel Landwerlin	81375516b2	intel: error_decode: add an option to decode all buffers The kernel reports workaround batch buffers, but we're not presenting them currently. Also they might not be useful for debugging purely userspace driver issues, when problems arise because of interactions between kernel & userspace drivers, it's nice to be able to decode them. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-04-03 16:55:53 +01:00
Lionel Landwerlin	b3aa18dfd6	intel: genxml: add preemption control instructions Helpful to debug kernel workaround batchbuffers. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-04-03 16:55:53 +01:00
Dylan Baker	6f6e711c72	mesa: ensure that variable is initialized This variable controls whether we link using the glsl code path or the spirv path. It's set when we validate that all shaders are glsl or spirv, but if there are no shaders attached to the program it will remain unset, resulting in undefined behavior. We want to go down the glsl path in that case, so initialize to false. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105820 Fixes: `16f6634e7f` ("mesa/program: Link SPIR-V shaders using the SPIR-V code-path") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2018-04-03 08:47:59 -07:00
Marek Olšák	d3e96b1063	radeonsi/gfx9: fix bad LLVM params in monolithic LS+HS Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-03 11:07:28 -04:00
Samuel Pitoiset	acf60abc54	radv: enable VK_EXT_shader_viewport_index_layer The driver already supports exporting the Layer and ViewportIndex built-ins from vertex or tessellation shaders. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-03 14:05:46 +02:00
Rob Clark	51888bf07d	nir+drivers: add helpers to get # of src/dest components Add helpers to get the number of src/dest components for an intrinsic, and update spots that were open-coding this logic to use the helpers instead. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-03 06:08:56 -04:00
Rob Clark	91f9450b32	freedreno/ir3: fix fallout of unused false-depth elimination Since we were MARK flag for both preventing loops, and tracking whether instructions were used, we could end up in an infinite loop due to `bd2ca2bcdd`. Instead invert the logic.. mark all instructions UNUSED up front and clear the flag as we visit them. Fixes: `bd2ca2bcdd` freedreno/ir3: eliminate unused false-deps Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-04-03 06:08:56 -04:00
Timothy Arceri	7e9b7ec094	gallium/pipebuffer: fix parenthesis location Without this the return value will never get set to -1. This was first added in `49866c8f34` and copied in `2b396eeed9`. Fixes: `2b396eeed9` "gallium/pb_cache: add a copy of cache bufmgr independent of pb_manager" Reviewed-by: Marek Olšák <marek.olsak@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102342	2018-04-03 16:05:59 +10:00
Tapani Pälli	6b21391729	Revert "mesa: add GL_HALF_FLOAT as supported type to readpixels" This reverts commit `41cf30b8bc`. Commit caused regressions with KHR-GLES3.packed_pixels.* tests. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Suggested-by: Eric Anholt <eric@anholt.net>	2018-04-03 08:43:30 +03:00
Mike Lothian	0bdbe4583f	gallivm: Fix include for LLVMAddPromoteMemoryToRegisterPass Include llvm-c/Transforms/Utils.h with the newest LLVM 7 Signed-of-by: Mike Lothian <mike@fireburn.co.uk> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-04-02 14:27:29 -04:00
Mike Lothian	5e07881305	radeonsi: Fix include for LLVMAddPromoteMemoryToRegisterPass Include llvm-c/Transforms/Utils.h with the newest LLVM 7 Signed-of-by: Mike Lothian <mike@fireburn.co.uk> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-04-02 14:27:29 -04:00
Mike Lothian	7e144ace95	ac/nir: Fix include for LLVMAddPromoteMemoryToRegisterPass Include llvm-c/Transforms/Utils.h with the newest LLVM 7 Signed-of-by: Mike Lothian <mike@fireburn.co.uk> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-04-02 14:27:29 -04:00
Daniel Stone	4cbecb6168	st/dri: Initialise modifier to INVALID for DRI2 When allocating a buffer for DRI2, set the modifier to INVALID to inform the backend that we have no supplied modifiers and it should do its own thing. The missed initialisation forced linear, even if the implementation had made other decisions. This resulted in VC4 DRI2 clients failing with: Modifier 0x0 vs. tiling (0x700000000000001) mismatch Signed-off-by: Daniel Stone <daniels@collabora.com> Reported-by: Andreas Müller <schnitzeltony@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Fixes: `3f8513172f` ("gallium/winsys/drm: introduce modifier field to winsys_handle")	2018-04-02 19:07:57 +01:00
Marek Olšák	2be6143032	radeonsi: implement GL_KHR_blend_equation_advanced MSAA is supported using sample shading. Layered rendering and all texture targets are also supported. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-02 13:55:25 -04:00
Marek Olšák	e04631b0f2	radeonsi: rename unpack_param -> si_unpack_param Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-02 13:55:23 -04:00
Marek Olšák	dc04e4bba2	radeonsi: move FMASK shader logic to shared code We'll need it for FBFETCH in both TGSI and NIR paths. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-02 13:55:22 -04:00
Marek Olšák	eb77961292	radeonsi: add R600_DEBUG=nofmask to disable MSAA compression For testing. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-02 13:55:20 -04:00
Marek Olšák	56342c97ee	gallium/u_tests: test FBFETCH and shader-based blending with MSAA Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-02 13:55:18 -04:00
Marek Olšák	5d91c2ccea	ac/gpu_info: print GB_ADDR_CONFIG	2018-04-02 13:10:37 -04:00
Marek Olšák	b1f33086ec	ac/gpu_info: reorder the fields and print them nicely	2018-04-02 13:10:37 -04:00
Marek Olšák	a0a96819e1	ac/gpu_info: rename has_virtual_memory -> r600_has_virtual_memory	2018-04-02 13:10:37 -04:00
Marek Olšák	32b3932de1	ac/gpu_info: don't print irrelevant fields	2018-04-02 13:10:37 -04:00
Marek Olšák	f754217517	st/mesa: don't draw if the bound element array buffer is not allocated Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-02 13:10:36 -04:00
Iago Toral Quiroga	31881079af	anv/cmd_buffer: honor pending clear views for depth/stencil attachments v2: rebased on top of subpass rework. v3: rebased v4: - rebased - reset pending clear views in one go rather one bit at a time (Caio) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-02 09:53:24 +02:00
Iago Toral Quiroga	f60c5fc17e	anv/cmd_buffer: consider multiview masks for tracking pending clear aspects When multiview is active a subpass clear may only clear a subset of the attachment layers. Other subpasses in the same render pass may also clear too and we want to honor those clears as well, however, we need to ensure that we only clear a layer once, on the first subpass that uses a particular layer (view) of a given attachment. This means that when we check if a subpass attachment needs to be cleared we need to check if all the layers used by that subpass (as indicated by its view_mask) have already been cleared in previous subpasses or not, in which case, we must clear any pending layers used by the subpass, and only those pending. v2: - track pending clear views in the attachment state (Jason) - rebased on top of fast-clear rework. v3: - rebased on top of subpass rework. v4: rebased. v5 (Caio): - Rebased. - Initialize pending clear views to only have bits set for layers that exist. - Reset pending clear views in one go rather one bit at a time. - Put "last subpass for this attachment" condition in a separate function to simplify the conditional that resets pending_clear_aspects. Fixes: dEQP-VK.multiview.readback_implicit_clear.* Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-02 09:53:15 +02:00
Timothy Arceri	c88e7fe29e	radeonsi/nir: fix explicit component packing for geom/tess doubles Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-02 14:56:00 +10:00
Timothy Arceri	dd3d3cc877	radeonsi/nir: gather buffers declared more accurately and use const fast path For now we skip SI && HAVE_LLVM < 0x0600 for simplicity. We also skip setting the more accurate masks for builtin uniforms for now as it causes some piglit regressions. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-02 14:56:00 +10:00
Timothy Arceri	56017d8100	radeonsi: create load_const_buffer_desc_fast_path() helper This will be shared by the TGSI and NIR backends. For simplicity we leave the SI LLVM 5.0 and lower work around only in the TGSI backend. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-02 14:56:00 +10:00
Timothy Arceri	7aad5e15f6	radeonsi/nir: set TGSI_PROPERTY_NEXT_SHADER Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-02 14:56:00 +10:00
Timothy Arceri	2ca5d9548f	st/glsl_to_nir: gather next_stage in shader_info Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-02 14:56:00 +10:00
Rob Clark	2f175bfe5d	freedreno/a5xx: don't align height for PIPE_BUFFER Buffers can be large, so we probably don't want to make them all 32x bigger. But they can't be rendered to (at least in GL) so we don't need this workaround to prevent page faults on mem<->gmem. Cc: "18.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-04-01 11:26:01 -04:00
Rob Clark	1866f76f7b	freedreno/a5xx: fix page faults on last level We could alternatively fall back to using "old style" draw's for mem<->gmem (ie. what <= a4xx do) when height is not aligned to 32, but that is somewhat more work (and not really something that could be applied to stable) Cc: "18.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-04-01 10:50:11 -04:00
Rob Clark	afde9294b5	freedreno/ir3: fix issue w/ glamor composite shaders Fixes an issue that became possible when we started lowering phi webs to regs (`a7ea2b4e`) (although was not really seen until we also switched to using peephole select pass (`ec8bc54a`) instead of lowering all if/else to select). If texture coord (or anything else that uses create_collect() to collect scalar values in a sequence of scalar registers) was consuming a value produced on either side of an if/else (ie. a phi lowered to nir reg, which in ir3 is an "array" of length 1) then register allocation would happen incorrectly and we'd end up sampling from garbage coordinates. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-31 16:25:13 -04:00
Rob Clark	2191a18e75	freedreno/ir3: more half-precision fixes Some instructions require src/dst to be in full or half precision register depending on src/dst type. So do a better job of propagating register type. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-31 15:16:16 -04:00
Rob Clark	e04e068f75	freedreno/ir3: add helper to create immed of specified size We'll also need to be able to create a half-precision immediate. So re-work create_immed(). Prep work for following patch. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-31 15:13:11 -04:00
Rob Clark	1f45320e51	freedreno/ir3: pass ctx instead of block to create_collect() Prep work for following patch. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-31 15:12:33 -04:00
Rob Clark	bd2ca2bcdd	freedreno/ir3: eliminate unused false-deps Previously false-dependencies would get flagged as used, even if the only "use" was a false dep to (for example) prevent a load from being scheduled after a store. In addition to being pointless instructions, in some cases they can cause problems. For example, ldg (and similar instructions) depend on an immed arg getting CP'd into the instruction, but this doesn't happen if an instruction is otherwise unused. Which can result in undefined results (overwriting unintended registers). Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-31 15:11:46 -04:00
Rob Clark	4f78383809	freedreno/ir3: add local_group_size Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-31 15:10:56 -04:00
Rob Clark	96e7927fb2	freedreno/ir3: clear SSA flag when assigning "ARRAY" regs too Avoids a misleading "INVALID FLAGS" warning in debug builds. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-31 15:10:16 -04:00
Rob Clark	6514b4e3fd	freedreno/ir3: print array live ranges This is also useful to see if optmsgs are enabled. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-31 15:09:42 -04:00
Wladimir J. van der Laan	e8e3aa68d6	freedreno: a2xx: Implement DP2 instruction Use DOT2ADDv instruction with 0.0f constant add. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-03-31 06:17:59 +00:00
Wladimir J. van der Laan	79d6b194f2	freedreno: a2xx: implement SEQ/SNE instructions Extend translate_sge_slt to emit these, in analogous fashion but using CNDEv. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-03-31 06:17:59 +00:00
Wladimir J. van der Laan	837fabaaa3	freedreno: a2xx: Compressed textures support Add support for: - PIPE_FORMAT_ETC1_RGB8 - PIPE_FORMAT_DXT1_RGB - PIPE_FORMAT_DXT1_RGBA - PIPE_FORMAT_DXT3_RGBA - PIPE_FORMAT_DXT5_RGBA Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-03-31 06:17:59 +00:00
Wladimir J. van der Laan	92d529e7e4	freedreno: a2xx: Support TEXTURE_RECT Denormalized texture coordinates are required for text rendering in GALLIUM_HUD. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-03-31 06:17:59 +00:00
Wladimir J. van der Laan	6be017fdc4	freedreno: a2xx: Prevent crash in emit_texture if view is not set Textures will sometimes be updated if texture view state was un-set, without this change that causes an assertion crash or segfault. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-03-31 06:17:59 +00:00
Wladimir J. van der Laan	fb41372761	freedreno: a2xx: Fix fd2_tex_swiz Compose swizzles using util_format_compose_swizzles instead of the custom code (which somehow had a bug). This makes the GL_ALPHA internal format work. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-03-31 06:17:59 +00:00
Wladimir J. van der Laan	faed84a615	freedreno: a2xx: Change use of BLEND_ to BLEND2_ Change use of BLEND_ to BLEND2_, BLEND_* a3xx_rb_blend_opcode BLEND2_* is a2xx_rb_blend_opcode This makes no effective difference as the used enumerant has the same value (0), but the other enumerants do not match 1-to-1 so this will avoid future problems. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-03-31 06:17:59 +00:00
Wladimir J. van der Laan	cb6dd7070f	freedreno: a2xx: Update rnndb header for formats enumeration The format enumeration comes comes from the yamoto register headers that are part of the amd-gpu kernel driver. (see freedreno envytools commit b8fb7978e7ae106d0d11d0b238ab2ba2d4dd9d43) Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-03-31 06:17:59 +00:00
Mathias Fröhlich	1da345e569	vbo: Use alloca for _vbo_draw_indirect. Avoid using malloc in the draw path of mesa. Since the draw_count is a user api input, fall back to malloc if the amount of consumed stack space may get too high. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-31 06:32:15 +02:00
Mathias Fröhlich	3f1cd957d3	vbo: Remove unused includes to vbo_private.h Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-31 06:32:14 +02:00
Mathias Fröhlich	6e9f00e3fc	vbo: Move vbo_split into the tnl module. Move the files, adapt to the naming scheme in tnl, update callers and build system. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-31 06:32:14 +02:00
Mathias Fröhlich	245f9a3977	vbo: Readd the arrays argument to the legacy draw methods. The legacy draw paths from back before 2012 contained a gl_vertex_array array for the inputs to be used for draw. So all draw methods from legacy drivers and everything that goes through tnl are originally written for this calling convention. The same goes for tools like t_rebase or vbo_split, that even partly still have the original calling convention with a currently unused such pointer. Back in 2012 patch `50f7e75` mesa: move gl_client_array[] from vbo_draw_func into gl_context introduced Array._DrawArrays, which was something that was IMO aiming for a similar direction than Array._DrawVAO introduced recently. Now several tools like t_rebase and vbo_split, which are mostly used by tnl based drivers, would need to be converted to use the internal Array._DrawVAO instead of Array._DrawArrays. The same goes for the driver backends that use any of these tools. Alternatively we can reintroduce the gl_vertex_array array in its call argument list and put these tools finally into the tnl directory. So this change reintroduces this gl_vertex_array array for the legacy draw paths that are still required for the tools t_rebase and vbo_split. A followup will move vbo_split also into tnl. Note that none of the affected drivers use the DriverFlags.NewArray driver bit. So it should be safe to remove this also for the legacy draw path. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-31 06:32:14 +02:00
Mathias Fröhlich	461698af26	vbo: Remove the now unused vbo draw path. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-31 06:32:13 +02:00
Mathias Fröhlich	784fdef4e7	tnl: Push down the gl_vertex_array inputs into tnl drivers. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-31 06:32:13 +02:00
Mathias Fröhlich	7f8db5ca47	vbo: Remove vbo_indirect_draw_func. Remove the vbo_indirect_draw_func vbo callback and make the default implementation use the drivers main draw callback function directly. This will be needed with the next changes when drivers without own main drivers DrawIndirect implementation get moved to the main drivers Draw method. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-31 06:32:13 +02:00
Mathias Fröhlich	4db9d83a2d	i965: Push down the gl_vertex_array inputs into i965. Let the i965 backend have its own gl_vertex_array array and basically reimplement the way _vbo_draw works. Note that brw_draw_indirect_prims calls brw_draw_prims internally and gets its update to Array._DrawArray by this way. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-31 06:32:12 +02:00
Mathias Fröhlich	fca1550550	gallium: Push down the gl_vertex_array inputs into gallium. Let the gallium backend have its own gl_vertex_array array and basically reimplement the way _vbo_draw works. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-31 06:32:12 +02:00
Jason Ekstrand	9978f55cd1	nir/validator: Validate that all used variables exist We were validating this for locals but nothing else. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-30 17:20:27 -07:00
Jason Ekstrand	2b977989f3	intel/vec4: Set channel_sizes for MOV_INDIRECT sources Otherwise, any indirect push constant access results in an assertion failure when we start digging through the channel_sizes array. This fixes dEQP-VK.pipeline.push_constant.graphics_pipeline.dynamic_index_vert on Haswell. It should be a harmless no-op for GL since indirect push constants aren't used there. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Fixes: `e69e5c7006` "i965/vec4: load dvec3/4 uniforms first in the..."	2018-03-30 17:20:27 -07:00
Jason Ekstrand	6018f5b079	nir/lower_indirect_derefs: Support interp_var_at intrinsics This fixes the fs-interpolateAtCentroid-block-array piglit test on i965. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2018-03-30 17:20:27 -07:00
Jason Ekstrand	0517d65f96	nir/vars_to_ssa: Remove copies from the correct set Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2018-03-30 17:20:27 -07:00
Jason Ekstrand	a1452a94fc	nir: Return a cursor from nir_instr_remove Because nir_instr_remove is an inline wrapper around nir_instr_remove_v, the compiler should be able to tell that the return value is unused and not emit the extra code in most cases. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-30 17:20:27 -07:00
Jason Ekstrand	956f17395b	nir: Add src/dest num_components helpers We already have these for bit_size Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-30 17:20:27 -07:00
Brian Paul	bebf758c49	docs: document WGL_SWAP_INTERVAL env var Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-03-30 14:45:05 -06:00
Brian Paul	c8906b8459	st/wgl: check if WGL_SWAP_INTERVAL is defined in wglSwapIntervalEXT() This allows the WGL_SWAP_INTERVAL env var to override any application calls to wglSwapIntervalEXT(). Useful for debugging, or to set the interval to zero to effectively disable the swap interval. Note: we also rename the previous instance of SVGA_SWAP_INTERVAL to WGL_SWAP_INTERVAL since this is a WGL feature and not related to the svga driver. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-03-30 14:44:50 -06:00
Brian Paul	1bf201ddce	glapi: define GL_API to be KEYWORD1 in glapi_dispatch.c (v2) This fixes a Windows build warning where the prototypes for the ES function in the header file don't match the prototypes in this file because the GL_API and GLAPI macros are defined differently. v2: defined GL_API to KEYWORD1 instead of GLAPI, per Mathias. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-03-30 14:33:33 -06:00
Brian Paul	26bc983c83	spirv: s/uint/unsigned/ to fix MSVC build Reviewed-by: Neil Roberts <nroberts@igalia.com>	2018-03-30 14:33:33 -06:00
Brian Paul	f3164c2ed9	nir/spirv: s/uint32_t/SpvOp/ in various functions The MSVC compiler warns when the function parameter types don't exactly match with respect to enum vs. uint32_t. Use SpvOp everywhere. Alternately, uint32_t could be used everywhere. There doesn't seem to be an advantage to one over the other. Reviewed-by: Neil Roberts <nroberts@igalia.com>	2018-03-30 14:33:33 -06:00
Brian Paul	cb619a3c9a	nir/spirv: fix MSVC syntax error in vtn_handle_texture() Reviewed-by: Neil Roberts <nroberts@igalia.com>	2018-03-30 14:33:33 -06:00
Brian Paul	c58c9f712d	nir/spirv: move NORETURN annotation on _vtn_fail() prototype This needs to before the function, not after, to compile with MSVC. This works with gcc too. Reviewed-by: Neil Roberts <nroberts@igalia.com>	2018-03-30 14:33:33 -06:00
Brian Paul	84be45fc20	nir/spirv: fix MSVC warning in vtn_align_u32() Fixes warning that "negation of an unsigned value results in an unsigned value". Reviewed-by: Neil Roberts <nroberts@igalia.com>	2018-03-30 14:33:33 -06:00
Neil Roberts	31d91f019b	spirv: Fix building with SCons The SCons build broke with commit `ba975140d3` because a SPIR-V function is called from Mesa main. This adds a convenience library for SPIR-V and adds it to everything that was including nir. It also adds both nir and spirv to drivers/x11/SConscript. Also add nir/spirv modules to osmesa and libgl-gdi targets. (Brian Paul) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105817 Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2018-03-30 14:33:03 -06:00
Brian Paul	cdc34e2cea	mesa: fix MSVC bitshift overflow warnings In the BITFIELD_MASK() macro, if b==32 the expression evaluates to ~0u, but the compiler still sees the expression (1 << 32) in the unused part and issues a warning about integer bitshift overflow. Fix that by using (b) % 32 to ensure the max shift is 31 bits. This issue has been present for a while, but shows up much more often because of the recent VBO changes. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-03-30 11:04:32 -06:00
Brian Paul	fa18a427e9	st/mesa: add missing GLSL_TYPE_[U]INT8 cases in st_glsl_type_dword_size() Silences a compiler warning about unhandled enum switch cases. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-03-30 11:04:32 -06:00
Jakob Bornecrantz	e16b92ad7e	vbo: MaxVertexAttribStride is not always set This assert is hit on hardware which does not expose GL 4.4 or GLES 3.1. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>	2018-03-30 17:23:08 +01:00
Daniel Stone	696762eef5	x11: Only report supported DRI3/Present versions The version passed to QueryVersion requests is the version that the client supports. We were just passing in whatever version of XCB was present on the system, which may not be a version that Mesa actually explicitly supports, e.g. it might bring unwanted semantics. Set specific protocol versions which we support, and only pass those. Signed-off-by: Daniel Stone <daniels@collabora.com> Fixes: `7aeef2d4ef` ("dri3: allow building against older xcb (v3)") Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-30 16:53:51 +01:00
Samuel Pitoiset	2a329f4ada	radv: set SAMPLE_RATE to the number of samples of the current fb Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-30 17:32:15 +02:00
Brian Paul	fc1d1dbe81	nir: s/uint/unsigned/ to fix MSVC/MinGW build Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-03-30 08:37:59 -06:00
Eduardo Lima Mitev	e7fc18097e	i965: Don't call process_glsl_ir() for SPIR-V shaders v2: Use 'spirv_data' from gl_linked_shader instead, to check if shader is SPIR-V. (Timothy Arceri) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-30 09:14:56 +02:00
Eduardo Lima Mitev	e7d97aa75d	i965: Call spirv_to_nir() instead of glsl_to_nir() for SPIR-V shaders This is the main fork of the shader compilation code-path, where a NIR shader is obtained by calling spirv_to_nir() or glsl_to_nir(), depending on its nature.. v2: Use 'spirv_data' member from gl_linked_shader to know which method to call. (Timothy Arceri) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-30 09:14:56 +02:00
Eduardo Lima Mitev	abb6d0797c	mesa/glspirv: Add a _mesa_spirv_to_nir() function This is basically a wrapper around spirv_to_nir() that includes arguments setup and post-conversion validation. v2: * Rebase update (SpirVCapabilities not a pointer anymore, spirv_to_nir_options added, and others). * Code-style improvements and remove debug hunk. (Timothy Arceri) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-30 09:14:56 +02:00
Eduardo Lima Mitev	16f6634e7f	mesa/program: Link SPIR-V shaders using the SPIR-V code-path Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-30 09:14:56 +02:00
Eduardo Lima Mitev	9c36e9f862	mesa/glspirv: Add _mesa_spirv_link_shaders() function This is the equivalent to link_shaders() from src/compiler/glsl/linker.cpp, but for SPIR-V programs. It just creates the program and its gl_linked_shader objects, giving drivers the opportunity to implement any linking of SPIR-V shaders they choose, at a later stage. v2: Bail out if we see more that one shader for the same stage, and add a corresponding comment. (Timothy Arceri) v3: * Adds also a linker error log to the condition above, with a reference to the specification issue. (Timothy Arceri) * Squash with the patch adding the function boilerplate (Timothy Arceri) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-30 09:14:56 +02:00
Eduardo Lima Mitev	22b6b3d0a7	mesa: Add a reference to gl_shader_spirv_data to gl_linked_shader This is a reference to the spirv_data object stored in gl_shader, which stores shader SPIR-V data that is needed during linking too. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-30 09:14:56 +02:00
Nicolai Hähnle	ba975140d3	mesa: Implement glSpecializeShaderARB v2: * Use gl_spirv_validation instead of spirv_to_nir. This method just validates the shader. The conversion to NIR will happen later, during linking. (Alejandro Piñeiro) * Use gl_shader_spirv_data struct to store the SPIR-V data. (Eduardo Lima) * Use the 'spirv_data' member to tell if the gl_shader is a SPIR-V shader, instead of a dedicated flag. (Timothy Arceri) Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Signed-off-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-30 09:14:56 +02:00
Alejandro Piñeiro	9063bf7ad8	nir/spirv: add gl_spirv_validation method ARB_gl_spirv adds the ability to use SPIR-V binaries, and a new method, glSpecializeShader. Here we add a new function to do the validation for this function: From OpenGL 4.6 spec, section 7.2.1" "Shader Specialization", error table: INVALID_VALUE is generated if <pEntryPoint> does not name a valid entry point for <shader>. INVALID_VALUE is generated if any element of <pConstantIndex> refers to a specialization constant that does not exist in the shader module contained in <shader>."" v2: rebase update (spirv_to_nir options added, changes on the warning logging, and others) v3: include passing options on common initialization, doesn't call setjmp on common_initialization v4: (after Jason comments): * Rename common_initialization to vtn_builder_create * Move validation method and their helpers to own source file. * Create own handle_constant_decoration_cb instead of reuse existing one v5: put vtn_build_create refactoring to their own patch (Jason) v6: update after vtn_builder_create method renamed, add explanatory comment, tweak existing comment and commit message (Timothy)	2018-03-30 09:14:56 +02:00
Alejandro Piñeiro	bebe3d626e	spirv: add vtn_create_builder Refactored from spirv_to_nir, in order to be reused later. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> v2: renamed method (from vtn_builder_create), add explanatory comment (Timothy)	2018-03-30 09:14:56 +02:00
Alejandro Piñeiro	3761e675e2	i965: initialize SPIR-V capabilities Needed for ARB_gl_spirv. Those are not the same that the Intel vulkan driver. From the ARB_spirv_extensions spec: "3. If a new GL extension is added that includes SPIR-V support via a new SPIR-V extension does it's SPIR-V extension also get enumerated by the SPIR_V_EXTENSIONS_ARB query?. RESOLVED. Yes. It's good to include it for consistency. Any SPIR-V functionality supported beyond the SPIR-V version that is required for the GL API version should be enumerated." So in addition to the core SPIR-V support, there is the possibility of specific GL extensions enabling specific SPIR-V extensions (so capabilities). That would mean that it is possible that OpenGL and Vulkan not having the same capabilities supported, even for the same driver. For this reason it is better to keep them separated. As an example: at the time of this patch writing Intel vulkan driver support multiview, but there isn't any OpenGL multiview GL extension supported. Note: we initialize SPIR-V capabilities at brwCreateContext instead of the usual brw_initialize_context_constants because we want to do that only if the extension is enabled. v2: * Rebase update (SpirVCapabilities not a pointer anymore) * Fill spirv capabilities for OpenGL >= 3.3 (Ian Romanick) v3: * Drop multiview support, as i965 doesn't support any multiview GL extension (Jason) * Fill spirv capabilities only if the extension is enabled (Jason) v4: Capabilities are supported only on gen7+. Added comment and assert (Jason)	2018-03-30 09:14:56 +02:00
Nicolai Hähnle	ca5cc78206	mesa: add gl_constants::SpirVCapabilities For drivers to declare which SPIR-V features they support. v2: Don't use a pointer (Ian Romanick) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-30 09:14:56 +02:00
Ian Romanick	19e0dd1ad3	i965: Don't request GLSL IR lowering of gl_VertexID Let the lowering in NIR handle it instead. This hurts one shader that occurs twice in shader-db (SynMark GSCloth) on IVB and HSW. No other shaders or platforms were affected. total cycles in shared programs: 253438422 -> 253438426 (0.00%) cycles in affected programs: 412 -> 416 (0.97%) helped: 0 HURT: 2 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Antia Puentes <apuentes@igalia.com>	2018-03-29 14:16:07 -07:00
Ian Romanick	2765633116	i965: Silence unused parameter warning src/mesa/drivers/dri/i965/brw_draw_upload.c: In function ‘double_types’: src/mesa/drivers/dri/i965/brw_draw_upload.c:225:34: warning: unused parameter ‘brw’ [-Wunused-parameter] double_types(struct brw_context *brw, ^~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2018-03-29 14:16:04 -07:00
Ian Romanick	042ee4bea2	spirv: Move SPIR-V building to Makefile.spirv.am and spirv/meson.build Future changes will add generated files used only from src/compiler/glsl. These can't be built from Makefile.nir.am, and we can't move all the rules from Makefile.nir.am to Makefile.spirv.am (and it would be silly anyway). v2: Do it for meson too. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (the meson bits) Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> (the automake bits)	2018-03-29 14:16:01 -07:00
Ian Romanick	2c9621ee5c	compiler: All leaf Makefile.am should use += This slightly simplifies later changes that add more Makefile.*.am files. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2018-03-29 14:09:41 -07:00
Ian Romanick	4925347ec5	util: Include bitscan.h directly Previously bitset.h would include u_math.h to get bitscan.h. u_math.h lives in src/gallium/auxiliary/util while both bitset.h and bitscan.h live in src/util. Having the one file directly include another file that lives in the same directory makes much more sense. As a side-effect, several files need to directly include standard header files that were previously indirectly included. v2: Fix build break in src/amd/common/ac_nir_to_llvm.c. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2018-03-29 14:09:30 -07:00
Ian Romanick	ef7a4c9015	util: Optimize util_is_power_of_two_nonzero Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Suggested-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2018-03-29 14:09:29 -07:00
Ian Romanick	cd18aa1e50	util: Use util_is_power_of_two_nonzero in u_vector Previously size=0, element_size=0 would have been allowed. That combination can only lead to despair. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2018-03-29 14:09:28 -07:00
Ian Romanick	22fbb5c594	util: Add and use util_is_power_of_two_nonzero Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2018-03-29 14:09:28 -07:00
Ian Romanick	d76c204d05	util: Move util_is_power_of_two to bitscan.h and rename to util_is_power_of_two_or_zero The new name make the zero-input behavior more obvious. The next patch adds a new function with different zero-input behavior. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Suggested-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2018-03-29 14:09:23 -07:00
Dylan Baker	a3a16d4aa7	meson: use dep_libdrm version for pkg-config This corrects pkg-config to use the libdrm version (as computed by the previous patch) instead of using a hardcoded value that may or may not (probably not) be right. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-03-29 10:20:52 -07:00
Dylan Baker	c445b1d56f	meson: Use the same version for all libdrm checks Currently each driver specifies it's own version, and core libdrm specifies a version. In the most common case this is fine, since there will be exactly one libdrm installed on a system, but if there are more than one it's possible that mesa will be linked against different versions of libdrm. There is also the possibility that the current approach makes the pkg-config files we generate incorrect, since there could be #defines that use newer features if they're available. This patch corrects all of that. All of the versions are still set by driver (along with a default core version). Then all of the drivers that are enabled have their versions compared and the highest version is selected, then all libdrm checks are made with that version. v2: - Reorder the list to have the name first and whether the dependency is needed second (Eric) Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-03-29 10:20:52 -07:00
Dylan Baker	acadf06f56	meson: group libdrm dependencies The reason libdrm is after libdrm_* will be made clear in later patches. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-03-29 10:18:47 -07:00
Brian Paul	e520ca562a	gl.h: remove stale comment, trailing whitespace Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-29 08:46:55 -06:00
Brian Paul	4ff6a7b0de	glapi: add glBlendBarrier(), glPrimitiveBoundingBox() prototypes in glapi_dispatch.c, as we have for many other GLES functions. Fixes a cross-compile issue (missing prototype) when GLES support is disabled. Reviewed-by: Sinclair Yeh <syeh@vmware.com>	2018-03-29 08:45:10 -06:00
Brian Paul	5cd5878a1f	st/mesa: silence unhandled switch case warning And improve the unreachable() error message. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-03-29 08:45:10 -06:00
Henri Verbeet	0b73c86b80	mesa: Inherit texture view multi-sample information from the original texture images. Found running "The Witness" in Wine. Without this patch, texture views created on multi-sample textures would have a GL_TEXTURE_SAMPLES of 0. All things considered such views actually work surprisingly well, but when combined with (plain) multi-sample textures in a framebuffer object, the resulting FBO is incomplete because the sample counts don't match. CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Henri Verbeet <hverbeet@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-03-29 14:38:25 +04:30
Samuel Pitoiset	e45fe0ed66	radv: fix scanning output_usage_mask with structs To fix a regression in: dEQP-VK.spirv_assembly.instruction.graphics.variable_init.output.struct And the following regressions (Polaris only): dEQP-VK.glsl.indexing.varying_array.* Fixes: `f3275ca01c` ("ac/nir: only enable used channels when exporting parameters") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-29 10:22:10 +02:00
Karol Herbst	6179a87c1e	nvc0/ir: fix emiting NOTs with predicates Signed-off-by: Karol Herbst <karolherbst@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-03-29 03:06:36 +02:00
Aaron Watry	1dae92f150	broadcom/vc4: Fix out-of-tree build with automake. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-03-28 17:48:41 -07:00
Eric Anholt	81f82ecc56	broadcom/vc5: Start using nir_opt_move_load_ubo(). In the absence of a general NIR or VIR-level scheduler, this at least avoids spilling in GTF-GLES3.gtf.GL3Tests.uniform_buffer_object.uniform_buffer_object_storage_layouts	2018-03-28 17:48:41 -07:00
Eric Anholt	1fe4c748f7	broadcom/vc5: Fix setup of integer surface clear values. I'm disappointed that the compiler didn't warn me about use of uninitialized uc in these paths. Just use the incoming clear color instead of the packing temporary if we're doing our own packing. Fixes GTF-GLES3.gtf.GL3Tests.color_buffer_float.color_buffer_float_clamp_*	2018-03-28 17:48:41 -07:00
Eric Anholt	123ee37627	broadcom/vc5: Stop trying to swizzle around RGBA4 clear color. We always want A in the A slot in the tile buffer, and any other swapping should happen elsewhere. Fixes RGBA4-using cases in fbo-clear-formats and GTF-GLES3.gtf.GL3Tests.color_buffer_float.color_buffer_float_clamp_fixed.	2018-03-28 17:48:41 -07:00
Eric Anholt	2f4c4e10c2	broadcom/vc5: Work around scissor w/h==0 bug same as rasterizer discard. The 7268 HW apparently lets some rendering through in this case. Fixes GTF-GLES2.gtf.GL2FixedTests.scissor.scissor	2018-03-28 17:48:41 -07:00
Eric Anholt	0349c79bdc	st: Don't try to finalize the texture in st_render_texture(). We can't necessarily finalize the texture at this point if we're rendering to a texture image whose format is different from the baselevel's format. This was introduced as a fix for fbo-incomplete-texture-03 in `de414f4915`, but the later fix for vmware on that testcase in `95d5c48f68` made it unnecessary. Fixes assertion failures in util_resource_copy_region() in KHR-GLES3.copy_tex_image_conversions.forbidden.* when trying to finalize an R8 texture image to the RG8 texture object's pt. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-03-28 17:48:41 -07:00
Marek Olšák	e159d46fc7	drirc: whitelist glthread for Medieval II: TW, Carnivores: DHR, Far Cry 2	2018-03-28 20:00:48 -04:00
Daniel Schürmann	b91cd5dba4	radv: enable VK_AMD_shader_trinary_minmax extension Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-29 01:29:39 +02:00
Daniel Schürmann	d00fb7ce54	ac: add support for trinary_minmax instructions v2: Add missing break (Bas) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-29 01:29:35 +02:00
Dave Airlie	fe5d5d19b0	spirv: add support for SPV_AMD_shader_trinary_minmax Co-authored-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de> Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-29 01:29:29 +02:00
Dave Airlie	3e830a1af2	nir: add support for min/max/median of 3 srcs These are needed for SPV_AMD_shader_trinary_minmax, the AMD HW supports these. Co-authored-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de> Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-29 01:28:58 +02:00
Marek Olšák	025105453a	radeonsi: simplify DCC format categories Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-28 18:45:52 -04:00
Marek Olšák	3fea237c85	radeonsi: don't use the SPI barrier management bug workaround Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-28 18:45:52 -04:00
Marek Olšák	3045c5f274	radeonsi: use maximum OFFCHIP_BUFFERING on Vega12 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-28 18:45:52 -04:00
Bas Nieuwenhuizen	4503ff760c	ac/nir: Add workaround for GFX9 buffer views. On GFX9 whether the buffer size is interpreted as elements or bytes depends on whether IDXEN is enabled in the instruction. If the index is a constant zero, LLVM optimizes IDXEN to 0. Now the size in elements is interpreted in bytes which of course results in out of bounds accesses. The correct fix is most likely to disable the LLVM optimization, but we need something to work with LLVM <= 6.0. radeonsi does the max between stride and element count on the CPU but that results in the size intrinsics returning the wrong size for the buffer. This would cause CTS errors for radv. v2: Also include the store changes. Fixes: `e38685cc62` 'Revert "radv: disable support for VEGA for now."' Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-29 00:03:03 +02:00
Marek Olšák	4f96747530	ac/surface: set AddrSurfInfoIn.format = ADDR_FMT_8 for stencil, add assertions Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105738 Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-28 17:23:41 -04:00
Samuel Pitoiset	1c4fdcf444	radv: enable VK_EXT_sampler_filter_minmax Only enable for CIK+ because it's buggy on SI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-28 22:55:48 +02:00
Samuel Pitoiset	413d77e7f9	radv: add support for VK_EXT_sampler_filter_minmax The driver only supports the required formats for now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-28 22:55:48 +02:00
Samuel Pitoiset	99b52aa1da	radv: rename VEGA10 device name Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-28 20:15:17 +02:00
Samuel Pitoiset	4d2c46dda3	radv: add support for Vega12 Based on RadeonSI. Untested. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-28 20:15:14 +02:00
Matt Turner	3e6326deb9	build: Fix up nir_intrinsics.Plo nir_intrinsics.c existed as a static file until commit `76dfed8ae2` began generating it as part of the build process. autotools is incapable of coping, and so a build-tree from before this commit would then fail with it: [4]: *** No rule to make target '../../../mesa/src/compiler/nir/nir_intrinsics.c', needed by 'nir/nir_intrinsics.lo'. Stop. Add a few lines to configure.ac to update the broken build files. Fixes: `76dfed8ae2` ("nir: mako all the intrinsics")	2018-03-28 11:09:23 -07:00
Dylan Baker	2cfc68d984	autotools: Include intel/dev/meson.build in tarball Fixes: `272bef0601` ("intel: Split gen_device_info out into libintel_dev") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-03-28 10:19:05 -07:00
Dylan Baker	bc2fdb9759	autotools: include meson_get_version Otherwise meson won't read the VERSION file and won't set a version. That means that pkg-config files will have version unset as well. Fixes: `3e9533d9b8` ("meson: Add script to use VERSION file for getting version") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-03-28 10:13:23 -07:00
Eric Engestrom	d77844a529	docs: fix 18.0 release note version Fixes: `839fb3a696` "docs: Update 18.0.0 release notes" Cc: "18.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-28 16:52:56 +01:00
Marek Olšák	20eb44ad65	radeonsi: add support for Vega12 Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2018-03-28 11:37:43 -04:00
Marek Olšák	5425d32fcf	amd/addrlib: update to the latest version for Vega12 Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2018-03-28 11:37:43 -04:00
Eric Engestrom	431a1d12cc	gbm: remove never-implemented function I assume this was implemented in a previous version of that commit, but was removed in the version that actually landed. Fixes: `8430af5ebe` "Add support for swrast to the DRM EGL platform" Cc: Giovanni Campagna <gcampagna@src.gnome.org> Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-28 16:25:52 +01:00
Stefan Schake	77ade10c86	android: Use new nir intrinsics python scripts Fixes: `76dfed8ae2` ("nir: mako all the intrinsics") Signed-off-by: Stefan Schake <stschake@gmail.com> Acked-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-03-28 14:48:47 +03:00
Eric Anholt	a691fa4a1b	broadcom/vc5: Fix padding of NPOT miplevels >= 2. The power-of-two padded size that gets minified is based on level 1's dimensions, not level 0's, which starts to differ at a width of 9. Fixes all failures on texelFetch fs sampler2D 1x1x1-64x64x1	2018-03-27 21:16:23 -07:00
Timothy Arceri	92fa89a08d	ac/radeonsi: pass bindless bool to load_sampler_desc() We also fix the base_index for bindless by using the driver location. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-28 12:56:16 +11:00
Timothy Arceri	5411b98d52	st/glsl_to_nir: set driver location for bindless images and samplers Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-28 12:56:15 +11:00
Timothy Arceri	f94b6b79be	radeonsi/nir: set uses_bindless_samplers for samplers Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-28 12:56:15 +11:00
Timothy Arceri	5c810a2c05	nir: add bindless to nir data Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-28 12:56:15 +11:00
Kenneth Graunke	fb18d0dbe4	i965: Drop unnecessary bo->align field. bo->align is always 0; there's no need to waste 8 bytes storing it. Thanks to C99 initializers zeroing fields, we can completely drop the only read of the field altogether. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-27 18:41:44 -07:00
Kenneth Graunke	037d738a23	i965: Drop unused alignment parameter from brw_bo_alloc(). brw_bo_alloc no longer uses this parameter, so there's no point. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-27 18:41:44 -07:00
Kenneth Graunke	07ec3a2e0f	i965: Drop alignment parameter from bo_alloc_internal(). Buffers are always page aligned on 965+ hardware; I believe this extra parameter is a vestige from the Gen2-3 era. All callers pass 0, and in fact we assert that the alignment is 0 unless BO_ALLOC_BUSY is set (for some reason). We can just drop the parameter and set the value to 0 explicitly. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-27 18:41:44 -07:00
Kenneth Graunke	b9a54b18f6	i965: Drop BO_ALLOC_BUSY in intel_miptree_create_for_bo(). intel_miptree_create_for_bo does not actually allocate a BO, so specifying allocation flags accomplishes nothing and is confusing. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-27 18:41:44 -07:00
Kenneth Graunke	2c01215c1b	i965: Drop PIPE_CONTROL_NO_WRITE from various calls. This is just zero - passing nothing already gives us a post-sync operation of "nothing". Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-27 18:41:44 -07:00
Jason Ekstrand	5f21a7afe0	nir/intrinsics: Don't report negative dest_components I have no idea why but having dest_components == -1 was causing a memory leak somewhere. Without this, you can't get through a full shader-db run without running out of memory. Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-03-27 18:18:26 -07:00
Jason Ekstrand	7e38f49a8f	intel/fs: Don't emit a des copy for image ops with has_dest == false This was causing us to walk dest_components times over a thing with no destination. This happened to work because all of the image intrinsics without a destination also happened to have dest_components == 0. We shouldn't be reading dest_components if has_dest == false. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-03-27 18:18:21 -07:00
Ilia Mirkin	776e6af879	nvc0/ir: fix INTERP_* with indirect inputs There were two problems, both of which are fixed now: - The indirect address was not being shifted by 4 - The indirect address was being placed as an argument in the offset case This fixes some of the new interpolateAt* piglits which now test for these situations. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2018-03-27 20:41:11 -04:00
Timothy Arceri	629ee690ad	nir: fix crash in loop unroll corner case When an if nesting inside anouther if is optimised away we can end up with a loop terminator and following block that looks like this: if ssa_596 { block block_5: /* preds: block_4 / vec1 32 ssa_601 = load_const (0xffffffff / -nan /) break / succs: block_8 / } else { block block_6: / preds: block_4 / / succs: block_7 / } block block_7: / preds: block_6 */ vec1 32 ssa_602 = phi block_6: ssa_552 vec1 32 ssa_603 = phi block_6: ssa_553 vec1 32 ssa_604 = iadd ssa_551, ssa_66 The problem is the phis. Loop unrolling expects the last block in the loop to be empty once we splice the instructions in the last block into the continue branch. The problem is we cant move phis so here we lower the phis to regs when preparing the loop for unrolling. As it could be possible to have multiple additional blocks/ifs following the terminator we just convert all phis at the top level of the loop body for simplicity. We also add some comments to loop_prepare_for_unroll() while we are here. Fixes: `51daccb289` "nir: add a loop unrolling pass" Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105670	2018-03-28 09:59:38 +11:00
Timothy Arceri	48f6014903	st/glsl_to_nir: correctly handle arrays packed across multiple vars Fixes piglit test: tests/spec/arb_enhanced_layouts/execution/component-layout/vs-fs-array-interleave-range.shader_test Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-28 09:59:38 +11:00
Timothy Arceri	b260efbd5e	radeonsi/nir: fix input processing for packed varyings The location was only being incremented the first time we processed a location. This meant we would incorrectly skip some elements of an array if the first element was packed and proccessed previously but other elements were not. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-28 09:59:38 +11:00
Timothy Arceri	51f175028d	ac/nir_to_llvm: fix component packing for double outputs We need to wait until after the writemask is widened before we adjust it for component packing. Together with the previous patch this fixes a number of arb_enhanced_layouts component layout piglit tests. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-28 09:59:37 +11:00
Timothy Arceri	fc51fdbcde	st/glsl_to_nir: fix driver location for dual-slot packed doubles Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-28 09:59:37 +11:00
Timothy Arceri	47eee04556	radeonsi/nir: fix scanning of multi-slot output varyings This fixes tcs/tes varying arrays where we dont lower indirects and therefore don't split arrays. Here we also fix useagemask for dual slot doubles. Fixes a number of arb_tessellation_shader piglit tests. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-28 09:59:37 +11:00
Eric Anholt	9f1b4f6204	broadcom/vc5: Fix RG16I/UI texture sampling. How many times did I look at this table without noticing the missing 'G' in the texture column? Fixes KHR-GLES3.copy_tex_image_conversions.required.* on 7268.	2018-03-27 15:49:58 -07:00
Rob Clark	16581904b0	nir: fix generated nir_intrinsics.c for MSVC Apparently it is not happy about things like: .foo = {} So skip over initializers for empty lists. Fixes: `76dfed8ae2` Reported-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-27 15:01:11 -04:00
Emil Velikov	eda2f58d15	docs: update calendar 18.0.0 is out Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-27 19:11:45 +01:00
Emil Velikov	02f89b62fe	docs: add news item and link release notes for 18.0.0 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-27 19:08:48 +01:00
Emil Velikov	62eb721ed8	docs: add sha256 checksums for 18.0.0 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `fb64913d19`)	2018-03-27 19:06:27 +01:00
Emil Velikov	839fb3a696	docs: Update 18.0.0 release notes Note: the file was originally 17.4.0, yet git stuggles to detect the move :-\ Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `dceb1ce807`)	2018-03-27 19:06:19 +01:00
Rob Clark	76dfed8ae2	nir: mako all the intrinsics I threatened to do this a long time ago.. I probably should have done it a long time ago when there where many fewer intrinsics. But the system of macro/#include magic for dealing with intrinsics is a bit annoying, and python has the nice property of optional fxn params, making it possible to define new intrinsics while ignoring parameters that are not applicable (and naming optional params). And not having to specify various array lengths explicitly is nice too. I think the end result makes it easier to add new intrinsics. v2: couple small fixes found with a test program to compare the old and new tables v3: misc comments, don't rely on capture=true for meson.build, get rid of system_values table to avoid return value of intrinsic() and mostly remove side-effects, add autotools build support v4: scons build Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-27 08:36:37 -04:00
Rob Clark	cc3a88e81d	nir: fix per_vertex_output intrinsic This is supposed to have both BASE and COMPONENT but num_indices was inadvertantly set to 1. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-27 08:20:40 -04:00
Rob Clark	1e0a06000b	glsl_types: fix build break with intel/msvc compiler The VECN() macro was taking advantage of a GCC specific feature that is not available on lesser compilers, mostly for the purposes of avoiding a macro that encoded a return statement. But as suggested by Ian, we could just have the macro produce the entire method body and avoid the need for this. So let's do that instead. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105740 Fixes: `f407edf340` Cc: Emil Velikov <emil.velikov@collabora.com> Cc: Timothy Arceri <tarceri@itsqueeze.com> Cc: Roland Scheidegger <sroland@vmware.com> Cc: Ian Romanick <idr@freedesktop.org> Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-03-27 08:17:11 -04:00
Lin Johnson	41cf30b8bc	mesa: add GL_HALF_FLOAT as supported type to readpixels EXT_color_buffer_float spec states: "An INVALID_OPERATION error is generated ... if the color buffer is a floating-point format and type is not FLOAT, HALF FLOAT, or UNSIGNED_INT_10F_11F_11F_REV." This means that GL_HALF_FLOAT type should be supported when color buffer has floating-point format. Fixes Android CTS test android.view.cts.PixelCopyTest. v2: remove comments of EXT_color_buffer_half_float as EXT_color_buffer_float can use type GL_HALF_FLOAT Signed-off-by: Lin Johnson <johnson.lin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-03-27 09:04:52 +03:00
Eric Anholt	0024b77e87	broadcom/vc5: Fix swizzling of RGB10_A2UI render targets. This is the actual hardware layout, and we were only swizzling R/B back around in texturing. Fixes part of KHR-GLES3.copy_tex_image_conversions.required.cubemap_negx_cubemap_negx in simulation.	2018-03-26 17:46:23 -07:00
Eric Anholt	c2b13627d9	broadcom/vc5: Fix extraneous register index in QIR dumping of TLBU writes. Just like TLB without a config uniform, we don't have a register index.	2018-03-26 17:46:23 -07:00
Eric Anholt	494da6c2dd	broadcom/vc5: Implement workaround for GFXH-1431. This should fix some blending errors, but doesn't impact any testcases in the CTS.	2018-03-26 17:46:19 -07:00
Eric Anholt	1bf466270d	broadcom/vc5: Fix EZ disabling and allow using GT/GE direction as well. Once we've disabled EZ for some draws, we need to not use EZ on future draws. Implementing that made implementing the GT/GE direction trivial. Fixes KHR-GLES3.shaders.fragdepth.compare.no_write on V3D 4.1 simulation.	2018-03-26 17:46:19 -07:00
Eric Anholt	262208eb3c	broadcom/vc5: Disable TF on V3D 4.x when drawing with queries disabled. On 3.x, we just don't flag the primitive as needing TF, but those primitive bits are now allocated to the new primitive types. Now we need to actually update the enable flag at draw time.	2018-03-26 17:46:19 -07:00
Eric Anholt	ef2cf9cc3c	broadcom/vc5: Disable transform feedback on V3D 4.x at the end of the job. The next job from this client will turn it back on unless TF gets disabled, but we don't want the state to leak from this client to another (which causes GPU hangs).	2018-03-26 17:46:19 -07:00
Eric Anholt	1fa820cef8	broadcom/vc5: Move the BCL epilogue code to a per-version compile. I need to do some new packets for transform feedback on 4.1.	2018-03-26 17:46:19 -07:00
Eric Anholt	3387864130	broadcom/vc5: Fix transform feedback in the presence of point size. I had this note to myself, and it turns out that a lot of CTS tests use XFB with points to get data out without using a fragment shader. Keep track of two sets of precomputed TF specs (point size in VPM prologue or not), and switch between them when we enable/disable point size.	2018-03-26 17:46:19 -07:00
Eric Anholt	09ac5ade8f	broadcom/vc5: Split transform feedback specs update from buffers. The specs update will be changing based on additional state flags in the next commit, and this unindents the buffer update code.	2018-03-26 17:46:18 -07:00
Eric Anholt	9e62aec9cd	broadcom/vc5: Limit each transform feedback data spec to 16 dwords. The length-1 field only has 4 bits, so we need to generate separate specs when there's too much TF output per buffer. Fixes GTF-GLES3.gtf.GL3Tests.transform_feedback.transform_feedback_builtin_type and transform_feedback_max_interleaved.	2018-03-26 17:33:37 -07:00
Eric Anholt	0356db022d	gallium/u_vbuf: Protect against overflow with large instance divisors. GTF-GLES3.gtf.GL3Tests.instanced_arrays.instanced_arrays_divisor uses -1 as a divisor, so we would overflow to count=0 and upload no data, triggering the assert below. We want to upload 1 element in this case, fixing the test on VC5. v2: Use some more obvious logic, and explain why we don't use the normal round_up(). Reviewed-by: Brian Paul <brianp@vmware.com>	2018-03-26 17:33:37 -07:00
Eric Anholt	d491ad1d36	st: Allow accelerated CopyTexImage from RGBA to RGB. There's nothing to worry about here -- the A channel just gets dropped by the blit. This avoids a segfault in the fallback path when copying from a RGBA16_SINT renderbuffer to a RGB16_SINT destination represented by an RGBA16_SINT texture (the fallback path tries to get/fetch to float buffers, but the float pack/unpack functions are NULL for SINT/UINT). Fixes KHR-GLES3.packed_pixels.pbo_rectangle.rgba16i on VC5. v2: Extract the logic to a helper function and explain what's going on better. v3: const-qualify args Reviewed-by: Brian Paul <brianp@vmware.com>	2018-03-26 17:33:37 -07:00
Marek Olšák	7d2079908d	winsys/amdgpu: always allow GTT placements on APUs Reviewed-by: Christian König <christian.koenig@amd.com>	2018-03-26 19:23:30 -04:00
Marek Olšák	769603564e	radeonsi: don't reallocate on DMABUF export if local BOs are disabled	2018-03-26 19:22:12 -04:00
Timothy Arceri	56b867395d	glsl: fix infinite loop caused by bug in loop unrolling pass Just checking for 2 jumps is not enough to be sure we can do a complex loop unroll. We need to make sure we also have also found 2 loop terminators. Without this we were attempting to unroll a loop where the second jump was nested inside multiple ifs which loop analysis is unable to detect as a terminator. We ended up splicing out the first terminator but failed to actually unroll the loop, this resulted in the creation of a possible infinite loop. Fixes: `646621c66d` "glsl: make loop unrolling more like the nir unrolling path" Tested-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105670	2018-03-27 09:15:02 +11:00
Vinson Lee	dc94a0506f	gallium: Do not add -Wframe-address option for gcc <= 4.4. This patch fixes these build errors with GCC 4.4. Compiling src/gallium/auxiliary/util/u_debug_stack.c ... src/gallium/auxiliary/util/u_debug_stack.c: In function ‘debug_backtrace_capture’: src/gallium/auxiliary/util/u_debug_stack.c:268: error: #pragma GCC diagnostic not allowed inside functions src/gallium/auxiliary/util/u_debug_stack.c:269: error: #pragma GCC diagnostic not allowed inside functions src/gallium/auxiliary/util/u_debug_stack.c:271: error: #pragma GCC diagnostic not allowed inside functions Fixes: `370e356eba` ("gallium: silence __builtin_frame_address nonzero argument is unsafe warning") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105529 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-03-26 11:23:51 -07:00
Alyssa Rosenzweig	029f1a2d61	gallium: Correct minor typo in header comments Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-03-26 10:15:04 -07:00
Rafael Antognolli	27581d18bc	intel/aubinator_error_decode: Decode more registers. Decode SC_INSTDONE, ROW_INSTDONE and SAMPLER_INSTDONE. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-26 09:25:57 -07:00
Rafael Antognolli	70d7c70e8d	intel/genxml: Add SAMPLER_INSTDONE register. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-26 09:25:57 -07:00
Rafael Antognolli	227edf05f3	intel/genxml: Add ROW_INSTDONE register. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-26 09:25:57 -07:00
Rafael Antognolli	4c0ae36143	intel/genxml: Add SC_INSTDONE register. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-26 09:25:57 -07:00
Ian Romanick	91225cb33f	i965/vec4: Fix null destination register in 3-source instructions A recent commit (see below) triggered some cases where conditional modifier propagation and dead code elimination would cause a MAD instruction like the following to be generated: mad.l.f0 null, ... Matt pointed out that fs_visitor::fixup_3src_null_dest() fixes cases like this in the scalar backend. This commit basically ports that code to the vec4 backend. NOTE: I have sent a couple tests to the piglit list that reproduce this bug without the commit mentioned below. This commit fixes those tests. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Tested-by: Tapani Pälli <tapani.palli@intel.com> Cc: mesa-stable@lists.freedesktop.org Fixes: `ee63933a7` ("nir: Distribute binary operations with constants into bcsel") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105704	2018-03-26 08:50:44 -07:00
Ian Romanick	2c643fd978	nir: Don't condition 'a-b < 0' -> 'a < b' on is_not_used_by_conditional Now that i965 recognizes that a-b generates the same conditions as 'a < b', there is no reason to condition this transformation on 'is not used by conditional.' Since this was the only user of the is_not_used_by_conditional function, delete it. All Gen6+ platforms had similar results. (Skylake shown) total instructions in shared programs: 14400775 -> 14400595 (<.01%) instructions in affected programs: 36712 -> 36532 (-0.49%) helped: 182 HURT: 26 helped stats (abs) min: 1 max: 2 x̄: 1.13 x̃: 1 helped stats (rel) min: 0.15% max: 1.82% x̄: 0.70% x̃: 0.62% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.24% max: 1.02% x̄: 0.82% x̃: 0.90% 95% mean confidence interval for instructions value: -0.97 -0.76 95% mean confidence interval for instructions %-change: -0.59% -0.43% Instructions are helped. total cycles in shared programs: 532929592 -> 532926345 (<.01%) cycles in affected programs: 478660 -> 475413 (-0.68%) helped: 187 HURT: 22 helped stats (abs) min: 2 max: 200 x̄: 20.99 x̃: 18 helped stats (rel) min: 0.23% max: 24.10% x̄: 1.48% x̃: 1.03% HURT stats (abs) min: 1 max: 214 x̄: 30.86 x̃: 11 HURT stats (rel) min: 0.01% max: 23.06% x̄: 3.12% x̃: 0.86% 95% mean confidence interval for cycles value: -19.50 -11.57 95% mean confidence interval for cycles %-change: -1.42% -0.58% Cycles are helped. GM45 and Iron Lake had similar results. (Iron Lake shown) total cycles in shared programs: 177851578 -> 177851810 (<.01%) cycles in affected programs: 24408 -> 24640 (0.95%) helped: 2 HURT: 4 helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4 helped stats (rel) min: 0.42% max: 0.47% x̄: 0.44% x̃: 0.44% HURT stats (abs) min: 24 max: 108 x̄: 60.00 x̃: 54 HURT stats (rel) min: 0.52% max: 1.62% x̄: 1.04% x̃: 1.02% 95% mean confidence interval for cycles value: -7.75 85.08 95% mean confidence interval for cycles %-change: -0.39% 1.49% Inconclusive result (value mean confidence interval includes 0). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-03-26 08:50:43 -07:00
Ian Romanick	cd635d149b	i965/vec4: Propagate conditional modifiers from compares to adds No changes on Broadwell or later as those platforms do not use the vec4 backend. Ivy Bridge and Haswell had similar results. (Ivy Bridge shown) total instructions in shared programs: 11682119 -> 11681056 (<.01%) instructions in affected programs: 150403 -> 149340 (-0.71%) helped: 950 HURT: 0 helped stats (abs) min: 1 max: 16 x̄: 1.12 x̃: 1 helped stats (rel) min: 0.23% max: 2.78% x̄: 0.82% x̃: 0.71% 95% mean confidence interval for instructions value: -1.19 -1.04 95% mean confidence interval for instructions %-change: -0.84% -0.79% Instructions are helped. total cycles in shared programs: 257495842 -> 257495238 (<.01%) cycles in affected programs: 270302 -> 269698 (-0.22%) helped: 271 HURT: 13 helped stats (abs) min: 2 max: 14 x̄: 2.42 x̃: 2 helped stats (rel) min: 0.06% max: 1.13% x̄: 0.32% x̃: 0.28% HURT stats (abs) min: 2 max: 12 x̄: 4.00 x̃: 4 HURT stats (rel) min: 0.15% max: 1.18% x̄: 0.30% x̃: 0.26% 95% mean confidence interval for cycles value: -2.41 -1.84 95% mean confidence interval for cycles %-change: -0.31% -0.26% Cycles are helped. Sandy Bridge total instructions in shared programs: 10430493 -> 10429727 (<.01%) instructions in affected programs: 120860 -> 120094 (-0.63%) helped: 766 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.30% max: 2.70% x̄: 0.78% x̃: 0.73% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.80% -0.75% Instructions are helped. total cycles in shared programs: 146138718 -> 146138446 (<.01%) cycles in affected programs: 244114 -> 243842 (-0.11%) helped: 132 HURT: 0 helped stats (abs) min: 2 max: 4 x̄: 2.06 x̃: 2 helped stats (rel) min: 0.03% max: 0.43% x̄: 0.16% x̃: 0.19% 95% mean confidence interval for cycles value: -2.12 -2.00 95% mean confidence interval for cycles %-change: -0.18% -0.15% Cycles are helped. GM45 and Iron Lake had identical results. (Iron Lake shown) total instructions in shared programs: 7780251 -> 7780248 (<.01%) instructions in affected programs: 175 -> 172 (-1.71%) helped: 3 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 1.49% max: 2.44% x̄: 1.81% x̃: 1.49% total cycles in shared programs: 177851584 -> 177851578 (<.01%) cycles in affected programs: 9796 -> 9790 (-0.06%) helped: 3 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.05% max: 0.08% x̄: 0.06% x̃: 0.05% Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-03-26 08:50:43 -07:00
Ian Romanick	780f307ba8	i965/vec4: Allow cmod propagation when src0 is a uniform or shader input No shader-db changes. This source must have been written by a previous instruction, so it cannot be a uniform or a shader input. However, this change allows the next commit to help more shaders. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-03-26 08:50:43 -07:00
Ian Romanick	020b0055e7	i965/fs: Propagate conditional modifiers from compares to adds The math inside the add and the cmp in this instruction sequence is the same. We can utilize this to eliminate the compare. add(8) g5<1>F g2<8,8,1>F g64.5<0,1,0>F { align1 1Q compacted }; cmp.z.f0(8) null<1>F g2<8,8,1>F -g64.5<0,1,0>F { align1 1Q switch }; (-f0) sel(8) g8<1>F (abs)g5<8,8,1>F 3e-37F { align1 1Q }; This is reduced to: add.z.f0(8) g5<1>F g2<8,8,1>F g64.5<0,1,0>F { align1 1Q compacted }; (-f0) sel(8) g8<1>F (abs)g5<8,8,1>F 3e-37F { align1 1Q }; This optimization pass could do even better. The nature of converting vectorized code from the GLSL front end to scalar code in NIR results in sequences like: add(8) g7<1>F g4<8,8,1>F g64.5<0,1,0>F { align1 1Q compacted }; add(8) g6<1>F g3<8,8,1>F g64.5<0,1,0>F { align1 1Q compacted }; add(8) g5<1>F g2<8,8,1>F g64.5<0,1,0>F { align1 1Q compacted }; cmp.z.f0(8) null<1>F g2<8,8,1>F -g64.5<0,1,0>F { align1 1Q switch }; (-f0) sel(8) g8<1>F (abs)g5<8,8,1>F 3e-37F { align1 1Q }; cmp.z.f0(8) null<1>F g3<8,8,1>F -g64.5<0,1,0>F { align1 1Q switch }; (-f0) sel(8) g10<1>F (abs)g6<8,8,1>F 3e-37F { align1 1Q }; cmp.z.f0(8) null<1>F g4<8,8,1>F -g64.5<0,1,0>F { align1 1Q switch }; (-f0) sel(8) g12<1>F (abs)g7<8,8,1>F 3e-37F { align1 1Q }; In this sequence, only the first cmp.z is removed. With different scheduling, all 3 could get removed. Skylake total instructions in shared programs: 14407009 -> 14400173 (-0.05%) instructions in affected programs: 1307274 -> 1300438 (-0.52%) helped: 4880 HURT: 0 helped stats (abs) min: 1 max: 33 x̄: 1.40 x̃: 1 helped stats (rel) min: 0.03% max: 8.70% x̄: 0.70% x̃: 0.52% 95% mean confidence interval for instructions value: -1.45 -1.35 95% mean confidence interval for instructions %-change: -0.72% -0.69% Instructions are helped. total cycles in shared programs: 532943169 -> 532923528 (<.01%) cycles in affected programs: 14065798 -> 14046157 (-0.14%) helped: 2703 HURT: 339 helped stats (abs) min: 1 max: 1062 x̄: 12.27 x̃: 2 helped stats (rel) min: <.01% max: 28.72% x̄: 0.38% x̃: 0.21% HURT stats (abs) min: 1 max: 739 x̄: 39.86 x̃: 12 HURT stats (rel) min: 0.02% max: 27.69% x̄: 1.38% x̃: 0.41% 95% mean confidence interval for cycles value: -8.66 -4.26 95% mean confidence interval for cycles %-change: -0.24% -0.14% Cycles are helped. LOST: 0 GAINED: 1 Broadwell total instructions in shared programs: 14719636 -> 14712949 (-0.05%) instructions in affected programs: 1288188 -> 1281501 (-0.52%) helped: 4845 HURT: 0 helped stats (abs) min: 1 max: 33 x̄: 1.38 x̃: 1 helped stats (rel) min: 0.03% max: 8.00% x̄: 0.70% x̃: 0.52% 95% mean confidence interval for instructions value: -1.43 -1.33 95% mean confidence interval for instructions %-change: -0.72% -0.68% Instructions are helped. total cycles in shared programs: 559599253 -> 559581699 (<.01%) cycles in affected programs: 13315565 -> 13298011 (-0.13%) helped: 2600 HURT: 269 helped stats (abs) min: 1 max: 2128 x̄: 12.24 x̃: 2 helped stats (rel) min: <.01% max: 23.95% x̄: 0.41% x̃: 0.20% HURT stats (abs) min: 1 max: 790 x̄: 53.07 x̃: 20 HURT stats (rel) min: 0.02% max: 15.96% x̄: 1.55% x̃: 0.75% 95% mean confidence interval for cycles value: -8.47 -3.77 95% mean confidence interval for cycles %-change: -0.27% -0.18% Cycles are helped. LOST: 0 GAINED: 8 Haswell total instructions in shared programs: 12978609 -> 12973483 (-0.04%) instructions in affected programs: 932921 -> 927795 (-0.55%) helped: 3480 HURT: 0 helped stats (abs) min: 1 max: 33 x̄: 1.47 x̃: 1 helped stats (rel) min: 0.03% max: 7.84% x̄: 0.78% x̃: 0.58% 95% mean confidence interval for instructions value: -1.53 -1.42 95% mean confidence interval for instructions %-change: -0.80% -0.75% Instructions are helped. total cycles in shared programs: 410270788 -> 410250531 (<.01%) cycles in affected programs: 10986161 -> 10965904 (-0.18%) helped: 2087 HURT: 254 helped stats (abs) min: 1 max: 2672 x̄: 14.63 x̃: 4 helped stats (rel) min: <.01% max: 39.61% x̄: 0.42% x̃: 0.21% HURT stats (abs) min: 1 max: 519 x̄: 40.49 x̃: 16 HURT stats (rel) min: 0.01% max: 12.83% x̄: 1.20% x̃: 0.47% 95% mean confidence interval for cycles value: -12.82 -4.49 95% mean confidence interval for cycles %-change: -0.31% -0.18% Cycles are helped. LOST: 0 GAINED: 5 Ivy Bridge total instructions in shared programs: 11686082 -> 11681548 (-0.04%) instructions in affected programs: 937696 -> 933162 (-0.48%) helped: 3150 HURT: 0 helped stats (abs) min: 1 max: 33 x̄: 1.44 x̃: 1 helped stats (rel) min: 0.03% max: 7.84% x̄: 0.69% x̃: 0.49% 95% mean confidence interval for instructions value: -1.49 -1.38 95% mean confidence interval for instructions %-change: -0.71% -0.67% Instructions are helped. total cycles in shared programs: 257514962 -> 257492471 (<.01%) cycles in affected programs: 11524149 -> 11501658 (-0.20%) helped: 1970 HURT: 239 helped stats (abs) min: 1 max: 3525 x̄: 17.48 x̃: 3 helped stats (rel) min: <.01% max: 49.60% x̄: 0.46% x̃: 0.17% HURT stats (abs) min: 1 max: 1358 x̄: 50.00 x̃: 15 HURT stats (rel) min: 0.02% max: 59.88% x̄: 1.84% x̃: 0.65% 95% mean confidence interval for cycles value: -17.01 -3.35 95% mean confidence interval for cycles %-change: -0.33% -0.08% Cycles are helped. LOST: 9 GAINED: 1 Sandy Bridge total instructions in shared programs: 10432841 -> 10429893 (-0.03%) instructions in affected programs: 685071 -> 682123 (-0.43%) helped: 2453 HURT: 0 helped stats (abs) min: 1 max: 9 x̄: 1.20 x̃: 1 helped stats (rel) min: 0.02% max: 7.55% x̄: 0.64% x̃: 0.46% 95% mean confidence interval for instructions value: -1.23 -1.17 95% mean confidence interval for instructions %-change: -0.67% -0.62% Instructions are helped. total cycles in shared programs: 146133660 -> 146134195 (<.01%) cycles in affected programs: 3991634 -> 3992169 (0.01%) helped: 1237 HURT: 153 helped stats (abs) min: 1 max: 2853 x̄: 6.93 x̃: 2 helped stats (rel) min: <.01% max: 29.00% x̄: 0.24% x̃: 0.14% HURT stats (abs) min: 1 max: 1740 x̄: 59.56 x̃: 12 HURT stats (rel) min: 0.03% max: 78.98% x̄: 1.96% x̃: 0.42% 95% mean confidence interval for cycles value: -5.13 5.90 95% mean confidence interval for cycles %-change: -0.17% 0.16% Inconclusive result (value mean confidence interval includes 0). LOST: 0 GAINED: 1 GM45 and Iron Lake had similar results (GM45 shown): total instructions in shared programs: 4800332 -> 4798380 (-0.04%) instructions in affected programs: 565995 -> 564043 (-0.34%) helped: 1451 HURT: 0 helped stats (abs) min: 1 max: 20 x̄: 1.35 x̃: 1 helped stats (rel) min: 0.05% max: 5.26% x̄: 0.47% x̃: 0.31% 95% mean confidence interval for instructions value: -1.40 -1.29 95% mean confidence interval for instructions %-change: -0.50% -0.45% Instructions are helped. total cycles in shared programs: 122032318 -> 122027798 (<.01%) cycles in affected programs: 8334868 -> 8330348 (-0.05%) helped: 1029 HURT: 1 helped stats (abs) min: 2 max: 40 x̄: 4.43 x̃: 2 helped stats (rel) min: <.01% max: 1.83% x̄: 0.09% x̃: 0.04% HURT stats (abs) min: 38 max: 38 x̄: 38.00 x̃: 38 HURT stats (rel) min: 0.25% max: 0.25% x̄: 0.25% x̃: 0.25% 95% mean confidence interval for cycles value: -4.70 -4.08 95% mean confidence interval for cycles %-change: -0.09% -0.08% Cycles are helped. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-03-26 08:50:43 -07:00
Ian Romanick	5bbb3d60d3	i965/fs: Allow cmod propagation when src0 is a uniform or shader input No shader-db changes. This source must have been written by a previous instruction, so it cannot be a uniform or a shader input. However, this change allows the next commit to help about 900 more shaders. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-03-26 08:50:43 -07:00
Ian Romanick	8f83eea71e	i965: Add negative_equals methods This method is similar to the existing ::equals methods. Instead of testing that two src_regs are equal to each other, it tests that one is the negation of the other. v2: Simplify various checks based on suggestions from Matt. Use src_reg::type instead of fixed_hw_reg.type in a check. Also suggested by Matt. v3: Rebase on 3 years. Fix some problems with negative_equals with VF constants. Add fs_reg::negative_equals. v4: Replace the existing default case with BRW_REGISTER_TYPE_UB, BRW_REGISTER_TYPE_B, and BRW_REGISTER_TYPE_NF. Suggested by Matt. Expand the FINISHME comment to better explain why it isn't already finished. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [v3] Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-03-26 08:50:43 -07:00
Gert Wollny	a21da49e5c	mesa/st/tests: Use tgsi opcode enum also in the test classes Fixes: ec478cf9c31K ("st/mesa,tgsi: use enum tgsi_opcode") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105737 Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-03-26 09:04:53 -06:00
Eric Engestrom	1e36fe5dc4	meson: fix header check message before: Checking if "endian.h works" compiles: YES after: Checking if "endian.h" compiles: YES Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>	2018-03-26 09:59:32 +01:00
Rob Clark	2f181c8c18	glsl_types: vec8/vec16 support Not used in GL but 8 and 16 component vectors exist in OpenCL. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-25 10:42:54 -04:00
Rob Clark	f407edf340	glsl_types: refactor/prep for vec8/vec16 Refactor things so there isn't so much typing involved to add new things. Also drops a pointless conditional (out of bounds rows or columns already returns error_type in all paths.. might as well drop it rather than make the check more convoluted in the next patch by adding the vec8/vec16 case). Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-25 10:42:54 -04:00
Jordan Justen	d60eaf7b1f	anv: Set genX_table for gen11 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-23 17:23:59 -07:00
Jordan Justen	af8535d02f	anv: Add gen11 to anv_genX_call Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-23 17:23:59 -07:00
Mathias Fröhlich	4a8ef1f5d4	vbo: Make sure the internal VAO's stay within limits. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-23 19:59:02 +01:00
Mathias Fröhlich	1a131aaf4b	mesa: Flag early if we modify a SharedAndImmutable VAO. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-23 19:58:59 +01:00
Mathias Fröhlich	19526a57f5	mesa: When copying a VAO also copy the vertex attribute mode. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-23 19:58:54 +01:00
Emil Velikov	5a75019ad0	configure: use AC_CHECK_HEADERS to check for endian.h The currently we use the singular CHECK_HEADER combined with explicit append to the DEFINES variable. That is a legacy misnomer, since it requires us to add $DEFINES to every piece that we build. Using the plural version of the helper sets the HAVE_ macro for us, plus ensures it's passed to the compiler - if config.h is available in there (not in the case of mesa) otherwise on the command line. In hindsight, we should replace all the AC_CHECK_{FUNC,HEADER} instances with the plural version (or even the _ONCE suffixed version) and drop the DEFINES hacks. Fixes: `cbee1bfb34` ("meson/configure: detect endian.h instead of trying to guess when it's available") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105717 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Tested-by: Clayton Craft <clayton.a.craft@intel.com>	2018-03-23 18:12:52 +00:00
Kenneth Graunke	90f556f0b1	android: Use local i915_drm.h rather than the system one. Fixes: `2d26c99933` (intel: devinfo: meson: include drm uapi) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Clayton Craft <clayton.a.craft@intel.com>	2018-03-23 10:05:02 -07:00
Brian Paul	e31d5bd2f9	st/mesa: s/unsigned/enum pipe_shader_type/ for st_bind_ubos() Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-03-23 09:03:26 -06:00
Brian Paul	6a93deedf5	st/mesa: whitespace/formatting fixes in st_atom_constbuf.c Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-03-23 09:03:26 -06:00
Brian Paul	aad23f91ee	st/mesa: s/unsigned/enum pipe_shader_type/ Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-03-23 09:03:26 -06:00
Brian Paul	93581c2ca0	svga: simplify uses_flat_interp expression in emit_input_declarations() Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-03-23 09:03:26 -06:00
Brian Paul	c99f46c2ac	svga: replace unsigned with proper enum names Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-03-23 09:03:26 -06:00
Brian Paul	7181a9fa0e	tgsi,softpipe: use enum tgsi_opcode Reviewed-by: Eric Anholt <eric@anholt.net>	2018-03-23 09:03:26 -06:00
Brian Paul	ec478cf9c3	st/mesa,tgsi: use enum tgsi_opcode Need to update the tgsi code and st_glsl_to_tgsi code at the same time to prevent compile break since C++ is much pickier about implicit enum/unsigned casting. Bump size of glsl_to_tgsi_instruction::op to 10 bits to be sure to avoid MSVC signed enum overflow issue. No change in class size. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-03-23 09:03:26 -06:00
Brian Paul	ccecb2bbd3	tgsi/nir: use enum tgsi_opcode Reviewed-by: Eric Anholt <eric@anholt.net>	2018-03-23 09:03:26 -06:00
Brian Paul	22a3190c85	tgsi: use enum tgsi_opcode Reviewed-by: Eric Anholt <eric@anholt.net>	2018-03-23 09:03:26 -06:00
Brian Paul	9413d1c0fe	gallivm: use enum tgis_opcode Reviewed-by: Eric Anholt <eric@anholt.net>	2018-03-23 09:03:26 -06:00
Brian Paul	7df96826f8	svga: use enum tgsi_opcode Reviewed-by: Eric Anholt <eric@anholt.net>	2018-03-23 09:03:26 -06:00
Brian Paul	4e0f967f6d	tgsi: convert opcode macros to enums Enums are nicer in gdb. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-03-23 09:03:26 -06:00
Lionel Landwerlin	412fae46c0	compiler: glsl: silence valgrind warning on write cache I don't think it actually fixes anything, but that's nice not to have valgrind warnings. It manifests itself when running the piglit test : glsl-fs-raytrace-bug27060 ==2058== Uninitialised byte(s) found during client check request ==2058== at 0xC5BB040: blob_write_bytes (blob.c:152) ==2058== by 0xC595359: write_variable (nir_serialize.c:144) ==2058== by 0xC59560C: write_var_list (nir_serialize.c:192) ==2058== by 0xC5982E4: nir_serialize (nir_serialize.c:1124) ==2058== by 0xC0B729D: brw_program_serialize_nir (brw_program.c:835) ==2058== by 0xC0AB2D6: brw_link_shader (brw_link.cpp:358) ==2058== by 0xC32FE3F: _mesa_glsl_link_shader (ir_to_mesa.cpp:3169) ==2058== by 0xC36C7ED: create_new_program(gl_context, state_key) (ff_fragment_shader.cpp:1127) ==2058== by 0xC36C8A6: _mesa_get_fixed_func_fragment_program (ff_fragment_shader.cpp:1157) ==2058== by 0xC1B50AF: update_program (state.c:134) ==2058== by 0xC1B56DF: _mesa_update_state_locked (state.c:352) ==2058== by 0xC1B579A: _mesa_update_state (state.c:386) ==2058== Address 0xf1eab8a is 58 bytes inside a block of size 96 alloc'd ==2058== at 0x4C2CB8F: malloc (vg_replace_malloc.c:299) ==2058== by 0xC0FD306: ralloc_size (ralloc.c:121) ==2058== by 0xC0FD5B1: ralloc_array_size (ralloc.c:208) ==2058== by 0xC452B3B: (anonymous namespace)::nir_visitor::visit(ir_variable) (glsl_to_nir.cpp:448) ==2058== by 0xC45CE8B: ir_variable::accept(ir_visitor) (ir.h:428) ==2058== by 0xC46D0B5: visit_exec_list(exec_list, ir_visitor) (ir.cpp:1898) ==2058== by 0xC451D2F: glsl_to_nir (glsl_to_nir.cpp:162) ==2058== by 0xC0B5223: brw_create_nir (brw_program.c:79) ==2058== by 0xC0AAB67: brw_link_shader (brw_link.cpp:257) ==2058== by 0xC32FE3F: _mesa_glsl_link_shader (ir_to_mesa.cpp:3169) ==2058== by 0xC36C7ED: create_new_program(gl_context, state_key) (ff_fragment_shader.cpp:1127) ==2058== by 0xC36C8A6: _mesa_get_fixed_func_fragment_program (ff_fragment_shader.cpp:1157) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-03-23 13:05:12 +00:00
Eric Engestrom	cbee1bfb34	meson/configure: detect endian.h instead of trying to guess when it's available Cc: Maxin B. John <maxin.john@gmail.com> Cc: Khem Raj <raj.khem@gmail.com> Cc: Rob Herring <robh@kernel.org> Suggested-by: Jon Turney <jon.turney@dronecode.org.uk> Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Cc: <mesa-stable@lists.freedesktop.org>	2018-03-23 11:44:21 +00:00
Juan A. Suarez Romero	ee2b943fa8	wayland-drm: do not distribute generated sources Instead we will re-generate them again on building. v2: get rid of BUILT_SOURCES (Daniel, Emil) v3: keep BUILT_SOURCES for egl/Makefile.am (Emil) Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-23 11:27:12 +01:00
Samuel Pitoiset	ccc64f3133	radv: enable TC-compat HTILE for 16-bit depth surfaces on GFX8 The hardware only supports 32-bit depth surfaces, but we can enable TC-compat HTILE for 16-bit depth surfaces if no Z planes are compressed. The main benefit is to reduce the number of depth decompression passes. Also, we don't need to implement DB->CB copies which is fine. This improves Serious Sam 2017 by +4%. Talos and F12017 are also affected but I don't see a performance difference. This also improves the shadowmapping Vulkan demo by 10-15% (FPS is now similar to AMDVLK). No CTS regressions on Polaris10. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-23 10:05:57 +01:00
Samuel Pitoiset	5ae9772245	radv: add radv_calc_decompress_on_z_planes() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-23 10:05:55 +01:00
Samuel Pitoiset	9b8e75bee3	radv: add radv_image_is_tc_compat_htile() helper Instead of that huge conditional that's going to be crazy. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-23 10:05:54 +01:00
Jason Ekstrand	884d27bcf6	nir: Rename image intrinsics to image_var Generated with git grep -l nir_intrinsic_image \| xargs \ sed -i 's/nir_intrinsic_image/nir_intrinsic_image_var/g' and some manual fixing in nir_intrinsics.h Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-23 13:48:11 +11:00
Dave Airlie	fa683385de	virgl: add ARB_cull_distance support. This just allows the properties through to the host if we have cull dist support. Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-23 10:21:10 +10:00
Eric Anholt	d7a015cbc6	broadcom/vc5: Account for InstanceID/VertexID in VPM segment size. Fixes failure in GTF-GLES3.gtf.GL3Tests.draw_instanced.draw_instanced_attrib_size	2018-03-22 15:12:21 -07:00
Eric Anholt	b8387dbc49	broadcom/vc5: Allow FBOs with mixed color formats. This is required by GLES3, fixing GTF-GLES3.gtf.GL3Tests.framebuffer_srgb.framebuffer_srgb_draw	2018-03-22 15:12:21 -07:00
Eric Anholt	4f62679be5	broadcom/vc5: Add missing support for 2101010_REV vertex attributes. Fixes GTF-GLES3.gtf.GL3Tests.vertex_type_2_10_10_10_rev.vertex_type_2_10_10_10_rev_invalid2, where we hadn't thrown a GL error as needed in the extension-disabled case. We want to be exposing the extension anyway.	2018-03-22 15:12:21 -07:00
Eric Anholt	ba29b89dc7	broadcom/vc5: Set up a vertex position if the shader doesn't. Our backend needs some sort of vertex position value to emit the scaled viewport values and such. Fixes potential segfaults in KHR-GLES3.copy_tex_image_conversions.required.cubemap_negx_cubemap_negx	2018-03-22 15:12:21 -07:00
Lionel Landwerlin	903e9952fb	i965: add performance query support on CNL v2: Add brw_oa_cnl.xml to EXTRA_DIST (Emil) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 20:14:22 +00:00
Lionel Landwerlin	e7f6d1e5f8	i965: perf: add support for new equation operators Some equations of the CNL metrics started to use operators we haven't defined yet, just add those. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 20:14:22 +00:00
Lionel Landwerlin	57a11550bc	i965: perf: query topology With the introduction of asymmetric slices in CNL, we cannot rely on the previous SUBSLICE_MASK getparam to tell userspace what subslices are available. We introduce a new uAPI in the kernel driver to report exactly what part of the GPU are fused and require this to be available on Gen10+. Prior generations can continue to rely on GETPARAM on older kernels. This patch is quite a lot of code because we have to support lots of different kernel versions, ranging from not providing any information (for Haswell on 4.13 through 4.17), to being able to query through GETPARAM (for gen8/9 on 4.13 through 4.17), to finally requiring 4.17 for Gen10+. This change stores topology information in a unified way on brw_context.topology from the various kernel APIs. And then generates the appropriate values for the equations from that unified topology. v2: Move slice/subslice masks fields to gen_device_info (Rafael) v3: Add a gen_device_info_subslice_available() helper (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 20:14:22 +00:00
Lionel Landwerlin	c1900f5b0f	intel: devinfo: add helper functions to fill fusing masks values There are a couple of ways we can get the fusing information from the kernel : - Through DRM_I915_GETPARAM with the SLICE_MASK/SUBSLICE_MASK parameters - Through the new DRM_IOCTL_I915_QUERY by requesting the DRM_I915_QUERY_TOPOLOGY_INFO The second method is more accurate and also gives us the EUs fusing masks. It's also a requirement for CNL as this platform has asymetric subslices and the first method SUBSLICE_MASK value is assumed uniform across slices. v2: Change gen_device_info_update_from_masks() to generate topology and call into gen_device_info_update_from_topology (Lionel/Ken) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 20:14:22 +00:00
Lionel Landwerlin	2d26c99933	intel: devinfo: meson: include drm uapi Already available with the autotools build. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 20:14:22 +00:00
Lionel Landwerlin	5d3e74a5a5	drm-uapi: bump headers Required updates from drm-next for changes in i965. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org	2018-03-22 20:14:22 +00:00
Lionel Landwerlin	c471716574	intel: devinfo: store slice/subslice/eu masks We want to store values coming from the kernel but as a first step, we can generate mask values out the numbers already stored in the gen_device_info masks. v2: Add a helper to set EU masks (Lionel/Ken) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 20:14:22 +00:00
Lionel Landwerlin	7e2c6147da	intel: devinfo: store number of EUs per subslice This will be reused to store values reported by the kernel. The main use case will be for use as the input values of the metric sets equations for the INTEL_performance_queries extension. By storing this information in the gen_device_info we make this non GL specific so this can be reused by Vulkan if we ever have an equivalent extension. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 20:14:22 +00:00
Dylan Baker	8e5988eb35	Revert "meson: merge C and C++ compiler arguments check" This reverts commit `cb2ddcefa5`. This causes clang to error out building C++ code. The plan is to fix the build to work with clang, but in the mean time we'll just revert this Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Eric Engestrom <eric@engestrom.ch>	2018-03-22 11:35:08 -07:00
Lionel Landwerlin	1603ce1921	i965/perf: fix config registration when uploading to kernel When registring configurations to the kernel for the first time, we run into an issue where the id number is not properly set (we're using the wrong variable). As a result when trying to use that id later on, we get an error. This issue manifest itself the first time you use frameretrace after reboot, subsequent runs are fine. Fixes: `27ee83eaf7` ("i965: perf: add support for userspace configurations") Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 18:21:57 +00:00
Lepton Wu	a8b846bccd	gallium/winsys/kms: Add support for multi-planes Add a new struct kms_sw_plane which delegate a plane and use it in place of sw_displaytarget. Multiple planes share same underlying kms_sw_displaytarget. v2: - add more check for plane size (Tomasz) v3: - split from larger patch (Emil) v4: - no change from v3 v5: - remove mapped field (Tomasz) v6: - remove change-id in commit message (Tomasz) v7: - add revision history in commit message (Emil) Reviewed-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Lepton Wu <lepton@chromium.org>	2018-03-22 18:10:44 +00:00
Lepton Wu	d891f28df9	gallium/winsys/kms: Fix possible leak in map/unmap. If user calls map twice for kms_sw_displaytarget, the first mapped buffer could get leaked. Instead of calling mmap every time, just reuse previous mapping. Since user could map same displaytarget with different flags, we have to keep two different pointers, one for rw mapping and one for ro mapping. Also introduce reference count for mapped buffer so we can unmap them at right time. v2: - avoid duplicated mapping and leaked mapping (Tomasz) v3: - split from larger patch (Emil) v4: - remove munmap from dt_destory (Emil) v5: - introduce reference count for mapping (Tomasz) - add back munmap in dt_destory v6: - remove change-id in commit message (Tomasz) v7: - remove munmap from dt_destory again (Emil) - add revision history in commit message (Emil) Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org> Signed-off-by: Lepton Wu <lepton@chromium.org>	2018-03-22 18:10:42 +00:00
Juan A. Suarez Romero	4db269f30c	broadcom/vc4: add path to nir_builder.h As the other VC4 files do. Otherwise, it won't find nir_builder.h v2: add path in source code rather changing autotools (Emil) Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-22 18:25:39 +01:00
Juan A. Suarez Romero	d39e828c82	autotools: add tegra header files Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-22 18:25:39 +01:00
Juan A. Suarez Romero	40ecee89b7	swr/rast: autotools: add events_private.proto in dist tarball. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-22 18:25:39 +01:00
Juan A. Suarez Romero	0bf1274883	radv: autotools: add radv_extensions.h in the generated VULKAN list Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-22 18:25:39 +01:00
Juan A. Suarez Romero	13459c637a	anv/radv: autotools: include vulkan_*.h headers Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-22 18:25:39 +01:00
Juan A. Suarez Romero	f8b749b7c0	nir: autotools, meson: add GLSL.ext.AMD.h in the files list Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-22 18:25:39 +01:00
Matt Turner	724586a266	intel/compiler: Readd ICL to test_eu_validate.cpp Now that the PCI IDs are upstream, this can be readded.	2018-03-22 09:56:09 -07:00
Matt Turner	65b060d9cb	intel/compiler: Skip 64-bit type tests when types not available Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 09:56:09 -07:00
Anuj Phogat	ad7ed86bf7	intel: Add a Ice Lake PCI IDs Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-03-22 09:56:09 -07:00
Anuj Phogat	1065acfb69	intel: Disable fast color clear on icl Disabling fast color clear makes fbo-clearmipmap test render correct texture in base miplevel. Fast color clear is anyways disabled for non-base miplevels. Acked-by: Matt Turner <mattst88@gmail.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 09:56:09 -07:00
Jason Ekstrand	d2eecf0b0b	intel/compiler/icl: Clear "null render target" bit in extended message descriptor Otherwise all our render target writes go no where. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 09:56:09 -07:00
Anuj Phogat	1484876ef7	intel/compiler/icl: Update the assert in brw_stage_has_packed_dispatch() Rafael ran piglit with the test code enabled and saw no additional GPU hangs. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 09:56:09 -07:00
Anuj Phogat	f05e0d9c2a	intel/common/icl: Disable hiz surface sampling On gen11+ AUX_HIZ is not a supported value for surfaces being sampled by the 3D sampler. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 09:56:09 -07:00
Anuj Phogat	370af9dcc0	intel/common/icl: Add L3 config ICL uses the same L3 configs as CNL, just leaving the SLM configs out. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 09:56:09 -07:00
Matt Turner	f56693af4b	intel/tools/aubinator: Drop platform list from print_help() We all know the platform names, and I don't want to update this list continually. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 09:56:09 -07:00
Derek Foreman	aa18a63512	egl/wayland: Make swrast display_sync the correct queue commit `03dd9a88b0` introduced per surface queues, but the display_sync for swrast_commit_backbuffer remained on the old queue. This is likely to break when dispatching the correct queue at the top of function (which can't dispatch the sync callback we're waiting for). The easiest known reproduction case is running weston-subsurfaces under weston --use-pixman Signed-off-by: Derek Foreman <derekf@osg.samsung.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-03-22 15:27:35 +00:00
Samuel Pitoiset	52fba3f45d	radv: remove unused radv_pipeline::needs_data_cache variable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-22 14:30:37 +01:00
Eric Engestrom	cb2ddcefa5	meson: merge C and C++ compiler arguments check Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-03-22 11:59:12 +00:00
Mathias Fröhlich	880c1718b6	omx: always define ENABLE_ST_OMX_{BELLAGIO,TIZONIA} We're trying to be -Wundef clean so that we can turn it on (and eventually make it an error). Note that the OMX code already used `#if ENABLE_ST_OMX_BELLAGIO` instead of #ifdef; I could've changed these, but the point of -Wundef is to catch typos, so we might as well make the change the right way. Fixes: `83d4a5d5ae` "st/omx/tizonia: Add H.264 decoder" Fixes: `b2f2236dc5` "st/omx/tizonia: Add H.264 encoder" Fixes: `c62cf1f165` "st/omx/tizonia/h264d: Add EGLImage support" Cc: Gurkirpal Singh <gurkirpal204@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-03-22 11:39:28 +00:00
Mathias Fröhlich	795b465c50	meson: simplify omx logic and let's make sure `with_gallium_omx` is never 'auto' and can only be one of [bellagio, tizonia, disabled]. Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-03-22 10:08:10 +00:00
Mathias Fröhlich	862c872c48	vbo: Remove now duplicate _DrawVAO notification. The DriverFlags.NewArray bit is already set to NewDriverState in _mesa_set_draw_vao since we have actually just above changed the VAOs content. So this can be removed. The _vbo_update_inputs is called by the vbo...recalculate_inputs being set through the same mechanism as described above. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-22 04:58:53 +01:00
Mathias Fröhlich	006b5e798a	vbo: Remove now duplicate _vbo_update_inputs from dlist draw. At the current state, _vbo_update_inputs is called from the draw callback if vbo...recalculate_inputs is set. But that is now set of the _DrawVAO or its content or the vertex program mode is changed. So remove _vbo_update_inputs from the direct dlist draw path. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-22 04:58:53 +01:00
Mathias Fröhlich	2887c98140	vbo: Remove redundant set of DriverFlags.NewArray in vbo_bind_arrays. Now that setting vbo...recalculate_inputs also sets the DriverFlags.NewArray bits into the NewDriverState setting that from vbo_bind_arrays is redundant. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-22 04:58:52 +01:00
Mathias Fröhlich	9f5b6ef2ef	vbo: Remove vbo...recalculate_inputs from vbo_exec_invalidate_state. This flag is now set when the actual Array._DrawVAO changes. So setting this flag is redundant here. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-22 04:58:52 +01:00
Mathias Fröhlich	bf328359a7	mesa: A change of gl_vertex_processing_mode needs an array update. Since arrays also handle the mapping of current values into the disabled array slots, we need to tell the array update code that this mapping has changed. Also mark only dirty if it has changed. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-22 04:58:52 +01:00
Mathias Fröhlich	5b91786225	mesa: Set DriverFlags.NewArray together with vbo...recalculate_inputs. Both mean something very similar and are set at the same time now. For that vbo module to be set from core mesa, implement a public vbo module method to set that flag. In the longer term the flag should vanish in favor of a driver flag of the appropriate driver. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-22 04:58:52 +01:00
Mathias Fröhlich	d3c604e12e	mesa: Update VAO internal state when setting the _DrawVAO. Update the VAO internal state on Array._DrawVAO instead of Array.VAO. Also the VAO internal state update gets triggered now by a change of Array._DrawVAO instead of the _NEW_ARRAY state flag. Also no driver looks at any VAO's NewArrays value from within the Driver.UpdateState callback. So it should be safe to move this update into the _mesa_set_draw_vao method. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-22 04:58:52 +01:00
Mathias Fröhlich	c4c56ff303	vbo: Move vbo_bind_arrays into a dd_driver_functions draw callback. Factor out that common call into the almost single place. Remove the _mesa_set_drawing_arrays call from vbo_{exec,save}_draw code paths as the function is now called through vbo_bind_arrays. Prepare updating the list of struct gl_vertex_array entries via calling _vbo_update_inputs for being pushed into those drivers that finally work on that long list of gl_vertex_array pointers. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-22 04:58:52 +01:00
Mathias Fröhlich	6307d1be0a	mesa: Move vbo draw functions into dd_function_table. Move vbo draw functions into struct dd_function_table. For now just wrap the underlying vbo functions. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-22 04:58:52 +01:00
Aaron Watry	23100acc8f	clover/llvm: Fix build against LLVM/Clang 4.0 The opencl 1.0 langstandard was renamed in 5.0+ v2: Move preprocessor check into compat.hpp Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-03-21 21:03:23 -05:00
Timothy Arceri	c135316555	ac/nir_to_llvm: add frexp support Fixes CTS tests: KHR-GL40.gpu_shader_fp64.builtin.frexp_double KHR-GL40.gpu_shader_fp64.builtin.frexp_dvec2 KHR-GL40.gpu_shader_fp64.builtin.frexp_dvec3 KHR-GL40.gpu_shader_fp64.builtin.frexp_dvec4 And piglit test: tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-frexp-dvec4.shader_test Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-22 12:42:34 +11:00
Timothy Arceri	cca2141745	nir: add frexp_exp and frexp_sig opcodes Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-22 12:42:34 +11:00
Caio Marcelo de Oliveira Filho	12c22b897a	anv/pipeline: don't pass constant view index in multiview If view mask has only one bit set, view index is effectively a constant, so doesn't need to be passed to the next stages, just always set it. Part of this was in the original patch that added anv_nir_lower_multiview.c but disabled. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-21 14:49:50 -07:00
Caio Marcelo de Oliveira Filho	5e7c1d05d4	anv/pipeline: use less instructions for multiview The view_index is encoded in the remainder of dividing instance id by the number of views in the view mask (n). In the general case (handled by the else clause), there is a need to map from 0..n-1 into the number of the view being masked. For that a map is encoded. In the case only the first n bits in the mask are set, the mapping is trivial, 0..n-1 already represent what view is being referred to. That case was in the original patch that added anv_nir_lower_multiview.c but disabled. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-21 14:49:50 -07:00
Eric Anholt	baeb6a4b4a	broadcom/vc5: Fix up the NIR types of FS outputs generated by NIR-to-TGSI. Unfortunately TGSI doesn't record the type of the FS output like GLSL does, but VC5's TLB writes depend on the output's base type. Just record the type in the key at variant compile time when we've got a TGSI input and then fix it up. Fixes KHR-GLES3.packed_pixels.pbo_rectangle.rgba32i/ui and apparently a GPU hang that breaks most tests that come after it.	2018-03-21 14:02:34 -07:00
Neil Roberts	61603f0e42	spirv: Add a 64-bit implementation of Frexp The implementation is inspired by lower_instructions_visitor::dfrexp_sig_to_arith. This has been tested against the arb_gpu_shader_fp64/fs-frexp-dvec4 test using the ARB_gl_spirv branch. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-21 20:18:44 +01:00
Rafael Antognolli	5297a17571	aubinator_error_decode: Compare only the class_name of the ring. ring_name is "<class_name> + <instance_id>" (e.g. rcs0). So we need to first compare the class name only, then get the instance id. Without this, INSTDONE is not being decoded. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2018-03-21 11:35:15 -07:00
Thomas Helland	8d5cd91ca0	nir: Migrate nir_dce to instr worklist Shader-db runtime change avarage of five runs: Before 125,77 seconds (+/- 0,09%) After 124,48 seconds (+/- 0,07%) Tested-by: Dieter Nützel <Dieter at nuetzel-hh.de> Reviewed-by: Eric Anholt <eric at anholt.net>	2018-03-21 19:26:40 +01:00
Thomas Helland	edb18564c7	nir: Initial implementation of a nir_instr_worklist Make a simple worklist by basically just wrapping u_vector. This is intended used in nir_opt_dce to reduce the number of calls to ralloc, as we are currenlty spamming ralloc quite bad. It should also give better cache locality and much lower memory usage. Tested-by: Dieter Nützel <Dieter at nuetzel-hh.de> Reviewed-by: Eric Anholt <eric at anholt.net>	2018-03-21 19:26:27 +01:00
Scott D Phillips	cab8df1e3e	intel/tools: aubinator: Catch gen11 "enhanced execlist" submission Different registers are used for execlist submission in gen11, so also watch those. This code only watches element zero of the submit queue, which is all aubdump currently writes. Tested-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-03-21 11:07:15 -07:00
Marek Olšák	a8d55374dc	radeonsi: fix a snprintf warning on gcc 7.3.0	2018-03-21 13:43:09 -04:00
Marek Olšák	cf0a95afac	radeonsi/gfx9: print the swizzle mode for testdma Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-03-21 13:40:06 -04:00
Marek Olšák	f7ffa504a0	ac/surface: compute tile swizzle for GFX9 Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-03-21 13:40:06 -04:00
Eric Anholt	9f0c9c6d18	broadcom/vc5: Don't skip job submit just because everything is scissored. The coordinate shaders may now have side effects in the form of transform feedback. Part of fixing GTF-GLES3.gtf.GL3Tests.transform_feedback.transform_feedback_misc	2018-03-21 10:04:21 -07:00
Eric Anholt	024e814dee	broadcom/vc5: Handle sparsely populated SO target array. Fixes GTF-GLES3.gtf.GL3Tests.transform_feedback.transform_feedback_state_variables	2018-03-21 10:04:21 -07:00
Eric Anholt	f735ac6b1c	broadcom/vc5: Fix 3D miplevel limit to match other texture targets. Fixes segfault in GTF-GLES3.gtf.GL3Tests.texture_storage.texture_storage_texture_levels on level 13.	2018-03-21 10:04:21 -07:00
Eric Anholt	ba87d85b04	broadcom/vc5: Clamp the instance divisor to 16 bits. Fixes debug assert on GTF-GLES3.gtf.GL3Tests.instanced_arrays.instanced_arrays_divisor Signed-off-by: Eric Anholt <eric@anholt.net>	2018-03-21 10:04:21 -07:00
Lionel Landwerlin	3dd92184d5	i965: fix android build This is the equivalent of commit `5770e1d89e` for android. v2: fix xml files path and file given to --header Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Fixes: `2d2b15fbca` ("i965: fix autotools/android build") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105634 Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-21 18:56:47 +02:00
Juan A. Suarez Romero	e5cd376c2f	docs: fix typo in 17.3.6 release notes Title is about 17.3.5, when it must be about 17.3.6. CC: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-03-21 16:37:49 +00:00
Caio Marcelo de Oliveira Filho	8571c577aa	nir/dead_cf: also remove useless ifs Generalize the code for remove dead loops to also remove dead if nodes. The conditions are the same in both cases, if the node (and it's children) don't have side-effects AND the nodes after it don't use the values produced by the node. The only difference is when evaluating side effects: loops consider only return jumps as a side-effect -- they can stop execution of nodes after it; 'if' nodes outside loops should consider all kinds of jumps (return, break, continue) since all of them can cause execution of nodes after it to be skipped. After this patch, empty ifs (those which both then and else blocks are empty) will be removed by nir_opt_dead_cf. It caused no change to shader-db, in part because the removal of empty ifs is currently covered by nir_opt_peephole_select. v2: Improve the identification of cases where break/continue can cause side-effects. (Jason) v3: Move code comment changes to a different patch. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-21 09:36:09 -07:00
Caio Marcelo de Oliveira Filho	470056d37b	nir/dead_cf: rephrase definition of a dead loop node Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-21 09:35:57 -07:00
Juan A. Suarez Romero	e1f8c23e18	docs: update calendar, add news and link release notes to 17.3.7 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2018-03-21 16:02:37 +00:00
Juan A. Suarez Romero	543e7c8382	docs: add sha256 checksums for 17.3.7 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `13dd6016d7`)	2018-03-21 15:58:55 +00:00
Juan A. Suarez Romero	09448940ed	docs: add release notes for 17.3.7 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `8a51f3857c`)	2018-03-21 15:58:52 +00:00
Leo Liu	c4de2f0880	radeon/vce: move feedback command inside of destroy function On the CI family, firmware requires the destory command have to be the last command in the IB, moving feedback command after destroy is causing issues on CI cards, so we have to keep the previous logic that moves destroy back to the last command. But as the original issue fixed previously, with the newer family like Vega10, feedback command have to be included inside of the task info command along with destroy command. Fixes: 6d74cb25("radeon/vce: move destroy command before feedback command") Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Cc: mesa-stable@lists.freedesktop.org	2018-03-21 11:24:35 -04:00
Eric Engestrom	1346a36162	egl: pull update from Khronos and drop local define Added in Khronos in 2b6bb4ee45cc46c89d4a "EGL_MESA_drm_image: add EGL_DRM_BUFFER_USE_CURSOR_MESA to egl.xml" [1] as part of PR #36 [2]. [1] `2b6bb4ee45` [2] https://github.com/KhronosGroup/EGL-Registry/pull/36 Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-21 14:28:05 +00:00
Eric Engestrom	f744c6c1e2	egl: align the formatting of Haiku section of eglplatform.h with Khronos' Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-21 14:28:05 +00:00
Eric Engestrom	ac698ae4a0	egl: add Ozone section to eglplatform.h This pulls in commit a93f559e9c11fa53fb5f1cc255b8f75433f85d2a "Add Ozone section to eglplatform.h" from Khronos [1] added by Brian Anderson [2] a few months ago. [1] `a93f559e9c` [2] https://github.com/KhronosGroup/EGL-Registry/pull/26 Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-21 14:28:05 +00:00
Aaron Watry	c95d953b18	clover: Dynamically calculate __OPENCL_VERSION__ and CLC language version Use get_language_version to calculate default cl standard based on device capabilities and -cl-std specified in build options. v5; move dev_clc_version declaration from an earlier patch v4: Squash the __OPENCL_VERSION__ and CLC language version patches v3: (Jan) Allow device_version up to 2.2 while device_clc_version only goes to 2.0 Use get_cl_version to calculate version instead v2: Split out from the previous patch (Pierre) Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> CC: Jan Vesely <jan.vesely@rutgers.edu>	2018-03-21 06:59:46 -05:00
Aaron Watry	29b4090d18	clover/llvm: Add get_[cl\|language]_version, validation and some helpers Used to calculate the default CLC language version based on the --cl-std in build args and the device capabilities. According to section 5.8.4.5 of the 2.0 spec, the CL C version is chosen by: 1) If you have -cl-std=CL1.1+ use the version specified 2) If not, use the highest 1.x version that the device supports Curiously, there is no valid value for -cl-std=CL1.0 Validates requested cl-std against device_clc_version Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> v7: (Pierre) Split cl/clc versions into separate lists and make more references const. v6: (Pierre) Add more const and fix some whitespace v5: (Aaron) Use a collection of cl versions instead of switch cases Consolidates the string, numeric version, and clc langstandard::kind v4: (Pierre) Split get_language_version addition and use into separate patches Squash patches that add the helpers and validate the language standard v3: Change device_version to device_clc_version v2: (Pierre) Move create_compiler_instance changes to correct patch to prevent temporary build breakage. Convert version_str into unsigned and use it to find language version Add build_error for unknown language version string Whitespace fixes	2018-03-21 06:59:37 -05:00
Juan A. Suarez Romero	14fffefc60	docs: add 17.3.{8,9} in the release calendar Mesa 18.0 series has not been released yet, so let's extend 17.3 lifetime. v2: add 17.3.9 in the calendar (Andres Gomez) CC: Andres Gomez <agomez@igalia.com> CC: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-21 11:57:44 +01:00
Eric Anholt	4d8b476fa9	intel/blorp: Fix compiler warning about num_layers. The compiler doesn't notice that the condition for num_layers to be undefined already defined it above (as our assert checked in a debug build). v2: Move the pair of assignments to one outside of the block. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-20 14:06:46 -07:00
Samuel Pitoiset	f0211155f1	radv: add support for VK_EXT_depth_range_unrestricted This extension removes the restrictions on minDepth/maxDepth, minDepthBounds/maxDepthBounds and VkClearDepthStencilValue::depth. The following CTS tests now pass: dEQP-VK.glsl.builtin_var.fragdepth.line_list_d32_sfloat_large_depth dEQP-VK.glsl.builtin_var.fragdepth.point_list_d32_sfloat_large_depth dEQP-VK.glsl.builtin_var.fragdepth.triangle_list_d32_sfloat_large_depth dEQP-VK.draw.inverted_depth_ranges.nodepthclamp_depth_range_unrestricted dEQP-VK.draw.inverted_depth_ranges.depthclamp_depth_range_unrestricted Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-20 21:55:41 +01:00
Samuel Pitoiset	4e9b0b39b5	radv: only enable one channel when exporting prim id It's a 32-bit integer like the layer. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-20 21:54:48 +01:00
Lionel Landwerlin	5770e1d89e	i965: fix out of tree autotools build Fixes: `2d2b15fbca` ("i965: fix autotools/android build") Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-03-20 19:48:56 +00:00
Stéphane Marchesin	1117edc60d	virgl: Implement seamless cube maps This was previously ignored. Along with the virglrenderer patch, this fixes ~100 dEQP tests: dEQP-GLES3.functional.texture.filtering.cube.* Signed-off-by: Stéphane Marchesin <marcheu@chromium.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-21 05:44:52 +10:00
Emil Velikov	c43715d30b	i965: annotate brw_oa.py's --header and --code as required As of earlier commit, the --header was made a hard requirement when using --code. Hence - annotate both as required and drop a few no longer needed checks. Fixes: `035cc7a12d` ("i965: perf: reduce i965 binary size") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-20 17:21:49 +00:00
Lionel Landwerlin	d3e5d3955c	i965: pipecontrol: add LRI write immediate flag Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-20 16:58:30 +00:00
Lionel Landwerlin	7f977d51b3	intel: genxml: add INSTPM/CS_DEBUG_MODE2 registers Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-20 16:58:30 +00:00
Lionel Landwerlin	2d2b15fbca	i965: fix autotools/android build Autotools/android builds generate the header & code files in 2 steps, but the code generation requires the name of the header file to include it. This change generates both files in one command. Fixes: `035cc7a12d` ("i965: perf: reduce i965 binary size") Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-20 16:58:29 +00:00
Daniel Stone	9f3509665d	dri3: Fix typo in version check The have-new-DRI3 codepaths would never actually properly trigger, since there was a typo in configure.ac which broke the version check. This went unnoticed but for an error in config.log if you looked closely enough. Signed-off-by: Daniel Stone <daniels@collabora.com> Reported-by: Lukas F. Hartmann <lukas@mntmn.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Fixes: `7aeef2d4ef` ("dri3: allow building against older xcb (v3)") Cc: Dave Airlie <airlied@redhat.com>	2018-03-20 16:38:08 +00:00
Daniel Stone	bc5e59119e	meson: Don't build svga by default on ARM/AArch64 VMware has no (published) support for Arm-architecture guests. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reported-by: Dylan Baker <dylan@pnwbakers.com>	2018-03-20 16:18:37 +00:00
Daniel Stone	d7603cb518	meson: Add default DRI drivers for ARM/AArch64 On all Arm architectures (ARMv7 and below as 'arm', ARMv8 and above as 'aarch64'), only build swrast for DRI drivers. The only classic drivers which could be used are r200 and NV20 cards, which seems unlikely enough that it shouldn't be the default. Signed-off-by: Daniel Stone <daniels@collabora.com> Reported-by: Javier Jardón <jjardon@gnome.org> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-03-20 16:18:37 +00:00
Emil Velikov	28780c5028	st/mesa: add compiler/nir/ prefix for nir includes Stay consistent with the rest of the codebase, effectively fixing the autotools build. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105621 Fixes: `ffa4bbe466` ("st/nir/radeonsi: move nir_lower_uniforms_to_ubo() to the state tracker") Cc: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-03-20 16:11:19 +00:00
Scott D Phillips	d849d36c6c	anv: off-by-one in GetDescriptorSetLayoutSupport Loop was accessing one more than bindingCount elements from pBindings, accessing uninitialized memory. Fixes: `ddc4069122` ("anv: Implement VK_KHR_maintenance3") Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-20 07:58:10 -07:00
Lionel Landwerlin	035cc7a12d	i965: perf: reduce i965 binary size Performance metric numbers are calculated the following way : - out of the 256 bytes long OA reports, we accumulate the deltas into an array of uint64_t - the equations' generated code reads the accumulated uint64_t deltas and normalizes them for a particular platform Our hardware is such that a number of counters in the OA reports always return the same values (i.e. they're not programmable), and they return the same values even across generations, and as a result a number of equations are identical in different metric sets across different generations. Up to now we've kept the generated code of the equations separated in different files (per generation/GT), and didn't apply any factorization of the common equations. We could have make some improvement by reusing equations within a given metrics file, but we can go even further and reuse across generations (i.e. all files). This change changes the code generation to emit a single file in which we reuse equations emitted code based on the hash of equations' strings. Here are the savings in a meson build : Before(.old)/after : $ du -h ./build/src/mesa/drivers/dri/libmesa_dri_drivers.so ./build/src/mesa/drivers/dri/libmesa_dri_drivers.so.old 43M ./build/src/mesa/drivers/dri/libmesa_dri_drivers.so 47M ./build/src/mesa/drivers/dri/libmesa_dri_drivers.so.old $ size build/src/mesa/drivers/dri/libmesa_dri_drivers.so build/src/mesa/drivers/dri/libmesa_dri_drivers.so.old text data bss dec hex filename 13054002 409424 671856 14135282 d7aff2 build/src/mesa/drivers/dri/libmesa_dri_drivers.so 14550386 409552 671856 15631794 ee85b2 build/src/mesa/drivers/dri/libmesa_dri_drivers.so.old As a side comment here is the size of the drivers if we remove all of the metrics from the build : $ du -sh build/src/mesa/drivers/dri/libmesa_dri_drivers.so 40M build/src/mesa/drivers/dri/libmesa_dri_drivers.so v2: Fix an issue with hashing of counter equations (Lionel) Build system rework (Emil) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (build system part) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-20 13:56:07 +00:00
Lionel Landwerlin	e9a9e85948	i965: perf: fix a counter return type on hsw The equation code computes a float (percentage) yet the return type was an uint64_t. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-20 11:36:13 +00:00
Tapani Pälli	604cac9f73	mesa: fix leaking ParameterValueOffset ==15115== 48 bytes in 1 blocks are definitely lost in loss record 16 of 66 ==15115== at 0x4C2EC15: realloc (vg_replace_malloc.c:785) ==15115== by 0x8602C3E: _mesa_reserve_parameter_storage (prog_parameter.c:212) ==15115== by 0x8602D1E: _mesa_add_parameter (prog_parameter.c:252) ==15115== by 0x86032C4: _mesa_add_sized_state_reference (prog_parameter.c:384) ==15115== by 0x8603324: _mesa_add_state_reference (prog_parameter.c:409) Fixes: `edded12376` "mesa: rework ParameterList to allow packing" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-20 13:25:07 +02:00
Daniel Stone	478fc2d2a1	dri3: Don't fail on version mismatch The previous commit to make DRI3 modifier support optional, breaks with an updated server and old client. Make sure we never set multibuffers_available unless we also support it locally. Make sure we don't call stubs of new-DRI3 functions (or empty branches) which will never succeed. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Fixes: `7aeef2d4ef` ("dri3: allow building against older xcb (v3)")	2018-03-20 08:52:59 +00:00
Timothy Arceri	9a243eccae	radv: don't lower indirects until after opts have run Noticed while passing by. Not sure if it impacts anything, but likely to impact GFX9 more than anything else since we lower inputs, outputs and locals there. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-20 15:01:44 +11:00
Timothy Arceri	dfe2f19855	st/nir: fix atomic lowering for gallium drivers i965 and gallium handle the atomic buffer index differently. It was just by luck that the single piglit test for this was passing. For gallium we use the atomic binding so that we match the handling in st_bind_atomics(). On radeonsi this fixes the CTS test: KHR-GL43.shader_storage_buffer_object.advanced-write-fragment It also fixes tressfx hair rendering in Tomb Raider. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-20 14:29:53 +11:00
Timothy Arceri	632d5e97ef	st/radeonsi: enable uniform packing in NIR backend Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-20 14:19:35 +11:00
Timothy Arceri	231333a20d	st: add uniform packing support to lower_uniforms_to_ubo() Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-20 14:17:34 +11:00
Timothy Arceri	9c51a7ea29	gallium: add packed uniform CAP Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-20 14:17:34 +11:00
Timothy Arceri	ffa4bbe466	st/nir/radeonsi: move nir_lower_uniforms_to_ubo() to the state tracker This will only ever be used by gallium drivers so it probably doesn't belong in the nir toolkit. Also we want to pass it some non NIR things in the following patch. To avoid regressions we wrap the lowering calls that have been moved to st_glsl_to_nir with a quick hack so that they are only called for radeonsi, we will replace the hack with a check for uniform packing in a following patch. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-20 14:17:34 +11:00
Timothy Arceri	a80cf442d9	st: add st_glsl_type_dword_size() helper This will be used to support uniform packing. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-20 14:17:34 +11:00
Timothy Arceri	5488166730	st/glsl_to_nir: add support for packed builtin uniforms Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-20 14:17:34 +11:00
Timothy Arceri	57ebab64c0	mesa: add _mesa_add_sized_state_reference() helper This will be used for adding packed builtin uniforms. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-20 14:17:34 +11:00
Timothy Arceri	2377754329	mesa: add support propagate uniform support for packed uniforms Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-20 14:17:34 +11:00
Timothy Arceri	40711a7a60	mesa: allow for uniform packing when adding uniforms to param list Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-03-20 14:17:33 +11:00
Timothy Arceri	a2198d4fdb	mesa: add packing support for setting uniform handles Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-03-20 14:17:33 +11:00
Timothy Arceri	6cfa15b803	mesa: add packing support for setting uniforms Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-03-20 14:17:33 +11:00
Timothy Arceri	4a7c5c079b	mesa: create copy uniform to storage helpers These will be used in the following patch to allow copying directly to the param list when packing is enabled. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-20 14:17:33 +11:00
Timothy Arceri	edded12376	mesa: rework ParameterList to allow packing Currently everything is padded to 4 components. Making the list more flexible will allow us to do uniform packing. V2 (suggestions from Nicolai): - always pass existing calls to _mesa_add_parameter() true for padd_and_align - fix bindless param value offsets - remove left over wip logic from pad and align code - zero out param value padding - whitespace fix Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-20 14:17:33 +11:00
Timothy Arceri	b13b9eb432	mesa: add PackedDriverUniformStorage const Will be used to determine whether to take packing code paths or not. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-20 14:17:33 +11:00
Eric Anholt	00910e3057	broadcom/vc5: Don't annotate dumps with stale live intervals. As you're debugging register allocation, you may have changed the intervals and not recomputed yet. Just skip the dump in that case.	2018-03-19 16:44:20 -07:00
Eric Anholt	facc3c6f58	broadcom/vc5: Add support for register spilling. Our register spilling support is nice to have since vc4 couldn't at all, but we're still very restricted due to needing to not spill during a TMU operation, or during the last segment of the program (which would be nice to spill a value of, when there's a long-lived value being passed through with little modification from the start to the end). We could do better by emitting unspills for the last-segment values just before the last thrsw, since the last segment is probably not the maximum interference area. Fixes GTF uniform_buffer_object_arrays_of_all_valid_basic_types and 3 others.	2018-03-19 16:44:06 -07:00
Eric Anholt	271fc58ba1	broadcom/vc5: Remove redundant last_inst lookup. The point was to get the MOV, which the MOV_dest already returned.	2018-03-19 16:42:59 -07:00
Eric Anholt	34dc64f627	broadcom/vc5: On QPU pack error, dump the instruction and return cleanly. This is nice for debugging when you've made a bad instruction.	2018-03-19 16:42:59 -07:00
Eric Anholt	d721348dcd	broadcom/vc5: Add cursors to the compiler infrastructure, like NIR's. This will let me do lowering late in compilation using the same instruction builder as we use in nir_to_vir.	2018-03-19 16:42:59 -07:00
Eric Anholt	c81d681742	broadcom/vc5: Move the umul macro to a header. Anywhere we want to multiply, we probably want this.	2018-03-19 16:42:59 -07:00
Eric Anholt	9e28c18cd1	broadcom/vc5: Correct the arg count of TIDX/EIDX.	2018-03-19 16:42:59 -07:00
Eric Anholt	55bf298333	broadcom/vc5: Re-do live variables after removing thrsws. Otherwise our start/ends ips won't line up with the actual instructions.	2018-03-19 16:42:59 -07:00
Eric Anholt	c3a504f470	broadcom/vc5: Add a QPU helper for instructions using the TLB. This will be used for detecting last thread segment in register spilling.	2018-03-19 16:42:59 -07:00
Eric Anholt	09c4dd1971	broadcom/vc5: Introduce v3d_qpu_reads_vpm()/v3d_qpu_writes_vpm(). These helpers will be used in register spilling to determine where to add a last thrsw if needed, and might help refactor QPU scheduling.	2018-03-19 16:42:59 -07:00
Eric Anholt	407f21ef1b	broadcom/vc5: The ldvpm signal also a case of using the VPM. The QPU scheduling code calling this function already separately checked this signal.	2018-03-19 16:42:59 -07:00
Eric Anholt	4760040c09	broadcom/vc5: Extract v3d_qpu_writes_tmu() helper. This will be reused in register spilling.	2018-03-19 16:42:59 -07:00
Dave Airlie	32791a0502	radv: don't export NULL layer. We have some cases where in subpass we want the layer but having it be 0 and loaded in the frag shader without the vertex shader exporting it is fine. So don't export the layer if we don't have a value to put in it. Fixes: `d4c74aed7a` (radv/multiview: mark layer_input if we have input attachments.) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-19 21:36:48 +00:00
Marek Olšák	f674b50d0e	mesa: adjust incorrect comment in texture_buffer_range	2018-03-19 16:56:17 -04:00
Ian Romanick	6aeaa7d363	nir: Don't compare b2f or b2i with zero All of the shaders that had loops changed were in Tomb Raider. The one shader that lost SIMD16 is one of those. Skylake total instructions in shared programs: 14391653 -> 14390468 (<.01%) instructions in affected programs: 111891 -> 110706 (-1.06%) helped: 501 HURT: 0 helped stats (abs) min: 1 max: 155 x̄: 2.37 x̃: 1 helped stats (rel) min: 0.05% max: 21.54% x̄: 1.61% x̃: 1.01% 95% mean confidence interval for instructions value: -3.23 -1.50 95% mean confidence interval for instructions %-change: -1.77% -1.45% Instructions are helped. total cycles in shared programs: 532793024 -> 532776598 (<.01%) cycles in affected programs: 987682 -> 971256 (-1.66%) helped: 348 nnHURT: 41 helped stats (abs) min: 1 max: 3074 x̄: 54.91 x̃: 18 helped stats (rel) min: 0.05% max: 32.24% x̄: 3.36% x̃: 1.68% HURT stats (abs) min: 1 max: 422 x̄: 65.39 x̃: 24 HURT stats (rel) min: 0.09% max: 39.29% x̄: 9.50% x̃: 2.02% 95% mean confidence interval for cycles value: -64.08 -20.38 95% mean confidence interval for cycles %-change: -2.78% -1.23% Cycles are helped. total loops in shared programs: 4854 -> 4829 (-0.52%) loops in affected programs: 27 -> 2 (-92.59%) helped: 18 HURT: 0 LOST: 1 GAINED: 0 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-19 13:52:35 -07:00
Dave Airlie	e8d9b7ab02	radv: lower constant initializers on output variables earlier If a shader only writes to an output via a constant initializer we need to lower it before we call nir_remove_dead_variables so that this pass sees the stores from the initializer and doesn't kill the output. Fixes test failures in new work-in-progress CTS tests: dEQP-VK.spirv_assembly.instruction.graphics.variable_init.output.float This is ported from anv: `99b57daf4a` anv/pipeline: lower constant initializers on output variables earlier from Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-19 19:29:40 +00:00
Dave Airlie	032014ac01	radv/query: handle multiview timestamp queries. For each view bit we need to emit a timestamp query. Fixes: dEQP-VK.multiview.queries* Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-19 19:29:14 +00:00
Dave Airlie	32b4f3c38d	radv/query: handle multiview queries properly. (v3) For multiview we need to emit a number of sequential queries depending on the view mask. This avoids dEQP-VK.multiview.queries.15 waiting forever on the CPU for query results that are never coming. We only really want to emit one query, and the rest should be blank (amdvlk does the same), so we emit begin/end pairs for all the others except the first query. v2: fix tests v3: split out patch. Fixes: dEQP-VK.multiview.queries* Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-19 19:29:09 +00:00
Dave Airlie	4034dc5c72	radv/query: split out begin/end query emission This just splits out the begin/end query hw emissions, it makes it easier to add multiview support for queries. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-19 19:29:05 +00:00
Dave Airlie	d4c74aed7a	radv/multiview: mark layer_input if we have input attachments. This fixes: dEQP-VK.multiview.input_attachments* Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-19 19:26:39 +00:00
Caio Marcelo de Oliveira Filho	f6338c3b85	anv/pipeline: set active_stages early Since the intermediate states of active_stages are not used, i.e. active_stages is read only after all stages were set into it, just set its value before compiling the shaders. This will allow to conditionally run certain passes based on what other shaders are being used, e.g. a certain pass might only be applicable to the vertex shader if there's no geometry or tessellation shader being used. v2: Use vk_to_mesa_shader_stage. (Lionel) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-19 18:00:49 +00:00
Caio Marcelo de Oliveira Filho	318073ce66	anv/pipeline: fail if TCS/TES compile fail v2: Add Fixes tag. (Lionel) Fixes: `e50d4807a3` ("anv: Compile TCS/TES shaders.") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-19 18:00:49 +00:00
Jordan Justen	2ed288363f	main/program_binary: In ProgramBinary set link status as LINKING_SKIPPED This change allows the disk shader cache to work with programs loaded with ProgramBinary. Drivers check for LINKING_SKIPPED, and if set, then they try to use the shader cache. Since the program loaded by ProgramBinary is similar to loading the shader from the disk cache, this is probably more appropriate. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-19 09:57:09 -07:00
Jordan Justen	d2b74ca2b5	i965: Allow disk shader cache usage with LINKING_SUCCESS status Currently, we only look in the disk shader cache if we see that the shader program is in the cache during the link step. If the shader cache entry isn't found during the program link, there are still some (fairly unlikely) scenarios where later it might be useful to search the cache for gen binary programs. 1. If the cache evicts the serialized glsl cache, there might still be valid gen program entries in the disk cache. 2. If two applications are running in parallel, then it is possible that one may write out the cached gen program item which the other application can then make use of. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-19 09:57:09 -07:00
Jordan Justen	b5baaee0d6	glsl/serialize: Save shader program metadata sha1 When the shader cache is used, this can be generated. In fact, the shader cache uses this sha1 to lookup the serialized GL shader program. If a GL shader program is restored with ProgramBinary, the shaders are not available, and therefore the correct sha1 cannot be generated. If this is restored, then we can use the shader cache to restore the binary programs to the program that was loaded with ProgramBinary. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-03-19 09:57:09 -07:00
Jordan Justen	9b473f9e3c	glsl: Remove api_enabled tracking for transform feedback We used this to prevent usage of the disk shader cache when transform feedback was enabled via the GL API. This is no longer used. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105444 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-19 09:57:09 -07:00
Jordan Justen	fc4a7aaa82	i965: Allow disk shader cache usage with transform feedback Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105444 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-19 09:57:09 -07:00
Jordan Justen	6d830940f7	glsl/shader_cache: Allow shader cache usage with transform feedback Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105444 Suggested-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-19 09:57:09 -07:00
Jose Fonseca	e10dc12f6f	scons: need to split CC or things might fail We've seen this fail internally. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-03-19 16:41:57 +01:00
Jordan Justen	d07a49fb18	i965: Add INTEL_DEBUG stages support for disk shader cache Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-03-19 00:07:29 -07:00
Dave Airlie	8f052a3e25	radv: handle exporting view index to fragment shader. (v1.1) The fragment shader was trying to read this, but nothing was exporting it from the vertex shader. This handles it like the prim id export. Fixes: dEQP-VK.multiview.secondary_cmd_buffer.* dEQP-VK.multiview.index.fragment_shader.* v1.1: updated to use 0x1 (Samuel) Fixes: `e3265c10c8` (radv: Implement multiview draws.) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-19 01:20:00 +00:00
Axel Davy	dbc24835d7	st/nine: Fix non inversible matrix check There was a missing absolute value when checking if the determinant was big enough. Fixes: https://github.com/iXit/Mesa-3D/issues/292 Signed-off-by: Axel Davy <davyaxel0@gmail.com> Reviewed-by: Patrick Rudolph <siro@das-labor.org> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> CC: "17.3 18.0" <mesa-stable@lists.freedesktop.org>	2018-03-18 22:53:46 +01:00
Axel Davy	f61e9a958b	st/nine: Fixes warning about implicit conversion Makes the conversion explicit. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=102542 Signed-off-by: Axel Davy <davyaxel0@gmail.com> Reviewed-by: Patrick Rudolph <siro@das-labor.org> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> CC: "17.3 18.0" <mesa-stable@lists.freedesktop.org>	2018-03-18 22:53:42 +01:00
Axel Davy	71eae7940e	st/nine: Fix bad tracking of vs textures for NINESBT_ALL Stateblocks with NINESBT_ALL should track all textures. For better performance they have a faster path which copies all the required. This path was only tracking ps textures. Fixes: https://github.com/iXit/Mesa-3D/issues/303 Signed-off-by: Axel Davy <davyaxel0@gmail.com> Reviewed-by: Patrick Rudolph <siro@das-labor.org> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> CC: "17.3 18.0" <mesa-stable@lists.freedesktop.org>	2018-03-18 22:53:36 +01:00
Axel Davy	76fa1f730b	st/nine: Fix bad tracking of bound vs textures An incorrect formula was used to compute bound_samplers_mask_vs. Since s is above always 8 for vs and the variable is encoded on 8 bits, it was always 0. This resulted in commiting the samplers every call when there was at least one texture read in the vs shader. Signed-off-by: Axel Davy <davyaxel0@gmail.com> Reviewed-by: Patrick Rudolph <siro@das-labor.org> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-03-18 22:53:32 +01:00
Grazvydas Ignotas	e1b2e5667c	radv: make vk_format_description structures static No need to bother the linker about them. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-17 18:53:21 +02:00
Grazvydas Ignotas	331141e87e	radv: fix stale comment in generated vk_format_table.c It seems to be a leftover from u_format_table.py. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-17 18:53:21 +02:00
Eric Anholt	7db1c09d12	anv: Silence warning about heap_size. We only get VK_SUCCESS if it was initialized, but apparently my compiler doesn't track that far. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-16 15:10:05 -07:00
Eric Anholt	d25640c3a3	i965: Silence compiler warning about promoted_constants. We only have a cfg != NULL if we went through one of the paths that set it, but my compiler doesn't figure that out. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `6411defdcd` ("intel/cs: Re-run final NIR optimizations for each SIMD size")	2018-03-16 15:09:55 -07:00
Eric Anholt	9f89452ea3	anv: Silence compiler warnings about uninitialized bind_offset. This is a legitimate warning: if anv's blorp_alloc_binding_table() throws an error from anv_cmd_buffer_alloc_blorp_binding_table(), we silently continue to use this undefined value. The rest of this code doesn't seem very allocation-error-proof, though, either. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-16 15:09:47 -07:00
Matt Turner	f3833f1ca7	intel/compiler: Use gen_get_device_info() in test_eu_validate Previously the unit test filled out a minimal devinfo struct. A previous patch caused the test to begin assert failing because the devinfo was not complete. Avoid this by using the real mechanism to create devinfo. Note that we have to drop icl from the table, since we now rely on the name -> PCI ID translation done by gen_device_name_to_pci_device_id(), and ICL's PCI IDs are not upstream yet. Fixes: `f89e735719` ("intel/compiler: Check for unsupported register sizes.") Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-03-16 13:20:21 -07:00
Matt Turner	54db78b196	intel: Add cfl to gen_device_name_to_pci_device_id() Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-03-16 13:20:21 -07:00
Rob Clark	bc5001325b	meson+dri3: allow building against older xcb (v3) Similar to previous patch, make xcb 1.13 optional. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-03-16 16:18:42 -04:00
Dave Airlie	7aeef2d4ef	dri3: allow building against older xcb (v3) I'm not sure everyone wants to be updating their dri3 in a forced march setting, this allows a nicer approach, esp when you want to build on distro that aren't brand new. I'm sure there are plenty of ways this patch could be cleaner, and I've also not built it against an updated dri3. For meson I've just left it alone, since if you are using meson you probably don't mind xcb updates, and if you are using meson you can fix this better than me. v3: just don't put a version in for dri3/present without modifiers, should allow building with 1.11 as well (feel free to supply meson followups) Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-03-16 13:19:45 -04:00
Marek Olšák	f099c3aef1	r600: consolidate PIPE_BIND_SHARED/SCANOUT handling (Ported from radeonsi commit `f70f6baaa3`) Allows cached BOs to be reused in more cases. Bugzilla: https://bugs.freedesktop.org/105171 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>	2018-03-16 17:31:28 +01:00
Rafael Antognolli	f89e735719	intel/compiler: Check for unsupported register sizes. Make sure we don't emit 64 bit types if the hardware doesn't support them. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Suggested-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-03-16 09:27:16 -07:00
Jason Ekstrand	315ee5faec	loader: Include include/drm-uapi in the autotools build We're already including it in the meson build. This fixes build issues on systems which have a drm_fourcc.h that doesn't have modifiers. Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-16 08:50:07 -07:00
Wu, Zhongmin	5fc21c6044	egl/android: Implement the eglSwapinterval for Android. Implement the eglSwapinterval for Android platform to enable the async mode for some GFX benchmarks such as Daimler C217, CityBench. Results of the dEQP-EGL.*swap_interval tests 'dEQP-EGL.functional.query_config.get_config_attrib.max_swap_interval'.. 'dEQP-EGL.functional.query_config.get_config_attrib.min_swap_interval'.. 'dEQP-EGL.functional.choose_config.simple.selection_only.max_swap_interval'.. 'dEQP-EGL.functional.choose_config.simple.selection_only.min_swap_interval'.. 'dEQP-EGL.functional.choose_config.simple.selection_and_sort.max_swap_interval'.. 'dEQP-EGL.functional.choose_config.simple.selection_and_sort.min_swap_interval'.. 'dEQP-EGL.functional.negative_api.swap_interval'.. Test run totals: Passed: 7/7 (100.0%) Failed: 0/7 (0.0%) Not supported: 0/7 (0.0%) Warnings: 0/7 (0.0%) Signed-off-by: Zhongmin Wu <zhongmin.wu@intel.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org> [Emil Velikov: polish inline comment, add dEQP stats, s/dpy/disp/] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-16 13:58:56 +00:00
Emil Velikov	3a9fb4f7ad	st/mesa: simplify st_init_limits() via tgsi_processor_to_shader_stage Reuse the tgis helper and remove a bunch of duplicated code. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-16 13:49:16 +00:00
Emil Velikov	f7f95310f0	tgsi: move tgsi_processor_to_shader_stage() to a header This way we can utilise it with later patches. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-16 13:48:46 +00:00
Emil Velikov	9fa1d822bf	egl/dri2: move wayland header inclusion where applicable Instead of indirectly pulling the wayland headers everywhere, use forward declarations and #include only as needed. Should effectively fix build errors like the following: make[5]: Entering directory '/.../src/gallium/state_trackers/omx/tizonia' CC h264dprc.lo In file included from h264dprc.c:45:0: .../src/egl/drivers/dri2/egl_dri2.h:47:10: fatal error: wayland/wayland-egl/wayland-egl-backend.h: No such file or directory #include "wayland/wayland-egl/wayland-egl-backend.h" Cc: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Tested-by: Andy Furniss <adf.lists@gmail.com>	2018-03-16 13:47:59 +00:00
Emil Velikov	d091c9c4cf	vulkan/wsi/x11: correct DRI3 version in comment During development the version was bumped, yet the comment did not get an update. Fixes: `c80c08e226` ("vulkan/wsi/x11: Add support for DRI3 v1.2") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-03-16 13:47:52 +00:00
Emil Velikov	19ec817756	vulkan/wsi/x11: use ARRAY_SIZE where applicable Use the handy macro instead of hard coded numbers. Fixes: `c80c08e226` ("vulkan/wsi/x11: Add support for DRI3 v1.2") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-03-16 13:45:47 +00:00
Juan A. Suarez Romero	705a6446b4	mesa: RGB9_E5 invalid for CopyTexSubImage* in GLES According to OpenGL ES 3.2, section 8.6, CopyTexSubImage* should return an INVALID_OPERATION if the internalformat of the texture is RGB9_E5. This fixes dEQP-GLES31.functional.debug.negative_coverage.*.copytexsubimage2d_texture_internalformat. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-03-16 12:49:16 +00:00
Christian Gmeiner	5e51f72374	etnaviv: remove superfluous \n from DBG(..) callers The DBG(..) macro appends a \n already so there is no need to do it twice. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2018-03-16 11:41:27 +01:00
Samuel Pitoiset	e96a1d27dc	radv: run nir_opt_move_load_ubo Polaris10: SGPRS: 108560 -> 107856 (-0.65 %) VGPRS: 74576 -> 74520 (-0.08 %) Spilled SGPRs: 7375 -> 7113 (-3.55 %) Code Size: 4273464 -> 4274364 (0.02 %) bytes Max Waves: 9434 -> 9446 (0.13 %) Vega10: Totals from affected shaders: SGPRS: 108264 -> 107576 (-0.64 %) VGPRS: 69068 -> 69000 (-0.10 %) Spilled SGPRs: 7221 -> 6959 (-3.63 %) Code Size: 3800796 -> 3801496 (0.02 %) bytes Max Waves: 10687 -> 10709 (0.21 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-16 09:58:19 +01:00
Samuel Pitoiset	af355aaa07	nir: add nir_opt_move_load_ubo() optimization pass This pass moves load UBO operations just before their first use, loosely based on nir_opt_move_comparisons. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-16 09:50:31 +01:00
Dave Airlie	9d0d806332	radv: drop geometry stride user sgpr. This removes the other geometry specific user sgpr. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:23:21 +00:00
Dave Airlie	6f051549c3	radv: get rid of geometry user sgpr for num entries. This drops one of the geometry specific user sgprs, we can work this out at compile time. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:23:17 +00:00
Dave Airlie	9188bd78d7	radv: migrate lds size calculations to shader gen. This moves the lds_size calcs into the shader so we have all the size stuff in one file. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:23:12 +00:00
Dave Airlie	384aced65e	radv: drop scanning the tess shader in the nir code. This drops the now unneeded scanning and results in favour of the ones in the info. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:23:08 +00:00
Dave Airlie	f50d520acf	radv: use num_patches output from tcs shader. Instead of recalculating the value, use the shader calculated value. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:23:05 +00:00
Dave Airlie	bf9a0ea853	radv/tess: remove last chunk of tess sgprs This removes the last TES-specifc user sgpr. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:23:01 +00:00
Dave Airlie	6db44d6a8c	radv: pass num_patches to tes from tcs TES needs num_patches to do some of the calculations. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:22:58 +00:00
Dave Airlie	010d055aae	radv: drop tess offchip layout for tcs. This removes the last TCS specific user sgpr. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:22:54 +00:00
Dave Airlie	ee31cff856	radv: drop tcs_out_offsets Move all calculations to shader generation. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:22:47 +00:00
Dave Airlie	b0460bbf1c	radv: drop tcs_out_layout Move all calculations to shader generation. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:22:43 +00:00
Dave Airlie	6adf99165c	radv/tess: drop tcs_in_layout setting completely. Inline all calcs at shader creation. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:22:37 +00:00
Dave Airlie	f343d11ae7	radv: drop ls_out_layout const. We can precalculate input_vertex_size at compile time. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:22:32 +00:00
Dave Airlie	d89b16b7b9	radv/shader_info: start gathering tess output info (v2) This gathers the ls outputs written by the vertex shader, and the tcs outputs, these are needed to calculate certain tcs parameters. These have to be separate for combined gfx9 shaders. This is a bit pessimistic compared to the nir pass, as we don't work out the individual slots for tcs outputs, but I actually thing it should be fine to just mark the whole thing used here. v2: move to radv, handle clip dist (Samuel), handle compacts and patchs properly. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:22:23 +00:00
Dave Airlie	2012dae19a	radv: migrate unique index info shader info (v2) This just moves this function to an inline so the shader_info pass can use it. v2: use inline (Samuel) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:22:19 +00:00
Samuel Pitoiset	f02f1ad13f	Revert "mesa: do not trigger _NEW_TEXTURE_STATE in glActiveTexture()" This reverts commit `f314a532fd`. This appears to introduce some blinking textures in UT2004. Not sure exactly what's the root cause because we don't have much information about the issue. Anyway, this was just a micro optimization that actually breaks, at least, one app almost one year later. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105436 Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-15 21:32:52 +01:00
Lionel Landwerlin	51783f3e7d	anv: silence unused variable warning Fixes: `59b0ea0c74` ("anv: Stop returning VK_ERROR_INCOMPATIBLE_DRIVER") Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-03-15 18:56:26 +00:00
Lionel Landwerlin	b5b56f91f5	i965: silence unused function warning [123/227] Compiling C object 'src/mesa/drivers/dri/i965/libi965_gen110@sta/genX_blorp_exec.c.o'. ../src/mesa/drivers/dri/i965/genX_blorp_exec.c:99:1: warning: ‘blorp_get_surface_base_address’ defined but not used [-Wunused-function] blorp_get_surface_base_address(struct blorp_batch *batch) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-03-15 18:56:23 +00:00
Lionel Landwerlin	0f544a3c51	anv: silence unused function warning on gen11 [84/227] Compiling C object 'src/intel/vulkan/libanv_gen110@sta/genX_blorp_exec.c.o'. ../src/intel/vulkan/genX_blorp_exec.c:68:1: warning: ‘blorp_get_surface_base_address’ defined but not used [-Wunused-function] blorp_get_surface_base_address(struct blorp_batch *batch) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-03-15 18:55:42 +00:00
Dylan Baker	2a7027f79a	meson: fix pipe-loaders after omx changes with_gallium_omx used to be a boolean, but now it's a string. That means it needs to be compared to 'disabled' instead of false. CC: Rob Clark <robdclark@gmail.com> Fixes: `34e852d5b5` ("meson: Re-add auto option for omx") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Tested-by: Rob Clark <robdclark@gmail.com Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-03-15 10:02:32 -07:00
Dylan Baker	9bd7a6f6f0	meson: require amdgpu >= 2.4.91 the meson equivalent of `f8773edb0a` Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-15 10:00:02 -07:00
Marek Olšák	f8773edb0a	configure.ac: require libdrm_amdgpu 2.4.91 Since 2.4.90 is problematic, just ask for the next version. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-15 12:44:40 -04:00
Marek Olšák	5d0acff39e	configure.ac: blacklist libdrm 2.4.90 Cc: 18.0 17.3 17.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-15 12:44:37 -04:00
Samuel Pitoiset	16ecf037f9	radv: dump LLVM IR when a hang is detected Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-15 17:20:07 +01:00
Samuel Pitoiset	81818662a5	radv: record LLVM IR when debugging shaders If AMD_shader_info or RADV_TRACE_FILE is used we might need to keep trace of LLVM IR. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-15 17:20:03 +01:00
Samuel Pitoiset	d07edf5fdf	radv: add dump_shader to the NIR compiler options Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-15 17:20:00 +01:00
Samuel Pitoiset	50fcca328c	radv: pass the NIR compiler options to ac_compile_llvm_module() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-15 17:19:58 +01:00
Samuel Pitoiset	14c27c2511	radv: print some information when RADV_TRACE_FILE is set Just to be sure all options are enabled when trying to generate a hang report. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-15 17:19:54 +01:00
Samuel Pitoiset	5be2757c35	radv: only display options that are enabled Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-15 17:19:52 +01:00
Eric Engestrom	6332893594	mailmap: Use Eric Engestrom's personal email address Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-03-15 12:03:41 +00:00
Alejandro Piñeiro	50767214a7	spirv/radv: add AMD_gcn_shader capability, remove current extensions So now, during spirv_to_nir, it uses the capability instead of the extension. Note that we are really doing here is treating SPV_AMD_gcn_shader as other supported extensions. SPV_AMD_gcn_shader is not the first SPV extension supported. For example, the capability draw_parameters infers if the extension SPV_KHR_shader_draw_parameters is supported or not. This could be seen as counter-intuitive, and that it would be easier to define which extensions are supported, and based our checks on that, but we need to take into account that some capabilities are optional from core, and others came from new extensions. Also this commit would make the implementation of ARB_spirv_extensions easier. v2: AMD_gcn_shader capability renamed to gcn_shader (Daniel Schürmann) Reviewed-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-15 12:08:25 +01:00
Samuel Iglesias Gonsálvez	adf58e59d3	spirv: update arguments for vtn_nir_alu_op_for_spirv_opcode() We don't need anymore the source and destination's data type, just their bitsize. v2: - Use glsl_get_bit_size () instead (Jason). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-15 08:56:15 +01:00
Samuel Iglesias Gonsálvez	ce2fd87056	spirv: fix the translation of SPIR-V conversion opcodes to NIR There are some SPIRV opcodes (like UConvert and SConvert) have some expectations of the output that doesn't depend on the operands data type. Generalize the solution of all of them. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-15 08:51:01 +01:00
Mathias Fröhlich	98f35ad63c	vbo: Correctly handle source arrays in vbo_split_copy. The original approach did optimize away a bit too many fields. Restablish the pointer into the original array and correctly feed that one. Reviewed-by: Brian Paul <brianp@vmware.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105471 Fixes: `64d2a20480` mesa: Make gl_vertex_array contain pointers to first order VAO members. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-15 06:11:57 +01:00
Apple SWE	361f79c97f	sched.h needs to be imported on Darwin/OSX targets. sched_yield is used but the include reference on Darwin is missing. This patch conditionally guards on Darwin/OSX to import sched.h first. Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-03-14 22:08:34 -07:00
Apple SWE	67f27b1e18	Add processor topology calculation implementation for Darwin/OSX targets. The implementation for bootstrapping SWR on Darwin targets is based on the Linux version. Instead of reading the output of /proc/cpuinfo, sysctlbyname is used to determine the physical identifiers, processor identifiers, core counts and thread-processor affinities. With this patch, it is possible to use SWR as an alternate renderer on OSX to softpipe and llvmpipe. Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-03-14 22:08:34 -07:00
Dave Airlie	4b15b5e803	virgl: resize resource bo allocation if we need to. This fixes an illegal command buffer on the host seen with piglit arb_internalformat_query2-max-dimensions Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-15 12:26:39 +10:00
Mario Kleiner	c1e47a3c1f	nv50,nvc0: Support BGRX1010102 and RGBX1010102 for sampling. Add them as usable for textures, so they can be used by Wayland drm in 10 bpc mode and for X11 compositing under GLX and EGL. We need these formats to be supported at least for sampling, otherwise GLX_texture_from_pixmap and the equivalent EGL image extension won't work with X11 drawables of depth 30 and just display an all black window. Do not expose these formats as renderable, and thereby not as a fbconfig/EGLConfig/Visual, as NVidia hw does not support 10 bpc unorm formats without alpha channel. Tested under X11 + GLX/EGL + DRI2/DRI3 for compositing, and under Wayland+Weston drm backend with a Tesla and Pascal gpu. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-03-14 21:41:27 -04:00
Thomas Helland	03e37ec6d7	util: Use set_foreach instead of rolling our own This follows the same pattern as in the hash_table. Reviewed-by: Jason Ekstrand <jason.ekstrand at intel.com>	2018-03-14 20:03:57 +01:00
Thomas Helland	5f129c05e6	glsl: Use hash table cloning in copy propagation Walking the whole hash table, inserting entries by hashing them first is just a really bad idea. We can simply memcpy the whole thing. V2: Remove leftover creation of acp in two places Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-14 19:52:02 +01:00
Thomas Helland	6baaf4291b	util: Implement a hash table cloning function V2: Don't rzalloc; we are about to rewrite the whole thing (Vladislav) Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-14 19:52:01 +01:00
Guillaume Charifi	388ed47081	st/mesa: Factorize duplicate code in st_BlitFramebuffer() Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-03-14 14:46:51 -04:00
Dylan Baker	7dd261ac50	autotools: add -I/src/egl to tizonia This fixes the following build breakage: make[5]: Entering directory '/mnt/sdc1/Gits/mesa/src/gallium/state_trackers/omx/tizonia' CC h264dprc.lo In file included from h264dprc.c:45:0: ../../../../../src/egl/drivers/dri2/egl_dri2.h:47:10: fatal error: wayland/wayland-egl/wayland-egl-backend.h: No such file or directory #include "wayland/wayland-egl/wayland-egl-backend.h" ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ compilation terminated. meson got the same fix in `7598dedfde`. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-14 11:23:19 -07:00
Dylan Baker	848f2b6e31	Revert "Add processor topology calculation implementation for Darwin/OSX targets." This reverts commit `de0d10db93`. This breaks the build on at least Linux, probably other non-apple platforms. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-03-14 09:30:17 -07:00
Dylan Baker	0f30c80932	Revert "sched.h needs to be imported on Darwin/OSX targets." This reverts commit `9dc5063262`. This breaks the build on at least Linux, probably other non-apple platforms. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-03-14 09:28:58 -07:00
Karol Herbst	b617bfcccf	compiler: int8/uint8 support OpenCL kernels also have int8/uint8. v2: remove changes in nir_search as Jason posted a patch for that Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-03-14 10:08:42 -04:00
Alex Smith	fcf267ba08	radv: Fix CmdCopyImage between uncompressed and compressed images From the spec: "When copying between compressed and uncompressed formats the extent members represent the texel dimensions of the source image and not the destination." However, as per `7b890a36`, we must still use the destination image type when clamping the extent so that we copy the correct number of layers for 2D to 3D copies. Fixes: `7b890a36` "radv: Fix vkCmdCopyImage for 2d slices into 3d Images" Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-14 09:59:21 +00:00
Samuel Pitoiset	38f34117dd	radv: fix vkGetDeviceQueue2() when create flags don't match This fixes CTS: dEQP-VK.api.device_init.create_device_queue2_unmatched_flags Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@gmail.com>	2018-03-14 09:53:42 +01:00
Neil Roberts	25a966a23d	spirv: Handle doubles when multiplying a mat by a scalar The code to handle mat multiplication by a scalar tries to pick either imul or fmul depending on whether the matrix is float or integer. However it was doing this by checking whether the base type is float. This was making it choose the int path for doubles (and presumably float16s). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-14 08:43:33 +01:00
Iago Toral Quiroga	1a0aba7216	anv/entrypoints: VkGetDeviceProcAddr returns NULL for core instance commands `af5f2322d0` addressed this for extension commands, but the spec mandates this behavior also for core API commands. From the Vulkan spec, Table 2. vkGetDeviceProcAddr behavior: device pname return ---------------------------------------------------------- (..) device core device-level command fp (...) See that it specifically states "device-level". Since the vk.xml file doesn't state if core commands are instance or device level, we identify device level commands as the ones that take a VkDevice, VkQueue or VkCommandBuffer as their first parameter. Fixes test failures in new work-in-progress CTS tests. Also see the public issue: https://github.com/KhronosGroup/Vulkan-LoaderAndValidationLayers/issues/2323 v2: - Include reference to github issue (Emil) - Rebased on top of Vulkan 1.1 changes. v3: - Remove the not in the condition and switch the then/else cases (Jason) Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-14 08:09:15 +01:00
Iago Toral Quiroga	a631575ff4	anv/entrypoints: dispatches to VkQueue are device-level v2: - Add trampoline functions (Jason) - Add an assertion for unhandled trampoline cases Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-14 08:09:15 +01:00
Dave Airlie	3b0f2081b5	radv: drop assert on bindingDescriptorCount > 0 The spec is pretty clear that this can be 0, and that it operates as a reserved binding. Fixes: dEQP-VK.binding_model.descriptor_update.empty_descriptor.uniform_buffer Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-14 16:54:52 +10:00
Apple SWE	9dc5063262	sched.h needs to be imported on Darwin/OSX targets. sched_yield is used but the include reference on Darwin is missing. This patch conditionally guards on Darwin/OSX to import sched.h first. Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>	2018-03-13 22:50:56 -07:00
Apple SWE	de0d10db93	Add processor topology calculation implementation for Darwin/OSX targets. The implementation for bootstrapping SWR on Darwin targets is based on the Linux version. Instead of reading the output of /proc/cpuinfo, sysctlbyname is used to determine the physical identifiers, processor identifiers, core counts and thread-processor affinities. With this patch, it is possible to use SWR as an alternate renderer on OSX to softpipe and llvmpipe. Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>	2018-03-13 22:50:27 -07:00
Roland Scheidegger	274f8bf05e	r600: fix abs for op3 sources If a src was referencing the same temp as the dst, the per-component copy code didn't work. e.g. cndge r0.xy, r0.xx, \|r2\|, r3 got expanded into mov r12.x, \|r2\| cndge r0.x, r0.x, r12, r3 mov r12.y, \|r2\| cndge r0.y, r0.x, r12, r3 hence for the second cndge r0.x was mistakenly the previous cndge result. Fix this by doing all the movs first, so there's no bogus alu.last in between. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=102905 Tested-by: <iive@yahoo.com> Reviewed-by: Dave Airlie <airlied@gmail.com>	2018-03-14 04:54:45 +01:00
Dave Airlie	27a5e5366e	radv: mark all tess output for an indirect access. If a shader does a tcs store with an indirect access, we were only marking the first spot as used. For indirect access we always now mark all slots used by the variable. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105464 Fixes: `94f9591995` (radv/ac: add support for TCS/TES inputs/outputs.) Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-14 11:18:54 +10:00
Dave Airlie	4f0c89d66c	ac/nir: pass the nir variable through tcs loading. I was going to have to add another parameter to this monster, so we should just pass the nir_variable in, I can't find any reason this would be a bad idea. This needed for the next fix. Fixes: `94f9591995` (radv/ac: add support for TCS/TES inputs/outputs.) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-14 11:18:54 +10:00
Dave Airlie	f9de2d409b	radv: get correct offset into LDS for indexed vars. This seems more correct to me, since if we have an array of floats they'll be vec4 aligned, and if we do af[2], we want the const index to increase by 2 slots in the non compact case. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105464 Fixes: `94f9591995` (radv/ac: add support for TCS/TES inputs/outputs.) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-14 11:18:54 +10:00
Rob Clark	4e4428482e	nir: lower_load_const_to_scalar fix for 8/16b types Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-13 20:17:04 -04:00
Dylan Baker	2aad12b2af	Update the documentation for meson Meson is pretty well tested and works in most configurations now, so we can remove the warning about it being unsuited for actual use. It's also worth documenting that meson 0.42.0 or greater is required. v2: - Minor rewording of supported platforms as suggested by Emil - Add two missing tags as reported by xmllint --html Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v1) Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)	2018-03-13 14:54:47 -07:00
Jason Ekstrand	85000b812d	ac/nir: Use lower_vote_eq_to_ballot instead of ac_nir_lower_subgroups Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 13:25:27 -07:00
Jason Ekstrand	3d1d7e8561	nir/subgroups: Add lowering for vote_ieq/vote_feq to a ballot This is based heavily on `97f10934ed`, "ac/nir: Add vote_ieq/vote_feq lowering pass." from Bas Nieuwenhuizen. This version is a bit more general since it's in common code. It also properly handles NaN due to not flipping the comparison for floats. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 13:25:15 -07:00
Dylan Baker	8247a30838	meson: don't use compiler.has_header Meson's compiler.has_header is completely useless, it only checks that a header exists, not whether it's usable. This creates problems if a header contains a conditional #error declaration, like so: > #if __x86_64__ > # error "Doesn't work with x86_64!" > #endif Compiler.has_header will return true in this case, even when compiling for x86_64. This is useless. Instead, we'll do a compile check so that any #error declarations will be treated as errors, and compilation will work. Fixes compilation on x32 architecture. Gentoo Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=649746 meson bug: https://github.com/mesonbuild/meson/issues/2246 Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-03-13 11:41:10 -07:00
Jason Ekstrand	8379bff6c4	i965: Emit texture cache invalidates around blorp_copy This is a terrible hack but it fixes CTS regressions. It's still incredibly unclear exactly what is going wrong in the hardware to cause this to be an issue so this isn't a good fix by any means. However, it does fix tests so there is that. Fixes: `fb0e9b5197` "i965: Track the depth and render caches separately" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103746 Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-13 11:24:40 -07:00
Eric Anholt	a326eedc75	brodacom/vc4: Fix simulator since the perfmon change. It would be nice to support perfmon with simulator, and might be a useful tool for regression testing performance (since the simulator would be deterministic).	2018-03-13 10:32:58 -07:00
Eric Anholt	191bc7ce61	spirv: Silence compiler warning about undefined srcs[0] v2: Use assume() at the srcs[] definition instead. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-03-13 10:32:55 -07:00
Samuel Pitoiset	7c83430672	ac/nir: rename radeon_llvm_reg_index_soa() to ac_llvm_reg_index_soa() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 16:54:28 +01:00
Samuel Pitoiset	b128fd773f	ac/nir: remove some unnecessary includes and declarations Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 16:54:27 +01:00
Samuel Pitoiset	cd4e823341	ac/nir: drop radv prefix from radv_lower_gather4_integer() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 16:54:25 +01:00
Samuel Pitoiset	fbe694562b	ac/nir: move ac_nir_compiler_options and friends to radv folder Also replace ac_ by radv_. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 16:54:23 +01:00
Samuel Pitoiset	237229430f	ac: move ac_shader_info to radv folder This is RADV specific code. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 16:54:21 +01:00
Samuel Pitoiset	2cfba40eea	ac/nir: move ac_shader_variant_info and friends to radv folder Also replace ac_ by radv_. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 16:54:16 +01:00
Samuel Pitoiset	b2653007b9	ac/nir: move all RADV related code to radv_nir_to_llvm.c Now the "ac/nir" prefix will really be the shared code between RadeonSI and RADV, that might avoid confusions in the future. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 14:05:06 +01:00
Samuel Pitoiset	8e15824b9d	ac/nir: make emit_barrier() non-static Required in order to move all RADV specific code outside of ac/nir. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 14:05:06 +01:00
Samuel Pitoiset	4e3117b718	ac/nir: move radeon_llvm_reg_index_soa() to ac_nir_to_llvm.h Required in order to move all RADV specific code outside of ac/nir. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 14:05:06 +01:00
Samuel Pitoiset	3a30b89353	ac/nir: make handle_shader_output_decl() non-static Required in order to move all RADV specific code outside of ac/nir. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 14:05:06 +01:00
Samuel Pitoiset	3fe47b1290	ac/nir: change prototype of handle_shader_output_decl() This allows to remove the ac_nir_context dependency. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 14:05:06 +01:00
Samuel Pitoiset	61a91ca3f5	ac/nir: move unpack_param() to ac_llvm_build.c Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 14:05:06 +01:00
Samuel Pitoiset	28bb6873ec	ac/nir: move trim_vector to ac_llvm_build.c Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 14:05:06 +01:00
Samuel Pitoiset	895632baef	ac/nir: move cast_ptr() to ac_llvm_build.c Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 14:05:06 +01:00
Samuel Pitoiset	bf6368297b	ac/nir: move ac_build_alloca() to ac_llvm_build.c As well as si_build_alloca_undef() and drop the si prefix. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 14:05:06 +01:00
Timothy Arceri	370e356eba	gallium: silence __builtin_frame_address nonzero argument is unsafe warning Calling __builtin_frame_address with a nonzero argument is unsafe but is sometimes done for debugging purposes. Since this code is part of some debug util code I'm assuming that is the case here and using GCC pragma to silence the warning. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-03-13 09:38:10 +11:00
Dylan Baker	b7c6870f87	meson: Add moduledir to d3d.pc This is required to build wine with the nine patchset Fixes: `6b4c7047d5` ("meson: build gallium nine state_tracker") Reported-by: Mike Lothian <mike@fireburn.co.uk> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-03-12 13:52:38 -07:00
Mathias Fröhlich	a2f08dd574	gallium: Use struct gl_array_attributes* as st_pipe_vertex_format argument. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-12 18:24:31 +01:00
Ian Romanick	def0030e64	mesa: Don't write to user buffer in glGetTexParameterIuiv on error With some sets of optimization flags, GCC will generate warnings like this: src/mesa/main/texparam.c:2327:27: warning: ‘((void )&ip+12)’ may be used uninitialized in this function [-Wmaybe-uninitialized] params[3] = ip[3]; ~~^~~ src/mesa/main/texparam.c:2320:16: note: ‘((void )&ip+12)’ was declared here GLint ip[4]; ^~ ip is not initialized in cases where a GL error is generated. In these cases, we should not write to the user's buffer, so this is actually a bug. I wrote a new piglit test gl-3.0-texparameteri to show this bug. I suspect that Coverity also detected this, but the scan site is currently down. Fixes: `c2c507786` "main: Added entry points for glGetTextureParameteriv, Iiv, and Iuiv." Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-03-12 10:13:30 -07:00
Roman Gilg	f94597f554	gallium: work around libtool relink issue for libdrm This is similar to commit `90633079`. libtool links first to system directories instead of custom locations of libdrm on relinking. Since a more recent libdrm version than the one provided by the system is often needed when compiling mesa, make sure this works by putting libdrm in front. See also: https://bugs.freedesktop.org/show_bug.cgi?id=100259 Signed-off-by: Roman Gilg <subdiff@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-12 14:49:07 +00:00
Emil Velikov	678ba53240	vulkan: autotools: do not redirect stdin/stdout for wayland-scanner The tool accepts the input and output files as arguments. There's no need for the redirection. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-03-12 14:48:52 +00:00
Emil Velikov	8151f5cad9	wayland-drm: autotools: do not redirect stdin/stdout for wayland-scanner The tool accepts the input and output files as arguments. There's no need for the redirection. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-03-12 14:48:52 +00:00
Emil Velikov	1178e0cf49	egl: autotools: do not redirect stdin/stdout for wayland-scanner The tool accepts the input and output files as arguments. There's no need for the redirection. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-03-12 14:48:52 +00:00
Emil Velikov	08189731a4	docs: document removal of GLX_SGIX_swap_{barrier,group} stubs Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-12 14:48:52 +00:00
Emil Velikov	5ef608fab7	glx: remove empty GLX_SGIX_swap_group stubs The extension was never implemented. Quick search suggests: - no actual users (on my Arch setup) - the Nvidia driver does not implement the extension Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-03-12 14:48:52 +00:00
Emil Velikov	2c765b0d9a	gallium/x11: remove empty GLX_SGIX_swap_group stubs The extension was never implemented. Quick search suggests: - no actual users (on my Arch setup) - the Nvidia driver does not implement the extension Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-03-12 14:48:52 +00:00
Emil Velikov	afab516f5f	x11: remove empty GLX_SGIX_swap_group stubs The extension was never implemented. Quick search suggests: - no actual users (on my Arch setup) - the Nvidia driver does not implement the extension Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-03-12 14:48:52 +00:00
Emil Velikov	742b8e3301	glx: remove empty GLX_SGIX_swap_barrier stubs The extension was never implemented. Quick search suggests: - no actual users (on my Arch setup) - the Nvidia driver does not implement the extension Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-03-12 14:48:52 +00:00
Emil Velikov	447731348e	gallium/x11: remove empty GLX_SGIX_swap_barrier stubs The extension was never implemented. Quick search suggests: - no actual users (on my Arch setup) - the Nvidia driver does not implement the extension Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-03-12 14:48:51 +00:00
Emil Velikov	1d2d519d78	x11: remove empty GLX_SGIX_swap_barrier stubs The extension was never implemented. Quick search suggests: - no actual users (on my Arch setup) - the Nvidia driver does not implement the extension Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-03-12 14:48:51 +00:00
Emil Velikov	f197f02e50	configure: remove unused AM_CONDITIONAL Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-03-12 14:48:51 +00:00
Bas Nieuwenhuizen	997306c031	radv: Increase the number of dynamic uniform buffers. The vulkan API is not ideal as it does not allow us have a shared limit. Feral needs 15+6 for one of their games, and I'm not a fan of overcommitting the limits, so increase the number of dynamic uniform buffers to 16. CC: <mesa-stable@lists.freedesktop.org> CC: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-12 09:46:22 +01:00
Dave Airlie	e76cf1ff12	u_vbuf/translate: pass max_index into the set_buffer. This fixes a memory trashing crash (not the test) seen with dEQP-GLES3.stress.draw.unaligned_data.random.203 on virgl. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-12 11:57:13 +10:00
Dave Airlie	5d4fbc2b54	r600: implement callstack workaround for evergreen. This is ported from the sb backend, there are some issues with evergreen stacks on the boundary between entries and ALU_PUSH_BEFORE instructions. Whenever we are going to use a push before, we check the stack usage and if we have to use the workaround, then we switch to a separate push. I noticed this problem dealing with some of the soft fp64 shaders, in nosb mode, they are quite stack happy. This fixes all the glitches and inconsistencies I've seen with them Reviewed-by: Roland Scheidegger <sroland@vmware.com> Tested-by: Elie Tournier <elie.tournier@collabora.com> Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-12 11:11:44 +10:00
Marek Olšák	163a29099a	gallium/util: add helper util_wait_for_idle This is an old patch that I had.	2018-03-11 13:14:27 -04:00
Roland Scheidegger	0f0a6fa21d	u_blit: (trivial) u_blit.h needs to include p_defines.h (For the pipe_tex_filter enum) Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-03-10 20:09:04 +01:00
Christian Gmeiner	c9b153fea7	travis: bump libxcb version to 1.13 Fixes following dependency problem: Native dependency xcb-dri3 found: NO found '1.11' but need: '>= 1.13' Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Daniel Stone <daniels@collabora.com> Fixes: `c80c08e226` ("vulkan/wsi/x11: Add support for DRI3 v1.2")	2018-03-10 16:55:36 +01:00
Mathias Fröhlich	64d2a20480	mesa: Make gl_vertex_array contain pointers to first order VAO members. Instead of keeping a copy of the vertex array content in struct gl_vertex_array only keep pointers to the first order information originaly in the VAO. For that represent the current values by struct gl_array_attributes and struct gl_vertex_buffer_binding. v2: Change comments. Remove gl... prefix from variables except in the i965 directory where it was like that before. Reindent because of that. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-10 07:33:51 +01:00
Roland Scheidegger	d62f0df354	draw: fix alpha value for very short aa lines The logic would not work correctly for line lengths smaller than 1.0, even a degenerated line with length 0 would still produce a fragment with anyhwere between alpha 0.0 and 0.5. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-03-10 02:11:50 +01:00
Jordan Justen	24b415270f	intel/vulkan: Hard code CS scratch_ids_per_subslice for Cherryview Ken suggested that we might be underallocating scratch space on HD 400. Allocating scratch space as though there was actually 8 EUs seems to help with a GPU hang seen on synmark CSDof. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-09 16:15:58 -08:00
Jordan Justen	06e3bd02c0	i965: Hard code CS scratch_ids_per_subslice for Cherryview Ken suggested that we might be underallocating scratch space on HD 400. Allocating scratch space as though there was actually 8 EUs seems to help with a GPU hang seen on synmark CSDof. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104636 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105290 Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>	2018-03-09 16:15:34 -08:00
Marek Olšák	db495b8962	st/dri: fix OpenGL-OpenCL interop for GL_TEXTURE_BUFFER Tested by our OpenCL team. Fixes: `9c499e6759` "st/mesa: don't invoke st_finalize_texture & st_convert_sampler for TBOs" Acked-by: Alex Deucher <alexander.deucher@amd.com>	2018-03-09 16:33:31 -05:00
Marek Olšák	2bdb54bce7	radeonsi: add a workaround for GFX9 hang with init_config alignment Fixes: `75c5d25f0f` "radeonsi: align command buffer starting address to fix some Raven hangs" Cc: 17.3 18.0 <mesa-stable@lists.freedesktop.org>	2018-03-09 16:28:29 -05:00
Marek Olšák	e99212e970	ac/gpu_info: print ib_start_alignment, add assertion	2018-03-09 16:28:29 -05:00
Greg V	e30a165be2	meson: Use system_has_kms_drm in default driver selection Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-03-09 10:02:44 -08:00
Eric Anholt	c57d5ea3bb	broadcom/vc4: Add an accelerated path to turn raster R8/RG88 into tiled. Drawing a 1080p YV12 video stream generated by MMAL goes from 10.5 FPS to 36.	2018-03-09 09:59:54 -08:00
Eric Anholt	cf170616da	gallium: Add a util_blitter path for using a custom VS and FS. Like the r600 paths to use other custom states, we pass in a couple of parameters to customize the innards of the blitter. It's up to the caller to wrap other state necessary for its shaders (for example, constant buffers for the uniforms the shader uses). Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-09 09:59:54 -08:00
Eric Anholt	46a32e3d2e	broadcom/vc4: Allow binding non-zero constant buffers. We're going to use UBO loads for implementing YUV linear-to-T-format blits.	2018-03-09 09:59:54 -08:00
Eric Anholt	2725ab2b12	broadcom: Remove our defines of DRM_FORMAT_MOD_INVALID. The imported drm_fourcc.h handles it now.	2018-03-09 09:59:54 -08:00
Eric Anholt	a3a4c23dec	broadcom: Suppress compiler warnings about enum pipe_tex_filter.	2018-03-09 09:59:54 -08:00
Louis-Francis Ratté-Boulianne	3160cb86aa	egl/x11: Re-allocate buffers if format is suboptimal If PresentCompleteNotify event says the pixmap was presented with mode PresentCompleteModeSuboptimalCopy, it means the pixmap could possibly have been flipped instead if allocated with a different format/modifier. Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-03-09 17:47:14 +00:00
Louis-Francis Ratté-Boulianne	069fdd5f9f	egl/x11: Support DRI3 v1.1 Add support for DRI3 v1.1, which allows pixmaps to be backed by multi-planar buffers, or those with format modifiers. This is both for allocating render buffers, as well as EGLImage imports from a native pixmap (EGL_NATIVE_PIXMAP_KHR). Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-03-09 17:47:14 +00:00
Louis-Francis Ratté-Boulianne	61309c2a72	vulkan/wsi/x11: Return VK_SUBOPTIMAL_KHR for X11 When it is detected that a window could have been flipped but has been copied because of suboptimal format/modifier. The Vulkan client should then re-create the swapchain. Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-03-09 17:47:13 +00:00
Daniel Stone	c80c08e226	vulkan/wsi/x11: Add support for DRI3 v1.2 Adds support for multiple planes and buffer modifiers. v4: Rename "has_dri3_v1_1" to "has_dri3_modifiers" v12: Multi-planar/modifier support is now DRI3 v1.2; also update release versions	2018-03-09 17:47:13 +00:00
Dylan Baker	7258be91c5	autotools: include all meson.build files Otherwise SWR cannot be built with meson from an autotools generated tarball, such as the 18.0.0-rc4 tarball. Fixes: `16bf813830` ("meson/swr: re-shuffle generated files") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: George Kyriazis <george.kyriazis@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-09 08:15:04 -08:00
Michel Dänzer	2a4596a2f0	st/mesa: gl_program::info.system_values_read is a 64-bit-field We were dropping the upper 32 bits, which caused assertion failures in some compute shader piglit tests with radeonsi since the commit below. Fixes: `752e969703` ("compiler: Add two new system values for subgroups") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-09 16:52:11 +01:00
George Kyriazis	379e00dc27	swr/rast: Refactor memory gather operations Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-03-09 09:36:42 -06:00
George Kyriazis	3f7ce10b3e	swr/rast: Add KNOB_DISABLE_SPLIT_DRAW This is useful for archrast data collection. This greatly speeds up the post processing script since there is significantly less events generated. Finally, this is a simpler option to communicate to users than having them directly adjust MAX_PRIMS_PER_DRAW and MAX_TESS_PRIMS_PER_DRAW. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-03-09 09:36:30 -06:00
George Kyriazis	e0a4a25829	swr/rast: Add VPOPCNT Supports popcnt on vector masks (e.g. <8 x i1>) Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-03-09 09:36:23 -06:00
George Kyriazis	b56afe1a4f	swr/rast: Add tracking for stream out topology Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-03-09 09:36:14 -06:00
George Kyriazis	2f6ae8cfcd	swr/rast: Add split draw and other state information to DrawInfoEvent. Removed specific split draw events. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-03-09 09:36:07 -06:00
George Kyriazis	714093203e	swr/rast: Refactor api and worker event handlers. In the API event handler we want to share information between the core layer and the API. Specifically, around associating various ids with different kinds of events. For example, associate render pass id with draw ids, or command buffer ids with draw ids. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-03-09 09:35:59 -06:00
George Kyriazis	cfdd35beaf	swr/rast: Add support for generalized late and early z/stencil stats Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-03-09 09:35:52 -06:00
George Kyriazis	9e25f298eb	swr/rast: Rasterized Subspans stats support Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-03-09 09:35:47 -06:00
George Kyriazis	d78b28fc33	swr/rast: Added comment Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-03-09 09:34:55 -06:00
Eric Engestrom	e903a7b0bb	vulkan/wsi: clean up cleanup path Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Keith Packard <keithp@keithp.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-09 13:25:44 +00:00
Bas Nieuwenhuizen	a793e7899f	radv: Fix the autotools build take 2. Forgot to remove a word.... Fixes: `04ffabf17a` "radv: Fix autotools build."	2018-03-09 14:10:24 +01:00
Lucas Stach	1f55d06783	etnaviv: allow mixing different bit depths for color and depth surfaces Vivante hardware supports this just fine. There is no reason why this shouldn't be advertised as a valid combination. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2018-03-09 12:06:07 +01:00
Thierry Reding	6d4d46bca9	autotools: Add tegra to AM_DISTCHECK_CONFIGURE_FLAGS This allows the driver to be built on a make distcheck and makes sure that it properly builds when a distribution tarball is made. Suggested-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2018-03-09 11:48:22 +01:00
Thierry Reding	1755f608f5	tegra: Initial support Tegra K1 and later use a GPU that can be driven by the Nouveau driver. But the GPU is a pure render node and has no display engine, hence the scanout needs to happen on the Tegra display hardware. The GPU and the display engine each have a separate DRM device node exposed by the kernel. To make the setup appear as a single device, this driver instantiates a Nouveau screen with each instance of a Tegra screen and forwards GPU requests to the Nouveau screen. For purposes of scanout it will import buffers created on the GPU into the display driver. Handles that userspace requests are those of the display driver so that they can be used to create framebuffers. This has been tested with some GBM test programs, as well as kmscube and weston. All of those run without modifications, but I'm sure there is a lot that can be improved. Some fixes contributed by Hector Martin <marcan@marcan.st>. Changes in v2: - duplicate file descriptor in winsys to avoid potential issues - require nouveau when building the tegra driver - check for nouveau driver name on render node - remove unneeded dependency on libdrm_tegra - remove zombie references to libudev - add missing headers to C_SOURCES variable - drop unneeded tegra/ prefix for includes - open device files with O_CLOEXEC - update copyrights Changes in v3: - properly unwrap resources in ->resource_copy_region() - support vertex buffers passed by user pointer - allocate custom stream and const uploader - silence error message on pre-Tegra124 - support X without explicit PRIME Changes in v4: - ship Meson build files in distribution tarball - drop duplicate driver_tegra dependency Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Andre Heider <a.heider@gmail.com> Reviewed-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2018-03-09 11:48:22 +01:00
Thierry Reding	2052dbdae3	nouveau: Add framebuffer modifier support This adds support for framebuffer modifiers to Nouveau. This will be used by the Tegra driver to share metadata about the format of buffers (such as the tiling mode or compression). Changes in v2: - remove unused parameters to nouveau_buffer_create() - move format modifier query code to nvc0 backend - restrict format modifiers to 2D textures - implement ->query_dmabuf_modifiers() Changes in v4: - add UAPI include path on meson builds Changes in v5: - remove unnecessary includes Acked-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Andre Heider <a.heider@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Thierry Reding <treding@nvidia.com>	2018-03-09 11:48:08 +01:00
Thierry Reding	b964cab80a	nouveau/nvc0: Extract common tile mode macro Add a new macro that can be used to extract the tiling mode from a tile_mode value. This is will be used to determine the number of GOBs used in block linear mode. Acked-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Andre Heider <a.heider@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Thierry Reding <treding@nvidia.com>	2018-03-09 11:47:54 +01:00
Thierry Reding	75bf489628	drm/tegra: Sanitize format modifiers The existing format modifier definitions were merged prematurely, and recent work has unveiled that the definitions are suboptimal in several ways: - The format specifiers, except for one, are not Tegra specific, but the names don't reflect that. - The number space is split into two, reserving 32 bits for some "parameter" which most of the modifiers are not going to have. - Symbolic names for the modifiers are not using the standard DRM_FORMAT_MOD_* prefix, which makes them awkward to use. - The vendor prefix NV is somewhat ambiguous. Fortunately, nobody's started using these modifiers, so we can still fix the above issues. Do so by using the standard prefix. Also, remove TEGRA from the name of those modifiers that exist on NVIDIA GPUs as well. In case of the block linear modifiers, make the "parameter" smaller (4 bits, though only 6 values are valid) and don't let that leak into any of the other modifiers. Finally, also use the more canonical NVIDIA instead of the ambiguous NV prefix. This is based on commit 268892cb63a822315921a8dab48ac3e4abf7dd03 from Linux v4.16-rc1. Acked-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Andre Heider <a.heider@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2018-03-09 11:44:35 +01:00
Thierry Reding	ffc85cfac0	drm/fourcc: Fix fourcc_mod_code() definition Avoid a compiler warnings when the val parameter is an expression. This is based on commit 5843f4e02fbe86a59981e35adc6cabebee46fdc0 from Linux v4.16-rc1. Acked-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Andre Heider <a.heider@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2018-03-09 11:44:35 +01:00
Bas Nieuwenhuizen	04ffabf17a	radv: Fix autotools build. Forgot it again .... Fixes: `b6347807a9` "radv: Generate icd files." Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-09 09:36:19 +01:00
Samuel Pitoiset	365850fd68	ac/nir: set number of channels for packed mrt exports Bit 0 enables VSRC0 (R in low bits, G high) and bit 2 enables VSRC1 (B in low bits, A high). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-09 09:28:20 +01:00
Bas Nieuwenhuizen	68201ab2da	radv: Update version to 1.1.70. Turns out they did not reset the patch number on release. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-09 07:53:39 +01:00
Bas Nieuwenhuizen	b6347807a9	radv: Generate icd files. If the api version is too low, the loader clamps the application requested version to the advertized version, which messes with which extensions are enabled. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-09 07:53:39 +01:00
Ian Romanick	6878c9aabc	nir: Don't i2b a value that is already Boolean A bunch of shaders have sequences like: i2b(u2i(floatBitsToUint(intBitsToFloat(x == y ? -1 : 0)))) Other optimizations (and NIR's typeless nature) reduce this to i2b(x == y) which is silly. Skylake total instructions in shared programs: 14498698 -> 14497948 (<.01%) instructions in affected programs: 74480 -> 73730 (-1.01%) helped: 277 HURT: 0 helped stats (abs) min: 1 max: 32 x̄: 2.71 x̃: 2 helped stats (rel) min: 0.04% max: 13.79% x̄: 1.45% x̃: 0.68% 95% mean confidence interval for instructions value: -3.35 -2.06 95% mean confidence interval for instructions %-change: -1.74% -1.16% Instructions are helped. total cycles in shared programs: 532015500 -> 531999238 (<.01%) cycles in affected programs: 5943878 -> 5927616 (-0.27%) helped: 251 HURT: 74 helped stats (abs) min: 1 max: 13149 x̄: 127.89 x̃: 14 helped stats (rel) min: 0.01% max: 17.31% x̄: 1.55% x̃: 0.53% HURT stats (abs) min: 1 max: 4550 x̄: 214.04 x̃: 15 HURT stats (rel) min: <.01% max: 44.43% x̄: 2.81% x̃: 0.33% 95% mean confidence interval for cycles value: -158.51 58.43 95% mean confidence interval for cycles %-change: -1.07% -0.04% Inconclusive result (value mean confidence interval includes 0). total loops in shared programs: 4753 -> 4735 (-0.38%) loops in affected programs: 18 -> 0 helped: 18 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% 95% mean confidence interval for loops value: -1.00 -1.00 95% mean confidence interval for loops %-change: -100.00% -100.00% Loops are helped. Haswell and Broadwell had simliar results. (Broadwell shown) total instructions in shared programs: 14791877 -> 14791127 (<.01%) instructions in affected programs: 77326 -> 76576 (-0.97%) helped: 278 HURT: 1 helped stats (abs) min: 1 max: 32 x̄: 2.70 x̃: 2 helped stats (rel) min: 0.04% max: 13.79% x̄: 1.42% x̃: 0.68% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.49% max: 0.49% x̄: 0.49% x̃: 0.49% 95% mean confidence interval for instructions value: -3.33 -2.05 95% mean confidence interval for instructions %-change: -1.70% -1.13% Instructions are helped. total cycles in shared programs: 558250067 -> 558252872 (<.01%) cycles in affected programs: 5806328 -> 5809133 (0.05%) helped: 235 HURT: 83 helped stats (abs) min: 1 max: 10630 x̄: 81.73 x̃: 16 helped stats (rel) min: 0.03% max: 18.58% x̄: 1.60% x̃: 0.51% HURT stats (abs) min: 1 max: 10590 x̄: 265.19 x̃: 20 HURT stats (rel) min: <.01% max: 15.28% x̄: 1.89% x̃: 0.54% 95% mean confidence interval for cycles value: -89.87 107.51 95% mean confidence interval for cycles %-change: -1.06% -0.32% Inconclusive result (value mean confidence interval includes 0). total loops in shared programs: 4735 -> 4717 (-0.38%) loops in affected programs: 18 -> 0 helped: 18 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% 95% mean confidence interval for loops value: -1.00 -1.00 95% mean confidence interval for loops %-change: -100.00% -100.00% Loops are helped. total fills in shared programs: 83111 -> 83110 (<.01%) fills in affected programs: 28 -> 27 (-3.57%) helped: 1 HURT: 0 Ivy Bridge total instructions in shared programs: 11774173 -> 11773436 (<.01%) instructions in affected programs: 70819 -> 70082 (-1.04%) helped: 267 HURT: 0 helped stats (abs) min: 1 max: 48 x̄: 2.76 x̃: 2 helped stats (rel) min: 0.21% max: 19.51% x̄: 1.57% x̃: 0.63% 95% mean confidence interval for instructions value: -3.51 -2.01 95% mean confidence interval for instructions %-change: -1.94% -1.21% Instructions are helped. total cycles in shared programs: 257153833 -> 257148932 (<.01%) cycles in affected programs: 585341 -> 580440 (-0.84%) helped: 167 HURT: 100 helped stats (abs) min: 1 max: 1327 x̄: 44.89 x̃: 16 helped stats (rel) min: 0.04% max: 26.54% x̄: 2.41% x̃: 0.88% HURT stats (abs) min: 1 max: 200 x̄: 25.95 x̃: 16 HURT stats (rel) min: 0.04% max: 9.81% x̄: 1.34% x̃: 0.65% 95% mean confidence interval for cycles value: -33.25 -3.46 95% mean confidence interval for cycles %-change: -1.47% -0.54% Cycles are helped. total loops in shared programs: 3416 -> 3398 (-0.53%) loops in affected programs: 18 -> 0 helped: 18 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% 95% mean confidence interval for loops value: -1.00 -1.00 95% mean confidence interval for loops %-change: -100.00% -100.00% Loops are helped. LOST: 2 GAINED: 0 Sandy Bridge total instructions in shared programs: 10499306 -> 10499094 (<.01%) instructions in affected programs: 6051 -> 5839 (-3.50%) helped: 43 HURT: 0 helped stats (abs) min: 1 max: 32 x̄: 4.93 x̃: 2 helped stats (rel) min: 0.39% max: 12.90% x̄: 4.29% x̃: 2.45% 95% mean confidence interval for instructions value: -7.66 -2.20 95% mean confidence interval for instructions %-change: -5.47% -3.12% Instructions are helped. total cycles in shared programs: 145862568 -> 145861370 (<.01%) cycles in affected programs: 61733 -> 60535 (-1.94%) helped: 36 HURT: 2 helped stats (abs) min: 16 max: 66 x̄: 36.61 x̃: 35 helped stats (rel) min: 0.45% max: 17.31% x̄: 4.92% x̃: 2.81% HURT stats (abs) min: 18 max: 102 x̄: 60.00 x̃: 60 HURT stats (rel) min: 1.10% max: 1.85% x̄: 1.48% x̃: 1.48% 95% mean confidence interval for cycles value: -41.28 -21.77 95% mean confidence interval for cycles %-change: -6.16% -3.00% Cycles are helped. total loops in shared programs: 1803 -> 1785 (-1.00%) loops in affected programs: 18 -> 0 helped: 18 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% 95% mean confidence interval for loops value: -1.00 -1.00 95% mean confidence interval for loops %-change: -100.00% -100.00% Loops are helped. LOST: 4 GAINED: 0 No changes on Iron Lake of GM45. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-08 15:26:26 -08:00
Ian Romanick	1583f49eaa	i965/vec4: Allow CSE on subset VF constant loads v2: Rewrite the code that generates the VF mask. Suggested by Ken. No changes on other platforms. Haswell, Ivy Bridge, and Sandy Bridge had similar results. (Haswell shown) total instructions in shared programs: 13059891 -> 13059884 (<.01%) instructions in affected programs: 431 -> 424 (-1.62%) helped: 7 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 1.19% max: 5.26% x̄: 2.05% x̃: 1.49% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -3.39% -0.71% Instructions are helped. total cycles in shared programs: 409260032 -> 409260018 (<.01%) cycles in affected programs: 4228 -> 4214 (-0.33%) helped: 7 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.28% max: 2.04% x̄: 0.54% x̃: 0.28% 95% mean confidence interval for cycles value: -2.00 -2.00 95% mean confidence interval for cycles %-change: -1.15% 0.07% Inconclusive result (%-change mean confidence interval includes 0). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-08 15:26:26 -08:00
Ian Romanick	360899d457	i965/vec4: Relax writemask condition in CSE If the previously seen instruction generates more fields than the new instruction, still allow CSE to happen. This doesn't do much, but it also enables a couple more shaders in the next patch. It helped quite a bit in another change series that I have (at least for now) abandoned. v2: Add some extra comentary about the parameters to instructions_match. Suggested by Ken. No changes on Skylake, Broadwell, Iron Lake or GM45. Ivy Bridge and Haswell had similar results. (Ivy Bridge shown) total instructions in shared programs: 11780295 -> 11780294 (<.01%) instructions in affected programs: 302 -> 301 (-0.33%) helped: 1 HURT: 0 total cycles in shared programs: 257308315 -> 257308313 (<.01%) cycles in affected programs: 2074 -> 2072 (-0.10%) helped: 1 HURT: 0 Sandy Bridge total instructions in shared programs: 10506687 -> 10506686 (<.01%) instructions in affected programs: 335 -> 334 (-0.30%) helped: 1 HURT: 0 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-08 15:26:26 -08:00
Ian Romanick	52c7df1643	i965/fs: Merge CMP and SEL into CSEL on Gen8+ v2: Fix several problems handling inverted predicates. Add a much bigger comment around the BRW_CONDITIONAL_NZ case. v3: Allow uniforms and shader inputs as sources for the original SEL and CMP instructions. This enables a LOT more shaders to receive CSEL merging (5816 vs 8564 on SKL). v4: Report progress. Broadwell and Skylake had similar results. (Broadwell shown) helped: 8527 HURT: 0 helped stats (abs) min: 1 max: 27 x̄: 2.44 x̃: 1 helped stats (rel) min: 0.03% max: 17.80% x̄: 1.12% x̃: 0.70% 95% mean confidence interval for instructions value: -2.51 -2.36 95% mean confidence interval for instructions %-change: -1.15% -1.10% Instructions are helped. total cycles in shared programs: 559442317 -> 558288357 (-0.21%) cycles in affected programs: 372699860 -> 371545900 (-0.31%) helped: 6748 HURT: 1450 helped stats (abs) min: 1 max: 32000 x̄: 182.41 x̃: 12 helped stats (rel) min: <.01% max: 66.08% x̄: 3.42% x̃: 0.70% HURT stats (abs) min: 1 max: 2538 x̄: 53.08 x̃: 14 HURT stats (rel) min: <.01% max: 96.72% x̄: 3.32% x̃: 0.90% 95% mean confidence interval for cycles value: -179.01 -102.51 95% mean confidence interval for cycles %-change: -2.37% -2.08% Cycles are helped. LOST: 0 GAINED: 6 No changes on earlier platforms. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1] Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v3] Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-03-08 15:26:26 -08:00
Kenneth Graunke	70de61594d	i965/fs: Add infrastructure for generating CSEL instructions. v2 (idr): Don't allow CSEL with a non-float src2. v3 (idr): Add CSEL to fs_inst::flags_written. Suggested by Matt. v4 (idr): Only set BRW_ALIGN_16 on Gen < 10 (suggested by Matt). Don't reset the access mode afterwards (suggested by Samuel and Matt). Add support for CSEL not modifying the flags to more places (requested by Matt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v3] Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-03-08 15:26:26 -08:00
Ian Romanick	54e8d2268d	nir: Narrow some dot product operations On vector platforms, this helps elide some constant loads. v2: Reorder the transformations. No changes on Broadwell or Skylake. Haswell total instructions in shared programs: 13093793 -> 13060163 (-0.26%) instructions in affected programs: 1277532 -> 1243902 (-2.63%) helped: 13216 HURT: 95 helped stats (abs) min: 1 max: 18 x̄: 2.56 x̃: 2 helped stats (rel) min: 0.21% max: 20.00% x̄: 3.63% x̃: 2.78% HURT stats (abs) min: 1 max: 6 x̄: 1.77 x̃: 1 HURT stats (rel) min: 0.09% max: 5.56% x̄: 1.25% x̃: 1.19% 95% mean confidence interval for instructions value: -2.57 -2.49 95% mean confidence interval for instructions %-change: -3.65% -3.54% Instructions are helped. total cycles in shared programs: 409580819 -> 409268463 (-0.08%) cycles in affected programs: 71730652 -> 71418296 (-0.44%) helped: 9898 HURT: 2352 helped stats (abs) min: 2 max: 16014 x̄: 37.08 x̃: 16 helped stats (rel) min: <.01% max: 35.55% x̄: 6.26% x̃: 4.50% HURT stats (abs) min: 2 max: 276 x̄: 23.25 x̃: 6 HURT stats (rel) min: <.01% max: 40.00% x̄: 3.54% x̃: 1.97% 95% mean confidence interval for cycles value: -33.19 -17.80 95% mean confidence interval for cycles %-change: -4.50% -4.26% Cycles are helped. total fills in shared programs: 82059 -> 82052 (<.01%) fills in affected programs: 21 -> 14 (-33.33%) helped: 7 HURT: 0 Sandy Bridge and Ivy Bridge had similar results (Ivy Bridge shown) total instructions in shared programs: 11811851 -> 11780605 (-0.26%) instructions in affected programs: 1155007 -> 1123761 (-2.71%) helped: 12304 HURT: 95 helped stats (abs) min: 1 max: 18 x̄: 2.55 x̃: 2 helped stats (rel) min: 0.21% max: 20.00% x̄: 3.69% x̃: 2.86% HURT stats (abs) min: 1 max: 6 x̄: 1.77 x̃: 1 HURT stats (rel) min: 0.09% max: 5.56% x̄: 1.25% x̃: 1.19% 95% mean confidence interval for instructions value: -2.56 -2.48 95% mean confidence interval for instructions %-change: -3.71% -3.59% Instructions are helped. total cycles in shared programs: 257618409 -> 257316805 (-0.12%) cycles in affected programs: 71999580 -> 71697976 (-0.42%) helped: 9155 HURT: 2380 helped stats (abs) min: 2 max: 16014 x̄: 38.44 x̃: 16 helped stats (rel) min: <.01% max: 35.75% x̄: 6.39% x̃: 4.62% HURT stats (abs) min: 2 max: 290 x̄: 21.14 x̃: 4 HURT stats (rel) min: <.01% max: 41.55% x̄: 3.14% x̃: 1.33% 95% mean confidence interval for cycles value: -34.32 -17.97 95% mean confidence interval for cycles %-change: -4.55% -4.29% Cycles are helped. GM45 and Iron Lake had nearly identical results (Iron Lake shown) total instructions in shared programs: 7886750 -> 7879944 (-0.09%) instructions in affected programs: 373781 -> 366975 (-1.82%) helped: 3715 HURT: 47 helped stats (abs) min: 1 max: 8 x̄: 1.86 x̃: 1 helped stats (rel) min: 0.22% max: 16.67% x̄: 2.88% x̃: 2.06% HURT stats (abs) min: 1 max: 6 x̄: 2.55 x̃: 2 HURT stats (rel) min: 1.09% max: 5.00% x̄: 1.93% x̃: 2.35% 95% mean confidence interval for instructions value: -1.85 -1.77 95% mean confidence interval for instructions %-change: -2.91% -2.73% Instructions are helped. total cycles in shared programs: 178114636 -> 178095452 (-0.01%) cycles in affected programs: 7227666 -> 7208482 (-0.27%) helped: 3349 HURT: 301 helped stats (abs) min: 2 max: 90 x̄: 6.55 x̃: 4 helped stats (rel) min: <.01% max: 14.18% x̄: 0.95% x̃: 0.63% HURT stats (abs) min: 2 max: 42 x̄: 9.13 x̃: 10 HURT stats (rel) min: 0.01% max: 11.19% x̄: 1.22% x̃: 1.50% 95% mean confidence interval for cycles value: -5.52 -4.99 95% mean confidence interval for cycles %-change: -0.81% -0.73% Cycles are helped. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1]	2018-03-08 15:26:26 -08:00
Lionel Landwerlin	d10a39ebe0	i965: perf: consolidate unmapping oa perf bo outside accumulation Do this in one place outside the only caller of the accumulation function. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-08 23:05:29 +00:00
Lionel Landwerlin	fb921a2870	i965: perf: count number of accumlated reports This will be reused later. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-08 23:05:26 +00:00
Lionel Landwerlin	e4387faafb	i965: perf: reuse timescale base function from query We already have the same function in brw_queryobj.c Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-08 23:05:23 +00:00
Lionel Landwerlin	b71da26496	i965: perf: store sysfs device entry into context We want to reuse it later on. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-08 23:05:21 +00:00
Lionel Landwerlin	5742b17da1	i965: perf: store the hw_id of the context in the query Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-08 23:05:18 +00:00
Lionel Landwerlin	80cd669a32	i965: perf: default case for unknown query types Just some extra safety before further changes. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-08 23:05:00 +00:00
Marek Olšák	9b7db12815	radeonsi: remove chip_class parameter from si_lower_nir We can get it from si_screen. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2018-03-08 14:58:16 -05:00
Marek Olšák	78ef16e2f9	winsys/amdgpu: query GDS info Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2018-03-08 14:58:16 -05:00
Marek Olšák	a4a113b5bc	winsys/amdgpu: pad compute IBs v2: pad with PKT2 NOPs on SI Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2018-03-08 14:58:16 -05:00
Marek Olšák	35cd86d4e9	radeonsi: expand constbuf 0 address correctly to fix Vega10 hangs This is only required with the latest libdrm. This fixes 32-bit support with high addresses. (and possibly 64-bit support too because the high bits need to be masked out) Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2018-03-08 14:58:16 -05:00
Marek Olšák	75c5d25f0f	radeonsi: align command buffer starting address to fix some Raven hangs Cc: 17.3 18.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2018-03-08 14:58:16 -05:00
Christian Gmeiner	5b68a7297d	etnaviv: add get_driver_query_group_info(..) This enables AMD_performance_monitor extension. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2018-03-08 20:44:04 +01:00
Christian Gmeiner	3d912bd742	etnaviv: add query_group_info for sw counters Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2018-03-08 20:43:55 +01:00
Dylan Baker	1e9d779331	meson: Fix building gallium media libs without egl v2: - rebase on omx fix Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> (v1)	2018-03-08 10:14:02 -08:00
Dylan Baker	f74cf04d3e	meson: Allow building dri based EGL without GLX It should be possible to build EGL without GLX, but the meson build currently doesn't allow that because it too tightly couples glx and dri. This patch eases dri and glx apart, so that EGL without GLX can be built. CC: Daniel Stone <daniels@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-03-08 09:12:24 -08:00
Thierry Reding	d41ee9ba5d	glx/apple: Ship meson build file in tarball The meson build file for Apple GLX is not listed in the EXTRA_DIST make variable and therefore isn't shipped as part of the release tarball, so meson builds from the tarball will fail. Add the file to EXTRA_DIST to ensure it is included in the tarball. Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2018-03-08 12:11:32 +01:00
Samuel Pitoiset	4e3c1ace65	ac/nir: do not emit unnecessary null exports in fragment shaders Null exports should only be needed when no other exports are emitted. This removes a bunch of 'exp null off, off, off, off done vm'. Affected games are Dota 2 and Wolfenstein 2, not sure if that really helps, but code size is decreasing there. Polaris10: Totals from affected shaders: SGPRS: 8216 -> 8216 (0.00 %) VGPRS: 7072 -> 7072 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 454968 -> 453896 (-0.24 %) bytes Max Waves: 772 -> 772 (0.00 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-08 11:56:05 +01:00
Eric Engestrom	19dd7f007e	drirc: whitespace fix Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-03-08 09:53:34 +00:00
Thomas Hellstrom	93e58d5e17	drirc: Disable the GLX_SGI_video_sync extension for gnome-shell on vmware With this extension enabled and a server GLX implementation that actually honors it, Window movement lags considerably on gnome-shell/vmware, so disable it by default. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Deepak Rawat <drawat@vmware.com>	2018-03-08 07:26:29 +01:00
Thomas Hellstrom	4ca9ad2bb2	gallium/st_dri: Honor the glx_disable_sgi_video_sync config option This option is disabled by default. Primarily intended for drivers on virtual hardware. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Deepak Rawat <drawat@vmware.com>	2018-03-08 07:26:29 +01:00
Thomas Hellstrom	f4070956d4	glx/dri: Add a driconf option to disable GLX_SGI_video_sync Drivers on virtual hardware don't want to expose this extension to GLX compositors, similarly to GLX_OML_sync_control, since that significantly increases latency. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Deepak Rawat <drawat@vmware.com>	2018-03-08 07:26:29 +01:00
Timothy Arceri	0c90264da4	ac/radeonsi: add emit_kill to the abi This should fix a regression with Rocket League grass rendering on the NIR backend. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104717	2018-03-08 11:28:37 +11:00
Timothy Arceri	50cc97d98a	radeonsi: add si_llvm_emit_kill() helper Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-08 11:28:37 +11:00
Timothy Arceri	f4b877631e	spirv: fix autotools builds Fixes: `68a6a3b51a` "spirv: handle AMD_gcn_shader extended instructions" Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-08 10:45:56 +11:00
Timothy Arceri	99cdc019bf	ac: make use of if/loop build helpers These helpers insert the basic block in the same order as they appear in NIR making it easier to follow LLVM IR dumps. The helpers also insert more useful labels onto the blocks. TGSI use the line number of the corresponding opcode in the TGSI dump as the label id, here we use the corresponding block index from NIR. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-08 10:12:34 +11:00
Timothy Arceri	6e1a142863	radeonsi: make use of if/loop build helpers in ac Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-08 10:12:34 +11:00
Timothy Arceri	42627dabb4	ac: add if/loop build helpers These have been ported over from radeonsi. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-08 10:12:34 +11:00
Daniel Schürmann	ffbf75cde4	radv: enable AMD_gcn_shader extension Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-07 23:09:58 +01:00
Daniel Schürmann	18c7f1e041	ac: implement AMD_gcn_shader extended instructions Co-authored-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-07 23:09:58 +01:00
Daniel Schürmann	68a6a3b51a	spirv: handle AMD_gcn_shader extended instructions Co-authored-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-07 23:09:58 +01:00
Daniel Schürmann	a1a2a8dfda	nir: add AMD_gcn_shader extended instructions Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-07 23:09:58 +01:00
Daniel Schürmann	39437025de	spirv: import AMD extensions header from glslang Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-07 23:09:58 +01:00
Dylan Baker	cba104ebe3	meson: Fix indent in omx meson.build Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Tested-by: Julien Isorce <julien.isorce@gmail.com> Tested-by: Karol Herbst <kherbst@redhat.com>	2018-03-07 13:30:54 -08:00
Dylan Baker	6f628951af	meson: Use include directory variables instead of traversing Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Tested-by: Julien Isorce <julien.isorce@gmail.com> Tested-by: Karol Herbst <kherbst@redhat.com>	2018-03-07 13:30:53 -08:00
Dylan Baker	34e852d5b5	meson: Re-add auto option for omx This re-adds the auto option for omx, without it we default to tizonia and the build fails almost immediately, this is especially obnoxious those building a driver that doesn't support the OMX state tracker to begin with. v2: - Only define OMX_FOO for auto cases if the dependencies are found. This fixes building tizonia with auto (Julien, Eric) CC: Gurkirpal Singh <gurkirpal204@gmail.com> Fixes: `bb5e27fab6` ("st/omx/bellagio: Rename st and target directories") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk> (v1) Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Tested-by: Julien Isorce <julien.isorce@gmail.com> Tested-by: Karol Herbst <kherbst@redhat.com> (v1)	2018-03-07 13:30:53 -08:00
Dylan Baker	7598dedfde	meson: fix tizonia compilation It needs to have src/egl in it's includes as well. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Tested-by: Julien Isorce <julien.isorce@gmail.com> Tested-by: Karol Herbst <kherbst@redhat.com>	2018-03-07 13:30:53 -08:00
Dylan Baker	2d3004ef1c	meson: combine state trackers and target if blocks This is needed later since tizonia requires dri Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Tested-by: Julien Isorce <julien.isorce@gmail.com> Tested-by: Karol Herbst <kherbst@redhat.com>	2018-03-07 13:30:53 -08:00
Marek Olšák	55376cb31e	st/mesa: expose 0 shader binary formats for compat profiles for Qt Bugzilla: https://bugreports.qt.io/browse/QTBUG-66420 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105065 Cc: "18.0" <mesa-stable@lists.freedesktop.org> Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>	2018-03-07 15:36:31 -05:00
Roland Scheidegger	8ba3750d3d	draw: fix line stippling with aa lines In contrast to non-aa, where stippling is based on either dx or dy (depending on if it's a x or y major line), stippling is based on actual distance with smooth lines, so adjust for this. (It looks like there's some minor artifacts with mesa demos line-sample and stippling, it looks like the line endpoints aren't quite right with aa + stippling - maybe due to the integer math in the stipple stage, but I can't quite pinpoint it.) Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-03-07 21:29:00 +01:00
Roland Scheidegger	dbb2cf388b	draw: simplify (and correct) aaline fallback (v2) The motivation actually was to get rid of the additional tex instruction, since that requires the draw fallback code to intercept all sampler / view calls (even if the fallback is never hit). Basically, the idea is to use coverage of the pixel to calculate the alpha value, and coverage is simply based on the distance to the center of the line (in both line direction, which is useful for wide lines, as well as perpendicular to the line). This is much closer to what hw supporting this natively actually does. It also fixes an issue with line width not quite being correct, as well as endpoints getting stretched too far (in line direction) with wide lines, which is apparent with mesa demo line-sample. (For llvmpipe, it would probably make sense to do something like this directly when drawing lines, since rendering two tris is twice as expensive as a line, but it would need some changes with state management.) Since we're no longer relying on mipmapping to get the alpha value, we also don't need to draw 3 rects (6 tris), one is sufficient. There's still issues (as before): - quite sure it's not correct without half_pixel_center, but can't test this with GL. - aaline + line stipple is incorrect (evident with line-sample demo). Looking at the spec the stipple pattern should actually be based on distance (not just dx or dy for x/y major lines as without aa). - outputs (other than pos + the one used for line aa) should be reinterpolated since we actually increase line length by half a pixel (but there's no tests which would care). v2: simplify the math (should be equivalent), don't need immediate v3: use float versions of atan2,cos,sin, minor cleanups Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-03-07 21:28:31 +01:00
Bas Nieuwenhuizen	034cce96b4	radv: Don't emit a warning on VI-GFX9. We are conformant: https://www.khronos.org/conformance/adopters/conformant-products#submission_308 v2: Actually not emit it on gfx9. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen	04d65d2b76	radv: Enable vulkan 1.1.0 for configurations that can support it. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen	0168eaaa42	radv: Disable sampler ycbcr conversion. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen	cce62f4065	radv: Expose that we don't support any VK_KHR_16_bit_storage parts. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen	b99b9cc864	radv: Implement vkEnumerateInstanceVersion. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen	5240fddb9d	radv: Add trivial device group implementation. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen	84e877aa77	radv: Implement vkCmdDispatchBase. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen	de5e25898c	radv: Implement VkGetDeviceQueue2. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen	b137e25277	radv: Support VkPhysicalDeviceProtectedMemoryFeatures. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen	4bcf4d1678	radv: Support VkPhysicalDeviceShaderDrawParameterFeatures. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen	41d958d073	radv: Implement VK_KHR_maintenance3. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen	8f9af587a2	radv: Add minimal subgroup support. Deliberately not implementing workgroup scopes as that is not needed for core vulkan. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen	89651fba9b	radv: Change client version check. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:34 +01:00
Bas Nieuwenhuizen	5b3979704d	radv: Update MAX_API_VERSION to 1.1.0 v2: Don't bump supported version. v3: Update json files. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:34 +01:00
Bas Nieuwenhuizen	97f10934ed	ac/nir: Add vote_ieq/vote_feq lowering pass. The old vote_eq implementation supported only booleans, but now we have to support arbitrary values, so use the read_first_invocation intrinsic + ballot. I took this as an opportunity to figure out how easy it was to do this in nir instead of in the nir_to_llvm pass, and it actually turned out pretty okay IMO. Only creating the pass is some extra code. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:32 +01:00
Jason Ekstrand	c217607b65	anv: Support version overrides While always sketchy to do, this is useful for debugging. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	a1ee51309e	vulkan/util: Add a helper to get a version override Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	d6b65222df	anv: Enable Vulkan 1.1 Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	03c07ac548	anv: Add support for SPIR-V 1.3 subgroup operations This requires us to bump the subgroup size to 32 for all shader stages because Vulkan requires that to be a physical device query. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	8b4a5e641b	intel/fs: Add support for subgroup quad operations NIR has code to lower these away for us but we can do significantly better in many cases with register regioning and SIMD4x2. Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	2292b20b29	intel/fs: Implement reduce and scan opeprations Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	4150920b95	intel/fs: Add a helper for emitting scan operations This commit adds a helper to the builder for emitting "scan" operations. Given a binary operation #, a scan takes the vector [a0, a1, ..., aN] and returns the vector [a0, a0 # a1, ..., a0 # a1 # ... # aN] where each channel contains the combination of all previous channels. The sequence of instructions to perform the scan is fairly optimal; a 16-wide scan on a 32-bit type is only 6 instructions. The subgroup scan and reduction operations will be implemented in terms of this. Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	b0858c1cc6	intel/fs: Add a couple of simple helper opcodes Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	57bff0a546	spirv: Add support for subgroup arithmetic Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	789221dcfa	nir: Add a helper for getting binop identities Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	82d493a939	nir: Add subgroup arithmetic reduction intrinsics Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	b3a5b0f3fc	spirv: Add subgroup quad support Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	493a165544	nir: Add quad operations and lowering Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	90c9f29518	i965/fs: Add support for nir_intrinsic_shuffle Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	8256ee3fa3	spirv: Add subgroup shuffle support Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	149b92ccf2	nir: Add subgroup shuffle intrinsics and lowering Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	7cfece820d	i965/fs: Support nir_intrinsic_vote_feq Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	0e893356fe	nir/lower_subgroups: Add scalarizing for vote_eq Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	d792f3d4cd	spirv: Add subgroup vote support Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	44681e4795	nir: Generalize nir_intrinsic_vote_eq The SPIR-V extension wants us to be able to do an AllEqual on any vector or scalar type. This has two implications: 1) We need to be able to handle vectors so we switch the vote_eq intrinsics to be vectorized intrinsics. 2) We need to handle floats which have different behavior with respect to +-0, NaN, etc. than the integer variant so we need two variants. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	9812fce60b	spirv: Add subgroup ballot support Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	974daec495	i965/fs: Implement basic SPIR-V subgroup intrinsics Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	adc077797a	spirv: Add initial subgroup support Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	5162a1d884	nir: Add new SPIR-V ballot intrinsics and lowering Someone can make the lowering optional later if they want something different for their hardware. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	752e969703	compiler: Add two new system values for subgroups This will be required for SPIR-V subgroup support Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	34c60ea02b	nir: Add new SPIR-V ballot ALU intrinsics and lowering Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	cc587ee9a7	spirv: Handle the new OpModuleProcessed instruction Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	59b0ea0c74	anv: Stop returning VK_ERROR_INCOMPATIBLE_DRIVER From the Vulkan 1.1 spec: "Vulkan 1.0 implementations were required to return VK_ERROR_INCOMPATIBLE_DRIVER if apiVersion was larger than 1.0. Implementations that support Vulkan 1.1 or later must not return VK_ERROR_INCOMPATIBLE_DRIVER for any value of apiVersion." Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	cbab2d1da5	anv: Implement vkEnumerateInstanceVersion Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Iago Toral Quiroga	605fd7c0da	anv/device: fail to initialize device if we have queues with unsupported flags This is not strictly necessary since users should not be requesting any flags that are not valid for the list of enabled features requested and we already fail if they attempt to use an unsupported feature, however it is an easy to implement sanity check that would help developes realize that they are doing things wrong, so we might as well do it. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-07 12:13:47 -08:00
Iago Toral Quiroga	b262f17b15	anv/device: GetDeviceQueue2 should only return queues with matching flags From the Vulkan 1.1 spec, VkDeviceQueueInfo2 structure: "The queue returned by vkGetDeviceQueue2 must have the same flags value from this structure as that used at device creation time in a VkDeviceQueueCreateInfo instance. If no matching flags were specified at device creation time then pQueue will return VK_NULL_HANDLE." For us this means no flags at all since we don't support any. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	9c8b40001d	anv: Support querying for protected memory Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	773a51e772	anv: Implement GetDeviceQueue2 This belongs to the protected memory feature but there's nothing about it that's specific to protected memory. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	68df93ecbc	anv: Trivially implement VK_KHR_device_group Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	dfe18be09e	anv: Implement vkCmdDispatchBase This is part of the device groups extension/feature but it's a decent chunk of work in its own right so it's worth breaking into its own patch. The mechanism we use is fairly straightforward: we just push the base work group id into the shader and add it to the work group id we get from dispatch. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	ff9db1a4cc	nir/spirv: Add support for device groups Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	ddc4069122	anv: Implement VK_KHR_maintenance3 Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	1deb7967c8	anv: Support VkPhysicalDeviceShaderDrawParameterFeatures This advertises the VK_KHR_shader_draw_parameters functionality as a "core optimal feature" in Vulkan 1.1. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	06719f9d4b	anv/entrypoints: Drop support for protect attributes Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	bd1279bd9f	Get rid of a bunch of KHR suffixes Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	af461986db	anv: Add version 1.1.0 but leave it disabled This requires us to rename any Vulkan API entrypoints which became core in 1.1 to no longer have the KHR suffix. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	0128187335	spirv: Update the SPIR-V headers and json to 1.3.1 Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	205c271562	vulkan: Update the XML and headers to 1.1.70 Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	7fb86fb511	vulkan/enum_to_str: Add support for aliases and new Vulkan versions Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	539a0aec45	vulkan/enum_to_str: Add a add_value_from_xml helper to VkEnum Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	eb23ca069f	anv/entrypoints: Generate #ifdef guards from platform attributes Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	05fc377f2e	anv/extensions: Add support for multiple API versions Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	8efa173ed2	anv/entrypoints_gen: Add support for aliases in the XML Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	39d9fcea13	anv/entrypoints: Allow an entrypoint to require multiple extensions In this case, we say an entrypoint is supported if ANY of the extensions is supported. This is because, in the XML, entrypoints don't require extensions so much as extensions require entrypoints. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	8e8f167c72	anv/entrypoints: Add an is_device_entrypoint helper Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	54b3493fc0	anv/entrypoints_gen: Allow the string map to grow Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	d91da06df5	anv/entrypoints_gen: A bit of refactoring Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	a4ca4c99ba	anv/entrypoints: Generalize the string map a bit The original string map assumed that the mapping from strings to entrypoints was a bijection. This will not be true the moment we add entrypoint aliasing. This reworks things to be an arbitrary map from strings to non-negative signed integers. The old one also had a potential bug if we ever had a hash collision because it didn't do the strcmp inside the lookup loop. While we're at it, we break things out into a helpful class. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	3960d0e332	vulkan: Rename multiview from KHX to KHR Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	68af9f04a4	spirv: Rework barriers Our previous handling of barriers always used the big hammer and didn't correctly emit memory barriers when specified along with a control barrier. This commit completely reworks the way we emit barriers to make things both more precise and more correct. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	de518f38e5	spirv: Add a vtn_constant_value helper Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Marek Olšák	9779f34326	radeonsi: remove si_llvm_add_attribute	2018-03-07 13:55:49 -05:00
Marek Olšák	2c3f3651c4	radeonsi: fix passing address32_hi to LLVM for high values The old function treats high values as negative, which LLVM interprets as 0.	2018-03-07 13:55:49 -05:00
Marek Olšák	b3b6b00ac8	radeonsi: assume has_virtual_memory == true	2018-03-07 13:55:48 -05:00
Marek Olšák	53db2790c0	radeonsi: add/update assertions for 32-bit address space	2018-03-07 13:55:47 -05:00
Marek Olšák	16856a1ee8	radeonsi: prevent a negative buffer offset in si_upload_descriptors	2018-03-07 13:55:42 -05:00
Marek Olšák	9b55498059	radeonsi: properly extract a buffer address from a descriptor	2018-03-07 13:55:40 -05:00
Marek Olšák	2a47660754	radeonsi: fix vertex buffer address computation with full 64-bit addresses	2018-03-07 13:55:38 -05:00
Marek Olšák	2e30268877	radeonsi: mask out high VM address bits in registers where needed	2018-03-07 13:55:35 -05:00
Bas Nieuwenhuizen	94c9096c83	radv: Add entrypoints generation with the new vk.xml A lot of it is based on intel again. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 15:50:19 +01:00
Simon Hausmann	fb5825e7ce	glsl: Fix memory leak with known glsl_type instances When looking up known glsl_type instances in the various hash tables, we end up leaking the key instances used for the lookup, as the glsl_type constructor allocates memory on the global mem_ctx. This patch changes glsl_type to manage its own memory, which fixes the leak and also allows getting rid of the global mem_ctx and its mutex. v2: remove lambda usage (Tapani) (+keep ASSERT_BITFIELD_SIZE, modify dummy ctor to initialize mem_ctx) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104884 Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Simon Hausmann <simon.hausmann@qt.io> Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-07 14:33:34 +02:00
Caio Marcelo de Oliveira Filho	c17808562e	spirv: Add SpvCapabilityShaderViewportIndexLayerEXT This capability allows gl_ViewportIndex and gl_Layer to also be used as outputs in Vertex and Tesselation shaders. v2: Make conditional to the capability, add gl_Layer, add tesselation shaders. (Iago) v3: Don't export to tesselation control shader. v4: Add Reviewd-by tag. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 07:04:20 +01:00
Mauro Rossi	487f8d48c9	android: anv: add libmesa_intel_dev static dependency Fixes the following building errors: external/mesa/src/intel/vulkan/anv_device.c:300: error: undefined reference to 'gen_get_pci_device_id_override' external/mesa/src/intel/vulkan/anv_device.c:312: error: undefined reference to 'gen_get_device_name' external/mesa/src/intel/vulkan/anv_device.c:313: error: undefined reference to 'gen_get_device_info' clang.real: error: linker command failed with exit code 1 (use -v to see invocation) Fixes: `272bef0601` "intel: Split gen_device_info out into libintel_dev" Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-03-07 07:55:34 +02:00
Timothy Arceri	1fdb21541e	Revert "nir: bump loop unroll limit to 96." This reverts commit `2d36efdb7f`. This raised limit turns out to harmful for more complex shaders, it causes excessive spilling in some Bioshock Infinite shaders. The fps for the ssao demo on radv remains unchanged when reverting this. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-07 15:10:05 +11:00
Dave Airlie	fb077b0728	ac/nir: don't put lod into args if it's zero. If it's zero but put it in args we still end up consuming a register for it. This fixes some spilling in the NIR paths in Dirt Rally that isn't seen with TGSI. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-07 03:34:59 +00:00
Christian Gmeiner	38e91e2b81	freedreno: bump required libdrm version Fixes: `26a9321d0a` "freedreno: add global_bindings state" Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-03-06 21:52:59 +01:00
Ian Romanick	e3ea166a2c	nir: Simplify some comparisons like a+b < a All Gen7+ platforms had similar results. (Skylake shown) total instructions in shared programs: 14514555 -> 14514547 (<.01%) instructions in affected programs: 1972 -> 1964 (-0.41%) helped: 8 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.39% max: 0.42% x̄: 0.41% x̃: 0.41% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.41% -0.40% Instructions are helped. total cycles in shared programs: 533141444 -> 533136780 (<.01%) cycles in affected programs: 164728 -> 160064 (-2.83%) helped: 181 HURT: 3 helped stats (abs) min: 2 max: 94 x̄: 26.17 x̃: 30 helped stats (rel) min: 0.12% max: 5.33% x̄: 3.42% x̃: 3.80% HURT stats (abs) min: 4 max: 54 x̄: 24.00 x̃: 14 HURT stats (rel) min: 0.20% max: 2.39% x̄: 1.09% x̃: 0.68% 95% mean confidence interval for cycles value: -27.12 -23.58 95% mean confidence interval for cycles %-change: -3.54% -3.16% Cycles are helped. Sandy Bridge total instructions in shared programs: 10533667 -> 10533539 (<.01%) instructions in affected programs: 10148 -> 10020 (-1.26%) helped: 124 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.03 x̃: 1 helped stats (rel) min: 0.39% max: 4.35% x̄: 2.20% x̃: 2.04% 95% mean confidence interval for instructions value: -1.06 -1.00 95% mean confidence interval for instructions %-change: -2.46% -1.95% Instructions are helped. total cycles in shared programs: 146136887 -> 146132122 (<.01%) cycles in affected programs: 206382 -> 201617 (-2.31%) helped: 171 HURT: 0 helped stats (abs) min: 2 max: 40 x̄: 27.87 x̃: 30 helped stats (rel) min: 0.08% max: 5.73% x̄: 2.98% x̃: 2.67% 95% mean confidence interval for cycles value: -29.19 -26.54 95% mean confidence interval for cycles %-change: -3.20% -2.76% Cycles are helped. Iron Lake total instructions in shared programs: 7886515 -> 7886507 (<.01%) instructions in affected programs: 3016 -> 3008 (-0.27%) helped: 8 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.25% max: 0.28% x̄: 0.27% x̃: 0.27% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.27% -0.26% Instructions are helped. total cycles in shared programs: 178100396 -> 178100388 (<.01%) cycles in affected programs: 156128 -> 156120 (<.01%) helped: 4 HURT: 4 helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4 helped stats (rel) min: 0.02% max: 0.04% x̄: 0.03% x̃: 0.03% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: <.01% max: 0.01% x̄: <.01% x̃: <.01% 95% mean confidence interval for cycles value: -3.68 1.68 95% mean confidence interval for cycles %-change: -0.03% <.01% Inconclusive result (value mean confidence interval includes 0). GM45 total instructions in shared programs: 4857872 -> 4857868 (<.01%) instructions in affected programs: 1544 -> 1540 (-0.26%) helped: 4 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.25% max: 0.27% x̄: 0.26% x̃: 0.26% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.28% -0.24% Instructions are helped. total cycles in shared programs: 122167654 -> 122167662 (<.01%) cycles in affected programs: 96248 -> 96256 (<.01%) helped: 0 HURT: 4 HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: <.01% max: 0.01% x̄: <.01% x̃: <.01% 95% mean confidence interval for cycles value: 2.00 2.00 95% mean confidence interval for cycles %-change: <.01% 0.02% Cycles are HURT. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-06 11:17:30 -08:00
Ian Romanick	d1ed4ffe0b	nir: Use De Morgan's Law on logic compounded comparisons The replacement of the comparison operators must happen during this step. If it does not, the next pass of nir_opt_algebraic will reapply De Morgan's Law in the "opposite direction" before performing dead code elimination. The resulting infinite loop will eventually get OOM killed. Haswell, Broadwell, and Skylake had similar results. (Broadwell shown) total instructions in shared programs: 14808185 -> 14808036 (<.01%) instructions in affected programs: 13758 -> 13609 (-1.08%) helped: 39 HURT: 0 helped stats (abs) min: 1 max: 10 x̄: 3.82 x̃: 3 helped stats (rel) min: 0.44% max: 1.55% x̄: 0.98% x̃: 1.01% 95% mean confidence interval for instructions value: -4.67 -2.97 95% mean confidence interval for instructions %-change: -1.09% -0.88% Instructions are helped. total cycles in shared programs: 559438333 -> 559435832 (<.01%) cycles in affected programs: 199160 -> 196659 (-1.26%) helped: 42 HURT: 3 helped stats (abs) min: 2 max: 184 x̄: 61.50 x̃: 51 helped stats (rel) min: 0.02% max: 6.94% x̄: 1.41% x̃: 1.40% HURT stats (abs) min: 2 max: 40 x̄: 27.33 x̃: 40 HURT stats (rel) min: 0.05% max: 0.74% x̄: 0.51% x̃: 0.74% 95% mean confidence interval for cycles value: -71.47 -39.69 95% mean confidence interval for cycles %-change: -1.64% -0.93% Cycles are helped. Sandy Bridge and Ivy Bridge had similar results. (Ivy Bridge shown) total instructions in shared programs: 11811776 -> 11811553 (<.01%) instructions in affected programs: 15201 -> 14978 (-1.47%) helped: 39 HURT: 0 helped stats (abs) min: 1 max: 20 x̄: 5.72 x̃: 6 helped stats (rel) min: 0.44% max: 2.53% x̄: 1.30% x̃: 1.26% 95% mean confidence interval for instructions value: -7.21 -4.23 95% mean confidence interval for instructions %-change: -1.48% -1.12% Instructions are helped. total cycles in shared programs: 257617270 -> 257614589 (<.01%) cycles in affected programs: 212107 -> 209426 (-1.26%) helped: 45 HURT: 0 helped stats (abs) min: 2 max: 180 x̄: 59.58 x̃: 54 helped stats (rel) min: 0.02% max: 6.02% x̄: 1.30% x̃: 1.32% 95% mean confidence interval for cycles value: -74.02 -45.14 95% mean confidence interval for cycles %-change: -1.59% -1.01% Cycles are helped. Iron Lake total instructions in shared programs: 7886648 -> 7886515 (<.01%) instructions in affected programs: 14106 -> 13973 (-0.94%) helped: 29 HURT: 0 helped stats (abs) min: 1 max: 10 x̄: 4.59 x̃: 4 helped stats (rel) min: 0.35% max: 1.83% x̄: 0.90% x̃: 0.81% 95% mean confidence interval for instructions value: -5.65 -3.52 95% mean confidence interval for instructions %-change: -1.03% -0.76% Instructions are helped. total cycles in shared programs: 178100812 -> 178100396 (<.01%) cycles in affected programs: 67970 -> 67554 (-0.61%) helped: 29 HURT: 0 helped stats (abs) min: 2 max: 40 x̄: 14.34 x̃: 12 helped stats (rel) min: 0.15% max: 1.69% x̄: 0.58% x̃: 0.54% 95% mean confidence interval for cycles value: -18.30 -10.39 95% mean confidence interval for cycles %-change: -0.71% -0.45% Cycles are helped. GM45 total instructions in shared programs: 4857939 -> 4857872 (<.01%) instructions in affected programs: 7426 -> 7359 (-0.90%) helped: 15 HURT: 0 helped stats (abs) min: 1 max: 10 x̄: 4.47 x̃: 4 helped stats (rel) min: 0.33% max: 1.80% x̄: 0.87% x̃: 0.77% 95% mean confidence interval for instructions value: -6.06 -2.87 95% mean confidence interval for instructions %-change: -1.06% -0.67% Instructions are helped. total cycles in shared programs: 122167930 -> 122167654 (<.01%) cycles in affected programs: 43118 -> 42842 (-0.64%) helped: 15 HURT: 0 helped stats (abs) min: 4 max: 40 x̄: 18.40 x̃: 16 helped stats (rel) min: 0.15% max: 1.69% x̄: 0.62% x̃: 0.54% 95% mean confidence interval for cycles value: -25.03 -11.77 95% mean confidence interval for cycles %-change: -0.82% -0.41% Cycles are helped. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-06 11:17:29 -08:00
Ian Romanick	52607658ff	nir: Replace fmin(b2f(a), b) with a bcsel All of the affected shaders are HDR mappers from Serious Sam 3. All Gen7+ platforms had similar results. (Skylake shown) total instructions in shared programs: 14516285 -> 14516273 (<.01%) instructions in affected programs: 348 -> 336 (-3.45%) helped: 12 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 2.08% max: 6.67% x̄: 4.31% x̃: 4.17% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -5.55% -3.06% Instructions are helped. total cycles in shared programs: 533163876 -> 533163808 (<.01%) cycles in affected programs: 1144 -> 1076 (-5.94%) helped: 4 HURT: 0 helped stats (abs) min: 16 max: 18 x̄: 17.00 x̃: 17 helped stats (rel) min: 5.80% max: 6.08% x̄: 5.94% x̃: 5.94% 95% mean confidence interval for cycles value: -18.84 -15.16 95% mean confidence interval for cycles %-change: -6.20% -5.68% Cycles are helped. Sandy Bridge total instructions in shared programs: 10533321 -> 10533309 (<.01%) instructions in affected programs: 372 -> 360 (-3.23%) helped: 12 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 2.00% max: 5.88% x̄: 3.91% x̃: 3.85% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -4.96% -2.86% Instructions are helped. total cycles in shared programs: 146136632 -> 146136428 (<.01%) cycles in affected programs: 11668 -> 11464 (-1.75%) helped: 12 HURT: 0 helped stats (abs) min: 16 max: 18 x̄: 17.00 x̃: 17 helped stats (rel) min: 0.99% max: 3.44% x̄: 2.20% x̃: 2.29% 95% mean confidence interval for cycles value: -17.66 -16.34 95% mean confidence interval for cycles %-change: -2.82% -1.58% Cycles are helped. Iron Lake total instructions in shared programs: 7886301 -> 7886277 (<.01%) instructions in affected programs: 576 -> 552 (-4.17%) helped: 12 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 2.94% max: 6.06% x̄: 4.51% x̃: 4.65% 95% mean confidence interval for instructions value: -2.00 -2.00 95% mean confidence interval for instructions %-change: -5.30% -3.72% Instructions are helped. total cycles in shared programs: 178113176 -> 178113176 (0.00%) cycles in affected programs: 2116 -> 2116 (0.00%) helped: 2 HURT: 4 helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4 helped stats (rel) min: 1.14% max: 1.14% x̄: 1.14% x̃: 1.14% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.50% max: 0.65% x̄: 0.58% x̃: 0.58% 95% mean confidence interval for cycles value: -3.25 3.25 95% mean confidence interval for cycles %-change: -0.93% 0.94% Inconclusive result (value mean confidence interval includes 0). GM45 total instructions in shared programs: 4857756 -> 4857744 (<.01%) instructions in affected programs: 294 -> 282 (-4.08%) helped: 6 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 2.94% max: 5.71% x̄: 4.40% x̃: 4.55% 95% mean confidence interval for instructions value: -2.00 -2.00 95% mean confidence interval for instructions %-change: -5.71% -3.09% Instructions are helped. total cycles in shared programs: 122178730 -> 122178722 (<.01%) cycles in affected programs: 700 -> 692 (-1.14%) helped: 2 HURT: 0 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-06 11:17:29 -08:00
Ian Romanick	b974dfee11	nir: Pull b2f out of bcsel All platforms had similar results. (Skylake shown) total instructions in shared programs: 14516592 -> 14516586 (<.01%) instructions in affected programs: 500 -> 494 (-1.20%) helped: 2 HURT: 0 total cycles in shared programs: 533167044 -> 533166998 (<.01%) cycles in affected programs: 6988 -> 6942 (-0.66%) helped: 2 HURT: 0 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-06 11:17:29 -08:00
Ian Romanick	f50400cc80	nir: Replace an odd comparison involving fmin of -b2f I noticed the fge version while looking at a shader for an unrelated reason. The feq version prevents a regression in a later change that performs strength reduction of some compares. Broadwell and Skylake had similar results. (Skylake shown) total instructions in shared programs: 14514808 -> 14514796 (<.01%) instructions in affected programs: 750 -> 738 (-1.60%) helped: 4 HURT: 0 helped stats (abs) min: 1 max: 5 x̄: 3.00 x̃: 3 helped stats (rel) min: 0.83% max: 1.96% x̄: 1.40% x̃: 1.40% 95% mean confidence interval for instructions value: -6.67 0.67 95% mean confidence interval for instructions %-change: -2.43% -0.36% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 533144939 -> 533144853 (<.01%) cycles in affected programs: 8911 -> 8825 (-0.97%) helped: 4 HURT: 0 helped stats (abs) min: 16 max: 32 x̄: 21.50 x̃: 19 helped stats (rel) min: 0.60% max: 1.89% x̄: 1.28% x̃: 1.31% 95% mean confidence interval for cycles value: -32.94 -10.06 95% mean confidence interval for cycles %-change: -2.30% -0.26% Cycles are helped. Haswell total instructions in shared programs: 13093785 -> 13093775 (<.01%) instructions in affected programs: 924 -> 914 (-1.08%) helped: 4 HURT: 2 helped stats (abs) min: 1 max: 5 x̄: 3.00 x̃: 3 helped stats (rel) min: 0.82% max: 1.95% x̄: 1.39% x̃: 1.39% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 1.19% max: 1.19% x̄: 1.19% x̃: 1.19% 95% mean confidence interval for instructions value: -4.53 1.20 95% mean confidence interval for instructions %-change: -2.02% 0.97% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 409580553 -> 409580118 (<.01%) cycles in affected programs: 10909 -> 10474 (-3.99%) helped: 5 HURT: 1 helped stats (abs) min: 6 max: 222 x̄: 89.60 x̃: 18 helped stats (rel) min: 0.16% max: 24.72% x̄: 9.54% x̃: 1.78% HURT stats (abs) min: 13 max: 13 x̄: 13.00 x̃: 13 HURT stats (rel) min: 0.39% max: 0.39% x̄: 0.39% x̃: 0.39% 95% mean confidence interval for cycles value: -180.68 35.68 95% mean confidence interval for cycles %-change: -19.55% 3.79% Inconclusive result (value mean confidence interval includes 0). Ivy Bridge total instructions in shared programs: 11811851 -> 11811840 (<.01%) instructions in affected programs: 1032 -> 1021 (-1.07%) helped: 5 HURT: 1 helped stats (abs) min: 1 max: 5 x̄: 2.40 x̃: 1 helped stats (rel) min: 0.63% max: 1.95% x̄: 1.13% x̃: 0.97% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 1.19% max: 1.19% x̄: 1.19% x̃: 1.19% 95% mean confidence interval for instructions value: -4.17 0.51 95% mean confidence interval for instructions %-change: -1.86% 0.36% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 257618403 -> 257618168 (<.01%) cycles in affected programs: 10784 -> 10549 (-2.18%) helped: 4 HURT: 2 helped stats (abs) min: 4 max: 220 x̄: 64.50 x̃: 17 helped stats (rel) min: 0.50% max: 24.34% x̄: 7.07% x̃: 1.72% HURT stats (abs) min: 9 max: 14 x̄: 11.50 x̃: 11 HURT stats (rel) min: 0.24% max: 0.42% x̄: 0.33% x̃: 0.33% 95% mean confidence interval for cycles value: -133.11 54.78 95% mean confidence interval for cycles %-change: -14.79% 5.59% Inconclusive result (value mean confidence interval includes 0). GM45, Iron Lake, and Sandy Bridge had similar results. (Sandy Bridge shown) total instructions in shared programs: 10533871 -> 10533859 (<.01%) instructions in affected programs: 865 -> 853 (-1.39%) helped: 4 HURT: 0 helped stats (abs) min: 1 max: 5 x̄: 3.00 x̃: 3 helped stats (rel) min: 0.63% max: 1.83% x̄: 1.22% x̃: 1.21% 95% mean confidence interval for instructions value: -6.67 0.67 95% mean confidence interval for instructions %-change: -2.16% -0.29% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 146139904 -> 146139852 (<.01%) cycles in affected programs: 15213 -> 15161 (-0.34%) helped: 4 HURT: 0 helped stats (abs) min: 3 max: 18 x̄: 13.00 x̃: 15 helped stats (rel) min: 0.15% max: 0.84% x̄: 0.39% x̃: 0.29% 95% mean confidence interval for cycles value: -23.79 -2.21 95% mean confidence interval for cycles %-change: -0.88% 0.09% Inconclusive result (%-change mean confidence interval includes 0). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-06 11:17:29 -08:00
Ian Romanick	380136e998	nir: Mark bcsel-to-fmin (or fmax) transformations as inexact These transformations are inexact because section 4.7.1 (Range and Precision) says: Operations and built-in functions that operate on a NaN are not required to return a NaN as the result. The fmin or fmax might not return NaN in cases where the original expression would be required to return NaN. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-06 11:17:14 -08:00
Ian Romanick	4addd34b04	nir: Recognize some more open-coded fmin / fmax This transformation is inexact because section 4.7.1 (Range and Precision) says: Operations and built-in functions that operate on a NaN are not required to return a NaN as the result. The fmin or fmax might not return NaN in cases where the original expression would be required to return NaN. v2: Reorder operands and mark as inexact. The latter suggested by Jason. shader-db results: Haswell, Broadwell, and Skylake had similar results. (Skylake shown) total instructions in shared programs: 14514817 -> 14514808 (<.01%) instructions in affected programs: 229 -> 220 (-3.93%) helped: 3 HURT: 0 helped stats (abs) min: 1 max: 4 x̄: 3.00 x̃: 4 helped stats (rel) min: 2.86% max: 4.12% x̄: 3.70% x̃: 4.12% total cycles in shared programs: 533145211 -> 533144939 (<.01%) cycles in affected programs: 37268 -> 36996 (-0.73%) helped: 8 HURT: 0 helped stats (abs) min: 2 max: 134 x̄: 34.00 x̃: 2 helped stats (rel) min: 0.02% max: 14.22% x̄: 3.53% x̃: 0.05% Sandy Bridge and Ivy Bridge had similar results. (Ivy Bridge shown) total cycles in shared programs: 257618409 -> 257618403 (<.01%) cycles in affected programs: 12582 -> 12576 (-0.05%) helped: 3 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.05% max: 0.05% x̄: 0.05% x̃: 0.05% No changes on Iron Lake or GM45. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-06 11:17:14 -08:00
Gurkirpal Singh	c62cf1f165	st/omx/tizonia/h264d: Add EGLImage support Example Gstreamer pipeline : MESA_ENABLE_OMX_EGLIMAGE=1 GST_GL_API=gles2 GST_GL_PLATFORM=egl gst-launch-1.0 filesrc location=movie.mp4 ! qtdemux ! h264parse ! omxh264dec ! glimagesink Acked-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Julien Isorce <julien.isorce@gmail.com>	2018-03-06 17:21:11 +00:00
Gurkirpal Singh	b2f2236dc5	st/omx/tizonia: Add H.264 encoder v2: Refactor out screen functions to st/omx Example Gstreamer pipeline : gst-launch-1.0 filesrc location=movie.mp4 ! qtdemux ! h264parse ! avdec_h264 ! videoconvert ! omxh264enc ! h264parse ! avdec_h264 ! videoconvert ! ximagesink Acked-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Julien Isorce <julien.isorce@gmail.com>	2018-03-06 17:20:08 +00:00
Gurkirpal Singh	83d4a5d5ae	st/omx/tizonia: Add H.264 decoder v2: Refactor out screen functions to st/omx Example Gstreamer pipeline : gst-launch-1.0 filesrc location=movie.mp4 ! qtdemux ! h264parse ! omxh264dec ! videoconvert ! ximagesink Acked-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Julien Isorce <julien.isorce@gmail.com>	2018-03-06 14:29:42 +00:00
Gurkirpal Singh	430ccdbcb9	st/omx/tizonia: Add entrypoint Adds base files for adding components Acked-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Julien Isorce <julien.isorce@gmail.com>	2018-03-06 14:29:42 +00:00
Gurkirpal Singh	e2afa154e9	st/omx/tizonia: Add --enable-omx-tizonia flag and build files Allow only bellagio or tizonia to be used at the same time. Detect tizonia package config file Generate libomx_mesa.so and install it to libtizcore.pc::pluginsdir Only compile empty source (target.c) for now. GSoC Project link: https://summerofcode.withgoogle.com/projects/#4737166321123328 Acked-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Julien Isorce <julien.isorce@gmail.com>	2018-03-06 14:29:42 +00:00
Gurkirpal Singh	bb5e27fab6	st/omx/bellagio: Rename st and target directories v2: Refactor out screen functions to st/omx Allows to keep all the code under st/omx (st/omx/tizonia and st/omx/bellagio). Reverts targets/omx_bellagio to omx as additions to existing files is enough to compile for both bellagio and tizonia. * autotools changes: --enable-omx -> --enable-omx-bellagio * meson changes: -Dgallium-omx=false -> -Dgallium-omx=disabled -Dgallium-omx=true -> -Dgallium-omx=bellagio Acked-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Julien Isorce <julien.isorce@gmail.com>	2018-03-06 13:07:03 +00:00
Samuel Pitoiset	e96e6f60f7	radv: report the scratch private memory size with shader stats Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-06 10:38:42 +01:00
Samuel Pitoiset	7f6b91c9c3	ac/nir: count the scratch private memory size Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-06 10:38:40 +01:00
Samuel Pitoiset	3b8e7459f2	ac: add ac_count_scratch_private_memory() Imported from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-06 10:38:38 +01:00
Samuel Pitoiset	f3275ca01c	ac/nir: only enable used channels when exporting parameters This allows us to generate, for example, "exp param0 v0, off, off, off" if only the first channel is needed. Not sure if this improves performance but it's worth trying. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-06 10:38:35 +01:00
Samuel Pitoiset	675dde13b2	ac: update enabled channels mask when optimizing PARAM exports When the mask is not 0xf we need to update the number of enabled channels, otherwise the hardware won't emit the components that are combined. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-06 10:37:52 +01:00
Samuel Pitoiset	c24abae9dc	ac/nir: pass the number of enabled channels to si_llvm_init_export_args() Currently, it's always 0xf but an upcoming patch will reduce the number of channels for parameters export. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-06 10:37:50 +01:00
Samuel Pitoiset	5cd34f03c0	ac/shader: scan output usage mask for VS and TES Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-06 10:37:47 +01:00
Clayton Craft	d1fa30e0f8	intel: Add missing includes for building on Android This adds a missing library to the i965/Android.mk file, and updates intel/Android.mk to include the new library. Without this, mesa does not build on Android. Fixes: `272bef0601` "intel: Split gen_device_info out into libintel_dev" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-06 00:14:22 -08:00
Tapani Pälli	237c9caa78	vulkan: do not expose surface/swapchain extensions on Android On Android surface/swapchain extensions are implemented by the loader. Patch modifies both anv and radv extension scripts disabling currently exposed ones. See also earlier commit `9f763c1f9b`. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-06 08:02:59 +02:00
Tapani Pälli	85518657a9	anv: Don't expose VK_KHX_multiview on android. Just like commit `2ffe395` does for radv. Fixes following dEQP test on i965: dEQP-VK.api.info.android.no_unknown_extensions v2: make it !ANDROID since this extension is not about surfaces/swapchain Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-06 08:01:20 +02:00
Roland Scheidegger	cf4a92fda2	gallium: increase PIPE_MAX_SHADER_SAMPLER_VIEWS to 128 Some state trackers require 128. (There are no plans to increase PIPE_MAX_SAMPLERS too, since with gl state tracker it's unlikely more than 32 will be needed, if you need more use bindless.)	2018-03-06 05:18:17 +01:00
Roland Scheidegger	06e724c7b4	tgsi/scan: use wrap-around shift behavior explicitly for file_mask The comment said it will only represent the lowest 32 regs. This was not entirely true in practice, since at least on x86 you'll get masked shifts (unless the compiler could recognize it already and toss it out). It turns out this actually works out alright (presumably noone uses it for temp regs) when increasing max sampler views, so make that behavior explicit. Albeit it feels a bit hacky (but in any case, explicit behavior there is better than undefined behavior). Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-03-06 05:18:17 +01:00
Aaron Watry	95ae6c0355	clover: Allow overriding platform/device version numbers Useful for testing API, builtin library, and device completeness of not-yet-supported versions. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Francisco Jerez <currojerez@riseup.net> (v3) Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Cc: Jan Vesely <jan.vesely@rutgers.edu> v4: Remove redundant std::string wrapper around debug_get_option calls v3: mark CL version overrides as static and const v2: Make version_string in platform const in case	2018-03-05 20:09:46 -06:00
Aaron Watry	106020712f	clover/llvm: Pass device down to compile We'll need to be able to detect device version to define the appropriate __OPENCL_VERSION__ header. v2: Rebase after removing the previous patch (Pierre) - Removed "clover: Add device_clc_version to llvm::create_compiler_instance" Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-03-05 20:09:46 -06:00
Aaron Watry	fc629e3594	clover: Pass device to llvm::create_compiler_instance We'll be using dev.device_clc_version to select the default language version soon along with the existing ir_target field. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net> v4: Pass the device down instead of device_clc_version as a separate field v3: Revise to acknowledge that we now have the device in compile/link_program instead of the string values. v2: (Pierre) Move changes to create_compiler_instance invocation to correct patch to prevent temporary build breakage. (Jan) Use device_clc_version instead of device_version for compile/link	2018-03-05 20:09:46 -06:00
Aaron Watry	dd81ca3883	clover/llvm: Use device in llvm compilation instead of copying fields Copying the individual fields from the device when compiling/linking will lead to an unnecessarily large number of fields getting passed around. v3: Rebase on current master v2: Use device in function args before making additional changes in following patches Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-03-05 20:09:46 -06:00
Timothy Arceri	71b3d681d8	radeonsi/nir: fix handling of doubles for gs inputs Fixes piglit test: tests/spec/arb_gpu_shader_fp64/execution/explicit-location-gs-fs-vs.shader_test Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-06 11:44:06 +11:00
Timothy Arceri	20bd0f6a2b	ac: pass the unmodified number of components to load gs inputs Currently both users of this would overflow an array when the input was a dual slot double as they expected the number of components to be a max of 4. Since we pass the type we can just let the functions handle doubles in a way they choose. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-06 11:44:06 +11:00
Timothy Arceri	2a68c6c6c8	radeonsi: move si_nir_load_input_gs() to si_shader.c All the tess shader and tgsi equivalents are here and it allows use to use llvm_type_is_64bit() in the following patch without exposing it externally. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-06 11:44:06 +11:00
Boris Brezillon	9ea90ffb98	broadcom/vc4: Add support for HW perfmon The V3D engine provides several perf counters. Implement ->get_driver_query_[group_]info() so that these counters are exposed through the GL_AMD_performance_monitor extension. Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Eric Anholt <eric@anholt.net>	2018-03-05 15:54:04 -08:00
Boris Brezillon	5924379a58	drm-uapi: Update vc4 header with perfmon related definitions v2: Update to the final version with the documentation. Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Eric Anholt <eric@anholt.net>	2018-03-05 15:53:48 -08:00
Roland Scheidegger	434523cf2a	r600: fix color export mask The r600 code (not the eg one) forgot to copy the ps_color_export_mask in commit `5b14e06d8b` when updating the pixel state, leading to misrenderings (probably with MRT). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105262 Tested-by: LoneVVolf <lonewolf@xs4all.nl> Tested-by: Pavel Vinogradov <public@sourcemage.org>	2018-03-05 20:15:05 +01:00
Andres Gomez	72552012c7	travis: keep meson version below 0.45.0 Recently Meson upgraded to 0.45.0 and it needs python 3.5+, which is not available in Trusty. Cc: Eric Engestrom <eric.engestrom@imgtec.com> Cc: Dylan Baker <dylan@pnwbakers.com> Cc: Emil Velikov <emil.velikov@collabora.com> Cc: Jon Turney <jon.turney@dronecode.org.uk> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-03-05 21:12:37 +02:00
Kenneth Graunke	0472aa3efe	intel: Drop SURFACE_FORMAT enum from genxml. We want people to be using ISL_FORMAT_*, rather than the genxml format enumerations. This patch drops 10 separate copies, and drops a bunch of ugly casting. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> [jordan.l.justen@intel.com: Minor changes for rebase] Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-05 09:51:08 -08:00
Jordan Justen	755e7e6c20	intel/common: Use isl for decoder surface formats Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-05 09:51:04 -08:00
Jordan Justen	bd3392423d	intel/isl: Add isl_format_is_valid Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-05 09:51:01 -08:00
Jordan Justen	272bef0601	intel: Split gen_device_info out into libintel_dev Split out the device info so isl doesn't depend on intel/common. Now it will depend on the new intel/dev device info lib. This will allow the decoder in intel/common to use isl, allowing us to apply Ken's patch that removes the genxml duplication of surface formats. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-05 09:47:37 -08:00
Gert Wollny	9a0d7bb48c	gallium/aux/hud: Avoid possible buffer overflow Limit the length of acceptable cpu names for use in hud_get_num_cpufreq in order to avoid a buffer overflow later in add_object when this name is copied into cpufreq_info::name. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105274 Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-03-05 11:38:28 -05:00
Eric Engestrom	b98c905a46	gbm: give a name to rgba fields Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-03-05 15:14:36 +00:00
Andres Gomez	40abffb295	egl: remove duplicated initialization Found by inspection. The line removed is a duplicate of the line literally just above the the 3 lines context usually printed in a commit log. v2: enhance the commit log (Emil). Cc: Ian Romanick <ian.d.romanick@intel.com> Cc: Emil Velikov <emil.velikov@collabora.com> Cc: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-05 15:55:53 +02:00
Rob Clark	5a5a43078c	freedreno/ir3: start dealing with half-precision Some instructions, assume src and/or dst is half-precision based on a type field (ie. f32/s32/u32 are full precision but others are half precision). So add some code to sanity check the src/dst registers to catch mixups. Also propagate half-precision flag for SSA sources. The instruction consuming a SSA value needs to be of the same type as the one producing it. This is probably not complete half-precision support, but a useful first step. We do still need to add support for nir alu instructions for converting between half/full precision. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-05 08:05:33 -05:00
Rob Clark	175d1b4372	freedreno/ir3: fix fixing-up register footprint It isn't just vertex shaders that need to fixup reg footprint for inputs populated before shader starts. This problem showed up with compute shaders. If you have (for example) a localregid sysval, but only the .x component is used, the hw still writes the .yz components, which could overflow into other threads causing corruption. Showed up in cl cts 'basic/test_basic intmath_int'. But in theory the same problem could crop up elsewhere. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-05 08:05:33 -05:00
Rob Clark	9a62536108	freedreno: surfaces can be PIPE_BUFFER At least for clover. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-05 08:05:33 -05:00
Rob Clark	d7af35a7f3	freedreno/a5xx: handle compute resources Not entirely sure why this is a different BIND bit, but it is. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-05 08:05:33 -05:00
Rob Clark	82c71b09d5	freedreno/ir3: ignore return jump I think this should also always only occur at the end of a BB (by definition), and the BB successor should be the end block. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-05 08:05:33 -05:00
Rob Clark	c9b1cc33df	freedreno: add some more compute caps Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-05 08:05:33 -05:00
Rob Clark	9630f4df3b	freedreno/a5xx: don't expose 64b pointers yet Temporary hack, but since we can't do 64b math yet in ir3, pretend that we don't support 64b pointers. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-05 08:05:33 -05:00
Rob Clark	54988f1e6b	freedreno: steal handy macro for compute caps from nouveau Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-05 08:05:33 -05:00
Rob Clark	26a9321d0a	freedreno: add global_bindings state Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-05 08:05:33 -05:00
Rob Clark	8c42f63151	freedreno/ir3: small cleanup Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-05 08:05:33 -05:00
Rob Clark	76687b0c0a	freedreno: add pctx->memory_barrier() Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-05 08:05:33 -05:00
Rob Clark	9e4f5966e8	freedreno/ir3: cmdline compiler updates for spv shaders Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-05 08:05:33 -05:00
Samuel Pitoiset	322a51b549	ac: add ac_build_fsign() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-05 11:04:36 +01:00
Samuel Pitoiset	e8bdde2289	ac: add ac_build_isign() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-05 11:04:32 +01:00
Samuel Pitoiset	459e33900f	ac: add ac_build_fract() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-05 11:04:30 +01:00
gurchetansingh@chromium.org	fe0647df5a	virgl: add offset alignment values to to v2 caps struct glBindBufferRange(..) in vrend_draw_bind_ubo is failing with more than one uniform block. This is due to improper alignment of the start of the second block. Let's query the proper alignment from the driver and pass it back to Mesa. Let's query for the texture alignment too, even though the Virgl renderer doesn't call glTexBufferRange yet. The default values are the widest workable range possible (for example, GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT on Nvidia is 256). Fixes: dEQP-GLES3.functional.ubo.* on Nvidia Example test: dEQP-GLES3.functional.ubo.multi_basic_types.single_buffer.shared_vertex Note: This is based on "virgl: reduce some default capset limits.", which hasn't landed in Mesa yet but should relatively soon. Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-05 13:29:39 +10:00
Dave Airlie	9283cf2ad1	virgl: reduce some default capset limits. Since v2 might take a while to rollout, we should reduce these inside some gathered minimums and then v2 can increase them using host values. Reviewed-by: Stéphane Marchesin <marcheu@chromium.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-05 13:29:38 +10:00
Dave Airlie	cd32258ec1	virgl: handle getting new capsets. This checks the kernel api is new enough and asks for the larger caps size since the kernel won't mess it up now. Reviewed-by: Stéphane Marchesin <marcheu@chromium.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-05 13:29:38 +10:00
Timothy Arceri	70190a6567	radeonsi/nir: call ac_lower_indirect_derefs() Fixes piglit tests: tests/spec/glsl-1.50/execution/variable-indexing/gs-input-array-vec3-index-rd.shader_test tests/spec/glsl-1.50/execution/geometry/max-input-components.shader_test Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-05 14:09:23 +11:00
Timothy Arceri	561503e3bd	radeonsi: add chip class to compiler_ctx_state This will be used in the following patch. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-05 14:09:23 +11:00
Timothy Arceri	0f2c7341e8	ac/radv: move lower_indirect_derefs() to ac_nir_to_llvm.c Until llvm handles indirects better we will need to use these workarounds in the radeonsi backend also. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-05 14:09:23 +11:00
Bas Nieuwenhuizen	eea20d59ab	radv: Fix copying from 3D images starting at non-zero depth. Fixes: `f4e499ec79` "radv: add initial non-conformant radv vulkan driver" Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-05 01:04:54 +01:00
Vinson Lee	bb742b6ebf	swr/rast: Fix macOS macro. Fixes: `a25093de71` ("swr/rast: Implement JIT shader caching to disk") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-By: George Kyriazis <george.kyriazis@intel.com>	2018-03-04 13:23:57 -08:00
Mathias Fröhlich	411aa8c322	vbo: Try to reuse the same VAO more often for successive dlists. The change tries to catch more opportunities to reuse the same set of VAO's when building up display lists. Instead of checking the offset with respect to the beginning of the vertex buffer object the change tries to apply this same optimization with respect to the previous display list node. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-03 05:56:35 +01:00
Ian Romanick	a9eb455e29	mesa: Silence unused parameter warnings from TEXSTORE_PARAMS Reduces my build from 1717 warnings to 1547 warnings by silencing 170 instances of things like In file included from ../../SOURCE/master/src/mesa/main/texcompress_bptc.h:30:0, from ../../SOURCE/master/src/mesa/main/texcompress_bptc.c:31: ../../SOURCE/master/src/mesa/main/texcompress_bptc.c: In function ‘_mesa_texstore_bptc_rgba_unorm’: ../../SOURCE/master/src/mesa/main/texstore.h:60:14: warning: unused parameter ‘dstFormat’ [-Wunused-parameter] mesa_format dstFormat, \ ^ ../../SOURCE/master/src/mesa/main/texcompress_bptc.c:1276:32: note: in expansion of macro ‘TEXSTORE_PARAMS’ _mesa_texstore_bptc_rgba_unorm(TEXSTORE_PARAMS) ^~~~~~~~~~~~~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-02 16:10:44 -08:00
Ian Romanick	1049b57bf2	i965: Silence unused parameter warnings in genX_state_upload Reduces my build from 1772 warnings to 1717 warnings by silencing 55 instances of things like ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘gen4_emit_vertex_buffer_state’: ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:313:41: warning: unused parameter ‘end_offset’ [-Wunused-parameter] unsigned end_offset, ^~~~~~~~~~ ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘gen4_emit_sampler_state_pointers_xs’: ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4689:58: warning: unused parameter ‘brw’ [-Wunused-parameter] genX(emit_sampler_state_pointers_xs)(struct brw_context brw, ^~~ ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4690:62: warning: unused parameter ‘stage_state’ [-Wunused-parameter] struct brw_stage_state stage_state) ^~~~~~~~~~~ ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘gen4_upload_default_color’: ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4730:40: warning: unused parameter ‘format’ [-Wunused-parameter] mesa_format format, GLenum base_format, ^~~~~~ ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘translate_wrap_mode’: ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4906:41: warning: unused parameter ‘brw’ [-Wunused-parameter] translate_wrap_mode(struct brw_context *brw, GLenum wrap, bool using_nearest) ^~~ ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘gen4_update_sampler_state’: ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4972:37: warning: unused parameter ‘batch_offset_for_sampler_state’ [-Wunused-parameter] uint32_t batch_offset_for_sampler_state) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-02 16:10:44 -08:00
Ian Romanick	50bf186829	isl: Silence unused parameter warnings in __gen_combine_address implementations Reduces my build from 1808 warnings to 1772 warnings by silencing 36 instances of things like ../../SOURCE/master/src/intel/isl/isl_emit_depth_stencil.c: In function ‘__gen_combine_address’: ../../SOURCE/master/src/intel/isl/isl_emit_depth_stencil.c:30:29: warning: unused parameter ‘data’ [-Wunused-parameter] __gen_combine_address(void data, void loc, uint64_t addr, uint32_t delta) ^~~~ ../../SOURCE/master/src/intel/isl/isl_emit_depth_stencil.c:30:41: warning: unused parameter ‘loc’ [-Wunused-parameter] __gen_combine_address(void data, void loc, uint64_t addr, uint32_t delta) ^~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-02 16:10:44 -08:00
Ian Romanick	492a472b28	genxml: Silence unused parameter warnings in generated pack code Reduces my build from 1960 warnings to 1808 warnings by silencing 152 instances of things like In file included from ../../SOURCE/master/src/intel/genxml/genX_pack.h:32:0, from ../../SOURCE/master/src/intel/isl/isl_emit_depth_stencil.c:36: src/intel/genxml/gen4_pack.h: In function ‘__gen_uint’: src/intel/genxml/gen4_pack.h:58:49: warning: unused parameter ‘end’ [-Wunused-parameter] __gen_uint(uint64_t v, uint32_t start, uint32_t end) ^~~ src/intel/genxml/gen4_pack.h: In function ‘__gen_offset’: src/intel/genxml/gen4_pack.h:94:35: warning: unused parameter ‘start’ [-Wunused-parameter] __gen_offset(uint64_t v, uint32_t start, uint32_t end) ^~~~~ src/intel/genxml/gen4_pack.h:94:51: warning: unused parameter ‘end’ [-Wunused-parameter] __gen_offset(uint64_t v, uint32_t start, uint32_t end) ^~~ src/intel/genxml/gen4_pack.h: In function ‘__gen_ufixed’: src/intel/genxml/gen4_pack.h:133:48: warning: unused parameter ‘end’ [-Wunused-parameter] __gen_ufixed(float v, uint32_t start, uint32_t end, uint32_t fract_bits) ^~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-02 16:10:44 -08:00
Ian Romanick	f726695cce	i965: Silence unused parameter warnings in blorp Reduces my build from 2023 warnings to 1960 warnings by silencing 63 instances of things like In file included from ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:33:0: ../../SOURCE/master/src/intel/blorp/blorp_genX_exec.h: In function ‘blorp_emit_cc_viewport’: ../../SOURCE/master/src/intel/blorp/blorp_genX_exec.h:500:51: warning: unused parameter ‘params’ [-Wunused-parameter] const struct blorp_params params) ^~~~~~ ../../SOURCE/master/src/intel/blorp/blorp_genX_exec.h: In function ‘blorp_emit_sampler_state’: ../../SOURCE/master/src/intel/blorp/blorp_genX_exec.h:524:53: warning: unused parameter ‘params’ [-Wunused-parameter] const struct blorp_params params) ^~~~~~ In file included from ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:36:0: ../../SOURCE/master/src/mesa/drivers/dri/i965/gen4_blorp_exec.h: In function ‘blorp_emit_vs_state’: ../../SOURCE/master/src/mesa/drivers/dri/i965/gen4_blorp_exec.h:50:48: warning: unused parameter ‘params’ [-Wunused-parameter] const struct blorp_params params) ^~~~~~ ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c: In function ‘blorp_flush_range’: ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:197:39: warning: unused parameter ‘batch’ [-Wunused-parameter] blorp_flush_range(struct blorp_batch batch, void start, size_t size) ^~~~~ ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:197:52: warning: unused parameter ‘start’ [-Wunused-parameter] blorp_flush_range(struct blorp_batch batch, void start, size_t size) ^~~~~ ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:197:66: warning: unused parameter ‘size’ [-Wunused-parameter] blorp_flush_range(struct blorp_batch batch, void *start, size_t size) ^~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-02 16:10:44 -08:00
Ian Romanick	3a944316c4	nir: Silence unused parameter warnings in generated nir_constant_expressions code Reduces my build from 2075 warnings to 2023 warnings by silencing 52 instances of things like src/compiler/nir/nir_constant_expressions.c: In function ‘evaluate_bfi’: src/compiler/nir/nir_constant_expressions.c:1812:61: warning: unused parameter ‘bit_size’ [-Wunused-parameter] evaluate_bfi(MAYBE_UNUSED unsigned num_components, unsigned bit_size, ^~~~~~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-02 16:10:44 -08:00
Ian Romanick	ab8f2e30b8	i965: Silence unused parameter warnings in generated OA code Reduces my build from 6301 warnings to 2075 warnings by silencing 4226 instances of things like src/mesa/drivers/dri/i965/i965@sta/brw_oa_hsw.c: In function ‘hsw__render_basic__gpu_core_clocks__read’: src/mesa/drivers/dri/i965/i965@sta/brw_oa_hsw.c:41:62: warning: unused parameter ‘brw’ [-Wunused-parameter] hsw__render_basic__gpu_core_clocks__read(struct brw_context *brw, ^~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-02 16:10:44 -08:00
Ian Romanick	a55dae6ea2	i965: Silence warnings about mixing enum and non-enum in conditional Reduces my build from 6451 warnings to 6301 warnings by silencing 150 instances of ../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_reg_type brw_inst_src1_type(const gen_device_info, const brw_inst)’: ../../SOURCE/master/src/intel/compiler/brw_inst.h:802:55: warning: enumeral and non-enumeral type in conditional expression [-Wextra] unsigned file = __builtin_strcmp("dst", #reg) == 0 ? \ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~ BRW_GENERAL_REGISTER_FILE : \ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ brw_inst_##reg##_reg_file(devinfo, inst); \ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../../SOURCE/master/src/intel/compiler/brw_inst.h:811:1: note: in expansion of macro ‘REG_TYPE’ REG_TYPE(src1) ^~~~~~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-02 16:10:44 -08:00
Ian Romanick	feefb7810e	intel/compiler: Silence unused parameter warnings in release builds Reduces my build from 7005 warnings to 6451 warnings by silencing 554 instances of In file included from ../../SOURCE/master/src/intel/compiler/brw_disasm.c:28:0: ../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_3src_a1_src0_imm’: ../../SOURCE/master/src/intel/compiler/brw_inst.h:346:57: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_inst_3src_a1_src0_imm(const struct gen_device_info devinfo, ^~~~~~~ ../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_3src_a1_src2_imm’: ../../SOURCE/master/src/intel/compiler/brw_inst.h:354:57: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_inst_3src_a1_src2_imm(const struct gen_device_info devinfo, ^~~~~~~ ../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_set_3src_a1_src0_imm’: ../../SOURCE/master/src/intel/compiler/brw_inst.h:362:61: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_inst_set_3src_a1_src0_imm(const struct gen_device_info devinfo, ^~~~~~~ ../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_set_3src_a1_src2_imm’: ../../SOURCE/master/src/intel/compiler/brw_inst.h:370:61: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_inst_set_3src_a1_src2_imm(const struct gen_device_info devinfo, ^~~~~~~ ../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_imm_uq’: ../../SOURCE/master/src/intel/compiler/brw_inst.h:703:47: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_inst_imm_uq(const struct gen_device_info devinfo, const brw_inst insn) ^~~~~~~ In file included from ../../SOURCE/master/src/intel/compiler/brw_shader.h:29:0, from ../../SOURCE/master/src/intel/compiler/brw_disasm.c:29: ../../SOURCE/master/src/intel/compiler/brw_compiler.h: In function ‘brw_stage_has_packed_dispatch’: ../../SOURCE/master/src/intel/compiler/brw_compiler.h:1277:61: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_stage_has_packed_dispatch(const struct gen_device_info *devinfo, ^~~~~~~ ../../SOURCE/master/src/intel/compiler/brw_disasm.c: In function ‘src_ia1’: ../../SOURCE/master/src/intel/compiler/brw_disasm.c:849:18: warning: unused parameter ‘_reg_file’ [-Wunused-parameter] unsigned _reg_file, ^~~~~~~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-02 16:10:44 -08:00
Ian Romanick	c8a03ab453	i965: Silence unused parameter warnings Reduces my build from 7119 warnings to 7005 warnings by silencing 114 instances of In file included from ../../SOURCE/master/src/mesa/drivers/dri/i965/brw_context.h:46:0, from ../../SOURCE/master/src/mesa/drivers/dri/i965/intel_pixel_read.c:38: ../../SOURCE/master/src/mesa/drivers/dri/i965/brw_bufmgr.h: In function ‘brw_bo_unmap’: ../../SOURCE/master/src/mesa/drivers/dri/i965/brw_bufmgr.h:258:47: warning: unused parameter ‘bo’ [-Wunused-parameter] static inline int brw_bo_unmap(struct brw_bo *bo) { return 0; } ^~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-02 16:10:44 -08:00
Kenneth Graunke	9fa95359df	intel: Drop program size pointer from vec4/fs assembly getters. These days, we're just passing a pointer to a prog_data field, which we already have access to. We can just use it directly. (In the past, it was a pointer to a separate value.) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-02 14:20:22 -08:00
Kenneth Graunke	b04cf529f2	i965: Mark upload buffers with MAP_ASYNC and MAP_PERSISTENT. This should have no practical impact. For the default uploader, we don't really care, but for others, we may want to append more data as the GPU is reading existing data, which means we need async and persistent flags. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2018-03-02 14:19:33 -08:00
Kenneth Graunke	eb99bf8abe	i965: Generalize intel_upload.c to support multiple uploaders. I'd like to reuse the upload logic for a new program cache, but the buffers will need to have a different lifetime than the default uploader, and also some address space restrictions. So, we can't use a single uploader for both situations - we'll need two of them. This creates a public 'uploader' structure, and adjusts the interface to take an uploader rather than always using brw->upload. It should have no functional change at the moment. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2018-03-02 14:19:33 -08:00
Anuj Phogat	56dc9f9f49	intel/compiler: Memory fence commit must always be enabled for gen10+ Commit bit in the message descriptor (Bit 13) must be always set to true in CNL+ for memory fence messages. It also fixes a piglit GPU hang on cnl+ in simulation environment. Piglit test: arb_shader_image_load_store-shader-mem-barrier See HSD ES # 1404612949 Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-03-02 11:45:21 -08:00
Francisco Jerez	4b4838b1ae	Revert "i965/fs: Predicate byte scattered writes if needed" This reverts commit `a4031bdfa9`. It's redundant with the sample mask predication done at this point by the common logical send lowering infrastructure, and rather buggy because it wasn't applying the correct sample mask in shaders using discard, since the dispatch mask returned by FS_OPCODE_MOV_DISPATCH_TO_FLAGS doesn't reflect samples discarded by the shader, so it could have led to data corruption in fragment shader invocations that execute discard based on a non-dynamically uniform condition. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-02 11:28:56 -08:00
Francisco Jerez	c063e88909	intel/fs: Handle surface opcode sample masks via predication. The main motivation is to enable HDC surface opcodes on ICL which no longer allows the sample mask to be provided in a message header, but this is enabled all the way back to IVB when possible because it decreases the instruction count of some shaders using HDC messages significantly, e.g. one of the SynMark2 CSDof compute shaders decreases instruction count by about 40% due to the removal of header setup boilerplate which in turn makes a number of send message payloads more easily CSE-able. Shader-db results on SKL: total instructions in shared programs: 15325319 -> 15314384 (-0.07%) instructions in affected programs: 311532 -> 300597 (-3.51%) helped: 491 HURT: 1 Shader-db results on BDW where the optimization needs to be disabled in some cases due to hardware restrictions: total instructions in shared programs: 15604794 -> 15598028 (-0.04%) instructions in affected programs: 220863 -> 214097 (-3.06%) helped: 351 HURT: 0 The FPS of SynMark2 CSDof improves by 5.09% ±0.36% (n=10) on my SKL laptop with this change. According to Eero this improves performance of the same test by 9% on BYT and by 7-8% on BXT J4205 and on SKL GT2 desktop. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-By: Eero Tamminen <eero.t.tamminen@intel.com>	2018-03-02 11:28:56 -08:00
Francisco Jerez	e7c9adca57	intel/eu: Plumb header present bit to codegen helpers for HDC messages. This makes sure that the header-present bit of the message descriptor is in sync with the IR instruction fields, which gives the optimizer more control to avoid the overhead of setting up a message header when it's possible to do so. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-02 11:28:56 -08:00
Francisco Jerez	6edb332b44	intel/ir: Allow arbitrary scratch flag registers for SHADER_OPCODE_FIND_LIVE_CHANNEL. This shouldn't cause any functional change at this point, it changes SHADER_OPCODE_FIND_LIVE_CHANNEL to use the flag register specified at the IR level instead of the hard-coded f1.0, now that it can be represented in backend_instruction::flag_subreg. This will be necessary for scheduling to behave correctly once more things start making use of f1.0. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-02 11:28:56 -08:00
Francisco Jerez	cc0fc8b8ac	intel/ir: Allow representing additional flag subregisters in the IR. This allows representing conditional mods and predicates on f1.0-f1.1 at the IR level by adding an extra bit to the flag_subreg backend_instruction field. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-02 11:28:56 -08:00
Francisco Jerez	9ec3362e0b	intel/l3: Don't allocate SLM partition on ICL+. SLM has a chunk of special-purpose memory separate from L3 on ICL+, we shouldn't allocate a partition for it on L3 anymore. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-02 11:28:56 -08:00
Charmaine Lee	af8877af3b	svga: add SVGA_NEW_PRESCALE to the tracked dirty mask for gs Since geometry shader also consumes prescale constants, the geometry shader constant buffer will need to be updated when prescale factor is changed. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-03-02 12:23:50 -07:00
Brian Paul	dc79b88402	svga: fix blending regression The earlier Mesa commit `3d06c8afb5` ("st/mesa: don't translate blend state when it's disabled for a colorbuffer") subtly changed the details of gallium's per-RT blend state. In particular, when pipe_rt_blend_state[i].blend_enabled is true, we have to get the src/dst blend terms from pipe_rt_blend_state[i], not [0] as before. We now have to scan the blend targets to find the first one that's enabled (if any). We have to use the index of that target for getting the src/dst blend terms. And note that we have to set identical blend terms for all targets. This fixes the Piglit fbo-drawbuffers2-blend test. VMware bug 2063493. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-03-02 12:23:50 -07:00
Brian Paul	b871a77316	svga: check svga_have_vgpu10() in svga_delete_blend_state() We were calling SVGA3D_vgpu10_DestroyBlendState() when vgpu10 was not enabled (bs->id==0 by default), resulting in lots of device errors. Reviewed-by: Neha Bhende<bhenden@vmware.com>	2018-03-02 12:23:50 -07:00
Brian Paul	72df3a7a39	svga: if svga_update_state() fails, skip the draw call If svga_update_state() fails, we flush the command buffer and retry. If it fails again, it likely means we were unable to translate a shader for some reason (uses too many resources, for example). In that case, let's just skip the draw call. The alternative, just disabling the shader stage in question, would certainly lead to bad rendering anyway, and probably device errors. Fixes failed assertion running Piglit glsl-1.50/execution/ variable-indexing/gs-output-array-vec4-index-wr.shader_test since it uses too many GS output registers (though the test still fails). VMware bug 2063492. v2: also call pipe_debug_message() so apps or apitrace can be notified when this issue occurs. v3: use svga_update_state_retry(). Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-03-02 12:23:50 -07:00
Brian Paul	0a7deaa0d6	svga: let svga_update_state_retry() return a bool This will allow minor simplifications elsewhere. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-03-02 12:23:50 -07:00
Brian Paul	35c5cf8959	svga: s/unsigned/boolean/ for a few local vars Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-03-02 12:23:50 -07:00
Dylan Baker	e23192022a	meson: install vulkan_intel.h header Fixes: `d1992255bb` ("meson: Add build Intel "anv" vulkan driver") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-02 11:11:20 -08:00
Boyuan Zhang	1ad89fa138	st/omx_bellagio: add picture profile and entry point Profile and entry point were missing in the picture structure. Therefore, add them back. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2018-03-02 12:04:36 -05:00
Boyuan Zhang	6a62e455f2	radeonsi: fix radeon create encoder return Previous patch missed a "return" when trying to modify the create encoder function, which made the whole logic fail. Therefore, add the return back. Fixes: `b38b208ff8` "radeonsi:create uvd hevc enc entry" Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-03-02 12:04:36 -05:00
Thierry Reding	f9bc48d41d	loader: Add support for platform and host1x busses ARM SoCs usually have their DRM/KMS devices on the platform bus, so add support for this bus in order to allow use of the DRI_PRIME environment variable with those devices. While at it, also support the host1x bus, which is effectively the same but uses an additional layer in the bus hierarchy. Note that it isn't enough to support the bus that has the rendering GPU because the loader code will also try to construct an ID path tag for a scanout-only device if it is the default that is being opened. The ID path tag for a device can be obtained by running udevadm info on the device node, as shown in this example on NVIDIA Tegra: $ udevadm info /dev/dri/card0 \| grep ID_PATH_TAG E: ID_PATH_TAG=platform-50000000_host1x The corresponding OF_FULLNAME property, from which the ID_PATH_TAG is constructed, can be found in the sysfs "uevent" attribute for the card0 device's parent: $ grep OF_FULLNAME /sys/devices/platform/50000000.host1x/drm/uevent OF_FULLNAME=/host1x@50000000 Similarily, /dev/dri/card1 corresponds to the GPU: $ udevadm info /dev/dri/card1 \| grep ID_PATH_TAG E: ID_PATH_TAG=platform-57000000_gpu and: $ grep OF_FULLNAME /sys/devices/platform/57000000.gpu/uevent OF_FULLNAME=/gpu@57000000 Changes in v2: - avoid confusing pre-increment in strdup() - add examples of tags to commit message Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2018-03-02 14:40:29 +01:00
Thierry Reding	498faea103	disk cache: Link with -latomic if necessary The disk cache implementation uses 64-bit atomic operations. For some architectures, such as 32-bit ARM, GCC will not be able to translate these operations into atomic, lock-free instructions and will instead rely on the external atomics library to provide these operations. Check at configuration time whether or not linking against libatomic is necessary and if so, create a dependency that can be used while linking the mesautil library. This is the meson equivalent of `2ef7f23820` ("configure: check if -latomic is needed for __atomic_*"). For some background information on this, see: https://gcc.gnu.org/wiki/Atomic/GCCMM Changes in v2: - clarify meaning of lock-free in commit message - fix build if -latomic is not necessary Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2018-03-02 11:31:59 +01:00
Samuel Pitoiset	c133a3411b	radv: do not set pending_reset_query in BeginCommandBuffer() This is just useless for two reasons: 1) flush_bits is not set accordingly, so nothing will be flushed in BeginQuery(). 2) we always flush caches in EndCommandBuffer(), so if a reset is done in a previous command buffer we are safe. Cc: "18.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-02 09:44:12 +01:00
Dave Airlie	bf2af063c3	r600/cayman: fix fragcood loading recip generation. This fixes some hangs seen where the recip_ieee opcodes would end up split across the wrong slots. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-02 00:33:18 +00:00
Kenneth Graunke	cee9f38903	i965: Allow 48-bit addressing on Gen8+. This allows most GPU objects to use the full 48-bit address space offered by Gen8+ platforms, rather than being stuck with 32-bit. This expands the available GPU memory from 4G to 256TB or so. A few objects - instruction, scratch, and vertex buffers - need to remain pinned in the low 4GB of the address space for various reasons. We default everything to 48-bit but disable it in those cases. Thanks to Jason Ekstrand for blazing this trail in anv first and finding the nasty undocumented hardware issues. This patch simply rips off all of his findings. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-01 15:46:11 -08:00
Kenneth Graunke	6712611735	i965: Shorten the name of the workaround BO. This makes the name shorter in debug printouts. If "workaround_bo" is good enough for the code, it's probably good enough for debugging.	2018-03-01 15:46:11 -08:00
Kenneth Graunke	b04c5cece7	i965: Add debugging code to dump the validation list. When anything goes wrong with this code, dumping the validation list is a useful way to figure out what's happening.	2018-03-01 15:46:11 -08:00
Jason Ekstrand	ff4726077d	intel/fs: Set up sampler message headers in the visitor on gen7+ This gives the scheduler visibility into the headers which should improve scheduling. More importantly, however, it lets the scheduler know that the header gets written. As-is, the scheduler thinks that a texture instruction only reads it's payload and is unaware that it may write to the first register so it may reorder it with respect to a read from that register. This is causing issues in a couple of Dota 2 vertex shaders. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104923 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-03-01 15:11:01 -08:00
Timothy Arceri	f5305c1b44	ac: fix nir_intrinsic_shared_atomic_comp_swap handling Following on from `49879f3778` this makes sure we use the correct src index. Fixes cts test: KHR-GL46.compute_shader.atomic-case3 Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-02 09:11:20 +11:00
Timothy Arceri	13cdf4e590	st/glsl_to_nir: simplify st_nir_assign_var_locations() and fix for fs outputs We only need to check for previously processed location on user defined varyings as they are the only ones that support component packing. Therefore a single instance of processed_locs can be shared by regular varyings and patches. For simplicity we make processed_locs an array in order to handle dual source bleanding. Fixes the follow piglit test on radeonsi: tests/spec/arb_enhanced_layouts/execution/component-layout/fs-output.shader_test Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-02 09:11:20 +11:00
Jason Ekstrand	89f78cf333	anv: Enable MSAA fast-clears This speeds up the Sascha Willems multisampling demo by around 25% when using 8x or 16x MSAA. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-01 14:07:58 -08:00
Jason Ekstrand	00da139477	anv/cmd_buffer: Add support for MCS fast-clears and resolves Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-01 14:07:58 -08:00
Jason Ekstrand	1805c483b1	anv/cmd_buffer: Add helpers for computing resolve predicates We'll want to re-use the complex resolve predicate computations for MCS resolves so it's nice to have them as helper functions. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-01 14:07:58 -08:00
Jason Ekstrand	a0a319f16e	anv/cmd_buffer: Handle MCS identical to CCS_E in compute_aux_usage This doesn't actually do anything because att_state->fast_clear is determined based on the return value of anv_layout_to_fast_clear_type which currently returns NONE for multisampled images. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-01 14:07:58 -08:00
Jason Ekstrand	d0f701d2f1	anv/blorp: Pass the clear address to blorp for subpass MSAA resolves Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-01 14:07:58 -08:00
Jason Ekstrand	f4f95496cb	anv/blorp: Allow indirect clear colors on blorp sources on gen7 Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-01 14:07:58 -08:00
Jason Ekstrand	d85f05bd6f	anv/blorp: Add partial clear support to anv_image_mcs_op Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-01 14:07:58 -08:00
Jason Ekstrand	c34feaea52	intel/blorp: Add indirect clear color support to mcs_partial_resolve This is a bit complicated because we have to get the indirect clear color in there somehow. In order to not do any more work in the shader than needed, we set it up as it's own vertex binding which points directly at the clear color address specified by the client. Acked-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-03-01 14:07:58 -08:00
Jason Ekstrand	ca7ab1a6a5	intel/blorp: Add a helper for filling out VERTEX_BUFFER_STATE There are enough #ifs in there that it's kind-of pointless to duplicate it for each buffer. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-01 14:07:58 -08:00
Andriy Khulap	7859701920	i965: Fix RELOC_WRITE typo in brw_store_data_imm64() Fixes: `6c530ad116` ("i965: Reduce passing 2x32b of reloc_domains to 2 bits") Signed-off-by: Andriy Khulap <andriy.khulap@globallogic.com> Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-01 11:20:04 -08:00
Jonathan Gray	034bbaa6c0	gallium/util: use sockets on PIPE_OS_UNIX in u_network Instead of listing all the UNIX PIPE_OS platforms just use PIPE_OS_UNIX. Makes BSD sockets available on PIPE_OS_BSD. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-03-01 18:44:39 +00:00
Jonathan Gray	7bea40e566	util: use clock_gettime() on PIPE_OS_BSD OpenBSD, FreeBSD, NetBSD and DragonFlyBSD all have clock_gettime() so use it when PIPE_OS_BSD is defined. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-03-01 18:44:38 +00:00
Jose Maria Casanova Crespo	4420d8866c	nir/search: Include 8 and 16-bit support in construct_value Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-01 09:16:03 -08:00
Jason Ekstrand	99ee40fb54	nir/search: Support 8 and 16-bit constants in match_value Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>	2018-03-01 09:15:01 -08:00
Andres Gomez	b5b912dfee	travis: make Meson find the proper llvm-config Travis CI has moved to LLVM 5.0, and meson is detecting automatically the available version in /usr/local/bin based on the PATH env variable order preference. As for 0.44.x, Meson cannot receive the path to the llvm-config binary as a configuration parameter. See https://github.com/mesonbuild/meson/issues/2887 and `7c8b6ee3fa` We want to use the custom (APT) installed version. Therefore, let's make Meson find our wanted version sooner than the one at /usr/local/bin Once this is corrected, we would still need a patch similar to: https://lists.freedesktop.org/archives/mesa-dev/2017-December/180217.html v2: Create the link only to the specificly wanted LLVM version (Gert). Cc: Eric Engestrom <eric.engestrom@imgtec.com> Cc: Dylan Baker <dylan@pnwbakers.com> Cc: Emil Velikov <emil.velikov@collabora.com> Cc: Juan A. Suarez Romero <jasuarez@igalia.com> Cc: Gert Wollny <gw.fossdev@gmail.com> Cc: Jon Turney <jon.turney@dronecode.org.uk> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-and-Tested-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-By: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-01 12:21:30 +02:00
Andres Gomez	98f7650add	meson: fix LLVM version detection when <= 3.4 3 digits versions in LLVM only started from 3.4.1 on. Hence, even if you can perfectly build with an old LLVM (< 3.4.1) in the system while not needing LLVM at all (auto), when passing through the LLVM version detection code, meson will fail when accessing "_llvm_version[2]" due to: "Index 2 out of bounds of array of size 2." v2: Properly compare LLVM version and set patch version to 0 if < 3.4.1 (Eric). v3: Improve the commit log explanation (Eric). Cc: Dylan Baker <dylan@pnwbakers.com> Cc: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-03-01 12:16:23 +02:00
Iago Toral Quiroga	bc73016703	i965/sbe: fix number of inputs for active components In `16631ca30e` we fixed gen9 active components to account for padded inputs in the URB, which we can have with SSO programs. To do that, instead of going through the bitfield of inputs (which doesn't include padding information), we compute the number of inputs from the size of the URB entry. Unfortunately, there are some special inputs that are not stored in the URB and that we also need to account for. These special inputs are identified and handled during calculate_attr_overrides(). Instead of keeping track of the exact number of inputs, we just program active components for all possible inputs like we do in anvil. This fixes a regression in a WebGL program that uses Point Sprite functionality (specifically, VARYING_SLOT_PNTC). v2: - Add 'Fixes' tag (Mark Janes) - make no_vue_inputs int instead of uint32_t, and add const qualifier to num_inputs variable (Ian) v3: - Do not try to count inputs correctly, just program all input slots like we do in anvil (Ken) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105224 Fixes: `16631ca30e` (i965/sbe: fix active components for SSO programs with over 16 inputs) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-01 10:55:12 +01:00
Samuel Pitoiset	c27f5419f6	radv: only emit cache flushes when the pool size is large enough This is an optimization which reduces the number of flushes for small pool buffers. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-01 09:53:40 +01:00
Samuel Pitoiset	2fe07933bd	radv: keep track of the query pool size Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-01 09:53:39 +01:00
Samuel Pitoiset	c956d0f406	radv: make sure to emit cache flushes before starting a query If the query pool has been previously resetted using the compute shader path. Fixes: `a41e2e9cf5` ("radv: allow to use a compute shader for resetting the query pool") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105292 Cc: "18.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-01 09:14:49 +01:00
Alejandro Piñeiro	e72fb4e611	nir/serialize: handle var->name being NULL var->name could be NULL under ARB_gl_spirv for example. And in any case, the code is already handing var name being NULL when reading a variable, so it is consistent to do it writing a variable too. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-01 08:23:33 +01:00
Jose Maria Casanova Crespo	ba642ee3ee	anv: Enable VK_KHR_16bit_storage for PushConstant Enables storagePushConstant16 features of VK_KHR_16bit_storage for Gen8+. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo	02266f9ba1	spirv/i965/anv: Relax push constant offset assertions being 32-bit aligned The introduction of 16-bit types with VK_KHR_16bit_storages implies that push constant offsets could be multiple of 2-bytes. Some assertions are updated so offsets should be just multiple of size of the base type but in some cases we can not assume it as doubles aren't aligned to 8 bytes in some cases. For 16-bit types, the push constant offset takes into account the internal offset in the 32-bit uniform bucket adding 2-bytes when we access not 32-bit aligned elements. In all 32-bit aligned cases it just becomes 0. v2: Assert offsets to be aligned to the dest type size. (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo	23ffb7c2d1	spirv: Calculate properly 16-bit vector sizes Range in 16-bit push constants load was being calculated wrongly using 4-bytes per element instead of 2-bytes as it should be. v2: Use glsl_get_bit_size instead of if statement (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo	994d210429	anv: Enable VK_KHR_16bit_storage for SSBO and UBO Enables storageBuffer16BitAccess and uniformAndStorageBuffer16BitAccesss features of VK_KHR_16bit_storage for Gen8+. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo	69be3a82ca	i965/fs: Support 16-bit store_ssbo with VK_KHR_relaxed_block_layout Restrict the use of untyped_surface_write with 16-bit pairs in ssbo to the cases where we can guarantee that offset is multiple of 4. Taking into account that VK_KHR_relaxed_block_layout is available in ANV we can only guarantee that when we have a constant offset that is multiple of 4. For non constant offsets we will always use byte_scattered_write. v2: (Jason Ekstrand) - Assert offset_reg to be multiple of 4 if it is immediate. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo	8dd8be0323	i965/fs: Support 16-bit do_read_vector with VK_KHR_relaxed_block_layout 16-bit load_ubo/ssbo operations that call do_untyped_read_vector don't guarantee that offsets are multiple of 4-bytes as required by untyped_read message. This happens for example in the case of f16mat3x3 when then VK_KHR_relaxed_block_layout is enabled. Vectors reads when we have non-constant offsets are implemented with multiple byte_scattered_read messages that not require 32-bit aligned offsets. Now for all constant offsets we can use the untyped_read_surface message. In the case of constant offsets not aligned to 32-bits, we calculate a start offset 32-bit aligned and use the shuffle_32bit_load_result_to_16bit_data function and the first_component parameter to skip the copy of the unneeded component. v2: (Jason Ekstrand) Use untyped_read_surface messages always we have constant offsets. v3: (Jason Ekstrand) Simplify loop for reads with non constant offsets. Use end - start to calculate the number of 32-bit components to read with constant offsets. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo	2dd94f462b	i965/fs: shuffle_32bit_load_result_to_16bit_data now skips components This helper used to load 16bit components from 32-bits read now allows skipping components with the new parameter first_component. The semantics now skip components until we reach the first_component, and then reads the number of components passed to the function. All previous uses of the helper are updated to use 0 as first_component. This will allow read 16-bit components when the first one is not aligned 32-bit. Enabling more usages of untyped_reads with 16-bit types. v2: (Jason Ektrand) Change parameters order to first_component, num_components Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo	67d7dd594e	isl/i965/fs: SSBO/UBO buffers need size padding if not multiple of 32-bit The surfaces that backup the GPU buffers have a boundary check that considers that access to partial dwords are considered out-of-bounds. For example, buffers with 1,3 16-bit elements has size 2 or 6 and the last two bytes would always be read as 0 or its writting ignored. The introduction of 16-bit types implies that we need to align the size to 4-bytew multiples so that partial dwords could be read/written. Adding an inconditional +2 size to buffers not being multiple of 2 solves this issue for the general cases of UBO or SSBO. But, when unsized arrays of 16-bit elements are used it is not possible to know if the size was padded or not. To solve this issue the implementation calculates the needed size of the buffer surfaces, as suggested by Jason: surface_size = isl_align(buffer_size, 4) + (isl_align(buffer_size, 4) - buffer_size) So when we calculate backwards the buffer_size in the backend we update the resinfo return value with: buffer_size = (surface_size & ~3) - (surface_size & 3) It is also exposed this buffer requirements when robust buffer access is enabled so these buffer sizes recommend being multiple of 4. v2: (Jason Ekstrand) Move padding logic fron anv to isl_surface_state. Move calculus of original size from spirv to driver backend. v3: (Jason Ekstrand) Rename some variables and use a similar expresion when calculating. padding than when obtaining the original buffer size. Avoid use of unnecesary component call at brw_fs_nir. v4: (Jason Ekstrand) Complete comment with buffer size calculus explanation in brw_fs_nir. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-28 21:37:40 -08:00
Mathias Fröhlich	4c232dc721	vbo: Remove vbo_save_vertex_list::vertex_size. Like before use local variables from compile_vertex_list instead. Remove vertex_size from struct vbo_save_vertex_list. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-01 04:06:23 +01:00
Mathias Fröhlich	478a9bc7bb	vbo: Remove vbo_save_vertex_list::buffer_offset. The buffer_offset is used in aligned_vertex_buffer_offset. But now that most of these decisions are done in compile_vertex_list we can work on local variables instead of struct members in the display list code. Clean that up and remove buffer_offset. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-01 04:06:23 +01:00
Mathias Fröhlich	bfa8d8e5bf	vbo: Remove vbo_save_vertex_list::start_vertex. Replace last use on replay with _vbo_save_get_{min,max}_index. Appart from that it is not used anymore. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-01 04:06:23 +01:00
Mathias Fröhlich	6dd3e98c21	vbo: Remove vbo_save_vertex_list::attrsz. Is not used anymore on replay, move the last use in display list compilation to the original array in the display list compiler. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-01 04:06:23 +01:00
Mathias Fröhlich	95b4be4f29	vbo: Remove vbo_save_vertex_list::attrtype. Is not used anymore on replay, move the last use in display list compilation to the original array in the display list compiler. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-01 04:06:23 +01:00
Mathias Fröhlich	77df52cc4f	vbo: Remove vbo_save_vertex_list::enabled. Is not used anymore on replay. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-01 04:06:23 +01:00
Mathias Fröhlich	19a0f27a49	vbo: Remove reference to the vertex_store from the dlist node. Since we now store a set of VAOs in the display list, use these object to get the reference to the VBO in several places. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-01 04:06:23 +01:00
Mathias Fröhlich	6e410270ee	vbo: Implement current values update in terms of the VAO. Use the information already present in the VAO to update the current values after display list replay. Set GL_OUT_OF_MEMORY on allocation failure for the current value update storage. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-01 04:06:23 +01:00
Mathias Fröhlich	08aa0d9bf4	vbo: Implement vbo_loopback_vertex_list in terms of the VAO. Use the information already present in the VAO to replay a display list node using immediate mode draw commands. Use a hand full of helper methods that will be useful for the next patches also. v2: Insert asserts, constify local variables. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-01 04:06:23 +01:00
Mathias Fröhlich	f7178d677c	vbo: Use a local variable for the dlist offsets. The master value is now stored inside the VAO already present in struct vbo_save_vertex_list. Remove the unneeded copy from dlist storage. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-01 04:06:23 +01:00
Mathias Fröhlich	1cc3516a11	vbo: Remove unused vbo_save_context::wrap_count. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-01 04:06:23 +01:00
Mathias Fröhlich	07915020f0	vbo: Remove unused vbo_save_vertex_list::dangling_attr_ref. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-01 04:06:23 +01:00
Jason Ekstrand	6d3edbea16	anv: Always set has_context_priority We don't zalloc the physical device so we need to unconditionally set everything. Crucible helpfully initializes all allocations to 139 so it was getting true regardless of whether or not the kernel actually supports context priorities. Fixes: `6d8ab53303` "anv: implement VK_EXT_global_priority extension" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 17:31:20 -08:00
Mark Janes	0fc009b8c7	Revert "i965: Only emit 3DSTATE_DRAWING_RECTANGLE once on gen8+" This reverts commit `a2c1e48f15`. On BDWGT3e and KBLGT3e systems, this commit regressed the following tests: piglit.spec.ext_framebuffer_multisample.accuracy 2 stencil_resolve small depthstencil piglit.spec.ext_framebuffer_multisample.accuracy 4 stencil_resolve small depthstencil piglit.spec.ext_framebuffer_multisample.accuracy 6 stencil_resolve small depthstencil piglit.spec.ext_framebuffer_multisample.accuracy 8 stencil_resolve small depthstencil piglit.spec.ext_framebuffer_multisample.accuracy all_samples stencil_resolve small depthstencil	2018-02-28 17:26:08 -08:00
Dave Airlie	6c1b5a40fd	radeonsi/nir: increase values to 8 for gs fetch. This stops a crash when running (still fails): tests/spec/arb_gpu_shader_fp64/execution/explicit-location-gs-fs-vs.shader_test Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-01 10:35:09 +10:00
Bas Nieuwenhuizen	f9898b211e	radv: Use the syncobj wait ioctl to wait on fences if possible. Handles the !waitAll and signal after the start of the wait cases correctly. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-01 01:07:18 +01:00
Bas Nieuwenhuizen	34bd5e2e2e	radv: Implement more efficient !waitAll fence waiting. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-01 01:07:18 +01:00
Bas Nieuwenhuizen	6968d782d3	radv: Implement waiting on non-submitted fences. Fixes: `f4e499ec79` "radv: add initial non-conformant radv vulkan driver" Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-01 01:07:18 +01:00
Bas Nieuwenhuizen	2a404c6f92	radv: Implement WaitForFences with !waitAll. Nothing to do except using a busy wait loop. At least for old kernels. A better implementation for newer kernels to come later. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105255 Fixes: `f4e499ec79` "radv: add initial non-conformant radv vulkan driver" Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-01 01:07:18 +01:00
Dave Airlie	49879f3778	ac/nir: fix shared atomic operations. The nir->llvm conversion was using the wrong srcs. Fixes: tests/spec/arb_compute_shader/execution/shared-atomics.shader_test Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-01 10:06:06 +10:00
Dave Airlie	69495b30a3	ac/nir: don't apply slice rounding on txf_ms This matches the tgsi code. Fixes arb_texture_multisample texelFetch piglit tests. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Fixes: `f4e499ec79` (radv: add initial non-conformant radv vulkan driver) Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-01 10:04:34 +10:00
Timothy Arceri	f383fec903	radeonsi: set some context vars for nir path Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-01 10:51:56 +11:00
Timothy Arceri	7e46214f87	gallium: remove llvm from ir struct This was added in `425dc4c4b3` but never used. Also since `100796c15c` native has superseded llvm. Acked-by: Dave Airlie <airlied@redhat.com>	2018-03-01 10:51:56 +11:00
Kenneth Graunke	e51b0664e0	i965: Don't emit MOVs with undefined registers for Gen4 point clipping. Gen4 point clipping calls brw_clip_tri_alloc_regs with nr_verts == 0, which means that c->reg.vertex[] isn't initialized. It then emits MOVs to stomp components of those uninitialized registers to 0. This started causing assertions after Matt's recent series, when those uninitialized registers started getting BRW_REGISTER_TYPE_NF, which definitely doesn't exist on Gen4-5. Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-02-28 15:03:51 -08:00
Eric Anholt	e4e79a02da	broadcom/vc5: Fix regression in the page-cache slice size alignment. We need to align the size of the slice, not the offset of the next slice. Fixes KHR-GLES3.texture_repeat_mode.rgba32ui_11x131_2_clamp_to_edge. Fixes: `b4b4ada761` ("broadcom/vc5: Fix layout of 3D textures.")	2018-02-28 13:59:50 -08:00
Jason Ekstrand	a2c1e48f15	i965: Only emit 3DSTATE_DRAWING_RECTANGLE once on gen8+ Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 13:31:42 -08:00
Jason Ekstrand	67da59e320	i965: Be more clever about setting up our viewport clip Before, we were trusting in the hardware to take the intersection of the viewport clip with the drawing rectangle. Unfortunately, 3DSTATE_DRAWING_RECTANGLE is fairly expensive because it implicitly does a full pipeline stall. If we're a bit more careful with our viewport clipping, we can just re-emit it once at context creation time. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 13:31:42 -08:00
Matt Turner	debaa822ef	intel/compiler: Re-add .vs_inputs_dual_locations = true Looks like a rebase mistake. Fixes: `89fe5190a2` ("intel/compiler: Lower flrp32 on Gen11+")	2018-02-28 13:25:21 -08:00
Dave Airlie	7cb9353de3	r600/shader: when using images always load thread id gpr at start (v2) The delayed loading code was fail if we had control flow. This fixes: tests/spec/arb_shader_image_load_store/execution/image_checkerboard.shader_test v2: don't use temp_reg before setting temp_reg up. Tested-by: Gert Wollny <gw.fossdev@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-28 20:16:19 +00:00
Dave Airlie	8369fdee8b	r600: fix whitespace in recent 1d texture commit. trivial fix.	2018-02-28 20:16:19 +00:00
Matt Turner	6f00bf519d	intel/compiler: Add ICL to test_eu_validate.cpp With the Align16 tests now disabled, we can run the rest of the tests in ICL mode (and see them pass!) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 11:15:47 -08:00
Matt Turner	ff4b41dd1d	intel/compiler: Disable Align16 tests on Gen11+ Align16 is no more. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 11:15:47 -08:00
Matt Turner	c31d77ac22	intel/compiler: Add instruction compaction support on Gen11 Gen11 only differs from SKL+ in that it uses a new datatype index table. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 11:15:47 -08:00
Matt Turner	d5bf093cf9	intel/compiler: Mark line, pln, and lrp as removed on Gen11+ Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 11:15:47 -08:00
Matt Turner	89fe5190a2	intel/compiler: Lower flrp32 on Gen11+ The LRP instruction is no more. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 11:15:47 -08:00
Matt Turner	2134ea3800	intel/compiler/fs: Implement ddy without using align16 for Gen11+ Align16 is no more. We previously generated an align16 ADD instruction to calculate DDY: add(16) g25<1>F -g23<4>.xyxyF g23<4>.zwzwF { align16 1H }; Without align16, we now implement it as: add(4) g25<1>F -g23<0,2,1>F g23.2<0,2,1>F { align1 1N }; add(4) g25.4<1>F -g23.4<0,2,1>F g23.6<0,2,1>F { align1 1N }; add(4) g26<1>F -g24<0,2,1>F g24.2<0,2,1>F { align1 1N }; add(4) g26.4<1>F -g24.4<0,2,1>F g24.6<0,2,1>F { align1 1N }; where only the first two instructions are needed in SIMD8 mode. Note: an earlier version of the patch implemented this in two instructions in SIMD16: add(8) g25<2>F -g23<4,2,0>F g23.2<4,2,0>F { align1 1N }; add(8) g25.1<2>F -g23.1<4,2,0>F g23.3<4,2,0>F { align1 1N }; but I realized that the channel enable bits will not be correct. If we knew we were under uniform control flow, we could emit only those two instructions however. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 11:15:47 -08:00
Matt Turner	62cfd4c656	intel/compiler/fs: Simplify ddx/ddy code generation The brw_reg() constructor just obfuscates things here, in my opinion. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 11:15:47 -08:00
Matt Turner	bed0267ff6	intel/compiler/fs: Pass fs_inst to generate_ddx/ddy instead of opcode In a future patch, generate_ddy will want to inspect inst->exec_size. Change generate_ddx as well for consistency. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 11:15:47 -08:00
Matt Turner	3a584a15c0	intel/compiler/fs: Don't generate integer DWord multiply on Gen11 Like CHV et al., Gen11 does not support 32x32 -> 32/64-bit integer multiplies. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 11:15:47 -08:00
Matt Turner	432674ce93	intel/compiler/fs: Implement FS_OPCODE_LINTERP with MADs on Gen11+ The PLN instruction is no more. Its functionality is now implemented using two MAD instructions with the new native-float type. Instead of pln(16) r20.0<1>:F r10.4<0;1,0>:F r4.0<8;8,1>:F we now have mad(8) acc0<1>:NF r10.7<0;1,0>:F r4.0<8;8,1>:F r10.4<0;1,0>:F mad(8) r20.0<1>:F acc0<8;8,1>:NF r5.0<8;8,1>:F r10.5<0;1,0>:F mad(8) acc0<1>:NF r10.7<0;1,0>:F r6.0<8;8,1>:F r10.4<0;1,0>:F mad(8) r21.0<1>:F acc0<8;8,1>:NF r7.0<8;8,1>:F r10.5<0;1,0>:F ... and in the case of SIMD8 only the first pair of MAD instructions is used. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 11:15:47 -08:00
Matt Turner	b5d8781e19	intel/compiler/fs: Return multiple_instructions_emitted from generate_linterp If multiple instructions are emitted, special handling of things like conditional mod and NoDDClr/NoDDChk need to be performed. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 11:15:47 -08:00
Matt Turner	b1afdf9fc1	intel/compiler/fs: Fix application of cmod and saturate to LINE/MAC pair This isn't technically broken, but the next patch will make this function report whether it generated multiple instructions, and that information will be used to disable the application of conditional mod by the generic code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 11:15:47 -08:00
Matt Turner	2cff324210	intel/compiler: Add Gen11+ native float type This new type exposes the additional precision offered by the accumulator register and will be used in the next patch to implement the functionality of the PLN instruction using a pair of MAD instructions. One weird thing to note: align1 ternary instructions may only have an accumulator in the dst or src1 normally, but when src0's type is :NF the accumulator is read. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 11:15:47 -08:00
Matt Turner	58611ff913	intel/compiler: Add Gen11 register types The hardware register types' encodings have changed on Gen11. Good thing we have that superfluous looking brw_reg_type abstraction lying around! Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 11:15:47 -08:00
Matt Turner	bb428454a9	intel: Disable 64-bit extensions on platforms without 64-bit types Gen11 does not support DF, Q, UQ types in hardware. As a result, we have to disable some GL extensions until they can be reimplemented. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-02-28 11:15:47 -08:00
Anuj Phogat	5e42103f3b	intel: Add icl pci id for INTEL_DEVID_OVERRIDE Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>	2018-02-28 11:15:47 -08:00
Matt Turner	35bfe20995	i965: Warn about preliminary support for Gen11 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 11:14:03 -08:00
Anuj Phogat	5ac804bd9a	intel: Add a preliminary device for Ice Lake Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Anuj Phogat <anuj.phogat@intel.com>	2018-02-28 11:14:03 -08:00
Tapani Pälli	0c983b9094	anv: remove anv_gem_set_context_priority helper anv_gem_set_context_param is to be used directly instead! Fixes: `6d8ab53303` "anv: implement VK_EXT_global_priority extension" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-28 19:50:54 +02:00
George Kyriazis	a01d5e3712	swr/rast: revert clip distance precision Fixes piglit tests that broke with `8a64593bde` Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-28 11:42:50 -06:00
George Kyriazis	7e813f6214	swr/rast: Faster frustum prim culling Fix clipper validMask setting. We don't need to run frustum rejected primitives through the clipper. Perform frustum culling with only frustum clip codes. Guardband clip codes cannot be used because they overlap frustum codes. Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-28 11:42:46 -06:00
George Kyriazis	1c73f42e6e	swr/rast: Consolidate TRANSLATE_ADDRESS Translate is now part of an overloaded LOAD call which required a change to the code gen to skip the load functions in order to handle them manually to make them virtual. Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-28 11:42:41 -06:00
George Kyriazis	e2a4fd0761	swr/rast: Code generation cleanup Generate more compact code from gen_llvm.hpp. Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-28 11:42:37 -06:00
George Kyriazis	190ead3d79	swr/rast: Remove draw type from event definitions - Have the draw type sent to DrawInfoEvent in handlers created in archrast.cpp. The draw type no longer needs to be sent during during AR_API_EVENT() call in api.cpp. - Remove draw type from event defintions in events_private.proto, no longer needed Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-28 11:42:32 -06:00
George Kyriazis	90e3e23f63	swr/rast: whitespace change Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-28 11:42:28 -06:00
George Kyriazis	539de78633	swr/rast: Fix index buffer overfetch issue for non-indexed draws Populate pLastIndex, even for the non-indexed case. An zero pLastIndex can cause the index offsets inside the fetcher to have non-sensical values that can be either very large positive or very large negative numbers. Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-28 11:42:19 -06:00
Roland Scheidegger	26103487b5	softpipe: don't iterate through PIPE_MAX_SHADER_SAMPLER_VIEWS We were setting view to NULL if the iteration was larger than i. But in fact if the view is NULL the code did nothing anyway... Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-02-28 18:22:28 +01:00
Roland Scheidegger	b923f21eaa	cso: don't cycle through PIPE_MAX_SHADER_SAMPLER_VIEWS on context destroy There's no point, we know the highest non-null one. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-02-28 18:22:28 +01:00
Roland Scheidegger	89ae5def8c	draw: don't needlessly iterate through all sampler view slots We already stored the highest (potentially) used number. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-28 18:22:28 +01:00
Tapani Pälli	6d8ab53303	anv: implement VK_EXT_global_priority extension v2: add ANV_CONTEXT_REALTIME_PRIORITY (Chris) use unreachable with unknown priority (Samuel) v3: add stubs in gem_stubs.c (Emil) use priority defines from gen_defines.h v4: cleanup, add anv_gem_set_context_param (Jason) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> (v2) Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v2) Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v3) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-28 14:36:57 +02:00
Tapani Pälli	5960023cf4	i965: use context priority definitions from gen_defines.h Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-28 14:36:57 +02:00
Tapani Pälli	4449a1f80d	intel: add new common header gen_defines.h Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-28 14:36:57 +02:00
Christian König	33633690aa	winsys/amdgpu: request high addresses We now have hopefully fixed all bugs regarding high addresses on Vega10 and Raven. Start to use the high range to make room for SVM in the low range. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-28 13:30:32 +01:00
Samuel Pitoiset	639c4f2b54	ac/shader: move scanning some info about input PS declarations Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-28 10:14:26 +01:00
Samuel Iglesias Gonsálvez	e207b2e2c8	glsl/linker: fix bug when checking precision qualifier According to GLSL ES 3.2 spec, see table in 9.2.1 "Linked Shaders" section, the precision qualifier should match for uniform variables. This also applies to previous GLSL ES 3.x specs. This 'if' checks the condition for uniform variables, while for UBOs it is checked in link_interface_blocks.cpp. Fixes: `b50b82b8a5` ("glsl/es31: precision qualifier doesn't need to match in shader interface block members") Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-02-28 07:04:13 +01:00
Samuel Iglesias Gonsálvez	c757c9dc03	anv: set maxResourceSize to the respective value for each generation v2: - Add the proper values to gen9+ (Jason) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-28 06:54:48 +01:00
Dave Airlie	a5853a3333	r600: partly revert disabling tiling for 1d texture. Previously we had a check for 1d of narrow 2D textures, however narrow 2d textures caused gpu hangs, but it was correct for 1d textures. This fixes a bunch of 1D image piglits for me. Fixes: `7b8e1c089d` (r600/texture: drop lowering 1d/2d images to linear.) Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-28 04:59:37 +00:00
Timothy Arceri	0c1f37cc2d	nir: fix interger divide by zero crash during constant folding From the GLSL 4.60 spec Section 5.9 (Expressions): "Dividing by zero does not cause an exception but does result in an unspecified value." Fixes: `89285e4d47` "nir: add new constant folding infrastructure" Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105271	2018-02-28 15:55:39 +11:00
Ilia Mirkin	086c88551d	st/mesa: ensure that images don't try to reference non-existent levels Ideally the st_finalize_texture call would take care of that, but it doesn't seem to with KHR-GL45.shader_image_size.advanced-nonMS-*. This assertion makes sure that no such values are passed to the driver. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-27 22:38:33 -05:00
Dave Airlie	c7b25005a1	ac/radv: move load base vertex abi setup to vertex shader. This was segfaulting: dEQP-VK.memory.pipeline_barrier.host_write_index_buffer.1024 Fixes: `8de6f79707` (ac/radeonsi: add load_base_vertex() to the abi) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-28 09:58:12 +10:00
Dave Airlie	3401b028df	ac/shader: fix vertex input with components. This fixes: dEQP-VK.glsl.440.linkage.varying.component.* Fixes: `1c57a6da5e` (ac/shader: scan vertex inputs usage mask) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-28 09:04:46 +10:00
Dave Airlie	6bafd4f4dd	radv: remove device pointer from buffer. This is never used. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-28 09:03:26 +10:00
Timothy Arceri	a050ea60ee	nir: add lower_ldexp to nir compiler options Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-28 09:23:49 +11:00
Timothy Arceri	08fa84bb9a	ac: implement nir_op_ldexp Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-28 09:23:49 +11:00
Timothy Arceri	9790921ff5	ac: fix nir_op_fdd{x,y} handling radeonsi, i965 and anv all treat fdd{x,y} opcodes the same as fdd{x,y}_coarse by default. The SPIR-V spec lets the implementation decide how it should be handled and radv was previously going for the higher quality option. Here we change the shared amd code to match how nir_op_fdd{x,y} is expected to be handled by the other NIR drivers. Fixes piglit test: ./bin/arb_shader_texture_lod-texgrad -auto Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-28 09:23:49 +11:00
Timothy Arceri	8de6f79707	ac/radeonsi: add load_base_vertex() to the abi Fixes the following piglit tests: ./bin/arb_shader_draw_parameters-basevertex basevertex -auto -fbo ./bin/arb_shader_draw_parameters-basevertex basevertex-baseinstance -auto -fbo Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-28 09:23:49 +11:00
Timothy Arceri	7f91473414	radeonsi: create get_base_vertex() helper Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-28 09:23:49 +11:00
Timothy Arceri	ae47af50d6	radeonsi/nir: disable vertex_id_zero_based lowering The lowering is incompatible with how the radeonsi backend works. Fixes piglit test: ./bin/arb_shader_draw_parameters-basevertex vertexid-zerobased -auto Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-28 09:23:49 +11:00
Timothy Arceri	5504bebfc4	ac: add support for handling nir_intrinsic_load_vertex_id This will be used by radeonsi. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-28 09:23:49 +11:00
Timothy Arceri	3a0b4187dd	ac: fix f2b and i2b for doubles Without this llvm was asserting in debug builds. V2: use LLVMConstNull() Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-28 09:23:49 +11:00
Francisco Jerez	cb309d27c5	intel/ir: Fix invalid type aliasing with undefined behavior in test_eu_compact. test_fuzz_compact_instruction() was attempting to modify the uint64_t data array of a brw_inst through a pointer to uint32_t, which has undefined behavior. This was causing the test_eu_compact unit test to fail mysteriously for me on GCC 7 with some additional harmless-looking changes I had applied to my tree, which happened to affect the order instructions are emitted by GCC causing the bit twiddling to be done after the clear_pad_bits() call which is supposed to overwrite the same data through a pointer of different type, leading to data corruption. A similar failure has been reported by Vinson Lee on the master branch built with GCC 8. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105052 Tested-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-02-27 11:42:39 -08:00
Francisco Jerez	69b4a9d21d	util/bitset: Make C++ wrapper trivially constructible. In order to fix a build failure on compilers not implementing unrestricted unions, which is a C++11 feature. v2: Provide signed integer comparison and assignment operators instead of BITSET_WORD ones to avoid spurious ambiguity warnings on comparisons with a signed integer literal. Fixes: `ba79a90fb5` "glsl: Switch ast_type_qualifier to a 128-bit bitset." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105238 Tested-by: Roland Scheidegger <sroland@vmware.com> Tested-By: George Kyriazis <george.kyriazis@intel.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-27 11:38:18 -08:00
Jordan Justen	9f223d860b	intel/tools: Use gen_device_name_to_pci_device_id in aubinator Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-02-27 11:15:10 -08:00
Jordan Justen	8ff89250ff	intel/common: Add gen_device_name_to_pci_device_id Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-02-27 11:15:10 -08:00
Jordan Justen	c2134f94c8	intel/vulkan: Support INTEL_DEVID_OVERRIDE environment variable Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-02-27 11:15:10 -08:00
Jordan Justen	843f6d187a	i965: Use gen_get_pci_device_id_override Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-02-27 11:15:10 -08:00
Jordan Justen	e560bb9dc2	intel/common: Add gen_get_pci_device_id_override Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-02-27 11:15:10 -08:00
Jordan Justen	6b274d5cc6	intel/vulkan: Support INTEL_NO_HW environment variable Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-02-27 11:15:10 -08:00
Harish Krupo	b9af043716	android: fix source files path for libmesa_anv_gen11 Signed-off-by: Harish Krupo <harish.krupo.kps@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-02-27 14:16:08 +02:00
Eric Engestrom	248c593132	meson: avoid changing types for the dri3 option Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-02-27 11:21:20 +00:00
Eric Engestrom	76e8d61999	meson: simplify the gbm option code, and avoid changing types v2: drop gallium comment (Dylan) Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-02-27 11:21:20 +00:00
Samuel Pitoiset	a549da877b	ac/nir: clean up a hack about rounding 2nd coord component It's basically just the opposite, and it only makes sense to round the layer for 2D texture arrays. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-27 10:09:27 +01:00
Ilia Mirkin	e683a797c6	nvc0: collapse output slots to have adjacent registers The hardware skips over unallocated slots, so we have to make sure those registers are packed together. Fixes KHR-GL45.enhanced_layouts.fragment_data_location_api Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Karol Herbst <kherbst@redhat.com>	2018-02-27 00:10:39 -05:00
Dave Airlie	250468f6b7	radv: expose async compute on SI It looks like we had all the pieces in place for this, just never tested it and turned it on. I don't see any CTS regressions and the computeshader demo runs. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-27 00:54:59 +00:00
Dave Airlie	1fc19a0f27	radv: merge tess rings into a single bo Inspired by a passing commit to radeonsi. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-27 00:54:59 +00:00
Emil Velikov	784d81e97e	docs: update calendar, add news and link release notes to 17.3.6 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-27 00:32:14 +00:00
Emil Velikov	d9391014de	docs: add sha256 checksums for 17.3.6 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `b00880973e`)	2018-02-27 00:29:44 +00:00
Emil Velikov	676c58fbdb	docs: add release notes for 17.3.6 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `b3e5a3f35b`)	2018-02-27 00:29:43 +00:00
Dylan Baker	b9636fe38a	meson: fix building without GL libgl will be undefined _glx, so move that check inside the `if with_glx != 'disabled'` block. v2: - Simplify commit message (Eric, Emil) Fixes: `5c460337fd` ("meson: Fix GL and EGL pkg-config files with glvnd") Reported-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> CC: Daniel Stone <daniels@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Untested-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-26 09:32:14 -08:00
Lionel Landwerlin	fca9f5b585	intel: aubinator_error_decode: fix segfault on missing register Some register might be missing in our genxmls. Don't try to decode them. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-26 16:54:48 +00:00
Eric Engestrom	11d45304fd	*-symbol-check: use correct `nm` path when cross-compiling Inspired-by: a similar patch for libdrm by Heiko Becker Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-26 13:50:59 +00:00
Karol Herbst	ef308d4007	nvir/gm107: consider FILE_FLAGS dependencies in SchedDataCalculatorGM107 currently while insterting barriers, writes and reads to FILE_FLAGS aren't considered. This can lead to WaR hazards in some situations. With the previous commit fixes shaders with intstructions like this: mad u32 $r2 $r4 $r11 $r2 mad u32 { $r5 $c0 } $r4 $r10 $r6 mad (SUBOP:1) u32 $r3 $r4 $r10 $r2 $c0 Affects OpenCL CTS tests on Maxwell+: basic/test_basic intmath_long basic/test_basic intmath_long2 basic/test_basic intmath_long4 v2: only put barriers on instructions which actually read flags Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-02-26 14:41:58 +01:00
Karol Herbst	2f07f823c9	nvir/gm107: iterate over all defs in SchedDataCalculatorGM107::findFirstUse In the sched data calculator we have to track first use of defs by iterating over all defs of an instruction, not just the first one. v2: fix minGRP and maxGRP values Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-02-26 14:41:58 +01:00
Samuel Pitoiset	e05507a427	ac/nir: use ordered float comparisons except for not equal Original patch from Timothy Arceri, I have just fixed the not equal case locally. This fixes one important rendering issue in Wolfenstein 2 (the cutscene transition issue). RadeonSI uses the same ordered comparisons, so I guess that what we should do as well. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104302 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104905 Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2018-02-26 13:59:04 +01:00
Mauro Rossi	6451b0703f	android: vulkan/util: add dependency on libnativewindow for O and later Similar to `90dd6e5` ("Android: egl: add dependency on libnativewindow") Fixes the following building error: In file included from out/target/product/x86_64/obj_x86/STATIC_LIBRARIES/libmesa_vulkan_util_intermediates/util/vk_enum_to_str.c:26: external/mesa/include/vulkan/vk_android_native_buffer.h:22:10: fatal error: 'system/window.h' file not found ^~~~~~~~~~~~~~~~~ 1 error generated. Cc: "18.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2018-02-26 14:50:24 +02:00
Mauro Rossi	d448954228	android: anv: add dependency on libnativewindow for O and later Similar to `90dd6e5` ("Android: egl: add dependency on libnativewindow") Fixes the following building errors: In file included from external/mesa/src/intel/vulkan/gen7_cmd_buffer.c:30: In file included from external/mesa/src/intel/vulkan/anv_private.h:72: external/mesa/include/vulkan/vk_android_native_buffer.h:22:10: fatal error: 'system/window.h' file not found ^~~~~~~~~~~~~~~~~ 1 error generated. ... In file included from external/mesa/src/intel/vulkan/anv_gem.c:32: In file included from external/mesa/src/intel/vulkan/anv_private.h:72: external/mesa/include/vulkan/vk_android_native_buffer.h:22:10: fatal error: 'system/window.h' file not found ^~~~~~~~~~~~~~~~~ 1 error generated. Cc: "18.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2018-02-26 14:49:06 +02:00
Mauro Rossi	9a508b719b	android: anv/extensions: fix generated sources build Building rules are aligned to automake ones The correct script to build anv_extensions.{c,h} is anv_extensions_gen.py Generation rules for anv_extensions.c requires --out-c option Generation rules for anv_extensions.h were missing Necessary include paths are added to avoid following build errors: cp: cannot stat '.../gen/STATIC_LIBRARIES/libmesa_vulkan_common_intermediates/vulkan/anv_extensions.c': No such file or directory In file included from external/mesa/src/intel/vulkan/anv_gem.c:32: external/mesa/src/intel/vulkan/anv_private.h:75:10: fatal error: 'anv_extensions.h' file not found ^~~~~~~~~~~~~~~~~~ 1 error generated. In file included from external/mesa/src/intel/vulkan/anv_batch_chain.c:30: external/mesa/src/intel/vulkan/anv_private.h:75:10: fatal error: 'anv_extensions.h' file not found ^~~~~~~~~~~~~~~~~~ 1 error generated. Fixes: `dd088d4bec` ("anv/extensions: Generate a header file with extension tables") Cc: "18.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-02-26 14:37:33 +02:00
Marek Olšák	8799eaed99	radeonsi: remove 2 unused user SGPRs from merged TES-GS with 32-bit pointers The effect of the last 13 commits on user SGPR counts: Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-26 12:01:19 +01:00
Marek Olšák	3fa7a59d69	radeonsi: make SI_SGPR_VERTEX_BUFFERS the last user SGPR input so that it can be removed and replaced with inline VBO descriptors, and the pointer can be packed in unused bits of VBO descriptors. This also removes the pointer from merged TES-GS where it's useless. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-26 12:01:08 +01:00
Marek Olšák	c78640ce31	radeonsi: set correct num_input_sgprs for VS prolog in merged shaders We need to take num_input_sgprs from VS, not the second shader. No apps suffered from this. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-26 12:01:05 +01:00
Marek Olšák	f852b24ce0	radeonsi: allow fewer input SGPRs in 2nd shader of merged shaders Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-26 12:01:03 +01:00
Marek Olšák	8d6e6b1d7c	radeonsi: don't use struct si_descriptors for vertex buffer descriptors VBO descriptor code will change a lot one day. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-26 12:01:00 +01:00
Daniel Stone	61d6ff3ba3	build: Move wayland-scanner check into platform Also only check for wayland-scanner if building for the Wayland platform. Signed-off-by: Daniel Stone <daniels@collabora.com> Fixes: `bfa22266cd` ("vulkan/wsi/wayland: Add support for zwp_dmabuf") Cc: Emil Velikov <emil.velikov@collabora.co.uk> Reported-by: Dieter Nützel <Dieter@nuetzel-hh.de> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105211	2018-02-26 10:43:19 +00:00
Daniel Stone	d33cd875e8	build: Move wayland-protocols check into platform In line with wayland-client and wayland-server, move the check for wayland-protocols into the wayland platform branch. Signed-off-by: Daniel Stone <daniels@collabora.com> Fixes: `bfa22266cd` ("vulkan/wsi/wayland: Add support for zwp_dmabuf") Cc: Emil Velikov <emil.velikov@collabora.co.uk> Reported-by: Dieter Nützel <Dieter@nuetzel-hh.de> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105211	2018-02-26 10:43:16 +00:00
Daniel Stone	d8f19d9aa0	vulkan/wsi/wayland: Move Wayland protocol from BUILT_SOURCES autotools wants to have the BUILT_SOURCES ready as soon as it enters the directory, even if they are not used. This meant the build failed if wayland-protocols was not available on the system, even if it was not enabled. As BUILT_SOURCES cannot be used in a conditional (cf. `166852ee95`), do the same thing as EGL and manually encode the dependencies in the Makefile. Signed-off-by: Daniel Stone <daniels@collabora.com> Fixes: `bfa22266cd` ("vulkan/wsi/wayland: Add support for zwp_dmabuf") Cc: Emil Velikov <emil.velikov@collabora.co.uk> Reported-by: Dieter Nützel <Dieter@nuetzel-hh.de> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105211	2018-02-26 10:43:12 +00:00
Dave Airlie	0cc5be7741	r600: fix tgsi clock last setting On cayman this was hitting an assert later, which probably wasn't see on non-cayman due to having the t slot. Fixes: `9041730d1` (r600: add support for ARB_shader_clock.)	2018-02-26 11:05:45 +10:00
Dave Airlie	4d72a1efea	r600: add time lo/hi debugging output. This just adds the these to the debug prints.	2018-02-26 11:05:26 +10:00
Timothy Arceri	22430224fe	radeonsi/nir: enable lowering of fpow Lowering fpow in NIR rather than LLVM can be beneficial. Polaris results: Totals from affected shaders: SGPRS: 124928 -> 124896 (-0.03 %) VGPRS: 68616 -> 68332 (-0.41 %) Spilled SGPRs: 394 -> 413 (4.82 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 3668912 -> 3658368 (-0.29 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 18575 -> 18593 (0.10 %) Wait states: 0 -> 0 (0.00 %) Fixes: `d6b7539206` "ac/nir: remove emission of nir_op_fpow" Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-26 11:43:47 +11:00
Timothy Arceri	9873bd9dcd	ac: make use of ac_get_llvm_num_components() helper Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-26 11:43:47 +11:00
Timothy Arceri	1a757c9c97	gallium/tgsi: remove is_msaa_sampler array from tgsi_shader_info Seems to have not been used since `16be87c904` Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-26 11:43:47 +11:00
Timothy Arceri	9f7c940840	radeonsi/nir: fix loading of doubles for tess varyings Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-26 11:43:47 +11:00
Timothy Arceri	81f9d03807	radeonsi/nir: fix lds store in tcs outputs handling We were ignoring the channel offset. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-26 11:43:47 +11:00
Gert Wollny	c7cadcbda4	r600: Take ALU_EXTENDED into account when evaluating jump offsets ALU_EXTENDED needs 4 DWORDS instead of the usual 2, hence if the last ALU clause within a IF-JUMP or ELSE branch is ALU_EXTENDED the target jump offset needs to be adjusted accordingly. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104654 Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-26 10:29:48 +10:00
Francisco Jerez	51562ea7a0	mesa: Expose EXT_shader_framebuffer_fetch(_non_coherent) on desktop and embedded GL. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	c6c64d4d6a	glsl: Silence warnings when reading from a framebuffer fetch output. Framebuffer fetch outputs are implicitly initialized upon entry to the fragment shader. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	537bb1da98	glsl: Specify framebuffer fetch coherency mode in lower_blend_equation_advanced(). This requires passing an extra argument to the lowering pass because the KHR_blend_equation_advanced specification doesn't seem to define any mechanism for the implementation to determine at compile-time whether coherent blending can ever be used (not even an "#extension KHR_blend_equation_advanced_coherent" directive seems to be required in the shader source AFAICT). In the long run we'll probably want to do state-dependent recompiles based on the value of ctx->Color.BlendCoherent, but right now there would be no benefit from that because the only driver that supports coherent framebuffer fetch is i965 on SKL+ hardware, which are unable to support the non-coherent path for the moment because of texture layout issues, so framebuffer fetch coherency is always enabled for them. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	ef9e3f63ca	glsl: Add support for the framebuffer fetch layout(noncoherent) qualifier. This allows the application to request framebuffer fetch coherency with per-fragment output granularity. Coherent framebuffer fetch outputs (which is the default if no qualifier is present for compatibility with older versions of the EXT_shader_framebuffer_fetch extension) will have ir_variable_data::memory_coherent set to true. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	0aeec504b4	glsl: Allow layout token for EXT_shader_framebuffer_fetch_non_coherent. EXT_shader_framebuffer_fetch_non_coherent requires layout qualifiers even on GL(ES) 2. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	1bc01db95f	glsl: Initialize ir_variable_data::fb_fetch_output earlier for GL(ES) 2. At the same point where it is initialized on GL(ES) 3.0+ so we can implement some common layout qualifier handling in a future commit. Until now the fb_fetch_output flag would be inherited from the original implicit gl_LastFragData declaration at a later point in the AST to GLSL IR translation. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	6ebefb0fd5	glsl: Replace MESA_shader_framebuffer_fetch extension flags with EXT ones. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	ba79a90fb5	glsl: Switch ast_type_qualifier to a 128-bit bitset. This should end the drought of bits in the ast_type_qualifier object. The bitset_t type works pretty much as a drop-in replacement for the current uint64_t bitset. The only catch is that the bitset_t type as defined in the previous commit doesn't have a trivial constructor (because it has a user-defined constructor), so it cannot be used as union member without providing a user-defined constructor for the union (which causes it in turn to be non-trivially constructible). This annoyance could be easily addressed in C++11 by declaring the default constructor of bitset_t to be the implicitly defined one -- IMO one more reason to drop support for GCC 4.2-4.3. The other minor change was required because glsl_parser_extras.cpp was hard-coding the type of bitset temporaries as uint64_t, which (unlike would have been the case if the uint64_t had been replaced with e.g. an __int128) would otherwise have caused a build failure, because the boolean conversion operator of bitset_t is marked explicit (if C++11 is available), so the bitset won't be silently truncated down to 1 bit in order to use it to initialize the uint64_t temporaries (yikes). Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	bdbc2ffa42	util/bitset: Add C++ wrapper for static-size bitsets. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	8d1f1ce412	util: Add EXPLICIT_CONVERSION macro. This can be used to specify that a C++ conversion operator is not meant to be used for implicit conversions, which can lead to unintended loss of information in some cases. Implemented as a macro in order to keep old GCC versions happy. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	378e918e28	mesa: Implement glFramebufferFetchBarrierEXT entry point. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	e4124f9bc1	glapi: Update XML for last revision of EXT_shader_framebuffer_fetch. Desktop GL is now supported, and there is an additional entry-point for EXT_shader_framebuffer_fetch_non_coherent. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	6a8ec78c2a	mesa: Rename MESA_shader_framebuffer_fetch gl_extensions bits to EXT. The changes I had originally planned for the MESA_shader_framebuffer_fetch extension have been merged into the EXT spec, there's no point in keeping MESA_shader_framebuffer_fetch extension enables. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	d0bef79f12	mesa: Rename dd_function_table::BlendBarrier to match latest EXT spec. This GL entry point was renamed to glFramebufferFetchBarrier() in the EXT extension on request from Khronos members. Update the Mesa codebase to match the latest spec. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	27c829da28	i965: Fix KHR_blend_equation_advanced with some render targets. This reverts two bogus and seemingly useless changes from the commits referenced below, which broke KHR_blend_equation_advanced (and EXT_shader_framebuffer_fetch_non_coherent which wasn't exposed yet) for any kind of render target surface that would cause the get_isl_surf() call in brw_emit_surface_state() to do anything useful (notice how the result of get_isl_surf() is completely ignored by the caller right now), as was the case while using those extensions with 1D array or 3D framebuffers in particular. Fixes: `f5859b45b1` "i965/miptree: Switch remaining surfaces to isl" Fixes: `bf24c3539e` "i965/miptree: Clean-up unused" Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Marek Olšák	fb410ae392	radeonsi: remove si_descriptors parameter from emit_shader_pointer functions Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:29 +01:00
Marek Olšák	63ea0a00a3	radeonsi: preload the tess offchip ring in TES so that it's not done multiple times in branches Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:29 +01:00
Marek Olšák	2d03c4cac8	radeonsi: move tess ring address into TCS_OUT_LAYOUT, removes 2 TCS user SGPRs TCS_OUT_LAYOUT has 13 unused bits. That's enough for a 32-bit address aligned to 512KB. Hey, it's a 13-bit pointer! Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:29 +01:00
Marek Olšák	190e064e63	radeonsi: move 2nd-shader descriptor pointers into s[0:1] If 32-bit pointers are supported, both pointers can be moved into s[0:1] and then ESGS has exactly the same user data SGPR declarations as VS. If 32-bit pointers are not supported, only one pointer can be moved into s[0:1]. In that case, the 2nd pointer is moved before TCS constants, so that the location is the same in HS and GS. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:29 +01:00
Marek Olšák	1d1df76d2b	radeonsi: change si_descriptors::shader_userdata_offset type to short We will want to use SH registers outside of user data SGPRs, like the GFX9 special SGPRs. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:28 +01:00
Marek Olšák	fca7dee9c6	radeonsi: put both tessellation rings into 1 buffer Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:28 +01:00
Marek Olšák	d2963d8b5f	radeonsi: move tessellation ring info into si_screen Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:28 +01:00
Marek Olšák	41895c26d3	radeonsi: move TCS_OUT_LAYOUT.PatchVerticesIn to lower bits For a later patch. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:28 +01:00
Karol Herbst	f0b39779a0	nvir: dont optimize mad with subops to shladd Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-24 18:48:13 +01:00
James Legg	afd8fd0656	radv: Really use correct HTILE expanded words. When transitioning to an htile compressed depth format, Set the full depth range, so later rasterization can pass HiZ. Previously, for depth only formats, the depth range was set to 0 to 0. This caused unwanted HiZ rejections with a VK_FORMAT_D16_UNORM depth buffer (VK_FORMAT_D32_SFLOAT was not affected somehow). These values are derived from PAL [0], since I can't find the specification describing the htile values. [0] `5cba4ecbda/src/core/hw/gfxip/gfx9/gfx9MaskRam.cpp (L1500)` CC: Dave Airlie <airlied@redhat.com> CC: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> CC: mesa-stable@lists.freedesktop.org Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Grazvydas Ignotas <notasas@gmail.com> Fixes: `5158603182` "radv: Use correct HTILE expanded words."	2018-02-24 02:16:22 +01:00
Mauro Rossi	8eed942136	radv/extensions: fix c_vk_version for patch == None Similar to `cb0d1ba156` ("anv/extensions: Fix VkVersion::c_vk_version for patch == None") fixes the following building errors: out/target/product/x86_64/obj_x86/STATIC_LIBRARIES/libmesa_radv_common_intermediates/radv_entrypoints.c:1161:48: error: use of undeclared identifier 'None'; did you mean 'long'? return instance && VK_MAKE_VERSION(1, 0, None) <= core_version; ^~~~ long external/mesa/include/vulkan/vulkan.h:34:43: note: expanded from macro 'VK_MAKE_VERSION' (((major) << 22) \| ((minor) << 12) \| (patch)) ^ ... fatal error: too many errors emitted, stopping now [-ferror-limit=] 20 errors generated. Fixes: `e72ad05c1d` ("radv: Return NULL for entrypoints when not supported.") Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-24 00:31:31 +01:00
Eric Anholt	b4b4ada761	broadcom/vc5: Fix layout of 3D textures. Cube maps are entire miptrees repeated, while 3D textures have each level have all of its layers next to each other. Fixes tex3d and tex-miplevel-selection GL2:texture() 3D.	2018-02-23 15:07:26 -08:00
Eric Anholt	97dc077303	broadcom/vc5: Ignore unused usage flags in is_format_supported. Like for vc4, the new DISPLAY_TARGET flag ended up causing no formats to match. Just drop the whole retval == usage thing and return early when we hit a known unsupported case. Fixes: `f7604d8af5` ("st/dri: only expose config formats that are display targets")	2018-02-23 15:07:18 -08:00
Eric Anholt	880573e737	gbm: Fix the alpha masks in the GBM format table. Once GBM started looking at the values of the alpha masks, ARGB/ABGR wouldn't match any more because we had both A and R in the low bits. Fixes: `2ed344645d` ("gbm/dri: Add RGBA masks to GBM format table") Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-02-23 15:03:36 -08:00
Mathias Fröhlich	b54bf0e3e3	mesa: Update vertex processing mode on _mesa_UseProgram. The change is a bug fix for `92d76a169`: mesa: Provide an alternative to get_vp_mode() that actually got exposed through `4562a7b0`: vbo: Make use of _DrawVAO from the dlist code. Fixes: KHR-GLES31.core.shader_image_load_store.advanced-sso-simple Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105229 Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 21:08:35 +01:00
Marek Olšák	d169438d8e	mesa: rename has_core_gs -> has_gs in get_programiv This is also true for GLES. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 20:50:23 +01:00
Marek Olšák	1881f41b6c	mesa: replace some API_OPENGL_CORE checks with _mesa_is_desktop_gl This is more accurate with respect to the compatibility profile. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 20:50:22 +01:00
Marek Olšák	1defc973db	mesa: add some of missing compatibility support for ARB_bindless_texture The extension is exposed in the compatibility profile. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 20:50:20 +01:00
Marek Olšák	b8e2e9e1a1	mesa: expose ARB_enhanced_layouts in the compatibility profile GLSL 1.40 is required. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 20:50:19 +01:00
Marek Olšák	a0c8b49284	mesa: enable OpenGL 3.1 with ARB_compatibility Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 20:50:17 +01:00
Marek Olšák	605a7f6db5	mesa: implement ARB_compatibility Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 20:50:15 +01:00
Emil Velikov	14a2c87c41	swr: remove dead LLVM code paths LLVM requirement was bumped to 4.0.0 with earlier commit. Hence any code tailored for older versions is now unreachable. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-By: George Kyriazis <george.kyriazis@intel.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2018-02-23 19:17:31 +00:00
Eric Anholt	5980a41c0f	broadcom/vc4: Remove the retval==usage check in is_format_supported(). This got us into trouble recently, so just remove it entirely.	2018-02-23 08:42:13 -08:00
Eric Anholt	bc3d16e633	broadcom/vc4: Add support for YUV textures using unaccelerated blits. Previously we would assertion fail about having no hardware format. This is enough to get kmscube -M nv12-2img working.	2018-02-23 08:42:13 -08:00
Eric Anholt	c824a045ea	broadcom/vc4: Fix double-unrefcounting of prsc->next with shadows. When we set up the shadow resource we were copying the original resource as the template, including its prsc->next field. When we shadowed the first YUV plane's resource for linear-to-tiled conversion, we would end up unbalancing the refcount on the shadow resource's destruction.	2018-02-23 08:42:13 -08:00
Eric Anholt	6deb158ec1	broadcom/vc4: Add pipe_reference debugging for vc4_bos. Trying to track down the YUV EGLImage use-after-free, it helps to see what the mystery objects are that are being refcounted.	2018-02-23 08:42:13 -08:00
Eric Anholt	34ea1aca92	broadcom/vc4: Remove dead vc4_bo_set_reference(). It would be broken if NULL was passed to it anyway, since it wouldn't participate in screen->bo_handles management.	2018-02-23 08:42:13 -08:00
Eric Anholt	a49738290c	broadcom/vc4: Use pipe_resource_reference in sampler views. Improves u_debug_refcount output.	2018-02-23 08:42:13 -08:00
Eric Anholt	0c1dd9dee0	broadcom/vc4: Allow importing linear BOs with arbitrary offset/stride. This is part of supporting YUV textures -- MMAL will be handing us a single GEM BO with the planes at offsets within it, and MMAL-decided stride.	2018-02-23 08:42:13 -08:00
Eric Anholt	978b884afc	broadcom/vc4: Ignore PIPE_BIND_DISPLAY_TARGET in is_format_supported(). We were failing the retval == usage check at the end. Fixes: `f7604d8af5` ("st/dri: only expose config formats that are display targets")	2018-02-23 08:42:13 -08:00
Lucas Stach	8df11f3fad	etnaviv: fix in-place resolve tile count TS tiles map to a fixed amount of bytes in the color/depth surface, so the blocksize of the format needs to be taken into account when calculating the number of tiles to fill. The simplest fix is to just use the layer stride, which is the surface size in bytes. Signed-off-by: Lucas Stach <l.stach@pengutronix.de>	2018-02-23 15:34:39 +01:00
Lucas Stach	add23b59c9	etnaviv: switch magic single buffer state to "3" Some of the 16bit formats misrender with missing tiles with the current "2" state. As all the previously working formats also work with the "3" state, just always use that one. Signed-off-by: Lucas Stach <l.stach@pengutronix.de>	2018-02-23 15:34:39 +01:00
Lucas Stach	8befc11186	etnaviv: add debug switch to disable single buffer feature This feature has caused some trouble already. Add a debug switch to allow users to quickly check if a specific issue is caused by this feature. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2018-02-23 15:34:31 +01:00
Dylan Baker	5c460337fd	meson: Fix GL and EGL pkg-config files with glvnd Currently meson will generate a pkg-config that links to EGL_mesa (or GLX_mesa), but this isn't correct, it should always link to EGL or GL. Probably the "right" solution is to have glvnd itself provide the pkg config files for GL and EGL, but that also means that glvnd needs to provide many of the header files, which makes it a more involved job. Fixes: `a47c525f32` ("meson: build glx") Fixes: `035ec7a2bb` ("meson: Add support for EGL glvnd") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-02-23 13:30:28 +00:00
Frank Binns	6160bf97db	egl/dri2: fix segfault when display initialisation fails dri2_display_destroy() is called when platform specific display initialisation fails. However, this would typically lead to a segfault due to the dri2_egl_display vbtl not having been set up. Fixes: `2db9548296` ("loader_dri3/glx/egl: Optionally use a blit context for blitting operations") Signed-off-by: Frank Binns <francisbinns@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-23 11:13:22 +00:00
Juan A. Suarez Romero	e1623b303c	mesa: add missing RGB9_E5 format in _mesa_base_fbo_format RGB9_E5 should be accepted by RenderbufferStorage if the EXT_texture_shared_exponent is exposed. It is left to the implementations to return GL_FRAMEBUFFER_UNSUPPORTED_EXT when checking the framebuffer completeness if they do not support rendering in this format. Discussed in: https://github.com/KhronosGroup/OpenGL-API/issues/32 This fixes KHR-GL45.internalformat.renderbuffer.rgb9_e5 v2: Added more info to the commit message (Antia) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Antia Puentes <apuentes@igalia.com>	2018-02-23 10:12:06 +01:00
Christian Gmeiner	e72062b66d	etnaviv: npot_tex_any_wrap needs one bit only Reduces size of struct etna_specs from 100 to 94 bytes. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2018-02-23 09:38:16 +01:00
Mathias Fröhlich	4562a7b0e8	vbo: Make use of _DrawVAO from the dlist code. Finally use an internal VAO to execute display list draws. Avoid duplicate state validation for display list draws. Remove client arrays previously used exclusively for display lists. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:34:14 +01:00
Mathias Fröhlich	2f35140846	mesa: Use atomics for shared VAO reference counts. VAOs will be used in the next change as immutable object across multiple contexts. Only reference counting may write concurrently on the VAO. So, make the reference count thread safe for those and only those VAO objects. v3: Use bool/true/false for gl_vertex_array_object::SharedAndImmutable. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:34:11 +01:00
Mathias Fröhlich	8a3a4b6fae	vbo: Make use of _DrawVAO from immediate mode draw Finally use an internal VAO to execute immediate mode draws. Avoid duplicate state validation for immediate mode draws. Remove client arrays previously used exclusively for immediate mode draws. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:34:07 +01:00
Mathias Fröhlich	c757e416ce	vbo: Implement tool functions for vbo specific VAO setup. Correct VBO_MATERIAL_SHIFT value. The functions will be used next in this series. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:34:04 +01:00
Mathias Fröhlich	ef8028017d	mesa: Add flush_vertices to _mesa_bind_vertex_buffer. We will need the flush_vertices argument later in this series. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:34:01 +01:00
Mathias Fröhlich	354b76ad20	mesa: Make _mesa_vertex_attrib_binding public. Change vertex_attrib_binding() to _mesa_vertex_attrib_binding(), add a flush_vertices argument, and make it publicly available. The function will be needed later in the series. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:33:58 +01:00
Mathias Fröhlich	4331969ac4	mesa: Add flush_vertices to _mesa_{enable,disable}_vertex_array_attrib. We will need the flush_vertices argument later in this series. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:33:55 +01:00
Mathias Fröhlich	195bb990ed	vbo: Use _DrawVAO for array type draw commands. Switch over to use the _DrawVAO for all the array type draws. The _DrawVAO needs to be set before we enter _mesa_update_state, so move setting the draw method in front of the first call to _mesa_update_state which is in turn called from the validateDraw* calls. Using the gl_vertex_array_object::_Enabled bitmask, gl_vertex_program_state::_VPMode and gl_vertex_array_object::_AttributeMapMode we can already set varying_vp_inputs before we call _mesa_update_state the first time. Thus remove duplicate state validation. v2: Update comments. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:33:50 +01:00
Mathias Fröhlich	6002ab564b	vbo: Implement method to track the inputs array. Provided the _DrawVAO and the derived state that is maintained if we have the _DrawVAO set, implement a method to incrementally update the array of gl_vertex_array input pointers. v2: Add some more comments. Rename _vbo_array_init to _vbo_init_inputs. Rename vbo_context::arrays to vbo_context::draw_arrays. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:33:46 +01:00
Mathias Fröhlich	08c7474189	mesa: Introduce a yet unused _DrawVAO. During the patch series this VAO gets populated with either the currently bound VAO or an internal VAO that will be used for immediate mode and dlist rendering. v2: More comments about the _DrawVAO, filter and enabled mask. Rename _DrawVAOEnabled to _DrawVAOEnabledAttribs. v3: Fix and move comment. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:33:43 +01:00
Mathias Fröhlich	ce3d2421a0	vbo: Remove get_vp_mode() and enum vp_mode. Is now unused. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:33:40 +01:00
Mathias Fröhlich	60c3ca1b23	vbo: Use _VPMode instead of get_vp_mode(). At those places where we used get_vp_mode() use gl_vertex_program_state::_VPMode instead. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:33:36 +01:00
Mathias Fröhlich	92d76a1691	mesa: Provide an alternative to get_vp_mode() To get equivalent information than get_vp_mode(), track the vertex processing mode in a per context variable at gl_vertex_program_state::_VPMode. This aims to replace get_vp_mode() as seen in the vbo module. But instead of the get_vp_mode() implementation which only gives correct answers past calling _mesa_update_state() this context variable is immediately tracked when the vertex processing state is modified. The correctness of this value is asserted on state validation. With this in place we should be able to untangle the dependency with varying_vp_inputs and state invalidation. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:33:30 +01:00
Ilia Mirkin	d73f1f2ad8	nv50,nvc0: fix integer MS resolves using 2d engine We don't want filtering for integer textures, same as depth/stencil. Fixes: KHR-GL45.direct_state_access.renderbuffers_storage_multisample Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Karol Herbst <kherbst@redhat.com>	2018-02-22 20:47:48 -05:00
Ilia Mirkin	33ce3569c5	nvc0: fix writing query results into buffer We need to mark the range as valid, and validate the resource using a helper to ensure that the buffer status is marked properly. Fixes some CTS pipeline stats query tests, and KHR-GL45.direct_state_access.queries_functional Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Karol Herbst <kherbst@redhat.com>	2018-02-22 20:47:48 -05:00
Ilia Mirkin	f6e4f95668	nv50,nvc0: fix clear buffer acceleration Two things were off: - valid range was not updated, which could affect waiting for future maps - fencing was done manually instead of using the *_resource_validate helper, which resulted in a missed dirty buffer flag being set Fixes: KHR-GL45.direct_state_access.buffers_clear Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Karol Herbst <kherbst@redhat.com>	2018-02-22 20:47:48 -05:00
Lionel Landwerlin	bd9672695b	i965: perf: ensure reading config IDs from sysfs isn't interrupted Fixes: `458468c136` "i965: Expose OA counters via INTEL_performance_query" Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-02-23 01:44:07 +00:00
Bas Nieuwenhuizen	032870beda	radv: Fix autotools build. Somewhere along the way the Makefile changes got lost ... Fixes: `4db78f3a6b` "radv: Put supported extensions in a struct." Acked-by: Dave Airlie <airlied@redhat.com>	2018-02-23 01:54:12 +01:00
Bas Nieuwenhuizen	e72ad05c1d	radv: Return NULL for entrypoints when not supported. This implements strict checking for the entrypoint ProcAddr functions. - InstanceProcAddr with instance = NULL, only returns the 3 allowed entrypoints. - DeviceProcAddr does not return any instance entrypoints. - InstanceProcAddr does not return non-supported or disabled instance entrypoints. - DeviceProcAddr does not return non-supported or disabled device entrypoints. - InstanceProcAddr still returns non-supported device entrypoints. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-23 00:39:02 +01:00
Bas Nieuwenhuizen	414f5e0e14	radv: Reword radv_entrypoints_gen.py With a big inspiration from anv as always ... Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-23 00:39:02 +01:00
Bas Nieuwenhuizen	076f7cfc6b	radv: Track enabled extensions. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-23 00:39:02 +01:00
Bas Nieuwenhuizen	4db78f3a6b	radv: Put supported extensions in a struct. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-23 00:39:02 +01:00
Jose Fonseca	1f5618e81c	appveyor: Build with MSVC 2015. The MSVC version we (at VMware) primarily care about from now on is 2015. See https://ci.appveyor.com/project/jrfonseca/mesa/build/46 We can drop support for building with 2013 in a future commit. I'm not aware of significant changes in C99/C11 support from MSVC 2013 to 2015, but there's no point in continuing supporting old MSVC versions when nobody cares. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-22 21:10:20 +00:00
Samuel Pitoiset	d6b7539206	ac/nir: remove emission of nir_op_fpow fpow is now lowered at NIR level. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-22 20:44:46 +01:00
Samuel Pitoiset	7aa008d1d7	radv: enable lowering of fpow to fexp2 and flog2 There is no fpow in hardware, so it's always lowered somewhere, but it appears that lowering at NIR level is better. Figured while comparing compute shaders between RadeonSI and RADV. Polaris10: Totals from affected shaders: SGPRS: 18936 -> 18904 (-0.17 %) VGPRS: 12240 -> 12220 (-0.16 %) Spilled SGPRs: 2809 -> 2809 (0.00 %) Code Size: 718116 -> 719848 (0.24 %) bytes Max Waves: 1409 -> 1410 (0.07 %) Vega10: Totals from affected shaders: SGPRS: 18392 -> 18392 (0.00 %) VGPRS: 12008 -> 11920 (-0.73 %) Spilled SGPRs: 3001 -> 2981 (-0.67 %) Code Size: 777444 -> 778788 (0.17 %) bytes Max Waves: 1503 -> 1504 (0.07 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-22 20:40:47 +01:00
Samuel Pitoiset	63fb30c674	nir: lower fexp2(fmul(flog2(a), 2)) to fmul(a, a) Similar for the 4 case. Suggested by Bas. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-22 20:40:45 +01:00
Samuel Pitoiset	b18997876f	nir: add is_used_once for fmul(fexp2(a), fexp2(b)) to fexp2(fadd(a, b)) Otherwise the code size increases because the original fexp2() instructions can't be deleted. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-22 20:40:43 +01:00
Samuel Pitoiset	a01e9996b5	ac/nir: set GLC=1 for load/store of coherent/volatile images This disables persistence accross wavefronts. F1 2017 and Wolfenstein 2 appear to use some coherent images but this patch doesn't seem to change anything. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-22 20:39:55 +01:00
Samuel Pitoiset	3c40be126f	spirv: apply memory qualifiers to images Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-22 20:39:53 +01:00
Chuck Atkins	540e49e105	glx: Properly handle cases where screen creation fails This fixes a segfault exposed by `a29d63ecf7` which occurs when swr is used on an unsupported architecture. v2: re-work to place logic in xmesa_init_display Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Cc: mesa-stable@lists.freedesktop.org Cc: George Kyriazis <george.kyriazis@intel.com> Cc: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-22 10:20:32 -05:00
Iago Toral Quiroga	7668b594e6	anv/blorp: multisample resolve all attachment layers We were only resolving the first. v2: - Do not require that the number of layers on dst and src are an exact match, it is okay if the dst has more layers so long as it has at least the same that we are going to resolve. - Do not always resolve array_len layers, we should resolve only from base_array_layer to array_len. v3: - v2 was assuming that array_len represented the total number of layers in the image, but it represents the number of layers starting at the base array ayer. v4: - The number of layers to resolve should be taken from the framebuffer (Nanley). Fixes new CTS tests for multisampled layered rendering: dEQP-VK.renderpass.multisample_resolve.layers_* Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-22 08:23:39 +01:00
Jason Ekstrand	2dce4ac6ac	intel/isl: Improve the documentation on get_default_aux_state Reviewed-by: Chad Versace <chadversary@chromium.org>	2018-02-21 18:18:16 -08:00
Jason Ekstrand	24952160fd	i965: Use finish_external instead of make_shareable in setTexBuffer2 The setTexBuffer2 hook from GLX is used to implement glxBindTexImageEXT which has tighter restrictions than just "it's shared". In particular, it says that any rendering to the image while it is bound causes the contents to become undefined. The GLX_EXT_texture_from_pixmap extension provides us with an acquire and release in the form of glXBindTexImageEXT and glXReleaseTexImageEXT. The extension spec says, "Rendering to the drawable while it is bound to a texture will leave the contents of the texture in an undefined state. However, no synchronization between rendering and texturing is done by GLX. It is the application's responsibility to implement any synchronization required." From the EGL 1.4 spec for eglBindTexImage: "After eglBindTexImage is called, the specified surface is no longer available for reading or writing. Any read operation, such as glReadPixels or eglCopyBuffers, which reads values from any of the surface’s color buffers or ancillary buffers will produce indeterminate results. In addition, draw operations that are done to the surface before its color buffer is released from the texture produce indeterminate results In other words, between the bind and release calls, we effectively own those pixels and can assume, so long as we don't crash, that no one else is reading from/writing to the surface. The GLX and EGL implementations call the setTexBuffer2 and releaseTexBuffer function pointers that the driver can hook. In theory, this means that, between BindTexImage and ReleaseTexImage, we own the pixels and it should be safe to track aux usage so we can avoid redundant resolves so long as we start off with the right assumption at the start of the bind/release pair. In practice, however, X11 has slightly different expectations. It's expected that the server may be drawing to the image at the same time as the compositor is texturing from it. In that case, the worst expected outcome should be tearing or partial rendering and not random corruption like we see when rendering races with scanout with CCS. Fortunately, the GEM rules about texture/render dependencies save us here. If X11 submits work to write to a pixmap after the compositor has submitted work to texture from it, GEM inserts a dependency between the compositor and X11. If X11 is using a high-priority context, this will cause the compositor to get a temporarily boosted priority while the batch from X11 is waiting on it. This means that we will never have an actual race between X11 and the compositor so no corruption can happen. Unfortunately, however, this means that X11 will likely be rendering to it between the compositor's BindTexImage and ReleaseTexImage calls. If we want to avoid strange issues, we need to be a bit careful about resolves because we can't really transition it away from the "default" aux usage. The only case where this would practically be a problem is with image_load_store where we have to do a full resolve in order to use the image via the data port. Even there it would only be a problem if batches were split such that X11's rendering happens between the resolve and the use of it as a storage image. However, the chances of this happening are very slim so we just emit a warning and hope for the best. This commit adds a new helper intel_miptree_finish_external which resets all aux state to whatever ISL says is the right worst-case "default" for the given modifier. It feels a little awkward to call it "finish" because it's actually an acquire from the perspective of the driver, but it matches the semantics of the other prepare/finish functions. This new helper gets called in intelSetTexBuffer2 instead of make_shareable. We also add an intelReleaseTexBuffer (we passed NULL to releaseTexBuffer before) and call intel_miptree_prepare_external in it. This probably does nothing most of the time but it means that the prepare/finish calls are properly matched. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chadversary@chromium.org>	2018-02-21 18:18:16 -08:00
Jason Ekstrand	00926a2730	i965/tex_image: Reference the renderbuffer miptree in setTexBuffer2 The old code made a new miptree that referenced the same BO as the renderbuffer and just trusted in the memory aliasing to work. There are only two ways in which the new miptree is liable to differ from the one in the renderbuffer and neither of them matter: 1) It may have a different target. The only targets that we can ever see in intelSetTexBuffer2 are GL_TEXTURE_2D and GL_TEXTURE_RECTANGLE and the difference between the two doesn't matter as far as the miptree is concerned; genX(update_sampler_state) only looks at the gl_texture_object and not the miptree when determining whether or not to use normalized coordinates. 2) It may have a very slightly different format. Again, this doesn't matter because we've supported texture views for quite some time so we always look at the gl_texture_object format instead of the miptree format for hardware setup anyway. On the other hand, because we were recreating the miptree, we were using intel_miptree_create_for_bo which doesn't understand modifiers. We really want this function to work without doing a resolve so long as you have modifiers so we need to fix that. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chadversary@chromium.org>	2018-02-21 18:18:16 -08:00
Jason Ekstrand	41d45eb21e	i965/tex_image: Pull the tex format from the renderbuffer in intelSetTexBuffer2 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chadversary@chromium.org>	2018-02-21 18:18:16 -08:00
Jason Ekstrand	344b57b10b	i965/miptree: Loosen the format check in miptree_match_image This function is used to determine when we need to re-allocate a miptree. Since we do nothing different in miptree allocation for sRGB vs. linear, loosening this should be safe and may lead to less copying and reallocating in some odd cases. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chadversary@chromium.org>	2018-02-21 18:18:16 -08:00
Jason Ekstrand	5b1b710e6f	i965/state: Ignore intel_obj->_Format for depth/stencil and ETC2 We're about to start letting the intel_obj->_Format be the "real" texture format. For depth/stencil textures, this may be a combined depth stencil format. For ETC2 on gen7 and earlier, this will be the actual ETC2 format. This makes a bit more GL sense but means we have to be careful in state upload. Reviewed-by: Chad Versace <chadversary@chromium.org>	2018-02-21 18:18:16 -08:00
Kenneth Graunke	183ce5e629	glsl: Parse 'layout' as a token with advanced blending or bindless Both KHR_blend_equation_advanced and ARB_bindless_texture provide layout qualifiers, and are exposed in compatibility contexts. We need to parse the layout qualifier as a token in order for those to work, but forgot to extend this check. ARB_shader_image_load_store would need a similar treatment, but we don't expose that in legacy OpenGL contexts. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105161 Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-02-21 17:50:57 -08:00
Daniel Stone	c7e22483fe	vulkan/wsi/x11: Consistently update and return swapchain status Use a helper function for updating the swapchain status. This will be used later to handle VK_SUBOPTIMAL_KHR, where we need to make a non-error status stick to the swapchain until recreation. Instead of direct comparisons to VK_SUCCESS to check for error, test for negative numbers meaning an error status, and positive numbers indicating non-error statuses. v2 (Jason Ekstrand): - Use a pattern of "return x11_swapchain_result(chain, VK_WHATEVER)" - Handle wsi_queue_pull returning VK_TIMEOUT - Call x11_swapchain_result in x11_present_to_x11 Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-21 22:37:10 +00:00
Jason Ekstrand	6937c61324	vulkan/wsi/x11: Set OUT_OF_DATE if wait_for_special_event fails This most likely means we lost our connection to the X server so OUT_OF_DATE is reasonable. This was also the one case where we pushed a UINT32_MAX into the queue without setting an error condition. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-02-21 22:37:10 +00:00
Daniel Stone	bfa22266cd	vulkan/wsi/wayland: Add support for zwp_dmabuf zwp_linux_dmabuf_v1 lets us use multi-planar images and buffer modifiers. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-21 22:37:10 +00:00
Jason Ekstrand	c757fd2852	anv/image: Add support for modifiers for WSI This adds support for the modifiers portion of the WSI "extension". Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-02-21 22:37:10 +00:00
Jason Ekstrand	adca1e4a92	anv/image: Separate modifiers from legacy scanout For a bit there, we had a bug in i965 where it ignored the tiling of the modifier and used the one from the BO instead. At one point, we though this was best fixed by setting a tiling from Vulkan. However, we've decided that i965 was just doing the wrong thing and have fixed it as of `5048572352`. The old assumptions also affected the solution we used for legacy scanout in Vulkan. Instead of treating it specially, we just treated it like a modifier like we do in GL. This commit goes back to making it it's own thing so that it's clear in the driver when we're using modifiers and when we're using legacy paths. v2 (Jason Ekstrand): - Rename legacy_scanout to needs_set_tiling Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-02-21 22:37:10 +00:00
Jason Ekstrand	f5433e4d6c	vulkan/wsi: Add modifiers support to wsi_create_native_image This involves extending our fake extension a bit to allow for additional querying and passing of modifier information. The added bits are intended to look a lot like the draft of VK_EXT_image_drm_format_modifier. Once the extension gets finalized, we'll simply transition all of the structs used in wsi_common to the real extension structs. Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-02-21 22:37:10 +00:00
Daniel Stone	55b27e1e5f	vulkan/wsi: Add drm_modifier member to wsi_image Not yet used anywhere. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-21 22:37:10 +00:00
Daniel Stone	61c3feb38d	vulkan/wsi: Add multiple planes to wsi_image Not currently used. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-21 22:37:10 +00:00
Timothy Arceri	cdeac00267	nir: remove old assert This was originally intended to make sure the remap location was not -1. However the code has changed alot since then, the location is now never set to -1 and we also handle components meaning this old assert has been doing comparisions with the pointer to the array of component data. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105183	2018-02-22 09:31:00 +11:00
Timothy Arceri	86098696fc	radeonsi/nir: collect more accurate output_usagemask Fixes assert in the glsl-1.50-gs-max-output-components piglit test. Note that the double handling will only work for doubles that don't take up multiple slots i.e. double and dvec2. However dual slot double handling is an existing bug which is made no worse by this patch. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-22 09:31:00 +11:00
Timothy Arceri	79dc94828a	radeonsi/nir: disable GLSL IR loop unrolling Delaying unrolling and allowing NIR to do it instead has been shown to result in better code in drivers such as i965. shader-db results appear to show the same is true for radeonsi. The other advantage is that using NIR unrolling improves compile times significantly. Totals from affected shaders: SGPRS: 9624 -> 10016 (4.07 %) VGPRS: 6800 -> 6464 (-4.94 %) Spilled SGPRs: 0 -> 2 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 359176 -> 332264 (-7.49 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 1355 -> 1432 (5.68 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-22 09:31:00 +11:00
Timothy Arceri	e6269ffc2e	radeonsi/nir: fix tess varying loads for doubles Fixes the following piglit tests: tests/spec/arb_tessellation_shader/execution/double-array-vs-tcs-tes.shader_test tests/spec/arb_tessellation_shader/execution/double-vs-tcs-tes.shader_test Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-22 09:31:00 +11:00
Timothy Arceri	6d338d757f	ac/radeonsi: pass type to load_tess_varyings() We need this to be able to load 64bit varyings. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-22 09:31:00 +11:00
Daniel Stone	eef890b7b1	x11/dri3: Store raw present completion mode The DRI3 drawable info struct currently stores a boolean for whether the last completed operation was a flip or not. As we need to track the full completion mode for handling suboptimal returns, change the 'flipping' field to the raw present completion mode from the server. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-02-21 21:57:38 +00:00
Daniel Stone	a6f1952814	x11/dri3: Don't open-code ARRAY_SIZE Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-21 21:57:38 +00:00
Jason Ekstrand	52056206e1	anv: Don't assert that stencil HiZ clears are single-slice It's true for depth HiZ clears because we only have HiZ on single-slice images right now. However, for stencil-only clears there is no such restriction. Tested-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-21 13:54:11 -08:00
Jason Ekstrand	7dd0f73fe1	anv: Only copy clear dwords if we're rendering to the first slice Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-02-21 12:47:17 -08:00
Marek Olšák	b494ed168c	radeonsi: don't flush when si_eliminate_fast_color_clear is no-op	2018-02-21 20:03:11 +01:00
Marek Olšák	5f55f4c59f	radeonsi: make texture_discard_cmask/eliminate functions non-static	2018-02-21 20:03:11 +01:00
James Zhu	81dd4a7637	radeonsi: enable uvd encode for HEVC main Enable UVD encode for HEVC main profile Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>	2018-02-21 13:53:38 -05:00
James Zhu	b38b208ff8	radeonsi:create uvd hevc enc entry Add UVD hevc encode pipe video codec creation entry Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>	2018-02-21 13:53:38 -05:00
James Zhu	e7d51e27ed	radeon/uvd:add uvd hevc enc functions Implement UVD hevc encode functions Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>	2018-02-21 13:53:38 -05:00
James Zhu	2b86f5fa0b	radeon/uvd:add uvd hevc enc hw ib implementation Implement required IBs for UVD HEVC encode. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>	2018-02-21 13:53:38 -05:00
James Zhu	461508c15c	radeon/uvd:add uvd hevc enc hw interface header Add hevc encode hardware interface for UVD Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>	2018-02-21 13:53:38 -05:00
James Zhu	c6acae22c8	winsys/amdgpu:add uvd hevc enc support in amdgpu cs Support UVD HEVC encode in amdgpu cs Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>	2018-02-21 13:53:38 -05:00
James Zhu	f0ad908e79	amd/common:add uvd hevc enc support check in hw query Based on amdgpu hardware query information to check if UVD hevc enc support Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-21 13:53:38 -05:00
Karol Herbst	7319311a50	nvir/nvc0: fix legalizing of ld unlock c0[0x10000] We have to increase the file index also for 0x10000 not just for values greater than 0x10000. Fixes: `37b67db6ae` Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-21 11:12:45 +01:00
Samuel Pitoiset	a6accad68f	ac/nir: add glsl_is_array_image() helper For consistency. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-21 09:41:51 +01:00
Samuel Pitoiset	ff83dfb364	ac/nir: set the DA field when performing atomics on 3D images This doesn't fix anything known but it should definitely be set. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-21 09:41:49 +01:00
Eric Anholt	afa7b2f199	i965: Fix compiler warning about write being undefined. This looks like it should be protected by the assume() about nr_color_regions, but my compiler warns anyway. Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-02-20 20:23:57 -08:00
Eric Anholt	4636ce362d	glsl/tests: Fix a compiler warning about signed/unsigned loop comparison. Fixes: `d32956935e` ("glsl: Walk a list of ir_dereference_array to mark array elements as accessed") Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-02-20 20:23:57 -08:00
Eric Anholt	7075c084fc	loader: Fix compiler warnings about truncating the PCI ID path. My build was producing: ../src/loader/loader.c:121:67: warning: ‘%1u’ directive output may be truncated writing between 1 and 3 bytes into a region of size 2 [-Wformat-truncation=] and we can avoid this careful calculation by just using asprintf (as we do elsewhere in the file). Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-02-20 20:23:57 -08:00
Eric Anholt	1b313eedb5	glsl: Silence warnings in the uniform initializer test about 16-bit types They should probably get unit tests implemented, but this cleans up a bunch of warnings in my build for now. Fixes: `59f458cd87` ("glsl: Add 16-bit types") Cc: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-02-20 20:23:57 -08:00
Jordan Justen	96fe36f7ac	i965: Enable disk shader cache by default Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-20 18:49:43 -08:00
Dave Airlie	baa0feb73d	radv: don't send num_tcs_input_cp to sgprs. We never use it in the shaders. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-21 00:01:36 +00:00
Dave Airlie	952222ddd4	radv/tess: don't need to look in constant for vertices_per_patch This just avoids passing this value via user sgprs. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-21 00:01:28 +00:00
Dave Airlie	77fd1b9187	ac/radv: cleanup some tcs output values access Just consolidates some code to make it easier to change. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-21 00:01:23 +00:00
Dave Airlie	0e6f0d400b	ac/radv: remove total_vertices variable This just removes an unneeded variable. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-21 00:01:19 +00:00
Dave Airlie	e9b9fb3616	ac/radv: don't mark tess inner as used if we don't use it. This just avoids marking it as a used output if we don't actually use it. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-21 00:01:15 +00:00
Dave Airlie	d5b2d7ed67	ac/nir: to integer the args to bcsel. dEQP-VK.tessellation.invariance.outer_edge_symmetry.triangles_equal_spacing_ccw was hitting an llvm assert due to one value being an int and the other a float. This just casts both values to integer and fixes the test. Fixes: dEQP-VK.tessellation.invariance.outer_edge_symmetry.triangles_equal_spacing_ccw Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-20 23:15:18 +00:00
Jason Ekstrand	c66fb12117	anv/blorp: Use layout_to_aux_usage when a layout is provided Instead of having aux usage and ANV_AUX_USAGE_DEFAULT to mean "give me something reasonable" we now use anv_layout_to_aux_usage whenever a layout is available. If a layout is available, we ignore the aux_usage parameter. For the cases where we have an explicit aux usage such as clears and aux ops, we have a new ANV_IMAGE_LAYOUT_EXPLICIT_AUX layout. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-02-20 13:57:17 -08:00
Jason Ekstrand	0fa040e6f5	anv/cmd_buffer: Delete some assert-only variables Checking the sample count is almost as good as aux usage in this case. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-02-20 13:57:16 -08:00
Jason Ekstrand	e10a62662b	anv/cmd_buffer: Use layout_to_* helpers in compute_aux_usage Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-02-20 13:57:14 -08:00
Jason Ekstrand	7ea8131aa0	anv/cmd_buffer: Simplify transition_depth_buffer If we don't have HiZ, then anv_layout_to_aux_usage will return NONE for both layouts. If the two layouts are the same, they will get the aux usage. In either case, the code below will give us ISL_AUX_OP_NONE and we'll return without doing anything. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-02-20 13:57:09 -08:00
Jason Ekstrand	87e86ee2e6	anv/cmd_buffer: Do subpass image transitions in begin/end_subpass Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:25 -08:00
Jason Ekstrand	7d5f6b6088	anv/cmd_buffer: Mark depth/stencil surfaces written in begin_subpass Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:25 -08:00
Jason Ekstrand	8a3f086a42	anv/cmd_buffer: Sync clear values in begin_subpass This is quite a bit cleaner because we now sync the clear values at the same time as we do the fast clear. For loading the clear values into the surface state, we now do it once when we handle the LOAD_OP_LOAD instead of every subpass. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:25 -08:00
Jason Ekstrand	a4136b8c1a	anv/pass: Store usage in each subpass attachment This requires us to ditch the VkAttachmentReference struct in favor of an anv-specific struct. However, we can now easily identify from just the subpass attachment what kind of an attachment it is. This will make iteration over anv_subpass::attachments a little easier in some case. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:25 -08:00
Jason Ekstrand	bd356e1bcf	anv/cmd_buffer: Add a concept of pending load aspects These are the same as pending clear aspects only for the "load" operation. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:25 -08:00
Jason Ekstrand	e526d49edd	anv/cmd_buffer: Iterate all subpass attachments when clearing This unifies things a bit because we now handle depth and stencil at the same time. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:25 -08:00
Jason Ekstrand	2cc3445eb2	anv/cmd_buffer: Decide whether or not to HiZ clear up-front This moves the decision out of begin_subpass and into BeginRenderPass like the decision for color clears. We use a similar name for the function for depth/stencil as for color even though no aux usage is really getting computed. v2 (Jason Ekstrand): - Don't always disable HiZ clears by accident - Use the initial layout to decide whether to do fast clears Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:24 -08:00
Jason Ekstrand	6fc8555610	anv/cmd_buffer: Move the rest of clear_subpass into begin_subpass Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:24 -08:00
Jason Ekstrand	7991838973	intel/blorp: Add a blorp_hiz_clear_depth_stencil helper This is similar to blorp_gen8_hiz_clear_attachments except that it takes actual images instead of trusting in the already set depth state. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:24 -08:00
Jason Ekstrand	1900dd76d0	anv/cmd_buffer: Move the color portion of clear_subpass into begin_subpass This doesn't really change much now but it will give us more/better control over clears in the future. The one interesting functional change here is that we are now re-emitting 3DSTATE_DEPTH_BUFFERS and friends for each clear. However, this only happens at begin_subpass time so it shouldn't be substantially more expensive. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:24 -08:00
Jason Ekstrand	6fb9d6c6f5	anv/cmd_buffer: Pass a subpass id into begin_subpass This is a bit less awkward than passing in the subpass because it means we don't have to extract the subpass id from the subpass. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:24 -08:00
Jason Ekstrand	01223b8199	anv/cmd_buffer: Add begin/end_subpass helpers Having begin/end_subpass is a bit nicer than the begin/next/end hooks that Vulkan gives us. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:24 -08:00
Jason Ekstrand	b5bd3fb4e4	anv/cmd_buffer: Apply subpass flushes before set_subpass This seems slightly more correct because it means that the flushes happen before any clears or resolves implied by the subpass transition. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:24 -08:00
Jason Ekstrand	869448a8ab	anv: Use framebuffer layers for implicit subpass transitions Fixes: `de3be61801` "anv/cmd_buffer: Rework aux tracking" Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:24 -08:00
Jason Ekstrand	85d0bec961	anv: Be more careful about fast-clear colors Previously, we just used all the channels regardless of the format. This is less than ideal because some channels may have undefined values and this should be ok from the client's perspective. Even though the driver should do the correct thing regardless of what is in the undefined value, it makes things less deterministic. In particular, the driver may choose to fast-clear or not based on undefined values. This level of nondeterminism is bad. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:24 -08:00
Jason Ekstrand	4796025ba5	intel/isl: Add an isl_color_value_is_zero helper Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:24 -08:00
Jason Ekstrand	116e818ef1	anv/gpu_memcpy: CS Stall before a MI memcpy on gen7 This fixes a pile of hangs caused by the recent shuffling of resolves and transitions. The particularly problematic case is when you have at least three attachments with load ops of CLEAR, LOAD, CLEAR. In this case, we execute the first CLEAR followed by a MI memcpy to copy the clear values over for the LOAD followed by a second CLEAR. The MI commands cause the first CLEAR to hang which causes us to get stuck on the 3DSTATE_MULTISAMPLE in the second CLEAR. We also add guards for BLORP to fix the same issue. These shouldn't actually do anything right now because the only use of indirect clears in BLORP today is for resolves which are already guarded by a render cache flush and CS stall. However, this will guard us against potential issues in the future. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:19 -08:00
Guillaume Charifi	a572ec2efe	st/mesa: Factorize duplicate code for atomic buffer binding Signed-off-by: Guillaume Charifi <guillaume.charifi@sfr.fr> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-02-20 20:54:49 +01:00
Guillaume Charifi	56bfcd50f7	st/mesa: Factorize duplicate code in st_update_framebuffer_state() Signed-off-by: Guillaume Charifi <guillaume.charifi@sfr.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-02-20 20:54:49 +01:00
Rob Clark	4c4e6232ee	freedreno/ir3: fix use_count refcnt'ing issue Was hitting an assert with vs-varying-array-mat4-index-col-row-wr.shader_test When eliminating a copy, we were dropping the use_count of the mov that is skipped, but not increasing the use_count of it's src instruction. Fixes: `76440fcca9` freedreno/ir3: clean up dangling false-dep's Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-20 13:43:42 -05:00
Eric Engestrom	ac731531a1	docs: fix patent url Reported-by: Pierre Moreau <pierre.morrow@free.fr> Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-20 15:14:34 +00:00
Brian Paul	e7d1a93723	svga: replaced 'unsigned' with proper enum types in shader code Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-02-20 08:11:06 -07:00
Jonathan Gray	9401d90a53	configure.ac: pthread-stubs not present on OpenBSD pthread-stubs is no longer required on OpenBSD and has been removed. libpthread parts involved moved to libc. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Cc: 17.3 18.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-20 15:08:47 +00:00
Andres Gomez	36ac485bd1	swr: bump minimum supported LLVM version to 4.0 Since radv and radeonsi removed support for LLVM 3.9 the distcheck target got broken because SWR distribution needed 3.9.x. After checking with George Kyriazis, SWR is OK with moving to LLVM 4.0 and above, which will solve this problem. Fixes: `3bf1e036e8` ("amd: remove support for LLVM 3.9") Cc: George Kyriazis <george.kyriazis@intel.com> Cc: Tim Rowley <timothy.o.rowley@intel.com> Cc: Emil Velikov <emil.velikov@collabora.com> Cc: Dylan Baker <dylan@pnwbakers.com> Cc: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: George Kyriazis <george.kyriazis@intel.com>	2018-02-20 17:03:06 +02:00
Andres Gomez	b39f6d5fc7	travis: radeonsi and radv need LLVM 4.0 Fixes: `3bf1e036e8` ("amd: remove support for LLVM 3.9") Cc: Marek Olšák <marek.olsak@amd.com> Cc: Emil Velikov <emil.velikov@collabora.com> Cc: Jan Vesely <jan.vesely@rutgers.edu> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-20 16:58:30 +02:00
Samuel Pitoiset	1ac741d690	ac/nir: move ac_declare_lds_as_pointer() outside of the switch Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-20 10:44:59 +01:00
Samuel Pitoiset	b5d111ae76	radv: allow to force family using RADV_FORCE_FAMILY Useful for pipeline-db. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-20 10:44:47 +01:00
Thomas Hellstrom	f386776ea5	loader_dri3/glx/egl: Reinstate the loader_dri3_vtable get_dri_screen callback Removing this callback caused rendering corruption in some multi-screen cases, so it is reinstated but without the drawable argument which was never used by implementations and was confusing since the drawable could have been created with another screen. Cc: "17.3 18.0" mesa-stable@lists.freedesktop.org Fixes: `5198e48a0d` (loader_dri3/glx/egl: Remove the loader_dri3_vtable get_dri_screen callback) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105013 Reported-by: Daniel van Vugt <daniel.van.vugt@canonical.com> Tested-by: Timo Aaltonen <tjaalton@ubuntu.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-20 10:36:53 +01:00
Thomas Hellstrom	80c31f7837	svga: Fix a leftover debug hack Fix what appears to be a leftover debug hack. The hack would force the driver to take a different blit path; possibly, although unverified, reverting to software blits. Tested using piglit tests/quick. No related regressions. Cc: "17.2 17.3 18.0" <mesa-stable@lists.freedesktop.org> Fixes: `9d81ab7376` (svga: Relax the format checks for copy_region_vgpu10 somewhat) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104625 Reported-by: Grazvydas Ignotas <notasas@gmail.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-20 10:12:19 +01:00
Iago Toral Quiroga	af5f2322d0	anv/entrypoints: make vkGetDeviceProcAddr return NULL for instance commands Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-20 08:12:32 +01:00
Ilia Mirkin	e1a70aed10	nv50,nvc0: mark ABGR format as displayable instead of ARGB format This matches the hardware's capabilities. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-19 22:33:58 -05:00
Ilia Mirkin	f7604d8af5	st/dri: only expose config formats that are display targets In the case of NVIDIA hardware, ABGR is displayable but ARGB is not. Only advertise the one set in the visuals list. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Daniel Stone <daniels@collabora.com>	2018-02-19 22:33:58 -05:00
Ilia Mirkin	ebdc4c31e2	mesa: add xbgr support adjacent to xrgb Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Daniel Stone <daniels@collabora.com>	2018-02-19 22:33:58 -05:00
Timothy Arceri	d88a2906f8	st/shader_cache: copy nir pointer to gl_program after deserializing This fixes a crash when running the arb_get_program_binary-api-errors piglit test twice. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-20 13:15:02 +11:00
Timothy Arceri	691c320de0	radeonsi: add nir shader cache support In future we might want to try avoid calling nir_serialize() but this works for now. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-20 13:15:02 +11:00
Timothy Arceri	2b431808ab	radeonsi: rename variables tgsi_binary -> ir_binary This better represents that the ir could be either tgsi or nir. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-20 13:15:02 +11:00
Emil Velikov	1270990438	docs: update calendar, add news and link release notes to 17.3.5 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-19 22:10:18 +00:00
Emil Velikov	be5a996039	docs: add sha256 checksums for 17.3.5 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `164a993112`)	2018-02-19 22:08:14 +00:00
Emil Velikov	ca614d40cd	docs: add release notes for 17.3.5 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `2529d77179`)	2018-02-19 22:08:12 +00:00
Marek Olšák	f78fe98fff	radeonsi: fix regression from 32-bit pointers on CI Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2018-02-19 17:56:23 +01:00
Samuel Pitoiset	549c7f3724	radv: compact varyings after removing unused ones It makes no sense to compact before, and the description of nir_compact_varyings() confirms that. Polaris10: Totals from affected shaders: SGPRS: 108528 -> 108128 (-0.37 %) VGPRS: 74548 -> 74500 (-0.06 %) Spilled SGPRs: 844 -> 814 (-3.55 %) Code Size: 3007328 -> 2992932 (-0.48 %) bytes Max Waves: 16019 -> 16009 (-0.06 %) Vega10: Totals from affected shaders: SGPRS: 106088 -> 106232 (0.14 %) VGPRS: 74652 -> 74700 (0.06 %) Spilled SGPRs: 692 -> 658 (-4.91 %) Code Size: 2967708 -> 2953028 (-0.49 %) bytes Max Waves: 18178 -> 18162 (-0.09 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-02-19 12:19:17 +01:00
Timothy Arceri	51e745cf77	radeonsi/nir: fix gl_FragCoord for pixel_center_integer Fixes piglit test glsl-arb-fragment-coord-conventions Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-19 08:47:48 +11:00
Timothy Arceri	347038baa9	glsl/nir: add pixel_center_integer to shader info Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-19 08:47:48 +11:00
Ilia Mirkin	fe76fc11b1	gm107/ir: avoid using kepler instruction capabilities Split up the op properties table into generation-specific bits, and only use the kepler ones on kepler. Fixes some CTS images tests. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2018-02-17 23:41:21 -05:00
Ilia Mirkin	f08fd676bf	nvc0: add support for bindless on maxwell+ Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-17 23:41:21 -05:00
Ilia Mirkin	0255550eb1	gm107/ir: change how SUQ works in preparation for bindless All this information can be retrieved from the TIC directly. Avoid having to dip into the constbuf information about the image. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-17 23:41:21 -05:00
Kenneth Graunke	fa8a764b62	i965: Use absolute addressing for constant buffer 0 on Kernel 4.16+. By default, 3DSTATE_CONSTANT_* Constant Buffer 0 is relative to dynamic state base address. This makes it unusable for pushing UBOs. There is a bit in the INSTPM register (or CS_DEBUG_MODE2 on Skylake) which controls whether buffer 0 is relative to dynamic state base address, or simply a normal pointer. Setting that gives us full flexibility. This lets us push up to 4 UBO ranges. We can't currently write this on Haswell and earlier, and will need to update the kernel command parser, and then do the whole version checking song and dance. We also need a brand new kernel that supports context isolation - on older kernels, newly created contexts inherit register state from whatever happened to be running. So, setting this would have catastrophic impact on other drivers such as libva, Beignet, or older Mesa. See commit `8ec5a4e4a4` where we did this once before, but had to revert it in commit `013d331220`. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-02-17 11:26:31 -08:00
Kenneth Graunke	a63c74be85	i965: Stop restoring the default L3 configuration on Kernel 4.16+. Kernel 4.16 has proper context isolation, which means we can change the L3 configuration without worrying about that leaking to other newly created contexts, breaking the assumptions of other userspace. So, disable our workaround to reprogram it back to the default. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-02-17 11:26:18 -08:00
Mikko Perttunen	5a1606c51f	nvc0: Use GP100_COMPUTE_CLASS on GP10B GP10B requires the use of GP100_COMPUTE_CLASS instead of GP104_COMPUTE_CLASS as is used for other non-GP100 chips. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-17 14:16:10 -05:00
Daniel Stone	9d21dbeb88	i965: Fix aux-surface size check The previous commit reworked the checks intel_from_planar() to check the right individual cases for regular/planar/aux buffers, and do size checks in all cases. Unfortunately, the aux size check was broken, and required the aux surface to be allocated with the correct aux stride, but full image height (!). As the ISL aux surface is not recorded in the DRIimage, we cannot easily access it to check. Instead, store the aux size from when we do have the ISL surface to hand, and check against that later when we go to access the aux surface. Signed-off-by: Daniel Stone <daniels@collabora.com> Fixes: `c2c4e5bae3` ("i965: Fix bugs in intel_from_planar") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-17 10:22:35 +00:00
Marek Olšák	931ec80eeb	radeonsi: implement 32-bit pointers in user data SGPRs (v2) User SGPRs changes: VS: 14 -> 9 TCS: 14 -> 10 TES: 10 -> 6 GS: 8 -> 4 GSCOPY: 2 -> 1 PS: 9 -> 5 Merged VS-TCS: 24 -> 16 Merged VS-GS: 18 -> 11 Merged TES-GS: 18 -> 11 SGPRS: 2170102 -> 2158430 (-0.54 %) VGPRS: `1645656` -> 1641516 (-0.25 %) Spilled SGPRs: 9078 -> 8810 (-2.95 %) Spilled VGPRs: 130 -> 114 (-12.31 %) Scratch size: 1508 -> 1492 (-1.06 %) dwords per thread Code Size: 52094872 -> 52692540 (1.15 %) bytes Max Waves: 371848 -> 372723 (0.24 %) v2: - the shader cache needs to take address32_hi into account - set amdgpu-32bit-address-high-bits Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v1)	2018-02-17 04:52:17 +01:00
Marek Olšák	5722cd4084	radeonsi: disallow constant buffers with a 64-bit address in slot 0 State trackers must use a user buffer or const_uploader, or set pipe_resource::flags same as const_uploader->flags. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-17 04:52:17 +01:00
Marek Olšák	d790b6cece	radeonsi: move const_uploader allocations to 32-bit address space Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-17 04:52:17 +01:00
Marek Olšák	50581549b7	winsys/radeon: implement and enable 32-bit VM allocations Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-17 04:52:17 +01:00
Marek Olšák	1104d1e9d3	winsys/radeon: add struct radeon_vm_heap Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-17 04:52:17 +01:00
Marek Olšák	48ecacfefa	winsys/amdgpu: enable 32-bit VM allocations Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-17 04:52:17 +01:00
Marek Olšák	c2da45be86	gallium/radeon: add 32-bit address space heaps Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-17 04:52:17 +01:00
Marek Olšák	0977b7f7b3	ac: query high bits of 32-bit address space	2018-02-17 04:51:58 +01:00
Marek Olšák	16be55da94	gallium: use PIPE_CAP_CONSTBUF0_FLAGS	2018-02-17 04:20:55 +01:00
Marek Olšák	8e7222f4e5	gallium: allow drivers to impose BO flags restrictions on constant buffer 0 Required by radeonsi for optimal behavior.	2018-02-17 04:20:55 +01:00
Alexander von Gluck IV	834d221512	meson: Add Haiku platform support v4 Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-02-16 16:56:34 -06:00
Anuj Phogat	7b283544dc	anv/icl: Add render target flush after uploading binding table The PIPE_CONTROL command description says: "Whenever a Binding Table Index (BTI) used by a Render Taget Message points to a different RENDER_SURFACE_STATE, SW must issue a Render Target Cache Flush by enabling this bit. When render target flush is set due to new association of BTI, PS Scoreboard Stall bit must be set in this packet." Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-16 11:10:32 -08:00
Anuj Phogat	136f583a24	anv/icl: Enable float blend optimization Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-16 11:10:32 -08:00
Anuj Phogat	cd7102972f	anv/icl: Use gen11 functions Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-16 11:10:32 -08:00
Anuj Phogat	9673c21d4f	anv/icl: Build anv libs for gen11 Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-16 11:10:32 -08:00
Anuj Phogat	1f108b436b	anv/icl: Generate gen11 entry point functions Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-16 11:10:32 -08:00
Anuj Phogat	a86c0a08df	anv/icl: Don't use DISPATCH_MODE_SIMD4X2 Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-16 11:10:32 -08:00
Anuj Phogat	cd5fc634a8	anv/icl: Don't use SingleVertexDispatch Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-16 11:10:32 -08:00
Anuj Phogat	6e3940b3cf	anv/icl: Don't set ResetGatewayTimer Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-16 11:10:32 -08:00
Anuj Phogat	41a4c2c8e8	anv/icl: Add #define genX Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-16 11:10:31 -08:00
Anuj Phogat	413d475b44	anv/icl: Add gen11 mocs defines Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-16 11:10:31 -08:00
Kenneth Graunke	1d6cf433d2	i965: Implement GenerateMipmap directly, rather than using Meta. Meta is awful and we'd like to stop using it. Implementing this using BLORP allows us to stop trashing a bunch of GL state every time. This follows the structure of st_generate_mipmap(). compute_num_levels is lifted directly from there. Improves performance in Gl41HdrBloom by about 11.794% +/- 1.01919% (n=3) on Kabylake GT2 at 1280x720 (the difference seems much smaller at higher resolutions). v2 (idr): Don't try depth or depth-stencil blorp blits on Gen4 or Gen5 because it's not implemented yet. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-02-16 10:48:10 -08:00
Kenneth Graunke	9bcd31ea90	mesa: Move compute_num_levels from st_gen_mipmap.c to mipmap.c. I want to use compute_num_levels inside i965. Rather than duplicating it, move it from mesa/st to core Mesa, and make it non-static. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-16 10:48:10 -08:00
Dylan Baker	03ab40b1f7	meson: freedreno depends on nir This fixes a race condition in building targets that link in freedreno. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105120 Fixes: `0bbecc5a85` ("meson: define driver dependencies") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Mark Janes <mark.a.janes@intel.com>	2018-02-16 10:10:18 -08:00
George Kyriazis	f1fbeb1a53	swr/rast: blend_epi32() should return Integer, not Float fix gcc8 compiler error for KNL. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105029 Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:02 -06:00
George Kyriazis	7dd793d10c	swr/rast: Normalize path for debug metadata in template gen_llvm.hpp Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:02 -06:00
George Kyriazis	f979d0bc2f	swr/rast: Consolidate archrast Draw events Consolidate archrst draw events into single draw event with an attribute that represents the type of draw - Add handlers for new private proto versions of DrawInstancedEvent, DrawIndexedInstancedEvent, DrawInstancedSplitEvent, and DrawIndexedInstancedSplitEvent - Convert the draw events to generic DrawInfoEvents - parse_proto_event_fields() replaces 'AR_DRAW_TYPE' as a field type with 'uint32_t'. This draw type is actually an enum, but can be represented as an unsigned integer. - is_draw_or_dispatch() recognizes DrawInfoEvent as a draw event Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:02 -06:00
George Kyriazis	45df1a6520	swr/rast: Add semantics for translating address Added support for another full translation path in fetch jitter. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:02 -06:00
George Kyriazis	c09483cf0a	swr/rast: Convert C Sampler intrinsics Convert portions of the C sampler to the rasty SIMD lib. Also fix SRL call with a non-immediate. Don't count on the compiler automagically converting an srli call to srl if the shift count isn't an immediate. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:01 -06:00
George Kyriazis	37ebf86add	swr/rast: Make SIMDLib templated types easier to use "typename SIMD_T::TypeName" --> "TypeName<SIMD_T>" Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:01 -06:00
George Kyriazis	74e8bb4a22	swr/rast: Be more explicit when fetching next component Use a new function to denote that we want to get offset to next component and hide the fact that GEP is used underneath. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:01 -06:00
George Kyriazis	da77eb55d5	swr/rast: Fix bug related to passing AR handle We were passing a garbage handle. Let's not do that. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:01 -06:00
George Kyriazis	48d62409f8	swr/rast: Fix primitive replication issue in tesselation PA. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:01 -06:00
George Kyriazis	e12db47a7d	swr/rast: Use llvm intrinsic masked gather Use llvm intrinsic masked.gather instead of manual unroll for the cases where we have vector of pointers. Improves llvm IR debug experience by reducing a ton of IR to a single intrinsic call. Also seems to reduce overall stack use considerably. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:01 -06:00
George Kyriazis	9cc9688e49	swr/rast: Misc cleanup Together with correct detection of clipDistance NaNs when no cullDistance is set Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:00 -06:00
George Kyriazis	036c8b6247	swr/rast: Renamed variable in vertexbufferstate Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:00 -06:00
George Kyriazis	b25efa36e6	swr/rast: Fix GATHERPS to avoid assertions. With the pBase type change, LLVM was asserting because of wrong types. Cast appropriately. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:00 -06:00
George Kyriazis	8a64593bde	swr/rast: More precise user clip distance interpolation Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:00 -06:00
George Kyriazis	3e560b7c85	swr/rast: Cull prims when all verts have negative clip distances Performance optimization, and fixes some clipping issues. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:00 -06:00
George Kyriazis	cb4b604ebd	swr/rast: whitespace and comment cleanup Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:00 -06:00
George Kyriazis	5df4d98780	swr/rast: Fix invalid number of attributes Fix invalid number of attributes passed into tesselation PA. Needs to take into account any offsets from the shader. Innocuous issue, but removes an assert firing in debug. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:53:59 -06:00
George Kyriazis	2053472723	swr/rast: Add clipper stats. Clipper event is now: event ClipperEvent { uint32_t drawId; uint32_t trivialRejectCount; uint32_t trivialAcceptCount; uint32_t mustClipCount; }; Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:53:59 -06:00
George Kyriazis	0420b2be89	swr/rast: Separate event types to public and private Split into two proto files and modify appropriate build rules for configure / scons / meson builds. There are private internal events (proxy) that communicate information from rasterizer to ArchRast. ArchRast can use these events to calculate a final answer and then emit other public events which will be saved to file. Users will use the public proto file and not the private one. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:53:59 -06:00
George Kyriazis	e48dd2489c	swr/rast: Clean up event types and remove BE events Begin/End events not needed anymore. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:53:59 -06:00
George Kyriazis	7070027d7b	swr/rast: Removed unused variable Gets rid of zillions of unused variable warnings, made worse by templates. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:53:59 -06:00
George Kyriazis	e3f92bb7af	swr/rast: Separate RDTSC code from archrast Renamed rdstc defines more appropriately Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:53:59 -06:00
George Kyriazis	8bce71622e	swr/rast: Cleanup of mpPrivateContext in Builder Provide access functions for mpPrivateContext in Builder. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:53:58 -06:00
George Kyriazis	5697dc3e23	swr/rast: Remove some JIT debug code Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:53:58 -06:00
George Kyriazis	2407b8c9b4	swr/rast: Don't include private context in gather args Move mpPrivateContext to compensate Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:53:58 -06:00
George Kyriazis	a4c23fc25b	swr/rast: Cleanup knob definitions Rename some of the categories and move some options around. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:53:42 -06:00
George Kyriazis	ec34ed73d6	swr/rast: Add missing parameter to a few gather functions We now pass pDrawContext as a default parameter Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:39:42 -06:00
Philipp Zabel	bfe4e24a42	etnaviv: add useful information to BO import errors Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2018-02-16 17:05:43 +01:00
Daniel Stone	ff5432dc50	egl/wayland: Always use in-tree wayland-egl-backend.h A recent patchset to Wayland[0] migrated Mesa's libwayland-egl backend into Wayland itself, so implementations could provide backends. Mesa still uses its own, and the two have already diverged[1]. The include from egl_dri2.h could pick up either the installed Wayland wayland-egl-backend.h (with a 'driver_private' member), or the Mesa internal wayland-egl-backend.h (with a 'private' member), failing the build in the first instance. Add an explicit directory prefix to the include, so we always get our in-tree version. [0]: https://patchwork.freedesktop.org/series/31663/ [1]: https://cgit.freedesktop.org/wayland/wayland/commit/?id=9fa60983b579 Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105103 Fixes: `198af27c67` ("wayland-egl: rename wayland-egl-{priv,backend}.h")	2018-02-16 14:04:19 +00:00
Daniel Stone	f766e1afa5	meson: Move Wayland dmabuf to wayland-drm As the comment notes: linux-dmabuf has nothing to do with wayland-drm, but we need a single place to build these files we can use from both EGL and Vulkan, which is guaranteed to be included before both EGL and Vulkan WSI. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>	2018-02-16 14:04:19 +00:00
Eric Engestrom	65dda6c9ec	egl/wayland: check for invalid format index v2: just tell the compiler to assume the format will always be found, as it comes from the table itself to begin with. (DanielS) CID: 1429516 Fixes: `d32b23f383` "egl/wayland: Add bpp to visual map" Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-02-16 13:14:29 +00:00
Eric Engestrom	a176b053b6	glsl: fix sizeof(pointer) bug Doesn't really change anything to the test though ¯\_(ツ)_/¯ CID: 1429511 Fixes: `e8495646af` "glsl/tests: changes to test_disk_cache_create test" Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-02-16 12:04:29 +00:00
Timothy Arceri	2f5d3df9fc	radeonsi/nir: set TGSI_PROPERTY_FS_EARLY_DEPTH_STENCIL correctly We set this for post_depth_coverage in addition to early_fragment_tests. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-16 15:53:13 +11:00
Dave Airlie	60c14a0db2	virgl: remap query types to hw support. The gallium query types changed, so we need to remap from the gallium ones to the virgl ones. Fixes: dEQP-GLES3.functional.transform_feedback.basic_types* "This also fixes: dEQP-GLES3.functional.transform_feedback.array.separate* dEQP-GLES3.functional.transform_feedback.array_element* dEQP-GLES3.functional.transform_feedback.interpolation.* Gallium's p_defines.h and virglrenderer's p_defines.h have diverged quite a bit, so not including PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE there makes sense for now." - Gurchetan Singh Fixes: `3f6b3d9db` (gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE) Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org> Tested-by: Gurchetan Singh <gurchetansingh@chromium.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-16 12:42:06 +10:00
Anuj Phogat	8a05b06146	i965/icl: Add render target flush after uploading binding table From PIPE_CONTROL command description in gfxspecs: "Whenever a Binding Table Index (BTI) used by a Render Taget Message points to a different RENDER_SURFACE_STATE, SW must issue a Render Target Cache Flush by enabling this bit. When render target flush is set due to new association of BTI, PS Scoreboard Stall bit must be set in this packet." V2: Move the PIPE_CONTROL to update_renderbuffer_surfaces() in brw_wm_surface_state.c (Ken). Fixes a fulsim error and a GPU hang described in below JIRA. JIRA: MD5-322 Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-15 16:14:56 -08:00
Anuj Phogat	3f8289164f	i965/icl: Enable float blend optimization and Wa3DStateMode Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-15 16:14:56 -08:00
Anuj Phogat	ba3cbee6c5	intel/common/icl: Add has_sample_with_hiz flag in gen_device_info Sampling from hiz is enabled in i965 for GEN9+ but this feature has been removed from gen11. So, this new flag will be useful to turn the feature on/off for different gen h/w. It will be used later in a patch adding device info for gen11. Suggested-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-15 16:14:56 -08:00
Anuj Phogat	9c144dc81e	i965/icl: Add assertions to check dispatch mode is SIMD8 SIMD4x2 dispatch mode has been removed in GEN11. We're not using it anyways in Mesa. Adding few asserts to make it explicit. Use GEN_GEN macro in place of devinfo->gen (Ken) Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-15 16:14:56 -08:00
Anuj Phogat	02e91b6d62	i965/icl: Update switch statements Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-15 16:14:56 -08:00
Anuj Phogat	27d0034938	i965/icl: Update the assert in brw_memory_barrier() Nothing is changed here from gen10 to gen11. So, just update the assert. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-15 16:14:56 -08:00
Anuj Phogat	d6b26649a6	i965/icl: Define and use icl mocs settings Gen11 MOCS settings are duplicate of Gen10 MOCS settings. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-15 16:14:56 -08:00
Anuj Phogat	e9ad5c9a5d	i965/icl: Update the comment for maximum number of threads per PSD Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-15 16:14:56 -08:00
Anuj Phogat	93f601d7ed	i965/icl: Build and use gen11 functions for genxml state-upload and blorp Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-02-15 16:14:56 -08:00
Anuj Phogat	85f319155f	i965/icl: Don't set ResetGatewayTimer This field is removed in gen11+ Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-15 16:14:56 -08:00
Anuj Phogat	772a75be46	intel/icl: Do StateCacheInvalidation for indirect clear color StateCacheInvalidation is required on all gen7+ platforms. We don't need to update this check for every new gen h/w unless this requirement is changed. So, dropping the check for latest gen h/w. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-15 16:14:55 -08:00
Anuj Phogat	bff24e2173	intel/isl/icl: Build and use gen11 surface state emit functions Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-02-15 16:14:55 -08:00
Anuj Phogat	0427bd4954	intel/isl/icl: Add the maximum surface size limit Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-15 16:14:55 -08:00
Anuj Phogat	c68ede0be7	intel/genxml/icl: Update genx_bits header Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-15 16:14:55 -08:00
Anuj Phogat	165a68b05a	intel/genxml/icl: Generate packing headers Move build system changes in to one patch (Ken, Emil) Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-02-15 16:14:55 -08:00
Anuj Phogat	7ed27d8cbf	intel/genxml/icl: Add gen11.xml Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-15 16:14:55 -08:00
Kenneth Graunke	4dee8f0548	i965: Drop EXEC_OBJECT_CAPTURE defines. These only existed to avoid making people update libdrm for new uABI headers. A while ago we imported those headers into the Mesa repo, so the dependency is gone and these are no longer useful. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-15 15:35:52 -08:00
Jan Vesely	78673b614b	clover: Fix build after llvm r325155 and r325160 r325155 ("Pass a reference to a module to the bitcode writer.") and r325160 ("Pass module reference to CloneModule") change function interface from pointer to reference. v2: Fix indentation (tab instead of spaces) Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-02-15 18:18:53 -05:00
Bas Nieuwenhuizen	05d84ed68a	radv: Always lower indirect derefs after nir_lower_global_vars_to_local. Otherwise new local variables can cause hangs on vega. CC: <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105098 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-02-15 23:45:59 +01:00
Dylan Baker	2ab1ce30c4	meson: fix xvmc target linkage This needs to link the state tracker with --whole-archive to expose the right symbols. v4: - Always add libswdri and libswkmsdri to the link_with list Fixes: `22a817af8a` ("meson: build gallium xvmc state tracker") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-15 10:38:43 -08:00
Dylan Baker	0b73c329bc	meson: Fix xa target linkage This needs to use --whole-archive (link_whole in meson) to properly expose symbols. v4: - Always add libswdri and libswkmsdri to link_with list Fixes: `0ba909f0f1` ("meson: build gallium xa state tracker") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-15 10:36:31 -08:00
Dylan Baker	91a59b6287	meson: Fix omx-bellagio target linkage This needs to use --whole-archive (link_whole in meson) to properly expose symbols. v4: - Always add libswdri and libswkmsdri to link_with Fixes: `1d36dc674d` ("meson: build gallium omx state tracker") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-15 10:36:26 -08:00
Dylan Baker	2e4be28fb2	meson: fix va target linkage The state tracker needs to be linked with whole-archive (like autotools). As a result there are symbols from libswdri and libswkmsdri that are needed, so link those as well. v4: - Always add libswdri and libswkmsdri to link_with list Fixes: `5a785d51a6` ("meson: build gallium va state tracker") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-15 10:36:16 -08:00
Dylan Baker	90d361753c	meson: fix vdpau target linkage The VDPAU state tracker needs to be linked with whole-archive (autotools does this). Because we are linking the whole archive we alos need to link with libswdri and libswkmsdri if those have been enabled. v4: - Always add libswdri and libswkmsdri to link_with list Fixes: `68076b8747` ("meson: build gallium vdpau state tracker") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-15 10:36:09 -08:00
Dylan Baker	3403055768	meson: Actually link xvmc target with libxvmc Unlike vdpau this is required. Fixes: `22a817af8a` ("meson: build gallium xvmc state tracker") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-15 10:36:04 -08:00
Dylan Baker	7708103857	meson: actually link with libomxil-bellagio This state tracker actually needs to link, unlike vdpau. Fixes: `1d36dc674d` ("meson: build gallium omx state tracker") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-15 10:35:57 -08:00
Dylan Baker	7023b373ec	meson: link dri3 xcb libs into vlwinsys instead of into each target This makes the dependencies easier to manage, since each media target doesn't need to worry about linking to half a dozen libraries. Fixes: `b1b65397d0` ("meson: Build gallium auxiliary") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-15 10:35:51 -08:00
Dylan Baker	424e654cb0	meson: use va-api version reported by pkg-config Fixes: `5a785d51a6` ("meson: build gallium va state tracker") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-15 10:35:47 -08:00
Dylan Baker	8eb608df61	meson: add libswdri and libswkmsdri to dri link_with Fixes: `b154b44ae3` ("meson: build radeonsi gallium driver") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-15 10:35:42 -08:00
Dylan Baker	be879f9f29	meson: add libswdri and libswkmsdri to d3dadaptor link_with v5: - Fix libswdi -> libswdri typo Fixes: `6b4c7047d5` ("meson: build gallium nine state_tracker") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-15 10:35:36 -08:00
Dylan Baker	d672084ba2	meson: define empty variables for libswdri and libswkmsdri This allows these variables to unconditionally included in `link_with` lists, even if they're not used. This allows deleting duplicated logic in nearly every gallium target implemented in meson today. This also removes the now useless `build_by_default` flag from swdri and swkmsdri. v4: - add this patch Fixes: `66c94b9313` ("meson: build gallium winsys for dri, null, and wrapper") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-15 10:35:23 -08:00
Dylan Baker	7d0e342af2	meson: add convenience variable for anv_extensions.py depdendency Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-02-15 09:46:07 -08:00
Dylan Baker	0e617c04f1	meson: use depend_files for adding extra file dependencies cc: Jason Ekstrand <jason.ekstrand@intel.com> Fixes: `dd088d4bec` ("anv/extensions: Generate a header file with extension tables") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-02-15 09:46:04 -08:00
Dylan Baker	b03969a5ad	meson: use depend_files to track extra file dependencies cc: Jason Ekstrand <jason.ekstrand@intel.com> Fixes: `f939940809` ("anv: Split anv_extensions.py into two files") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-02-15 09:45:56 -08:00
Dylan Baker	384bff13e0	Revert "anv/meson: Make anv_entrypoints_gen.py depend on anv_extensions.py" This reverts commit `10d1b0be8e`. This is unnecessary, the depend_files argument is for adding dependencies on files that are not part of the input, which is already done. cc: Jason Ekstrand <jason.ekstrand@intel.com> Fixes: `10d1b0be8e` Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-02-15 09:45:40 -08:00
Brian Paul	64a1223a80	svga: replace gotos with else clauses Simple clean-up. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-02-15 09:49:06 -07:00
Brian Paul	fa901768a4	svga: s/unsigned/enum pipe_shader_type/ Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-02-15 09:05:09 -07:00
Brian Paul	8b54299c34	svga: move duplicated code for setting fillmode/flatshade state Move the calls to svga_hwtnl_set_fillmode() and svga_hwtnl_set_flatshade() out of the two retry_draw_*() functions to the svga_draw_vbo() function. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-02-15 09:05:09 -07:00
Brian Paul	072df89a79	svga: move svga_update_state() call in draw code This fixes a few Piglit transform feedback regressions caused by commit `7a1401938b`. In that change I moved the moved svga_update_state() into the loops, after the calls to svga_hwtnl_set_flatshade(). But svga_hwtnl_set_flatshade() actually depends on some derived shader state. This patch moves the svga_update_state() call into svga_draw_vbo() so it's not duplicated in two places. Fixes: `7a1401938b` ("svga: clean up retry_draw_range_elements(), retry_draw_arrays()") Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-02-15 09:05:08 -07:00
Brian Paul	6f0aec5671	svga: call tgsi_scan_shader() for dummy shaders If we fail to compile the normal VS or FS we fall back to a simple/ dummy shader. We need to rescan the the shader to update the shader info. Otherwise, this can lead to further translations failures because the shader info doesn't match the actual shader. Found by adding some extra debug assertions in the state-update code while debugging something else. v2: also update shader generic_inputs/outputs, etc. per Charmaine Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-02-15 09:05:01 -07:00
Samuel Pitoiset	579b33c1fd	ac/nir: do not reserve user SGPRs for unused descriptor sets In theory this might lead to corruption if we bind a descriptor set which is unused, because LLVM is smart and it can re-use unused user SGPRs. In practice, this doesn't seem to fix anything. As a side effect, this will reduce the number of emitted SH_REG packets. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-15 14:53:30 +01:00
Samuel Pitoiset	309854148c	ac/shader: fix gathering of desc_set_used_mask This was quite wrong. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-15 14:53:30 +01:00
Samuel Pitoiset	61a4fc3ecc	ac/shader: be a little smarter when scanning vertex buffers Although meta shaders don't use any vertex buffers, there is no behaviour change but I think it's better to do this. Though, this saves two user SGPRs for push constants inlining or something else. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-15 14:53:30 +01:00
Louis-Francis Ratté-Boulianne	a34715ad9c	dri: fromPlanar() can return NULL as a valid result It was assumed that fromPlanar() could return NULL to mean that the planar image is the same as the parent DRI image. That assumption wasn't made everywhere though. Let's fix things and make sure that all callers understand a NULL result Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-02-15 11:58:17 +00:00
Emil Velikov	f0654dfa65	docs: correct link to the 17.3.3 release notes Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-15 11:33:27 +00:00
Emil Velikov	dd4734d5c1	docs: update calendar, add news and link release notes to 17.3.4 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-15 11:33:04 +00:00
Emil Velikov	eadde35f83	docs: add sha256 checksums for 17.3.4 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `26c84b8af9`)	2018-02-15 11:28:19 +00:00
Emil Velikov	6f4a6e2310	docs: add release notes for 17.3.4 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `2f9820c553`)	2018-02-15 11:28:18 +00:00
Karol Herbst	7bc15090fc	nvc0: disable MS Images for sample_count == 1 on Maxwell fixes KHR-GL45.multi_bind.dispatch_bind_textures on Maxwell Suggested-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-15 11:14:46 +01:00
Gurchetan Singh	c6694793e1	mesa: don't clamp just based on ARB_viewport_array extension The ARB_viewport_array spec says: "Dependencies OpenGL 1.0 is required. OpenGL 3.2 or the EXT_geometry_shader4 or ARB_geometry_shader4 extensions are required. This extension is written against the OpenGL 3.2 (Compatibility) Specification." As such, we should ignore it for GLES2 contexts. Fixes: dEQP-GLES2.functional.state_query.integers.viewport_getinteger dEQP-GLES2.functional.state_query.integers.viewport_getfloat on llvmpipe and virgl. v2: Use _mesa_has_* (Ilia) Signed-off-by: Marek Olšák <marek.olsak@amd.com> Cc: 17.3 18.0 <mesa-stable@lists.freedesktop.org>	2018-02-15 01:58:50 +01:00
Dylan Baker	5317211fa0	meson: use a custom target instead of a generator for i965 oa Generators really are never the thing you want. The problem in this case is that a generator must create a file that contains any file that the generated target depends on. Since brw_oa.py doesn't generate such a file the generated sources are not regenerated even if the xml files they should depend on changes. While we could change brw_oa.py to write such a file, that's silly, it depends on itself and the xml file. So we'll just use a custom target instead, which will have the correct dependency behavior and doesn't really add that much code. Fixes: `3218056e0e` ("meson: Build i965 and dri stack") CC: Ian Romanick <idr@freedesktop.org> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-02-14 16:45:40 -08:00
Anuj Phogat	0cd37f9178	isl: Don't use surface format R32_FLOAT for typed atomic integer operations From Skylake PRM Surface Formats section: "The surface format for the typed atomic integer operations must be R32_UINT or R32_SINT." Fixes an error and a piglit GPU hang in simulation environment. Piglit test: gl45-imageAtomicExchange-float.shader_test Suggested-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.co Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "18.0 17.3" <mesa-stable@lists.freedesktop.org>	2018-02-14 16:30:05 -08:00
Timothy Arceri	7be5f30bb1	radeonsi/nir: fix si_nir_load_tcs_varyings() for outputs We were incorrectly using the input info for outputs. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-15 09:02:41 +11:00
Timothy Arceri	9740c8a8aa	ac: implement nir_intrinsic_image_samples Fixes cts test: KHR-GL45.shader_texture_image_samples_tests.image_functional_test Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-15 09:02:41 +11:00
Timothy Arceri	c6b70a0eae	st: add NIR GL_ARB_get_program_binary support Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-15 09:02:41 +11:00
Timothy Arceri	928be4e97e	st/shader_cache: add st_{de}serialise_nir_program() helpers These will be used for NIR GL_ARB_get_program_binary support. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-15 09:02:41 +11:00
Timothy Arceri	3ad52501dc	ac/nir_to_llvm: fix image size for arrays of arrays Fixes cts test: KHR-GL44.shader_image_size.advanced-changeSize Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-15 09:02:41 +11:00
Timothy Arceri	6acab18828	radeonsi/nir: fix shader ballot return value bitsize Fixes cts test: KHR-GL46.shader_ballot_tests.ShaderBallotFunctionBallot Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-15 09:02:41 +11:00
Jason Ekstrand	8534af44e4	intel/aubinator: Correctly decode INTERFACE_DESCRIPTOR_DATA Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-14 13:17:26 -08:00
Jason Ekstrand	5c9d47d9c6	i965: Add gl_state_index casts for PATCH_VERTICES_IN This fixes the build in clang Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105088 Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-14 13:16:47 -08:00
Scott D Phillips	3b4f432d9b	i965/miptree: Initialize mcs with a linear map When initializing mcs, map with MAP_RAW and fill in the linear map. Removes a place where gtt mapping is used. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-14 12:38:34 -08:00
Scott D Phillips	d13ab69a78	i965/tiled_memcpy: change linear pointer from (0, 0) to (xt1, yt1) In all current uses, the linear surface is only allocated starting at (xt1, yt1) anyway, so this improves the calling ergonomics. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-14 12:38:34 -08:00
Scott D Phillips	ecaad89525	i965/tiled_memcpy: linear_to_ytiled a cache line at a time TileY's low 6 address bits are: v1 v0 u3 u2 u1 u0 Thus a cache line in the tiled surface is composed of a 2d area of 16x4 bytes of the linear surface. Add a special case where the area being copied is 4-line aligned and a multiple of 4-lines so that entire cache lines will be written at a time. On Apollolake, this increases tiling throughput to wc maps by 84.0103% +/- 0.862818% v2: Split [y0, y1) and [y2, y3) loops apart for clarity (Jason Ekstrand) v3: Don't reset src var (Jason), Ensure y0 <= y1 <= y2 <= y3 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-14 12:38:34 -08:00
Rafael Antognolli	eb2e17e2d1	docs: Add Cannonlake support to 18.0 release notes. 17.4 is actually 18.0. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Cc: "18.0" mesa-stable@lists.freedesktop.org Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-14 10:11:05 -08:00
Rafael Antognolli	fcae3d1a9a	anv/gen10: Remove warning message. Gen10 seems pretty stable so far, remove "alpha support" message. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Cc: Jason Ekstrand <jason@jlekstrand.net> Cc: "18.0" mesa-stable@lists.freedesktop.org Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-14 10:11:01 -08:00
Rafael Antognolli	bf1577fe09	i965/gen10: Remove warning message. Gen10 seems pretty stable so far, so there's no reason to keep this message. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: "18.0" mesa-stable@lists.freedesktop.org Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-14 10:09:41 -08:00
Louis-Francis Ratté-Boulianne	aad14cf15a	egl/x11: Fix leak in dri3_create_image_khr_pixmap bp_reply wasn't properly free'd Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-02-14 11:52:06 +00:00
Iago Toral Quiroga	cb9dbd6dec	i965/compiler: clean up nir_intrinsic_load_input for vertex shaders This code to re-set the type of the source and destination is not necessary since we never manipulate the types. Looks like a left over from a time where we had to retype to float temporarily to handle 64-bit inputs. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-02-14 12:00:14 +01:00
Iago Toral Quiroga	4917d38321	intel/compiler: fix first_component for 64-bit types on vertex inputs Divide it by two as we do for other stages. This is because the component layout qualifier is always in 32-bit units. Fixes issues in a new CTS test (still WIP): KHR-GL45.enhanced_layouts.varying_double_components Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-02-14 12:00:14 +01:00
Samuel Pitoiset	ad4b58ea70	ac/nir: rename nir_to_llvm_context to radv_shader_context There is still more to do in that area, but it's a good start. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-14 11:53:16 +01:00
Samuel Pitoiset	141db61509	ac: remove nir_to_llvm_context from ac_nir_translate() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-14 11:53:14 +01:00
Samuel Pitoiset	a541117ff4	ac/nir: remove nir_to_llvm_context::nir link Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-14 11:53:12 +01:00
Samuel Pitoiset	e9f0205ca2	ac: move the outputs array to the ABI Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-14 11:53:10 +01:00
Samuel Pitoiset	07e4268f36	ac/shader: scan force_persample Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-14 11:53:08 +01:00
Dave Airlie	b9d2ff05a6	r600: fix regression in gl_FragColor drawing This fixes a regression in the broadcast color to all color bufs case. Fixes: `6c691081a` (r600: fixup sparse color exports.) Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-14 14:02:41 +10:00
Dave Airlie	9c9a9bee44	r600: fix array spill if temp[0] is before all arrays I found a shader with DCL TEMP[0], LOCAL DCL TEMP[1..256], ARRAY(1), LOCAL DCL TEMP[257..512], ARRAY(2), LOCAL DCL TEMP[513..768], ARRAY(3), LOCAL DCL TEMP[769], LOCAL This would remap badly, as it would add up all the spilled sizes and subtract it from the temp for 0. If the current temp is less than the array start break out. Fixes: `1d871aa6` (r600g: Implement spilling of temp arrays (v2)) Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-14 13:37:59 +10:00
Dave Airlie	8f2656c75b	virgl: add ARB_sample_shading support. This enable ARB_sample_shading if the renderer supports it. Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-14 13:06:07 +10:00
Dave Airlie	9b95b70719	virgl: add ARB_draw_indirect support. This relies on the renderer code landing first. Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-14 13:06:07 +10:00
Roland Scheidegger	f6718baabc	tgsi: Recognize RET in main for tgsi_transform Shaders coming from dx10 state trackers have a RET before the END. And the epilog needs to be placed before the RET (otherwise it will get ignored). Hence figure out if a RET is in main, in this case we'll place the epilog there rather than before the END. (At a closer look, there actually seem to be problems with control flow in general with output redirection, that would need another look. It's enough however to fix draw's aa line emulation in some internal bug - lines tend to be drawn with trivial shaders, moving either a constant color or a vertex color directly to the output). v2: add assert so buggy handling of RET in main is detected Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-02-14 02:06:54 +01:00
Bas Nieuwenhuizen	7461bd5b8f	ac: Use the renumbered const address space for LLVM 7. The LLVM AMDGPU backend decided to renumber the constant address space .... Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-14 01:05:03 +01:00
Dave Airlie	9ddacd9af4	gallium: drop all the guard band float caps. Nobody queries these and nobody sets them to anything useful, the docs say TODO. Drop them until a use appears. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-14 08:50:08 +10:00
Vadym Shovkoplias	a553c54abf	mesa: add glsl version query (v4) Add support for GL_NUM_SHADING_LANGUAGE_VERSIONS and glGetStringi for GL_SHADING_LANGUAGE_VERSION v2: - Combine similar functionality into _mesa_get_shading_language_version() function. - Change GLSL version return mechanism. v3: - Add return of empty string for GLSL ver 1.10. - Move _mesa_get_shading_language_version() function to src/mesa/main/version.c. v4: - Add OpenGL version check. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104915 Signed-off-by: Andriy Khulap <andriy.khulap@globallogic.com> Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 13:24:31 -07:00
Brian Paul	b08d718703	mesa: add missing switch case for EXTRA_VERSION_40 in check_extra() The EXTRA_VERSION_40 predicate is tested as part of extra_gl40_ARB_sample_shading but there was no switch case for it. Fixes: `77b440e42d` ("mesa: Add new functions and enums required by GL_ARB_sample_shading") Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-02-13 10:35:55 -07:00
Mark Janes	e5809788d6	mesa: fix compile failure Missing header triggered a failure in i965 CI buildtest project. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105067 Fixes: `e149a0253c`	2018-02-13 00:22:05 -08:00
Mark Janes	d9de7aaca3	Partially revert "mesa: use GLenum16 in a few more places" This reverts part of commit `ca721b3d89`. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105067	2018-02-13 00:22:05 -08:00
Mark Janes	3e5758a70a	Revert "mesa: reduce the size of gl_texture_image" This reverts commit `f4ea2b2a9e`. Several members reduced in size by the offending commit are not large enough to store the data needed by the i965 driver. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105067	2018-02-13 00:22:05 -08:00
Dave Airlie	db5f422169	i965: fix tessellation regressions with gl_state_index16 Looks like one conversion was missed. Fixes: `e149a0253` (mesa,glsl,nir: reduce gl_state_index size to 2 bytes) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105067 Signed-off-by: Dave Airlie <airlied@redhat.com> Tested-by: Mark Janes <mark.a.janes@intel.com>	2018-02-12 23:05:16 -08:00
Stéphane Marchesin	5e4a2b394e	virgl: Support v2 caps struct (v2) This struct allows us to report: - accurate max point size/line width. - accurate texel and texture gather offsets - vertex/geometry limits. Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-13 14:23:54 +10:00
Timothy Arceri	10457712ed	ac/nir: add nir_intrinsic_{load,store}_shared support Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-13 14:43:05 +11:00
Timothy Arceri	c787cbfa33	ac/nir_to_llvm: add support for nir_intrinsic_shared_atomic_* Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-13 14:43:05 +11:00
Timothy Arceri	b6cf898ec2	radeonsi: make si_declare_compute_memory() more generic and call for nir Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-13 14:43:05 +11:00
Timothy Arceri	94fa090fad	st/glsl: set req_local_mem earlier for compute shaders Without this change it will never be set for backends using nir. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-13 14:43:05 +11:00
Marek Olšák	6b1e26e181	mesa: move STATE_LENGTH to shader_enums.h and use it everywhere Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	f4ea2b2a9e	mesa: reduce the size of gl_texture_image 80 -> 40 bytes. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	4794fbc86e	mesa: reduce the size of gl_program_parameter 40 -> 24 bytes, which includes the gl_state_index16 change. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	e149a0253c	mesa,glsl,nir: reduce gl_state_index size to 2 bytes Let's use the new gl_state_index16 type everywhere and remove the typecasts. This helps reduce the size of gl_program_parameter. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	a7882013d3	mesa: reduce the size of gl_viewport_attrib All drivers convert these to float, so there is no reason to use double. The piglit test that expects double precision from glGet will be adjusted not to require it (there is a piglit patch). gl_context::ViewportArray: 512 -> 384 bytes Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	d7550d783a	mesa: reduce the size of gl_texture_object Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	65ed98839b	mesa: reduce the size of gl_program gl_program: 1456 -> 976 bytes Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	78f1decc95	mesa: reduce the size of gl_image_unit (v2) gl_context::ImageUnits: 6144 -> 4608 bytes v2: use ASSERT_BITFIELD_SIZE Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	ca5c5d96d8	mesa: further reduce the size of ctx->Texture Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	78043a75f6	mesa: decrease the array size of ctx->Texture.FixedFuncUnit to 8 GL allows doing glTexEnv on 192 texture units, while in reality, only MaxTextureCoordUnits units are used by fixed-func shaders. There is a piglit patch that adjusts piglits/texunits to check only MaxTextureCoordUnits units. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	07c10cc59c	mesa: separate legacy stuff from gl_texture_unit into gl_fixedfunc_texture_unit Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	79aca14f5f	mesa: inline init_texture_unit because this is going to be changed Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	ca721b3d89	mesa: use GLenum16 in a few more places Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Jason Ekstrand	4c77e21c81	anv: Move setting current_pipeline to cmd_state_init We were setting current_pipeline to UINT32_MAX and then calling cmd_cmd_state_reset which memsets the entire state struct to 0 which implicitly resets current_pipeline to 3D. I have no idea how this hasn't caused everything to explode. Fixes: `cd3feea745` "anv/cmd_buffer: Rework anv_cmd_state_reset" cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-02-12 15:18:23 -08:00
Jason Ekstrand	f37bd726c7	anv: Don't resolve or ambiguate non-existent layers The previous code was trying to avoid non-existent layers by taking a MAX with anv_image_aux_layers. Unfortunately, it wasn't taking into account that layer_count starts at base_layer which may not be zero. Instead, we need to subtract base_layer from anv_image_aux_layers with a guard against roll-over. Fixes: `de3be61801` "anv/cmd_buffer: Rework aux tracking" Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-12 15:14:57 -08:00
Daniel Stone	c2c4e5bae3	i965: Fix bugs in intel_from_planar This commit fixes two bugs in intel_from_planar. First, if the planar format was non-NULL but only had a single plane, we were falling through to the planar case. If we had a CCS modifier and plane == 1, we would return NULL instead of the CCS plane. Second, if we did end up in the planar_format == NULL case and the modifier was DRM_FORMAT_MOD_INVALID, we would end up segfaulting in isl_drm_modifier_has_aux. Cc: mesa-stable@lists.freedesktop.org Fixes: `8f6e54c929` Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-12 15:14:45 -08:00
Eric Anholt	1aed66dc1e	radv: Fix compiler warning about uninitialized 'set' The compiler doesn't figure out that we only get result == VK_SUCCESS if set got initialized. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 20:48:47 +00:00
Eric Anholt	21670f8208	glsl/tests: Fix strict aliasing warning about int64/double. Fixes: `4bf9862747` ("glsl/tests: Add UINT64 and INT64 types") Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>	2018-02-12 20:48:43 +00:00
Eric Anholt	091bff8317	ac/nir: Fix compiler warning about uninitialized dw_addr. Even switching the def's condition to be the same chip revision check as the use, the compiler doesn't figure it out. Just NULL-init it. Fixes: `ec53e52742` ("ac/nir: Add ES output to LDS for GFX9.") Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 20:48:29 +00:00
Eric Anholt	7a83be4b28	gallium/llvmpipe: Fix compiler warnings about ddx/ddy/ddmax. My gcc doesn't figure out that dims >= 1 (seems reasonable), and doesn't notice that ddmax is used from the same no_rho_opt as its initialization. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-12 20:48:18 +00:00
Kenneth Graunke	bd87bd178c	anv: Drop I915_EXEC_CONSTANTS_REL_GENERAL from execbuf. The kernel used to have execbuf parameters to program the INSTPM bit for whether 3DSTATE_CONSTANT_* should be relative to dynamic state base address or an absolute address. However, they never worked in the presence of hardware contexts, so I deleted them a while back. It doesn't make sense to set this flag, as it doesn't exist anymore. It also never did anything anyway - the flag is zero, so \|'ing it in did nothing. The default is relative anyway. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-12 07:00:41 -08:00
Eric Engestrom	111d4bf1d0	r200: remove left over dead code `0aaa27f291` removed the references to this array without removing the array itself Cc: Ian Romanick <ian.d.romanick@intel.com> Fixes: `0aaa27f291` "mesa: Pass the translated color logic op dd_function_table::LogicOpcode" Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-02-12 11:19:44 +00:00
Samuel Pitoiset	f4e85ba93f	ac/nir: remove backlink to nir_to_llvm_context Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:39 +01:00
Samuel Pitoiset	be5f6eb13e	ac/nir: remove nir_to_llvm_context::module Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:36 +01:00
Samuel Pitoiset	90a815ddeb	ac/nir: remove nir_to_llvm_context::builder Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:34 +01:00
Samuel Pitoiset	759acfa180	ac/nir: drop nir_to_llvm_context from glsl_to_llvm_type() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:31 +01:00
Samuel Pitoiset	e7373a6498	ac/nir: drop nir_to_llvm_context from visit_var_atomic() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:29 +01:00
Samuel Pitoiset	485346b05a	ac/nir: drop nir_to_llvm_context from visit_vulkan_resource_reindex() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:27 +01:00
Samuel Pitoiset	cd6dfacda9	ac/nir: drop nir_to_llvm_context from visit_load_push_constant() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:25 +01:00
Samuel Pitoiset	5c9e398c83	ac/nir: drop nir_to_llvm_context from cast_ptr() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:23 +01:00
Samuel Pitoiset	5ef5944848	ac/nir: drop nir_to_llvm_context from visit_load_local_invocation_index() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:21 +01:00
Samuel Pitoiset	da8b0b8264	ac/nir: drop nir_to_llvm_context from emit_f2f16() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:19 +01:00
Samuel Pitoiset	e32f374944	ac: remove unused parameters in abi::load_tess_coord() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:17 +01:00
Samuel Pitoiset	1e69db003d	ac/nir: remove useless bitcast in load_tess_coord() nir_intrinsic_load_tess_coord always returns a v3i32. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:15 +01:00
Samuel Pitoiset	ed179fbdf3	ac: add load_resource() to the ABI Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:13 +01:00
Samuel Pitoiset	ecf229706f	ac: add load_sample_mask_in() to the ABI Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:11 +01:00
Samuel Pitoiset	0f48eeea05	ac: move view_index to the ABI Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:09 +01:00
Samuel Pitoiset	0efbede949	ac: move push_constants to the ABI Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:07 +01:00
Samuel Pitoiset	460d3ce726	ac: move tg_size to the ABI Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:04 +01:00
Samuel Pitoiset	054c92190c	ac/nir: remove unused nir_to_llvm_context:{defs,phis} Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:02 +01:00
Eric Anholt	0b97eb02b0	egl/gbm: Fix compiler warning about visual matching. The compiler doesn't know that num_visuals > 0. Fixes: `37a8d907cc` ("egl/gbm: Ensure EGLConfigs match GBM surface format") Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-02-12 09:16:44 +00:00
Rob Clark	831fb29252	freedreno: small fix for flushing dependent batches Flush a resource's previous write_batch synchronously. Because a resource's associated batches are not updated until after the flush thread submits rendering to the kernel, this was causing a bit of confusion in the following loop. This fixes a bug that appeared with recent stk. Perhaps we need to re-work things a bit to clear out dependent patches in the ctx's thread and use a fence to deal with the period between when a flush is queued and when it is submitted to the kernel. But this will do until time permits a larger refactor. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	c57ed8e01c	freedreno/ir3: intra-block scheduling Because of loops, we can't schedule all of a block's predecessors first. Instead just assume that the result consumed in a block was written far enough away in all paths into a block. And do an intra-block scheduling pass to figure out if there are any cases where we need to insert extra nop's. This works out better than always assuming the worst case (ie. that a value live into a block was written in the last instruction in the predecessor block). Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	2a2099a875	freedreno/ir3: "boost" the depth of if/else condition Account for the move to predicate register, to try to avoid needing to insert extra NOPs later. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	ffb00f6841	freedreno/ir3: account for arrays in delayslot calc Normally false-deps are not something to consider, since they mostly exist for delay-slot related reasons: * barriers * ordering writes after read * SSBO/image access ordering The exception is a false-dependency on an array store. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	f54d2b4f10	freedreno/ir3: more clever legalize algorithm Previously we didn't handle flow control in legalize, and instead just set (ss)(sy) on the first instruction in every block. Which isn't very clever. Instead, consider output state of all predecessor blocks, so we only set a sync bit if needed for any possible path leading into a block. Because of loops, we can't require that all successor blocks are legalized before a given block, so instead run in a loop until results converge. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	015afb6a38	freedreno/ir3: track block predecessors Useful in the following patches. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	76440fcca9	freedreno/ir3: clean up dangling false-dep's Maybe there is a better way for this.. where it comes useful is "array" loads, which end up as a false-dep for a later array store. If all the uses of an array load are CP'd into their consumer, it still leaves the dangling array load, leading to funny things like: mov.u32u32 r5.y, r0.y mov.u32u32 r5.y, r0.z Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	aea223741f	freedreno/ir3: handle IMMED for mad 2nd src special case Consider also immediates for swapping the first two srcs, because they can be lowered to constant. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	242a8a1957	freedreno/ir3: remove ir3 phi instruction Now that we convert phi webs to ssa, we can drop all this. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	a7b569d60c	freedreno/ir3: remove lower_if_else pass Now that it is unused. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	268ab05484	freedreno/ir3: add experimental GCM pass Generally seems to do worse on instruction count and register usage, according to shader-db. But shader-db also doesn't do a very good job of weighting loop bodies, so that might not be totally valid. So add an env variable to enable GCM pass for easier experimentation. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	4c15c53d91	freedreno/ir3: change opt passes There are more useful nir passes added since initial conversion to nir. But ir3 was never updated to use them. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	ec8bc54ad2	freedreno/ir3: use peephole select pass Agressively lowering all if/else to selects in some extreme cases results in much higher register pressure. Using peephole select instead with a modest threshold speeds up alu2 4x! 16 seems like a good limit, low enough to help alu2 but not too low that it penalizes everything else. With a bit better scheduling of the instruction that moves a value into a predicate register, we might be able to lower this limit a bit more in the future, but since we need 6 cycles from the move to predicate register to predicated branch, that puts some sort of lower bound on how far we can lower this threshold. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	a7ea2b4eba	freedreno/ir3: lower phi webs to regs nir's from_ssa pass is much better at avoiding inserting extra moves than our logic is. And lowering phi webs to regs just treats anything involved in a phi web as an array of length=1. Which with previous array related fixes in RA/etc ends up working out quite well. This cuts down on extra instructions and also helps with register pressure. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	0a6ddf964f	freedreno/ir3: separate arrays from groups Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	55f14a1ac4	freedreno/ir3: make block/instruction serialno per-shader Makes it easier to compare values seen in-game (where there are many shaders) to cmdline standalone compiler. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	5a7de94392	freedreno/ir3: add spirv support to cmdline compiler Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	942341bcd0	freedreno/ir3: don't lower fsat Instead, if possible fold (sat) flag into src, otherwise use: (sat)max.f rD, rS, rS Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	b2fc94f074	freedreno/ir3: add encoding/decoding for (sat) bit Seems to be there since a3xx, but we always lowered fsat. But we can shave some instructions, especially in shaders that use lots of clamp(foo, 0.0, 1.0) by not lowering fsat. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	1b658533e1	freedreno/ir3: extend liverange of arrays Use livein state of other blocks to extend liverange of arrays when they are still needed by successor blocks. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	ac459a6f7f	freedreno/ir3: avoid extra mov's for "arrays" Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	2bc3fb6992	freedreno/ir3: a couple more array fixes (Plus a couple TODOs) Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	8ea1ef4191	freedreno/ir3: keep array stores Since these are not in SSA form, add to block's keeps so it doesn't appear unused. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	c60f150d56	freedreno/ir3: propagate barrier information When eliminating movs, the instruction that is now directly using the src of the mov has the same scheduling order constraints as the original mov instruction. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	98702c1010	freedreno/ir3: remove pointless statement Function ends after this if/else ladder, so it was pointless. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	930ca0e038	freedreno/ir3: some more debug prints Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	a84e324847	freedreno/ir3: fix printing of relative branch offsets The number of bits depends on generation. But printing negative values with a5xx encoding (largest size) but compiling for a3xx or a4xx, would result in negative values printed as large positive values. I guess in practice huge negative branch offsets aren't likely (and if that is the case, the shader is probably too big to grok by reading the assembly). So just print using smallest bitfield size. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	a5c28fe07b	freedreno/ir3: be more clever with if/else jumps Try to clean up things like: br !p0.x #2 br p0.x #something to eliminate the first branch. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	44dd7dcd2f	freedreno/ir3: avoid some spurious sync bits Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	069c0ac625	freedreno/ir3: print # of sync bits for shaderdb When trying to optimize to reduce stalls, it is nice to see this info. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	7d45e2e39f	freedreno: add debug trace for flush Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Grazvydas Ignotas	9b9a89cd79	intel/compiler: fix 64bit value prints on 32bit Fix the following: warning: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘uint64_t {aka long long unsigned int}. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-02-10 17:59:02 +02:00
Timothy Arceri	ff0e3fa1fe	st/glsl_to_nir: remove unused options variable	2018-02-10 11:06:55 +11:00
Timothy Arceri	8f378c116e	st/radeonsi: enable disk cache for nir Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-10 10:59:10 +11:00
Timothy Arceri	bc9d9f9b86	st: add nir shader disk cache support v2: include compute shader support Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-10 10:59:10 +11:00
Timothy Arceri	97efdc0d57	st/glsl_to_tgsi: move nir detection earlier We move the nir check before the shader cache call so that we can call a nir based caching function in a following patch. Also with this change we simply check if vertex shaders support NIR rather than looping over the stages as mixing of shader types is not supported anyway. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-10 10:59:10 +11:00
Timothy Arceri	b5e23887fe	radeonsi: stop returning PIPE_SHADER_IR_NATIVE for PIPE_SHADER_CAP_PREFERRED_IR Clover now checks PIPE_SHADER_CAP_SUPPORTED_IRS for native support instead. This change indirectly enables NIR support for compute shaders on radeonsi. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-10 10:59:10 +11:00
Timothy Arceri	73f1d6f0c1	r600: always return PIPE_SHADER_IR_TGSI for PIPE_SHADER_CAP_PREFERRED_IR We now use PIPE_SHADER_CAP_SUPPORTED_IRS to check for native support in clover. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-10 10:59:10 +11:00
Timothy Arceri	51f484bb44	clover: use PIPE_SHADER_CAP_SUPPORTED_IRS to discover IR PIPE_SHADER_CAP_PREFERRED_IR was conflicting with PIPE_SHADER_IR_NIR for compute shaders, so we let clover pick the one it wants to use. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-10 10:59:10 +11:00
Timothy Arceri	3af4f34e61	r600: add PIPE_SHADER_IR_NATIVE to supported shaders for cs Acked-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-10 10:59:10 +11:00
Timothy Arceri	ce836487b8	radeonsi/nir: add depth layout to scan pass Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-10 10:46:28 +11:00
Timothy Arceri	6a8efbe652	radeonsi/nir: add FRAG_RESULT_COLOR to scan pass Fixes a number of draw buffers piglit tests. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-10 10:46:28 +11:00
Timothy Arceri	ef8082baf8	ac: convert nir_op_f2f32 src to a float Fixes the following piglit test: ./bin/arb_vertex_attrib_64bit-check-explicit-location -auto -fbo Where we would end up with the nir such as: vec1 64 ssa_11 = pack_64_2x32_split ssa_9, ssa_10 vec1 32 ssa_12 = f2f32 ssa_2 And our pack_64_2x32_split nir to llvm code always produces a 64bit integer as output. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-10 10:46:28 +11:00
Timothy Arceri	1b1e5f8edf	ac: fix some 64bit unpack asserts Previously the asserts did not take swizzles into account. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-10 10:46:28 +11:00
Mark Janes	9a05c66feb	Revert "i965: prevent potentially null pointer access" This reverts commit `712332ed54`, which caused over 90k failures in Mesa i965 CI. Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-02-09 09:46:07 -08:00
Daniel Stone	37a8d907cc	egl/gbm: Ensure EGLConfigs match GBM surface format When we create an EGL window surface on a GBM surface, ensure that the EGLConfig is compatible with the GBM format, notwithstanding XRGB/ARGB interchange. For example, rendering with an XRGB8888 EGLConfig on to an ARGB8888 gbm_surface (and vice-versa) are acceptable, but rendering with an XRGB2101010 EGLConfig on to an XRGB8888 gbm_surface will now be rejected. This was previously allowed through; when 10bpc formats were enabled, clients which picked a completely random EGL config and hoped/assumed they were XRGB8888 would break. If you have bisected a failure to start a GBM/KMS client to this commit, please look at its EGLConfig selection (e.g. through eglChooseConfigs), and add an EGL_NATIVE_VISUAL_ID == gbm_surface format match to the attribs for config selection. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:16 +00:00
Daniel Stone	8174e5b49e	egl/gbm: Remove duplicate format table Now that we have mask/channel information in gbm_dri's format conversion table, we can remove the copy in EGL. As this table contains more formats (notably including R8 and RG8, which can be used for BO but not surface allocation), we now compare the masks of all channels when trying to find a suitable config. Without doing this, an XRGB8888 EGLConfig would match on an R8 format. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:16 +00:00
Daniel Stone	314714ac53	gbm/dri: Expose visuals table through gbm_dri_device Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:16 +00:00
Daniel Stone	2ed344645d	gbm/dri: Add RGBA masks to GBM format table Eventually, we can replace the visuals list inside GBM EGL driver with this one. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:16 +00:00
Daniel Stone	4732094cff	egl/wayland: Use an array for modifiers Each Wayland EGLDisplay currently contains a struct with one vector of modifiers per format, hardcoded in the header. To allow easier support for more formats, turn this into an array of u_vectors which is opaque outside of platform_wayland.c. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:16 +00:00
Daniel Stone	5bc49d4cbf	egl/wayland: Remove has_format enum Instead of the has_format enum, use an index into the visual array. This makes adding new formats less typing. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:16 +00:00
Daniel Stone	d32b23f383	egl/wayland: Add bpp to visual map Both the DRI2 GetBuffersWithFormat interface, and SHM buffer allocation, had their own format -> bpp lookup tables. Replace these with a lookup into the visual map. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:16 +00:00
Daniel Stone	4de98a9c07	egl/wayland: Use visual map for DRIImage<->FourCC map When trying to translate between DRIImage format enums and FourCC codes, use our visual map rather than an open-coded subset. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:16 +00:00
Daniel Stone	68a80c11bd	egl/wayland: Use visual map for format advertisement Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:16 +00:00
Daniel Stone	3323ce72ff	egl/wayland: Use visual map for buffer_from_image When creating a wl_buffer on an upstream Wayland display from an existing EGLImage, use the dri2_wl_visual map rather than another hardcoded list of formats. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:16 +00:00
Daniel Stone	a9cc4edb60	egl/wayland: Use visual map for config->format lookup Having hoisted the format -> config map into common code, we now use it for config -> format lookups. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:15 +00:00
Daniel Stone	1dc013f1ee	egl/wayland: Add format enums to visual map Extend the visual map from only containing names and bitmasks, to also carrying the three format enums we need. These are the DRIImage format tokens for internal allocation, FourCC codes for wl_drm and dmabuf protocol, and wl_shm codes for swrast drivers. We will later use these formats to eliminate a bunch of open-coded conversions. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:15 +00:00
Daniel Stone	66912641df	egl/wayland: Use proper enum type in visual definition No semantic change. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:15 +00:00
Daniel Stone	845c2f6156	egl/wayland: Widen channel masks to bpp Widen the channel masks given in the visual table to the full width of the pixel format, i.e. as many leading zeros as required. No functional change. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:15 +00:00
Daniel Stone	19cbca38e4	egl/wayland: Hoist format <-> EGLConfig definition up Pull the mapping between Wayland formats and EGLConfigs up to the top level, so we can reuse it elsewhere. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:15 +00:00
Daniel Stone	4fbd2d50b1	egl/wayland: Fix ARGB/XRGB transposition in config map When `0b2b719121` moved from an if tree to a struct to map between wl_drm formats and EGLConfigs, it transposed the mapping between XRGB and ARGB. Luckily, everyone exposes both formats, so this is harmless. Signed-off-by: Daniel Stone <daniels@collabora.com> Fixes: `0b2b719121` ("egl/wayland: introduce dri2_wl_add_configs_for_visuals() helper") Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:06 +00:00
Marek Olšák	76085f2048	st/mesa: generate blend state according to the number of enabled color buffers Non-MRT cases always translate blend state for 1 color buffer only. MRT cases only check and translate blend state for enabled color buffers. This also avoids an assertion failure in translate_blend for: dEQP-GLES31.functional.draw_buffers_indexed.overwrite_common.common_advanced_blend_eq_buffer_blend_eq Reviewed-by: Eric Anholt <eric@anholt.net>	2018-02-09 15:52:22 +01:00
Marek Olšák	c446dd7927	st/mesa: don't translate blend state when color writes are disabled Reviewed-by: Eric Anholt <eric@anholt.net>	2018-02-09 15:52:22 +01:00
Marek Olšák	3d06c8afb5	st/mesa: don't translate blend state when it's disabled for a colorbuffer Reviewed-by: Eric Anholt <eric@anholt.net>	2018-02-09 15:52:22 +01:00
Lionel Landwerlin	712332ed54	i965: prevent potentially null pointer access Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> CID: 1418110	2018-02-09 14:02:59 +00:00
Mark Thompson	5db29d62ce	st/va: Make the vendor string more descriptive Include the Mesa version and detail about the platform. Signed-off-by: Mark Thompson <sw@jkqxz.net> Reviewed-by: Christian König <christian.koenig@amd.com>	2018-02-09 13:37:43 +01:00
Mark Thompson	768f1487b0	st/va: Enable vaExportSurfaceHandle() It is present from libva 2.1 (VAAPI 1.1.0 or higher). Signed-off-by: Mark Thompson <sw@jkqxz.net> Reviewed-by: Christian König <christian.koenig@amd.com>	2018-02-09 13:37:36 +01:00
Tapani Pälli	41c5bf3836	disk cache: move path creation back to constructor This patch moves disk cache path and index creation back to the constructor which matches previous behavior. We still allow create to succeed without path so that cache can be used with callback functionality. Fixes: c95d3ed091 "disk cache: create cache even if path creation fails" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-02-09 11:33:25 +02:00
Samuel Pitoiset	3a2bb4db23	ac/nir: compute correct number of user SGPRs on GFX9 For merged shaders. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-09 10:16:04 +01:00
Michel Dänzer	171076f082	st/mesa: Initialize tex_target in compile_tgsi_instruction Initialize to TGSI_TEXTURE_BUFFER (== 0), same as was done before the variable type was changed to enum tgsi_texture_type. Fixes a bunch of piglit failures with radeonsi, e.g.: gles-3.0-transform-feedback-uniform-buffer-object: ../../../../src/gallium/auxiliary/tgsi/tgsi_util.c:502: tgsi_util_get_texture_coord_dim: Assertion `!"unknown texture target"' failed. Corresponding compiler warning: CXX state_tracker/st_glsl_to_tgsi.lo ../../../src/mesa/state_tracker/st_glsl_to_tgsi.cpp: In function ‘pipe_error st_translate_program(gl_context, uint, ureg_program, glsl_to_tgsi_visitor, const gl_program, GLuint, const ubyte, const ubyte, const ubyte, const ubyte, const ubyte, GLuint, const ubyte, const ubyte, const ubyte)’: ../../../src/mesa/state_tracker/st_glsl_to_tgsi.cpp:5992:23: warning: ‘tex_target’ may be used uninitialized in this function [-Wmaybe-uninitialized] ureg_memory_insn(ureg, inst->op, dst, num_dst, src, num_src, ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ inst->buffer_access, ~~~~~~~~~~~~~~~~~~~~ tex_target, inst->image_format); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../../../src/mesa/state_tracker/st_glsl_to_tgsi.cpp:5866:27: note: ‘tex_target’ was declared here enum tgsi_texture_type tex_target; ^~~~~~~~~~ Fixes: `9f9ce1625f` ("st/mesa: use TGSI enum types in st_glsl_to_tgsi.cpp") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-09 09:26:40 +01:00
Alejandro Piñeiro	f32b01ca43	glsl/linker: remove ubo explicit binding handling This is already handled at link_uniform_blocks, specifically at process_block_array_leaf. Additionally, this code was not handling correctly arrays of arrays. When creating the name of the block to set the binding, it only took into account the first level, so any attempt to set a explicit binding on a array of array ubo would trigger an assertion. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-02-09 08:32:42 +01:00
Mathias Fröhlich	77cb2fc0bd	mesa: Only update enabled VAO gl_vertex_array entries. Instead of updating all modified gl_vertex_array_object::_VertexArray entries just update those that are modified and enabled. Also release buffer object from the _VertexArray that belong to disabled attributes. v2: Also set Ptr and Size to zero. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-09 04:26:23 +01:00
Mathias Fröhlich	437cae411e	gallium: Mute arrays for several meta like callbacks. Set the _DrawArray pointer to NULL when calling into the Drivers Bitmap/CopyPixels/DrawAtlasBitmaps/DrawPixels/DrawTex hooks. This fixes an assert that gets uncovered when the following patch gets applied. v2: Mute from within the state tracker instead of generic mesa. v3: Avoid evaluating _DrawArrays from within st_validate_state. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-09 04:26:13 +01:00
Mathias Fröhlich	2f9eb0aad5	mesa: Fix VAO buffer object tracking. When changing the attribute binding in the VAO we also need to account for getting rid of non vbo bits from VertexAttribBufferMask. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-09 04:21:36 +01:00
Timothy Arceri	d8bca3809d	radeonsi/nir: gather some missing fs info Fixes some early-z arb_shader_image_load_store piglit tests. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-09 12:51:27 +11:00
Timothy Arceri	c77078c942	ac: pass struct ac_llvm_context to emit_membar() Fixes segfault in piglit test: ./bin/arb_shader_image_load_store-shader-mem-barrier --quick -auto -fbo Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-09 12:51:27 +11:00
Marek Olšák	12fd567c78	radeonsi: copy the NIR enablement debug bit to the shader cache flags When NIR is enabled, TGSI must not be used. When NIR is disabled, TGSI Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-02-09 02:01:45 +01:00
Jason Ekstrand	8f20cf166e	intel/blorp: Use isl_aux_op instead of blorp_hiz_op Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	1e941a0528	intel/blorp: Use isl_aux_op instead of blorp_fast_clear_op Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	1810f965c8	anv: Allow fast-clearing the first slice of a multi-slice image Now that we're tracking aux properly per-slice, we can enable this for applications which actually care. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	de3be61801	anv/cmd_buffer: Rework aux tracking This commit completely reworks aux tracking. This includes a number of somewhat distinct changes: 1) Since we are no longer fast-clearing multiple slices, we only need to track one fast clear color and one fast clear type. 2) We store two bits for fast clear instead of one to let us distinguish between zero and non-zero fast clear colors. This is needed so that we can do full resolves when transitioning to PRESENT_SRC_KHR with gen9 CCS images where we allow zero clear values in all sorts of places we wouldn't normally. 3) We now track compression state as a boolean separate from fast clear type and this is tracked on a per-slice granularity. The previous scheme had some issues when it came to individual slices of a multi-LOD images. In particular, we only tracked "needs resolve" per-LOD but you could do a vkCmdPipelineBarrier that would only resolve a portion of the image and would set "needs resolve" to false anyway. Also, any transition from an undefined layout would reset the clear color for the entire LOD regardless of whether or not there was some clear color on some other slice. As far as full/partial resolves go, he assumptions of the previous scheme held because the one case where we do need a full resolve when CCS_E is enabled is for window-system images. Since we only ever allowed X-tiled window-system images, CCS was entirely disabled on gen9+ and we never got CCS_E. With the advent of Y-tiled window-system buffers, we now need to properly support doing a full resolve of images marked CCS_E. v2 (Jason Ekstrand): - Fix an bug in the compressed flag offset calculation - Treat 3D images as multi-slice for the purposes of resolve tracking v3 (Jason Ekstrand): - Set the compressed flag whenever we fast-clear - Simplify the resolve predicate computation logic Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	2cbfcb205e	anv/cmd_buffer: Move the mi_alu helper higher up Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	2e69045c4d	anv/image: Simplify some verbose commennts Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	f0523f70ef	anv: Use blorp_ccs_ambiguate instead of fast-clears Even though the blorp pass looks a bit on the sketchy side, the end result in the Vulkan driver is very nice. Instead of having this weird case where you do a fast clear and then maybe have to resolve, we just do the ambiguate and are done with it. The ambiguate does exactly what we want of setting all the CCS values to 0 which puts it into the pass-through state. This should also improve performance a bit in certain cases. For instance, if we did a transition from UNDEFINED to GENERAL for a surface that doesn't have CCS enabled all the time, we would end up doing a fast-clear and then a full resolve which ends up touching every byte in the main surface as well as the CCS. With the ambiguate pass, that transition only touches the CCS. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	84fd2ebfbc	anv/cmd_buffer: Re-arrange the logic around UNDEFINED fast-clears Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	3ef8c4b2f5	anv/cmd_buffer: Pull the undefined layout condition into the if Now that this isn't a multi-case if and it's just the one case, it's a bit clearer if the condition is just part of the if instead of being pulled out into a boolean variable. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	857b5b5a7f	intel/blorp: Add a CCS ambiguation pass This pass performs an "ambiguate" operation on a CCS-compressed surface by manually writing zeros into the CCS. On gen8+, ISL gives us a fairly detailed notion of how the CCS is laid out so this is fairly simple to do. On gen7, the CCS tiling is quite crazy but that isn't an issue because we can only do CCS on single-slice images so we can just blast over the entire CCS buffer if we want to. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	13b621d6fd	anv: Only fast clear single-slice images The current strategy we use for managing resolves has an issues where we track clear colors and the need for resolves per-LOD but we still allow resolves of only a subset of the slices in any given LOD and doing so sets the "needs resolve" flag for that LOD to false while leaving the remaining layers unresolved. This patch is only the first step and does not, by itself fix anything. However, it's fairly self-contained and splitting it out means any performance regressions should bisect to this nice obvious commit rather than to the giant "rework aux tracking" commit. Nanley and I did some testing and none of the applications we tested even tried to fast-clear anything other than the first slice of an image. The test was done by adding a printf right before we call blorp_fast_clear if we were every going to touch any slice other than the first with a fast-clear. Due to the way the original code was structured, this would not have included applications which only cleared a subset of layers. The applications tested were: * All Sascha Willems demos * Aztec Ruins * Dota 2 * The Talos Principle * Mad Max * Warhammer 40,000: Dawn of War III * Serious Sam Fusion 2017: BFE While not the full list of shipping applications, it's a pretty good spread and covers most of the engines we've seen running on our driver. If this is ever shown to be a performance problem in the future, we can reconsider our strategy. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	571ed588ac	anv/cmd_buffer: Add a mark_image_written helper Currently, this helper does nothing but we call it every place where an image is written through the render pipeline. This will allow us to properly mark the aux state so that we can handle resolves correctly. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	9876d6f0ef	anv/blorp: Add src/dst_level helper variables in CmdCopyImage Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	c180c2c868	anv/cmd_buffer: Add an anv_genX_call macro This is copied and pasted from the similar macro we added to ISL. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	ab7543b13d	anv/cmd_buffer: Generalize transition_color_buffer This moves it to being based on layout_to_aux_usage instead of being hard-coded based on bits of a priori knowledge of how transitions interact with layouts. This conceptually simplifies things because we're now using layout_to_aux_usage and layout_supports_fast_clear to make resolve decisions so changes to those functions will do what one expects. There is a potential bug with window system integration on gen9+ where we wouldn't do a resolve when transitioning to the PRESENT_SRC layout because we just assume that everything that handles CCS_E can handle it all the time. When handing a CCS_E image off to the window system, we may need to do a full resolve if the window system does not support the CCS_E modifier. The only reason why this hasn't been a problem yet is because we don't support modifiers in Vulkan WSI and so we always get X tiling which implies no CCS on gen9+. This patch doesn't actually fix that bug yet but it takes us the first step in that direction by making us actually pick the correct resolve op. In order to handle all of the cases, we need more detailed aux tracking. v2 (Jason Ekstrand): - Make a few more things const - Use the anv_fast_clear_support enum v3 (Jason Ekstrand): - Move an assert and add a better comment Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	151771b390	anv/cmd_buffer: Recurse in transition_color_buffer instead of falling through Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	bea7373c92	anv/image: Support color aspects in layout_to_aux_usage Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	b09464db42	anv/image: Add a helper for determining when fast clears are supported v2 (Jason Ekstrand): - Return an enum instead of a boolean v3 (Jason Ekstrand): - Return ANV_FAST_CLEAR_NONE instead of false (Topi) - Rename ANV_FAST_CLEAR_ANY to ANV_FAST_CLEAR_DEFAULT_VALUE - Add documentation for the enum values v4 (Jason Ekstrand): - Remove a dead comment Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	1f7eee6bc1	anv/image: Update a comment This got lost in all of the aspect vs. plane rebasing of YCBCR. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	5c38ab8f07	anv/blorp: Rework HiZ ops to look like MCS and CCS Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	1d473e26f2	anv/blorp: Support ISL_AUX_USAGE_HIZ in surf_for_anv_image If the function gets passed ANV_AUX_USAGE_DEFAULT, it still has the old behavior of setting ISL_AUX_USAGE_NONE for depth/stencil which is what we want for blits/copies. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	42f1668a54	anv/blorp: Rework image clear/resolve helpers This replaces image_fast_clear and ccs_resolve with two new helpers that simply perform an isl_aux_op whatever that may be on CCS or MCS. This is a bit cleaner as it separates performing the aux operation from which blorp helper we have to call to do it. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	482c24783e	intel/isl: Codify AUX operations in an enum Right now, we have different entrypoints and enums in blorp for these different operations. This provides us a central enum which we can begin to transition to. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Gert Wollny	c36172e387	r600/sb: Check whether optimizations would result in reladdr conflict v2: * Check whether the node src and dst registers are NULL before using them. * fix a type in the commit message. Two cases are handled with this patch: 1. If copy propagation tries to eliminated a move from a relative array access then it could optimize MOV R1, ARRAY[RELADDR_1] MOV R2, ARRAY[RELADDR_2] OP2 R3, R1 R2 into OP2 R3, ARRAY[RELADDR_1], ARRAY[RELADDR_2] which is forbidden, because there is only one address register available. 2. When MULADD(x,a,MUL(x,c)) is handled MUL TMP, R1, ARRAY[RELADDR_1] MULLADD R3, R1, ARRAY[RELADDR_2], TMP by folding this into ADD TMP, ARRAY[RELADDR_2], ARRAY[RELADDR_1] MUL R3, R1, TMP which is also forbidden. Test for these cases and reject the optimization if a forbidden combination of relative access would be created. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103142 Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-09 10:00:38 +10:00
Glenn Kennard	1d871aa626	r600g: Implement spilling of temp arrays (v2) Pessimistically spills arrays if GPR limit is exceeded. v2: fix r600 support [airlied] Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-09 09:53:26 +10:00
Dave Airlie	22fc5eff80	r600/sb: handle scratch mem reads on r600 On r600 we use the scratch mem with read/read_ind, in that case sb should track the rw_gpr as a dst instead of a src. This stops the whole shader being optimised out. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-09 09:53:21 +10:00
Glenn Kennard	cd34deb585	r600g/sb: Add dependency tracking for scratch ops Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-09 09:53:19 +10:00
Glenn Kennard	a100d906b2	r600g/sb: Support scratch ops Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-09 09:53:16 +10:00
Glenn Kennard	6b4303f358	r600g: Implement scratch buffer state management (v2) v2: add Glenn's fixes Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-09 09:53:12 +10:00
Glenn Kennard	9d31596d7a	r600g: Add pending output function Spills have to happen after the VLIW bundle currently processed, so defer emitting the spill op. Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-09 09:53:08 +10:00
Glenn Kennard	9c48a139b0	r600g: Support emitting scratch ops Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-09 09:52:48 +10:00
Dave Airlie	2a891ed190	r600: fix texture gather swizzling. This fixes: KHR-GL45.texture_gather.swizzle on cayman and redwood. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-09 09:32:20 +10:00
Timothy Arceri	12a2350e6d	ac: add 64bit support to ac_find_lsb() v2: use LLVMBuildTrunc() Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-09 09:42:59 +11:00
Timothy Arceri	a9f6b392c7	ac: move get_elem_bits() to ac_llvm_build.c Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-09 09:42:59 +11:00
Timothy Arceri	19f9839f0b	ac: add 64bit bitCount support v2: use LLVMBuildTrunc() Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-09 09:42:59 +11:00
Samuel Pitoiset	bb750d265c	ac/nir: clean up handle_fs_outputs_post() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:14:33 +01:00
Samuel Pitoiset	528bc14fa5	ac/nir: add radv_load_output() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:14:30 +01:00
Samuel Pitoiset	834d9845ca	ac/shader: scan info about output PS declarations NIR->LLVM should only be a translation pass, and all scan stuff should be done before. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:14:27 +01:00
Samuel Pitoiset	a8e04e91de	ac/nir: add radv_export_param() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:14:26 +01:00
Samuel Pitoiset	e3cfd6b805	ac/nir: remove set but unused export_mask Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:14:24 +01:00
Samuel Pitoiset	724136d590	ac/nir: remove dead code in handle_vs_outputs_post() The memcpy can't be reached because the condition is always false. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:14:22 +01:00
Samuel Pitoiset	c63d8d0284	ac/nir: remove useless check in si_llvm_init_export_args() values can't be NULL because we use ac_build_export_null() now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:14:20 +01:00
Samuel Pitoiset	26ab5a4269	ac/nir: use ac_build_export_null() The number of enabled channels should be 0 when exporting null. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:11:44 +01:00
Samuel Pitoiset	bd9f7b7635	ac: add ac_build_export_null() helper Imported from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-08 22:11:42 +01:00
Scott D Phillips	1f4d2433e7	meson: Add build option for tools Add a build option to control building some of the misc tools we have. Also set the executables to install, presumably you want that if you're asking for the build. v2: set 'install:' to the with_tools value, not true (Jordan) handle 'all' in a the comma list (Dylan) Add freedreno's tools (Dylan) Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-02-08 11:24:42 -08:00
Anuj Phogat	464d057c86	intel: Add Coffee Lake brand strings Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-02-08 10:26:34 -08:00
Brian Paul	11e92889aa	gallium/util: silence clang warning in blitter code Silence "warning: comparison of constant 4294967295 with expression of type 'ubyte'". Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-02-08 10:27:31 -07:00
Brian Paul	4b0a45da25	tgsi: s/unsigned/enum tgsi_semantic/ in ureg_DECL_output() So the function matches the prototype. Found with clang. v2: fix copy&paste error Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-02-08 10:27:19 -07:00
Brian Paul	d95c2d86cc	tgsi: use TGSI_INTERPOLATE_x arguments instead of zeros in ureg code TGSI_INTERPOLATE_CONSTANT and TGSI_INTERPOLATE_LOC_CENTER have the value zero so there's no change in behavior. It seems funny to declare these fs input registers with constant interpolation. But it looks like ureg_DECL_input_layout() is not called anywhere and ureg_DECL_input() is only called from util_make_geometry_passthrough_shader(). Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-08 09:49:03 -07:00
Brian Paul	26948ba761	gallium/util: s/uint/enum tgsi_semantic/ in simple shader code Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-08 09:49:03 -07:00
Brian Paul	0f40f4ffda	tgsi: s/unsigned/enum pipe_shader_type/ in ureg code And add a default switch case to silence a compiler warning. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-08 09:49:03 -07:00
Brian Paul	c0dc337ecd	gallium/util: s/uint/enum tgsi_semantic/ in u_blitter.c And put static qualifier on const arrays. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-08 09:49:03 -07:00
Brian Paul	e55de6e20c	st/mesa: s/unsigned/enum tgsi_semantic/ st_cb_drawpixels.c Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-08 09:49:03 -07:00
Brian Paul	b9ff185e41	vbo: add a comment on vbo_draw_transform_feedback() Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-08 09:49:03 -07:00
Brian Paul	93b3d38176	gallium/util: trivial whitespace/formatting fixes in u_blit.c Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-08 09:49:03 -07:00
Brian Paul	5396f8546a	vbo: improve comments on vbo_draw_func() And rename a parameter name. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-08 09:49:03 -07:00
Brian Paul	b03ade55b9	cso: add a couple sanity check assertions in cso_draw_vbo() Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-08 09:49:03 -07:00
Brian Paul	5cf342704d	st/mesa: rename some vars related to indirect draw count 'indirect_params' was a bit vague. Use the names that we use in gallium's pipe_draw_indirect_info. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-08 09:49:03 -07:00
Marek Olšák	d9e6e0bbe3	st/mesa: remove out_num_textures from update_textures Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-02-08 16:14:11 +01:00
Marek Olšák	08496c5d52	st/mesa: don't store non-fragment sampler states and views in st_context those are unused. st_context: 10120 -> 3704 bytes Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-02-08 16:14:11 +01:00
Lionel Landwerlin	e843667733	i965: perf: cleanup detection of kernel support for loadable configs The initial revision of the patch adding loadable configs was testing the feature's availability by adding a new config successfully and then removing it. A second version tested the availability just by exercising the removal. But some unused code remained. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-02-08 10:52:14 +00:00
Lionel Landwerlin	bd6c0cab60	i965: perf: use drmIoctl() instead of ioctl() ioctl() might be interrupted, use drmIoctl() instead as it'll retry automatically. Fixes: `27ee83eaf7` "i965: perf: add support for userspace configurations" Cc: "18.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Tested-by: Mark Janes <mark.a.janes@intel.com>	2018-02-08 10:51:40 +00:00
Lionel Landwerlin	0f952b778f	i965: perf: add debug messages for loaded configs This helps figuring out potential problems when metrics don't show up on frameretrace for example. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-02-08 10:51:01 +00:00
Dave Airlie	3f7a7bd897	r600: implement tg4 integer workaround. (v2) This ports the texture gather integer workaround from radeonsi. This fixes: KHR-GL45.texture_gather.plain-gather-uint/int* v2: add rect support, fix 2d array shadow Reviewed-by: Roland Scheidegger <sroland@vmware.com> (on irc) Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-08 16:21:40 +10:00
Glenn Kennard	77b1b33724	r600: clean up initial shader register setup This is taken from Glenn Kennards scratch series, but separated out as a cleanup by me. Reviewed-By: Gert Wollny <gw.fossdev@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-08 16:21:35 +10:00
Roland Scheidegger	b936f4d1ca	r600: partly fix sampleMaskIn value The hw gives us coverage for pixel, not for individual fragment shader invocations, in case execution isn't per pixel (eg, unlike cm, actually cannot do "real" minSampleShading, it's either per-pixel or per-fragment, but it doesn't really make a difference here). Also, with msaa disabled, the hw still gives us a mask corresponding to the number of samples, where GL requires this to be 1. Fix this up by masking the sampleMaskIn bits with the bit corresponding to the sampleID, if we know this shader is always executed at per-sample granularity. (In case of a per-sample frequency shader and msaa disabled, the sampleID will always be 0, so this works just fine there.) Fixing this for the minSampleShading case will need a shader key (radeonsi uses the prolog part for) (for eg, could get away with a single bit, cm would need more bits depending on sample/invocation ratio, or read the bits from a uniform), unless we'd want to always use a sample mask uniform (which is probably not a good idea, as it would make the ordinary common msaa case slower for no good reason). This fixes some parts of piglit arb_sample_shading-samplemask (with fixed test), in particular those which use a sampleID, still failing others as expected. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-08 04:07:52 +01:00
Roland Scheidegger	07d724326a	r600: clean up fragment shader input scan code For some reason, we were iterating through the code twice (first just for instructions needing barycentrics, then for instructions and input dcls). Move things around slightly so this is no longer necessary. There also was a unnedeed enabling of the fixed_pt_position_gpr - this is only needed if the per-sample interpolation comes from an input, not from an instruction (just move the assert where it belongs) (since the sample id to sample from comes from a tgsi src in this case, and isn't sampleID). Otherwise there should be no functional change. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-08 04:07:52 +01:00
Roland Scheidegger	6fd3c39590	mesa: (trivial) remove unused ignore_sample_qualifier_parameter This parameter for _mesa_get_min_incations_per_fragment() was once used by the intel driver, but it's long gone. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Dave Airlie <airlied@vmware.com>	2018-02-08 04:07:52 +01:00
Roland Scheidegger	becc7faae2	r600/cm: (trivial) code cleanup for emitting msaa state No functional change (compile tested only). Reviewed-by: Dave Airlie <airlied@redhate.com>	2018-02-08 04:07:52 +01:00
Brian Paul	b99cb13002	tgsi: use tgsi_semantic enum type in ureg code Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-07 18:43:01 -07:00
Brian Paul	174f3a4ab7	st/mesa: use tgsi_semantic enum type Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-07 18:43:01 -07:00
Brian Paul	0f7be4fc16	tgsi: use TGSI enum types in ureg code v2: fix enum tgsi_interpolate_mode/loc typo. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-07 18:42:39 -07:00
Brian Paul	9f9ce1625f	st/mesa: use TGSI enum types in st_glsl_to_tgsi.cpp Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-07 18:38:04 -07:00
Brian Paul	6321b1bd40	gallium/util: replace uint with tgsi enum types Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-07 18:38:04 -07:00
Brian Paul	15874338ff	gallium/util: replace unsigned with tgsi enum types Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-07 18:38:04 -07:00
Fredrik Höglund	5a38d8f103	radv: implement VK_EXT_external_memory_host Ported from the radeonsi GL_AMD_pinned_memory implementation. Signed-off-by: Fredrik Höglund <fredrik@kde.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 00:46:07 +01:00
Dave Airlie	5dd385f378	r600: fix rendering regression on r6/7 gpus Fixes: `2d5b5d267e` (r600: work out target mask at framebuffer bind.) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104989 Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-08 09:37:09 +10:00
Grazvydas Ignotas	f91aa68ac6	radeonsi: avoid int-to-pointer-cast warnings on 32bit I hope the actual dropping of MSB is ok, but that's what's already happened before this change. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-08 01:13:58 +02:00
Grazvydas Ignotas	13ada91740	gallium/hud: update some query functions It seems these were missed when struct pipe_context * argument was added to hud_graph::query_new_value. Fixes: `3132afdf4c` "gallium/hud: pass pipe_context explicitly to most functions" Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-08 01:12:07 +02:00
Roland Scheidegger	09f49b9e50	Revert "gallium: build ddebug, noop, rbug, trace as part of auxiliary" This reverts commit `6f82b8d8d0`. This broke scons build, and reportedly clover with autotools/meson too.	2018-02-07 23:47:39 +01:00
Marek Olšák	6f82b8d8d0	gallium: build ddebug, noop, rbug, trace as part of auxiliary Building gallium is faster by 7.5 seconds on a 4core/8thread 3GHz CPU. (gallium build time is reduced by 15% when building only radeonsi) Non-recursive makefiles are great!	2018-02-07 22:08:34 +01:00
Roland Scheidegger	def09f8db0	u_blit: (trivial) fix bogus argument order for set_fragment_shader Amazingly this still worked sometimes, albeit I'm not even sure why... This fixes `d7bec6f7a6`.	2018-02-07 22:03:18 +01:00
Andres Rodriguez	83990dd529	mesa: fix incorrect type when allocating arrays The array members are have type 'struct gl_buffer_object *' Found by coverity. Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-02-07 14:50:21 -05:00
Roland Scheidegger	d7bec6f7a6	u_blit,u_simple_shaders: add shader to convert from xrbias format We need this to handle some oddball dx10 format (DXGI_FORMAT_R10G10B10_XR_BIAS_A2_UNORM). What you can do with this format is very limited, hence we don't want to add it as a gallium format (we could not express the properties of this format as ordinary format properties neither, so like all special formats it would need specific code for handling it in any case). While here, also nuke the array for different shaders for different writemasks, as it was not actually used (always full masks are passed in for generating shaders). Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-02-07 17:09:37 +01:00
Roland Scheidegger	afd1e9be17	u_simple_shaders: fix mask handling in util_make_fragment_tex_shader_writemask The writemask handling was busted, since writing defaults to output meant they got overwritten by the tex sampling anyway. Albeit the affected components were undefined, so maybe with some luck it still would have worked with some drivers - if not could as well kill it... (This would have affected u_blitter but not u_blit since the latter always used xyzw mask.) Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-07 17:08:24 +01:00
Bas Nieuwenhuizen	5d754872b5	autotools: Only build libmesa-st-tests-common.a for tests. We don't need the library if we don't build tests, and building it adds a dependency on gtest which adds a dependency on cxxabi.h. Fixes: `6569b33b6e` "mesa/st/tests: unify MockCodeLine* classes" Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>	2018-02-07 14:04:04 +01:00
Tapani Pälli	9d322fde97	i965: add __DRI2_BLOB support and set cache functions v2: adjust to change that moved cache from ctx to screen Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-07 14:45:34 +02:00
Tapani Pälli	ae00ef2702	disk cache: add callback functionality v2: add disk_cache_has_key, disk_cache_put_key support using blob cache (Nicolai, Jordan) v3: rename set_cb as put_cb to match existing naming (Timothy) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-07 14:45:34 +02:00
Tapani Pälli	6a651b6b77	disk cache: initialize cache path and index only when used This patch makes disk_cache initialize path and index lazily so that we can utilize disk_cache without a path using callback functionality introduced by next patch. v2: unmap mmap and destroy queue only if index_mmap exists Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-07 14:45:34 +02:00
Tapani Pälli	e8495646af	glsl/tests: changes to test_disk_cache_create test Next patch will allow disk_cache instance to be created without path set for it, modify some test cases that assume disk_cache creation to fail with invalid path. Creation should succeed but simple put/get test fail. v2: leave tests as is but check that both cache struct exists and try simple put/get that should fail with invalid path set (Emil) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (v1) Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-07 14:45:34 +02:00
Tapani Pälli	83c81b6cce	glsl/tests: move utility functions in cache_test Patch moves functions higher so that we can utilize them from test_disk_cache_create which is modified by next patch. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-07 14:45:34 +02:00
Tapani Pälli	6f5b57093b	egl: add support for EGL_ANDROID_blob_cache v2: cleanup, move callbacks to _egl_display struct (Emil Velikov) adapt to earlier ctx->screen changes v3: remove useless checking, add _eglSetFuncName (Emil Velikov) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (v2) Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-07 14:45:34 +02:00
Tapani Pälli	cf4569da6b	dri: add interface for EGL_ANDROID_blob_cache extension v2: move from __DRIcontext to __DRIscreen (Emil Velikov) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-07 14:45:34 +02:00
Samuel Pitoiset	757d36ee70	ac/nir: use new pknorm_i16/u16 and pk_i16/u16 LLVM intrinsics Ported from RadeonSI. Only one F1 2017 shader is affected, code size decreased from 532 to 488 on both Polaris10 and Vega10. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-07 12:42:13 +01:00
Samuel Pitoiset	2f54d7382d	ac/nir: avoid loading unused VS input components Polaris10: Totals from affected shaders: SGPRS: 122840 -> 120984 (-1.51 %) VGPRS: 78812 -> 78440 (-0.47 %) Spilled SGPRs: 177 -> 129 (-27.12 %) Code Size: 2950028 -> 2941276 (-0.30 %) bytes Max Waves: 17899 -> 17976 (0.43 %) Vega10: Totals from affected shaders: SGPRS: 117144 -> 115776 (-1.17 %) VGPRS: 77580 -> 77532 (-0.06 %) Spilled SGPRs: 0 -> 152 (0.00 %) Code Size: 3352656 -> 3347860 (-0.14 %) bytes Max Waves: 19756 -> 19866 (0.56 %) This increases SGPRs spilling a bit with Talos, but I have some other ideas that might reduce it. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-07 12:42:09 +01:00
Samuel Pitoiset	1c57a6da5e	ac/shader: scan vertex inputs usage mask Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-07 12:42:07 +01:00
Iago Toral Quiroga	f474b19875	i965: allocate a SGVS element when VertexID or InstanceID are read Although on gen8+ platforms we can in theory use 3DSTATE_VF_SGVS to put these beyond the last vertex element it seems that we still need to allocate the SVGS element, otherwise we have observed cases where we end up reading garbage. Specifically, the CTS test mentioned below was flaky with a fail rate of ~1% on some gen9+ platforms caused by reading garbage for the gl_InstanceID value. The flakyness goes away as soon as we start allocating the SVGS element. v2: - Do this for gen8+, not just gen9+, and pull the boolean outside the #if block (Jason) Fixes flaky test: KHR-GL45.vertex_attrib_64bit.limits_test Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104335 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-07 11:11:16 +01:00
Dylan Baker	c74719cf4a	glapi: fix check_table test for non-shared glapi with meson v2: - Add glapitable_h generated source to requirements Fixes: `3218056e0e` ("meson: Build i965 and dri stack") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1) Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)	2018-02-06 15:00:17 -08:00
Dylan Baker	002fbde71e	glapi: Don't search through subdirs from glapitable.h Because meson won't put it in that folder. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-06 15:00:17 -08:00
Dylan Baker	aac3d01178	state_tracker: Don't build st-renumerate-test without shared glapi Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-06 15:00:17 -08:00
Dylan Baker	0316aa432d	glapi: remove APPLE extensions from test Fixes: `7009955281` ("mesa: Remove GL_APPLE_vertex_array_object stubs") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com>	2018-02-06 15:00:17 -08:00
Dylan Baker	a4f1fc5dd1	glapi/check_table: Remove 'extern "C"' block Using 'extern "C"' around includes is always incorrect, as the header may contain C++ symbols (as it does in this case), which means it cannot use C linkage. In this case the header has a template in it, which obviously cannot be linked with C linkage rules. Fixes: `a29ad2b421` ("mesa/tests: Add tests for the generated dispatch table") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-06 15:00:17 -08:00
Dylan Baker	105178db8f	meson: fix test source name for static glapi fixes: `43a6e84927` ("meson: build mesa test.") Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-06 15:00:17 -08:00
Dylan Baker	9be7487f30	glapi: don't walk backwards for includes Instead just set the proper -I flags and include it from a more standard path. In this case we'll add -Isrc/mesa (which is common), and #include main/foo.h. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-06 15:00:17 -08:00
Brian Paul	e7a4536e64	mesa: rename gl_vertex_array_object::_VertexAttrib -> _VertexArray Since the type is gl_vertex_array. Update comment to explain that these arrays are only used by the VBO module. Also rename some local variables in _mesa_update_vao_derived_arrays(). Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-02-06 15:36:47 -07:00
Brian Paul	d9ab39ea65	mesa: minor whitespace fixes, line wrapping in texcompress.c Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-02-06 15:23:26 -07:00
Brian Paul	b38196b452	mesa: simplify _mesa_get_compressed_formats() Instead of testing for formats==NULL everywhere, just point formats at a dummy array which will be discarded. Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-02-06 15:23:26 -07:00
Vlad Golovkin	d919ff0f27	util: remove redundant check for the __clang__ macro Clang defines __GNUC__ macro, so one doesn't need to check __clang__ macro in this particular case. v2: added comment as per Brian Paul's suggestion Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-06 15:23:26 -07:00
Brian Paul	77bc74e674	st/mesa: use st_access_flags_to_transfer_flags() helper in more places Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-06 15:23:26 -07:00
Brian Paul	1852a2e1a2	st/mesa: refactor st_bufferobj_map_range() Use a new helper function, st_access_flags_to_transfer_flags(), to convert the GL_MAP_x flags to PIPE_TRANSFER_x flags. We'll be able to use this function in a couple other places. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-06 15:23:26 -07:00
Brian Paul	8a32dd2ec9	st/mesa: refactor bufferobj_data() Split out some of the code into three new helper functions: buffer_target_to_bind_flags(), storage_flags_to_buffer_flags(), buffer_usage() to make the code more managable. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-06 15:23:26 -07:00
Samuel Pitoiset	3488a3f033	radv: run nir_opt_shrink_load LLVM can't shrink loads. Polaris10: Totals from affected shaders: SGPRS: 62528 -> 59955 (-4.11 %) VGPRS: 44708 -> 44616 (-0.21 %) Spilled SGPRs: 16 -> 8 (-50.00 %) Code Size: 1355504 -> 1355172 (-0.02 %) bytes Max Waves: 11710 -> 11670 (-0.34 %) Vega10: Totals from affected shaders: SGPRS: 51448 -> 50371 (-2.09 %) VGPRS: 39140 -> 39048 (-0.24 %) Spilled SGPRs: 16 -> 16 (0.00 %) Code Size: 1307188 -> 1304296 (-0.22 %) bytes Max Waves: 11312 -> 11292 (-0.18 %) This reduces SGPRs spilling in MadMax, and it also reduces number of SGPRs in DOW3 and F12017. The number of waves slightly decreases in F1 but I don't see any performance changes after benchmarking it. Talos and Serious Sam are not affected because they don't use any push constants. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-06 23:08:44 +01:00
Samuel Pitoiset	e68562b94b	nir: add nir_opt_shrink_load pass This is a very simple pass that just shrinks load_push_constant intrinsics when some components are unused. For now, it can just shrink vec4 to vec3, vec3 to vec2 and so on. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-06 23:08:39 +01:00
Timothy Arceri	e2ea9e1191	radeonsi/nir: add nir support for compiling compute shaders Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-07 08:43:08 +11:00
Timothy Arceri	9c52902c76	ac/radeonsi: add num_work_groups to the abi Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-07 08:43:08 +11:00
Timothy Arceri	f12e2f9c12	ac: implement nir_intrinsic_shader_clock Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-07 08:43:08 +11:00
Timothy Arceri	b7b89bbddb	ac/radeonsi: create ac_build_shader_clock() helper Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-07 08:43:08 +11:00
Timothy Arceri	d116af383f	ac/radeonsi: add load_local_group_size() to the abi Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-07 08:43:08 +11:00
Timothy Arceri	f6932d1ef3	radeonsi: add get_block_size() helper This will be reused by the nir backend in a later patch. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-07 08:43:08 +11:00
Timothy Arceri	e3ebffdbb0	ac: don't call emit_outputs() for compute Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-07 08:43:08 +11:00
Timothy Arceri	c8066cdfa7	ac/radeonsi: add local_invocation_ids to the abi Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-07 08:43:08 +11:00
Timothy Arceri	fa5239c153	ac/radeonsi: add workgroup_ids to the abi Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-07 08:43:08 +11:00
Timothy Arceri	64c10c9737	radeonsi/nir: gather some compute info in si_nir_scan_shader() Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-07 08:43:08 +11:00
Timothy Arceri	1142b1d3e1	radeonsi/nir: always set input_usage_mask as using all components This fixes a regression for now, in the future we should gather the used components properly. V2: just set for VS and correctly handle doubles Fixes: `be973ed21f` "radeonsi: load the right number of components for VS inputs and TBOs" Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-07 08:38:52 +11:00
Timothy Arceri	ffeebcfa7e	i965: remove unused brw_nir_lower_cs_shared() This has been unused since `8761a04d0d`. Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-02-07 08:38:01 +11:00
Bas Nieuwenhuizen	a3e42e7a69	vulkan/wsi: Fix OOM behavior with prime images. Fixes: `d50937f137` "vulkan/wsi: Implement prime in a completely generic way" Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-06 21:52:39 +01:00
Bas Nieuwenhuizen	c7d640fbbf	ac/nir: fix GS load input type. Fixes: `df1d5174fc` "ac/nir: replace SI.buffer.load.dword with amdgcn.buffer.load" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-06 21:52:38 +01:00
Mathias Fröhlich	e8a9473d32	mesa: Factor out _mesa_disable_vertex_array_attrib. And use it in the enable code path. Move _mesa_update_attribute_map_mode into its only remaining file. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-06 21:20:14 +01:00
Mathias Fröhlich	236657842b	vbo: Move vbo_rebase into its only caller module tnl. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-06 21:20:14 +01:00
Mathias Fröhlich	2313c33e95	mesa: Use atomics for buffer objects reference counts. The mutex is currently used for reference counting and updating the minmax index cache. The change uses atomics directly for reference counting and the mutex for the minmax cache. This is safe since the reference count is not modified beside in _mesa_reference_buffer_object where atomics aim to be used. While using the minmax cache, the calling code holds a reference to the buffer object. Thus unreferencing or even referencing the buffer object does not need to be serialized with accessing the minmax cache. The change reduces the time _mesa_reference_buffer_object_ takes by about a factor of two when looking at perf results for some of my favorite use cases. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-06 21:20:14 +01:00
Dave Airlie	6c691081a1	r600: fixup sparse color exports. If we have gaps in the shader mask we have to have 0x1 in them according to a comment in radeonsi, and this is required to fix the test at least on cayman. We also need to record the highest one written to write to the ps exports reg. This fixes: KHR-GL45.enhanced_layouts.fragment_data_location_api Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-07 06:16:59 +10:00
Dave Airlie	2d5b5d267e	r600: work out target mask at framebuffer bind. If we only get 1,2,3,6 framebuffers we want a sparse target mask. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-07 06:16:55 +10:00
Dave Airlie	5b14e06d8b	r600: work out shader export mask at shader build time (v1.1) Since enhanced layouts allows setting specific MRT outputs, we can get sparse outputs, so we have to calculate the shader mask earlier. v1.1: update checks for state update (Roland) Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-07 06:16:27 +10:00
Dave Airlie	f292eceae1	r600: fix xfb stream check. This fixes: KHR-GL45.enhanced_layouts.xfb_vertex_streams Reviewed-by: Roland Scheidegger <sroland@vmware.com> Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-07 06:08:12 +10:00
Dave Airlie	680cb9898a	r600/compute: add render cond support. Set render cond and emit atom. Fixes: KHR-GL45.compute_shader.conditional-dispatching Reviewed-by: Roland Scheidegger <sorland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-07 06:08:12 +10:00
Dave Airlie	5fd7b282b3	r600: fix not-very indirect compute We need to get the grid sizes earlier to fill in to the const buffer. Fixes: KHR-GL45.compute_shader.built-in-variables and KHR-GL45.compute_shader.dispatch-indirect Reviewed-by: Roland Scheidegger <sorland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-07 06:08:12 +10:00
Dave Airlie	00a112641b	r600: overhaul buffer resource query. This cleans up and fixes the previous fix even more. Buffers from textures start at max const, buffers from buffers/images come in from the 168 offset. This fixes a bunch of: KHR-GL45.shader_storage_buffer_object* Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-07 06:08:12 +10:00
Dave Airlie	736b150768	r600/eg: fix buffer sizing. For buffers we want the size in bytes, For images we want it in elements. This fixes: KHR-GL45.shader_storage_buffer_object.advanced-unsizedArrayLength-cs-std430-vec-pad Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-07 06:08:12 +10:00
Dave Airlie	c9c4f0b722	r600/images: set offset for compute shaders with number of declared samplers for frag shaders we get a value in the key, I expect I need to make compute work better Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-07 06:08:12 +10:00
Dave Airlie	ab5cee4c24	r600/compute: only mark buffer/image state dirty for fragment shaders The compute emission path always emits this currently, and emitting it on the fragment path breaks the blitter. This fixes gpu hangs in KHR-GL45.compute_shader.resource-texture Reviewed-by: Roland Scheidegger <sorland@vmware.com> Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-07 06:08:12 +10:00
Dave Airlie	4e3b43f180	r600/atomic: fix ATOMCAS instruction. This has 4 srcs. This fixes: KHR-GL45.shader_atomic_counter_ops_tests.ShaderAtomicCounterOpsExchangeTestCase Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-07 06:08:11 +10:00
Dave Airlie	8bdad9fa1f	r600/sb/cayman: fix indirect ubo access on cayman With sb enabled on cayman, this was overwriting the proper cf index value with random ones if the dst gpr was 2 or 3, only save the value for a MOVA instruction. Fixes: KHR-GL45.gpu_shader5.uniform_blocks_array_indexing (on cayman with sb) Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-07 06:08:11 +10:00
Dave Airlie	012100b809	r600/eg: use texture target to pick array size not view target (v2) This fixes a few CTS cases in : KHR-GL45.texture_view.view_sampling some multisample cases are still broken, but not sure this is the same problem. v2: fix more cases Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-07 06:08:11 +10:00
Dave Airlie	e7e81f362d	radv: don't support tc-compat on multisample d32s8 at all. RX550 fails dEQP-VK.renderpass.suballocation.multisample.d32_sfloat_s8_uint.samples_2 So increase the range of the workaround. Fixes: `f4c534ef6` (radv: don't enable tc compat for d32s8 + 4/8 samples (v1.1)) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-06 19:56:00 +00:00
Michal Navratil	4081e08896	winsys/amdgpu: allow non page-aligned size bo creation from pointer Fix INVALID_OPERATION caused by BufferData with target EXTERNAL_VIRTUAL_MEMORY_BUFFER_AMD when the buffer size is not page aligned. Signed-off-by: Marek Olšák <marek.olsak@amd.com> Cc: 17.3 18.0 <mesa-stable@lists.freedesktop.org>	2018-02-06 18:51:12 +01:00
Jon Turney	9440599c8e	meson: ensure xmlpool/options.h is generated for libgallium In file included from ../src/gallium/targets/dri/target.c:1: In file included from ../src/gallium/auxiliary/target-helpers/drm_helper.h:8: ../src/util/xmlpool.h:103:10: fatal error: 'xmlpool/options.h' file not found See also `26bde1e3`. Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-02-06 15:56:12 +00:00
Andres Gomez	1ec88755c2	vbo: provide 64bits support to print_draw_arrays Cc: Mathias Fröhlich <mathias.froehlich@web.de> Cc: Brian Paul <brianp@vmware.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-02-06 15:30:29 +02:00
Andres Gomez	0057ae4038	vbo: take into account the size when printing VAO elements When using print_draw_arrays for debugging, we were printing an "n" amount of vertex but that meant not to print all the size in the "n" vertex, depending on the stride used. Now we print the whole size in the "n" vertex. Cc: Mathias Fröhlich <mathias.froehlich@web.de> Cc: Brian Paul <brianp@vmware.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-02-06 15:30:23 +02:00
Andres Gomez	c9325b4fa9	vbo: print first element of the VAO when the binding stride is 0 Cc: Mathias Fröhlich <mathias.froehlich@web.de> Cc: Brian Paul <brianp@vmware.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-02-06 15:30:12 +02:00
Iago Toral Quiroga	a5053ba27e	anv/device: initialize the list of enabled extensions properly The loop goes through the list of enabled extensions marking them as enabled in the list, but this relies on every other extension being initialized to false by default. This bug would make us, for example, advertise certain device extension entry points as available even when the corresponding extensions had not been enabled. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Fixes: `abc62282b5` "anv: Add a per-device table of enabled extensions" Cc: "18.0" <mesa-stable@lists.freedesktop.org>	2018-02-06 07:51:00 +01:00
Iago Toral Quiroga	ef439a4fdc	spirv: split constant initializers on in/out structs The SPIR-V parser splits in/out struct variables and creates a separate variable for each first-level member of the struct. When the struct variable has an initializer this means that we also need to split the initializer. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-06 07:50:18 +01:00
Iago Toral Quiroga	1d20001d97	i965/nir: do int64 lowering before optimization Otherwise loop unrolling will fail to see the actual cost of the unrolling operations when the loop body contains 64-bit integer instructions, and very specially when the divmod64 lowering applies, since its lowering is quite expensive. Without this change, some in-development CTS tests for int64 get stuck forever trying to register allocate a shader with over 50K SSA values. The large number of SSA values is the result of NIR first unrolling multiple seemingly simple loops that involve int64 instructions, only to then lower these instructions to produce a massive pile of code (due to the divmod64 lowering in the unrolled instructions). With this change, loop unrolling will see the loops with the int64 code already lowered and will realize that it is too expensive to unroll. v2: Run nir_algebraic first so we can hopefully get rid of some of the int64 instructions before we even attempt to lower them. Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-02-06 07:49:27 +01:00
Ilia Mirkin	02a6d901ee	mesa: add OES_EGL_image_external_essl3 support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-02-06 07:28:11 +02:00
Vinson Lee	fe32f796f2	r600/fp64: Fix build. CC r600_shader.lo r600_shader.c: In function ‘egcm_int_to_double’: r600_shader.c:4543:12: error: ‘ctx’ is a pointer; did you mean to use ‘->’? if (ctx.bc->chip_class == CAYMAN) ^ -> Fixes: `35b4301577` ("r600/fp64: fix integer->double conversion") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-02-05 15:32:20 -08:00
Dave Airlie	35b4301577	r600/fp64: fix integer->double conversion Doing a straight uint/int->fp32->fp64 conversion causes some precision issues, Roland suggested splitting the integer into two portions and doing two separate int->fp32->fp64 conversions then adding the results. This passes the tests in CTS and piglit. [airlied: fix cypress conversion opcodes] Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-06 08:21:48 +10:00
Samuel Pitoiset	0170ae1e23	ac/nir: remove emission of nir_op_fdiv RadeonSI and RADV lower fdiv. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-05 23:09:34 +01:00
Jon Turney	b5af199f92	travis: add macOS meson build v2: Simplify set of options now we have better defaults Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-05 19:42:01 +00:00
Jon Turney	80bc41b2ec	meson: osx ld doesn't support --build-id Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-02-05 19:40:43 +00:00
Jon Turney	ea8730024f	meson: build src/glx/apple Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-05 19:40:43 +00:00
Dylan Baker	569628dd24	meson: set apple glx defines Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-05 19:40:43 +00:00
Jon Turney	4772909447	meson: better defaults for osx, windows and cygwin set suitable defaults for 'dri-drivers', 'gallium-drivers', 'vulkan-drivers' and 'platforms' options for osx, windows and cygwin, adding cygwin where appropriate. v2: error() for unknown OS Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-02-05 19:34:37 +00:00
Matt Turner	e2b31e9acf	i965: Move mistakenly placed line Ken called this out in review, but it seems I forgot to make the change. I noticed that the control flow annotations in the fragment shader disassembly of tests/shaders/glsl-fs-loop-continue.shader_test were not correct, and moving this line to the correct place fixes it.	2018-02-05 09:50:56 -08:00
Juan A. Suarez Romero	4195eed961	glsl/linker: check same name is not used in block and outside According with OpenGL GLSL 3.20 spec, section 4.3.9: "It is a link-time error if any particular shader interface contains: - two different blocks, each having no instance name, and each having a member of the same name, or - a variable outside a block, and a block with no instance name, where the variable has the same name as a member in the block." This fixes a previous commit `9b894c8` ("glsl/linker: link-error using the same name in unnamed block and outside") that covered this case, but did not take in account that precision qualifiers are ignored when comparing blocks with no instance name. With this commit, the original tests KHR-GL*.shaders.uniform_block.common.name_matching keep fixed, and also dEQP-GLES31.functional.shaders.linkage.uniform.block.differing_precision regression is fixed, which was broken by previous commit. v2: use helper varibles (Matteo Bruni) Fixes: `9b894c8` ("glsl/linker: link-error using the same name in unnamed block and outside") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104668 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104777 CC: Mark Janes <mark.a.janes@intel.com> CC: "18.0" <mesa-stable@lists.freedesktop.org> Tested-by: Matteo Bruni <matteo.mystral@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2018-02-05 18:10:43 +01:00
Juan A. Suarez Romero	3d14e72057	mesa: enable ASTC format for CompressedTexSubImage3D If extensions GL_KHR_texture_compression_astc_hdr or GL_KHR_texture_compression_astc_sliced_3d are implemented then ASTC format are supported in CompressedTexÎmage3D. Fixes KHR-GLES2.texture_3d. with this format. CC: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2018-02-05 17:00:19 +01:00
Stephan Gerhold	02e2009b92	util/build-id: Fix address comparison for binaries with LOAD vaddr > 0 build_id_find_nhdr_for_addr() fails to find the build-id if the first LOAD segment has a virtual address other than 0x0. For most shared libraries, the first LOAD segment has vaddr=0x0: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align LOAD 0x000000 0x00000000 0x00000000 0x2d2e26 0x2d2e26 R E 0x1000 LOAD 0x2d2e54 0x002d3e54 0x002d3e54 0x2e248 0x2f148 RW 0x1000 However, compiling the Intel Vulkan driver as 32-bit binary on Android produces the following ELF header with vaddr=0x8000 instead: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align PHDR 0x000034 0x00008034 0x00008034 0x00100 0x00100 R 0x4 LOAD 0x000000 0x00008000 0x00008000 0x224a04 0x224a04 R E 0x1000 LOAD 0x225710 0x0022e710 0x0022e710 0x25988 0x27364 RW 0x1000 build_id_find_nhdr_callback() compares the address of dli_fbase from dladdr() and dlpi_addr from dl_iterate_phdr(). With vaddr > 0, these point to a different memory address, e.g.: dli_fbase=0xd8395000 (offset 0x8000) dlpi_addr=0xd838d000 At least on glibc and bionic (Android) dli_fbase refers to the address where the shared object is mapped into the process space, whereas dlpi_addr is just the base address for the vaddrs declared in the ELF header. To compare them correctly, we need to calculate the start of the mapping by adding the vaddr of the first LOAD segment to the base address. Note: musl users will need the following patch. https://git.musl-libc.org/cgit/musl/commit/?id=b3ae7beabb9f0c219bb8a8b63567a01c6530c1ac Cc: Chad Versace <chadversary@chromium.org> Cc: <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104642 Fixes: `5c98d38` "util: Query build-id by symbol address, not library name" Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-02-05 14:26:33 +00:00
Boyuan Zhang	d645b0850a	radeonsi: enable vcn encode for HEVC main Enable vcn encode for HEVC main profile on Raven. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-02-05 09:16:18 -05:00
Boyuan Zhang	5534a2791f	st/va: implement HEVC encode functions Implement HEVC encode functions based on VAAPI HEVC encode interface. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-02-05 09:16:18 -05:00
Boyuan Zhang	9ac50a2e0c	st/va: add HEVC encode functions Add a separate file for HEVC encode functions. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-02-05 09:16:18 -05:00
Boyuan Zhang	66087d8a2d	st/va: enable dual instances encode only for H264 Logics that related to dual instances encode should only be done for H264, not other codecs. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-02-05 09:16:18 -05:00
Boyuan Zhang	a9c0861c6c	st/va: add entrypoint check for HEVC Add entrypoint check for HEVC to differentiate decode and encode jobs. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-02-05 09:16:18 -05:00
Boyuan Zhang	ecc3944344	st/va: add HEVC picture desc Add HEVC picture desc, and add codec check when creating and destroying context. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-02-05 09:16:18 -05:00
Boyuan Zhang	9393b53c29	st/va: move H264 enc functions into separate file Move all H264 encode related functions into separate file. Similar to VAAPI decode side, there will be separate file for each codec on encode side as well. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-02-05 09:16:18 -05:00
Boyuan Zhang	b391d34916	radeon/vcn: add header implementations for HEVC Implement encoding of sps, pps, vps, aud, and slice headers for HEVC based on HEVC specs. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-02-05 09:16:18 -05:00
Boyuan Zhang	fdc952b320	radeon/vcn: add ib implementations for HEVC Implement required ibs for vcn HEVC encode. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-02-05 09:16:18 -05:00
Boyuan Zhang	5ab73edddb	radeon/vcn: support picture parameters for HEVC Pass pipe_picture_desc instead of pipe_h264_enc_picture_desc so that it can be used for different codecs. Add functions to handle picture parameters that will be used for HEVC encode. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-02-05 09:16:18 -05:00
Boyuan Zhang	db67d04df3	radeon/vcn: add vcn encode interface for HEVC Add vcn encode interface for HEVC, and rename radeon_enc_h264_enc_pic to radeon_enc_pic since radeon_enc_pic is used by both H264 and HEVC. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-02-05 09:16:18 -05:00
Boyuan Zhang	f410936439	vl: add parameters for HEVC encode Add HEVC encode interface Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-02-05 09:16:18 -05:00
Eric Anholt	aa2f609f70	broadcom/vc5: Ignore samplers for finding uniform offsets. Fixes: KHR-GLES3.shaders.struct.uniform.sampler_array_fragment KHR-GLES3.shaders.struct.uniform.sampler_array_vertex KHR-GLES3.shaders.struct.uniform.sampler_nested_fragment KHR-GLES3.shaders.struct.uniform.sampler_nested_vertex	2018-02-05 13:56:02 +00:00
Eric Anholt	63a8a0f3c0	broadcom/vc5: Fix non-mipfiltered sampling. We need to clamp the LOD to 0 if mip filtering is disabled. This is part of fixing KHR-GLES3.shaders.struct.uniform.sampler_array_fragment.	2018-02-05 13:53:38 +00:00
Eric Anholt	e29988c908	broadcom/vc5: Fix "hardwrae" typo in a field name in XML.	2018-02-05 13:53:38 +00:00
Samuel Pitoiset	a1d568c830	ac/nir: fix a crash in load_gs_input() on pre-GFX9 chips Fixes: `df1d5174fc` ("ac/nir: replace SI.buffer.load.dword with amdgcn.buffer.load") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-05 11:05:52 +01:00
Eric Anholt	8bb000f460	broadcom/vc5: Try to merge more than 2 QPU instructions together. Obviously it would be good to have an ADD and a MUL and a signal together, but we can even potentially have multiple signals merged, as well. total instructions in shared programs: 100423 -> 97874 (-2.54%) instructions in affected programs: 78812 -> 76263 (-3.23%)	2018-02-05 09:29:37 +00:00
Eric Anholt	dc78643ace	broadcom/vc5: Remove no-op MOVs after register allocation. We emit some MOVs to track lifetimes of payload registers, but we don't need there to be actual MOV instructions for them. total instructions in shared programs: 101045 -> 100423 (-0.62%) instructions in affected programs: 37083 -> 36461 (-1.68%)	2018-02-05 09:29:37 +00:00
Eric Anholt	f3978a7380	broadcom/vc5: Add missing shader-db instruction counting. I must have misplaced it in the instruction packing rework.	2018-02-05 09:29:37 +00:00
Dave Airlie	7801425028	r600: fix resq for buffer images. If this is an image buffer, we need to calculate the correct resource id. Fixes: KHR-GL45.shader_image_size.* Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-05 05:15:41 +10:00
Dave Airlie	6c1432f0be	r600/eg: fix cube map array buffer images. This fixes a crash in: KHR-GL45.texture_cube_map_array.texture_size_compute_sh. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-05 05:14:56 +10:00
Marek Olšák	af3685d149	mesa: change ctx->Color.ColorMask into a 32-bit bitmask 4 bits per draw buffer, 8 draw buffers in total --> 32 bits. This is easier to work with. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-02-04 01:50:10 +01:00
Jordan Justen	83e60ce927	i965: Create new program cache bo when clearing the program cache When the disk shader cache CI testing was enabled, we started noticing occasional failures on deqp test runs. (Mainly SNB, rarely HSW) Before this change, when we cleared the (in memory) program cache we reused the same bo. Since the disk shader cache quickly restores programs, it appears that this would lead to overwrites of the older program binaries in the in memory program cache that apparently were still executing in some cases. If these programs were still executing, this could cause a GPU hang. This issue is probably not disk shader cache specific, but may have been hidden due to the compiler taking time to recompile programs after the cache was cleared. v2: * Don't add `copy` param to brw_cache_new_bo (Ken) * Call from brw_program_cache_check_size (Ken) Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-03 12:16:58 -08:00
Jason Ekstrand	589e9db23f	aubinator: Multiply count by 4 to compute buffer sizes The count field is in terms of dwords and not bytes. In `7d4007d58a`, I fixed one instance of this but missed another.	2018-02-02 22:30:56 -08:00
Eric Anholt	2e746bc63d	broadcom/vc5: Enable UIF XOR on textures. This should increase performance by reducing SDRAM bank conflicts when crossing between UIF columns (particularly on power-of-two height textures). The uif_xor_disable setup is dropped, since we need to allow XOR on lower miplevels even when level 0 is XOR. The level 0 force UIF and level 0 XOR flags should handle setting XOR properly on imported buffers.	2018-02-02 16:50:02 -08:00
Eric Anholt	6a862b0de7	broadcom/vc5: Fix alignment of miplevel 1 with UIF. The alignment here means that we can't get back the padded height from the size/stride any more, so it's now a field in the slice as well. Fixes piglit fbo-generatemipmap-formats RGBA16 NPOT.	2018-02-02 16:27:49 -08:00
Eric Anholt	5c57e0a549	broadcom/vc5: Switch our RGBA4 support to the new gallium format. Fixes fbo-generatemipmap-formats, fbo-alphatest-formats, etc. tests for GL_RGBA4, GL_RGB4, GL_RGBA2, etc.	2018-02-02 16:27:49 -08:00
Eric Anholt	2a97f1d3ef	gallium: Add a new A4B4G4R4 pipe format for Broadcom. The VC5 HW puts A in the low bits and R in the high bits. We can't just swizzle in the shaders because the blending HW can't pick what channel A is in, so make a new format to match it. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-02 16:27:49 -08:00
Eric Anholt	1429cd74c2	mesa: Drop incorrect A4B4G4R4 _mesa_format_matches_format_and_type() cases. swapBytes operates on bytes, not 4-bit channels, so you can't just take non-swapBytes cases and flip the REV flag. Avoids piglit texture-packed-formats regressions when enabling the ABGR4444 format. Fixes: `c5a5c9a7db` ("mesa/formats: add new mesa formats and their pack/unpack functions.") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-02 16:27:49 -08:00
George Kyriazis	bbef9474fa	meson/swr: Updated copyright dates cc: mesa-stable@lists.freedesktop.org cc: dylan@pnwbakers.com Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-02-02 17:43:07 -06:00
George Kyriazis	16bf813830	meson/swr: re-shuffle generated files Move generated files from codegen/meson.build to other directories, in order to satisfy generated include file dependencies Add correct file lists for architecture-specific libraries. cc: mesa-stable@lists.freedesktop.org cc: dylan@pnwbakers.com Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-02-02 17:43:00 -06:00
Marek Olšák	3bf1e036e8	amd: remove support for LLVM 3.9 Only these are supported: - LLVM 4.0 - LLVM 5.0 - LLVM 6.0 - master (7.0) Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-02 23:47:40 +01:00
Dylan Baker	c75a4e5b46	meson: Check for actual LLVM required versions Currently we always check for 3.9.0, which is pretty safe since everything except radv work with >= 3.9 and 3.9 is pretty old at this point. However, radv actually requires 4.0, and there is a patch for radeonsi to do the same. Fixes: `673dda8330` ("meson: build "radv" vulkan driver for radeon hardware") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-02 13:22:58 -08:00
Dylan Baker	d7235ef83b	meson: Don't confuse the install and search paths for dri drivers Currently there is not a separate option for setting the search path of DRI drivers in meson, like there is in scons and autotools. This is an oversight and needs to be fixed. This adds an extra option `dri-search-path`, which will default to the value of `dri-drivers-path`, like autotools does. v2: - Split input list before joining. v3: - use : instead of ; as the delimiter. The autotools help string incorrectly says ; but the code uses : v4: - Take list in pre : delimited form (Ilia) - Ensure that the dri-search-path is absolute when using dri_drivers_path Fixes: `db9788420d` ("meson: Add support for configuring dri drivers directory.") Reported-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> (v2) Reviewed-by: Eric Engestrom <eric@engestrom.ch> (v3)	2018-02-02 11:01:42 -08:00
Marek Olšák	847d0a393d	radeonsi: use pknorm_i16/u16 and pk_i16/u16 LLVM intrinsics Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-02 16:46:22 +01:00
Jon Turney	b3a1d9588e	travis: add osx autotools build Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-02 15:28:52 +00:00
Jon Turney	4701379d96	travis: pip -> pip2 On travis, for OSX, python2 from homebrew is pre-installed. per [1]: python points to the macOS system Python (with no manual PATH modification) python2 points to Homebrew’s Python 2.7.x (if installed) python3 points to Homebrew’s Python 3.x (if installed) pip doesn't exist pip2 points to Homebrew’s Python 2.7.x’s pip (if installed) pip3 points to Homebrew’s Python 3.x’s pip (if installed) We will end up using 'python2' for building mesa. Just use 'pip2' instead of 'pip', as that seems to work for all platforms on travis. [1] https://docs.brew.sh/Homebrew-and-Python.html Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-02 15:28:52 +00:00
Jon Turney	7d1ec6d6a9	travis: conditionalize building of prerequisites on if OS=linux Use a '\|' YAML literal block to avoid the convoluted syntax needed to put the entire conditional on a single line. Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-02 15:28:52 +00:00
Jon Turney	63041ba613	glx/test: fix building for osx An additional stub for applegl_create_context() is needed Cannot test indirect API as it's not built on osx, currently Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-02 15:28:52 +00:00
Andres Gomez	4761a8fea6	i965: check if upload is 0 explicitely, when downsizing a format downsize_format_if_needed takes an integer as number of uploads parameter. Hence, let's do an integer comparation instead of a boolean check, since that is confusing. Since we are at it, fix a couple of wrongly tabbed indents. Cc: Alejandro Piñeiro <apinheiro@igalia.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2018-02-02 16:32:30 +02:00
Marek Olšák	51d36f5e02	mesa: don't flag _NEW_COLOR for KHR adv.blend if prog constant doesn't change This only affects drivers that set DriverFlags.NewBlend. v2: - fix typo advanded -> advanced - return "enum gl_advanced_blend_mode" from _mesa_get_advanced_blend_sh_constant - don't call FLUSH_VERTICES twice Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-02-02 15:06:47 +01:00
Samuel Pitoiset	df1d5174fc	ac/nir: replace SI.buffer.load.dword with amdgcn.buffer.load The old one generates useless instructions in there, found while comparing geometry shaders between RadeonSI and RADV. This improves all Vulkan demos that use geometry shaders, +4% for deferredshadows, +9% for viewportarray, +7% for geometryshader on Polaris10. This seems to also improve DOW3 a little bit (+1%). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-02 12:32:21 +01:00
Dave Airlie	f9c121c420	r600/eg: add crap indirect compute support. I think the cp packets can be made work, but I think it might need a kernel change, so for now just do the worst thing. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-02 16:50:18 +10:00
Jason Ekstrand	2f7205be47	i965: Call prepare_external after implicit window-system MSAA resolves This fixes some rendering corruption in a couple of Android apps that use window-system MSAA. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104741 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-02-01 21:45:25 -08:00
Roland Scheidegger	c2f0e08857	r600: don't do stack workarounds for hemlock By the looks of it it seems hemlock is treated separately to cypress, but certainly it won't need the stack workarounds cedar/redwood (and seemingly every other eg chip except cypress/juniper) need. (Discovered by accident.) Acked-by: Alex Deucher <alexander.deucher@amd.com>	2018-02-02 01:46:43 +01:00
Dave Airlie	8fa5aade43	r600: initial attempt at gl_HelperInvocation (v3) This passes the CTS and piglit tests. This also disable sb for helper invocations until it doesn't mess up the VPM flags. Thanks to Ilia and Glenn for advice, and Roland for working out the working evergreen path. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-02 09:46:05 +10:00
Bas Nieuwenhuizen	2ffe395cba	radv: Don't expose VK_KHX_multiview on android. deqp does not allow any KHX extensions, and since deqp is included in android-cts, android does not allow any khx extensions. So disable VK_KHX_multiview on android. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> CC: 18.0 <mesa-stable@lists.freedesktop.org>	2018-02-01 23:32:48 +01:00
Mathias Fröhlich	5b3d58520f	vbo: Simplify input array distribution for dlist type draws. Using the newly introduced VAO array maps, we can simplify vbo_bind_vertex_list. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-01 22:39:08 +01:00
Mathias Fröhlich	fb10a7b7b0	vbo: Simplify input array distribution for imm type draws. Using the newly introduced VAO array maps, we can simplify vbo_exec_bind_arrays. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-01 22:39:08 +01:00
Mathias Fröhlich	44b1454b96	vbo: Simplify input array distribution for array type draws. Using the newly introduced VAO state variable, we can simplify recalculate_input_bindings. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-01 22:39:07 +01:00
Mathias Fröhlich	3d4fb879dd	vbo: Use static const VERT_ATTRIB->VBO_ATTRIB maps. Instead of each context having its own map instance for this purpose, use a global static const map. v2: s,unsigned char,GLubyte,g s,_VP_MODE_MAX,VP_MODE_MAX,g Change comment style. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-01 22:39:07 +01:00
Mathias Fröhlich	b4fd63015a	mesa: Track position/generic0 aliasing in the VAO. Since the first material attribute no longer aliases with the generic0 attribute, only aliasing between generic0 and position is left and entirely dependent on the enabled state of the VAO. So introduce a gl_attribute_map_mode in the VAO that is used to track how the position and the generic 0 attribute alias. Provide a static const array that can be used to map from vertex program input indices to VERT_ATTRIB_* indices. The outer dimension of the array is meant to be indexed directly by the new VAO member variable. Also provide methods on the VAO to convert bitmasks of VERT_BIT's from the VAO numbering to the vertex processing inputs numbering. v2: s,unsigned char,GLubyte,g s,_ATTRIBUTE_MAP_MODE_MAX,ATTRIBUTE_MAP_MODE_MAX,g Change comment style, add comments. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-01 22:39:06 +01:00
Mathias Fröhlich	186f03cfb0	mesa: Put materials at the end of the generic block. The materials are now moved to the end of the generic attributes block to the range 4-15. Before, the way the position and generic 0 attribute is handled was dependent on the presence and kind of the currently attached vertex program. With this change the way the position attribute and the generic 0 attribute is treated only depends on the enabled flag of those two arrays. This will later help to untangle the update dependencies between enabled arrays and shader inputs. v2: s,VERT_ATTRIB_MAT_OFFSET,VERT_ATTRIB_MAT0,g Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-01 22:39:06 +01:00
Mathias Fröhlich	38b41fd718	mesa: Use defines for the aliased material array attributes. Instead of just assuming that the material attributes just overlap with the generic attributes 0-12, give them symbolic defines so that we can easier move them to an other range. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-01 22:39:06 +01:00
Mathias Fröhlich	f37e29ac22	vbo: Correctly handle attribute offsets in dlist draw. When executing a display list draw, for the offset list to be correct, the offset computation needs to accumulate all attribute size values in order. Specifically, if we are shuffling around the position and generic0 attributes, we may violate the order or if we do not walk the generic vbo attributes we may skip some of the attributes. Even if this is an unlikely usecase we can fix this use case by precomputing the offsets on the full attribute list and store the full offset list in the display list node. v2: Formatting fix v3: Rebase Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-01 22:39:05 +01:00
Brian Paul	7a044ef68b	gallivm/llvmpipe: add const qualifiers on sampler variables Once a lp_build_sampler_soa or lp_build_sampler_aos object is created, it should never be modified. Found by inspection. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-01 14:19:58 -07:00
Brian Paul	1bdbeae17c	vbo: change an argument in vbo_draw_indirect_prims() In vbo_draw_indirect_prims() pass the 'indirect_data' argument to vbo->draw_prims(). All the callers are passing ctx->DrawIndirectBuffer so this should be no functional change. Add a (temporary) assertion to be sure. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-02-01 12:17:59 -07:00
Brian Paul	1b7ad3ae97	vbo: add comments on the VBO draw function typedefs And rename indirect_params -> indirect_draw_count_buffer and indirect_params_offset -> indirect_draw_count_offset to be more specific. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-02-01 12:17:59 -07:00
Brian Paul	c7bf05c833	vbo: s/drawcount/drawcount_offset This parameter (from the glMultiDrawArraysIndirectCountARB function) is poorly named. It's an offset into the buffer which contains the number of primitives to draw. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-02-01 12:17:59 -07:00
Brian Paul	b0a2f38db9	vbo: use vbo local var for draw call in vbo_save_playback_vertex_list() Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-02-01 12:17:59 -07:00
Brian Paul	84c3641864	svga: remove unneeded #includes in svga_pipe_draw.c Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-02-01 12:17:59 -07:00
Brian Paul	fa98730bf3	svga: whitespace/formatting fixes in svga_pipe_draw.c Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-02-01 12:17:59 -07:00
Brian Paul	7a1401938b	svga: clean up retry_draw_range_elements(), retry_draw_arrays() Get rid of a bunch of goto spaghetti. Remove unneeded do_retry parameter. No Piglit changes. Also tested w/ Google Earth and other apps. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-02-01 12:17:59 -07:00
Brian Paul	c744289552	svga: remove unused min/max_index params to draw_vgpu10() Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-02-01 12:17:59 -07:00
Eric Anholt	06858c7348	broadcom/vc5: Fix image_h setup for both loads and stores. The image_h for the tiling algorithm needs to be the padded-to-a-uifblock height of the level, not the unpadded height or the height of level 0. Fixes some cases of KHR-GLES3.texture_repeat_mode.* and depthstencil-render-miplevels.	2018-02-01 11:02:29 -08:00
Eric Anholt	5329f35ea1	broadcom/vc5: Add appropriate height padding for bank conflicts. I thought I didn't need this because I was doing level-0-always-UIF and that the pad there would propagate down, but it turns out that for level 1 the padding ends up being chosen by the HW. This brings us closer to being able to turn on UIF XOR for increased performance, as well.	2018-02-01 11:02:29 -08:00
Eric Anholt	dea902c933	broadcom/vc5: Simplify separate stencil surface setup. If we just make another gallium surface for the separate stencil, it's a lot easier to keep track of which set of fields we're using in RCL setup. This also incidentally fixes a little bug in setting up the surface's padded height for separate stencil when the UIF-ness changes at different levels of Z versus stencil.	2018-02-01 11:02:29 -08:00
Eric Anholt	7239b3edbe	broadcom/vc5: Rename the UIFCFG register in the UAPI. This matches the naming of the other hub regs we get, and I don't know for sure if UIFCFG will be the same register between the hub and the cores on all versions.	2018-02-01 11:02:29 -08:00
Eric Anholt	353b42ccc7	broadcom/vc5: Fix a segfault on mix of booleans. We don't have a src1 to look up if the compare instruction is "i2b".	2018-02-01 11:02:29 -08:00
Eric Anholt	eb765394c2	broadcom/vc5: Skip over missing color buffers for a couple of checks. Fixes crashes in piglit alpha-to-coverage-no-draw-buffer-zero 2	2018-02-01 11:02:29 -08:00
Eric Anholt	aec066c7aa	broadcom/vc5: Add the missing PIPE_CAP_FENCE_SIGNAL.	2018-02-01 11:02:29 -08:00
Baldur Karlsson	030821a873	mesa: fix query of GL_TEXTURE_COMPRESSION_HINT_ARB Fixes: `f96a69f916` ("mesa: replace GLenum with GLenum16 in common structures (v4)") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104908 Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-01 11:58:02 -07:00
Lucas Stach	0c71a19fe4	renderonly: fix dumb BO allocation for non 32bpp formats Take into account the resource format, instead of applying a hardcoded 32bpp. This not only over-allocates 16bpp formats, but also results in a wrong stride being filled into the handle. Fixes: `848b49b288` ("gallium: add renderonly library") CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-02-01 19:36:17 +01:00
Kenneth Graunke	85ec7abc3f	intel/decoder: Fix control / evaluation label mixup. Trivial. DS is TES, HS is TCS.	2018-02-01 09:44:15 -08:00
Kenneth Graunke	c3cd2aac27	i965: Bump official kernel requirement to Linux v3.9. In commit `3f353342a6` (present in 17.3.0) we started unconditionally using I915_EXEC_NO_RELOC, which was introduced in Linux v3.9. ChromeOS kernel 3.8 has backported this, so it should work too. Running on older kernels would likely result in every single batch being rejected by the kernel, which is pretty catastrophic. Yet, it appears that nobody noticed. So, let's just bump the official requirement and move forward ever so slowly. Fixes: `3f353342a6` ("i965: Use I915_EXEC_NO_RELOC") Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-01 07:58:58 -08:00
Marc Dietrich	4c5f0b4fd4	meson: don't install windows headers on non-windows platforms Only dive into the windows subdir if windows platform is selected. Signed-off-by: Marc Dietrich <marvin24@gmx.de> Fixes: `5ef75cb02b` "meson: build src/glx/windows" Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-02-01 15:33:02 +00:00
Marek Olšák	71c6f64e54	radeonsi: use ac_build_buffer_load_format for image buffer loads Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-01 16:20:19 +01:00
Marek Olšák	b0a6053a99	ac/nir: use ac_build_buffer_load_format for image buffer loads Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-01 16:20:19 +01:00
Marek Olšák	bac9fa9f17	ac: add glc parameter to ac_build_buffer_load_format Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-01 16:20:19 +01:00
Marek Olšák	be973ed21f	radeonsi: load the right number of components for VS inputs and TBOs The supported counts are 1, 2, 4. (3=4) The following snippet loads float, vec2, vec3, and vec4: Before: buffer_load_format_x v9, v4, s[0:3], 0 idxen ; E0002000 80000904 buffer_load_format_xyzw v[0:3], v5, s[8:11], 0 idxen ; E00C2000 80020005 s_waitcnt vmcnt(0) ; BF8C0F70 buffer_load_format_xyzw v[2:5], v6, s[12:15], 0 idxen ; E00C2000 80030206 s_waitcnt vmcnt(0) ; BF8C0F70 buffer_load_format_xyzw v[5:8], v7, s[4:7], 0 idxen ; E00C2000 80010507 After: buffer_load_format_x v10, v4, s[0:3], 0 idxen ; E0002000 80000A04 buffer_load_format_xy v[8:9], v5, s[8:11], 0 idxen ; E0042000 80020805 buffer_load_format_xyzw v[0:3], v6, s[12:15], 0 idxen ; E00C2000 80030006 s_waitcnt vmcnt(0) ; BF8C0F70 buffer_load_format_xyzw v[3:6], v7, s[4:7], 0 idxen ; E00C2000 80010307 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-01 16:20:19 +01:00
Marek Olšák	472361dd7e	radeonsi: remove unused si_shader_context members Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-01 16:20:19 +01:00
Jon Turney	d3540b405b	glx/apple: locate dispatch table functions to wrap by name Avoid reaching into the dispatch table internals (and thus having to deal with the complexities of remap etc.) by identifying functions to wrap by name. See: https://lists.freedesktop.org/archives/mesa-dev/2015-June/086721.html et seq. https://bugs.freedesktop.org/show_bug.cgi?id=90311 Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-01 15:14:08 +00:00
Jon Turney	b37b7b42dc	glx/apple: include util/debug.h for env_var_as_boolean prototype mesa/src/glx/glxcmds.c:1295:21: error: implicit declaration of function 'env_var_as_boolean' is invalid in C99 [-Werror,-Wimplicit-function-declaration] mesa/src/glx/apple/apple_visual.c:85:28: error: implicit declaration of function 'env_var_as_boolean' is invalid in C99 [-Werror,-Wimplicit-function-declaration] Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-01 15:14:02 +00:00
Jon Turney	f8ed9f24d5	osx: ld doesn't support --build-id Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-01 15:13:56 +00:00
Jon Turney	7ad7a07c88	configure: Default to gbm=no on osx Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-01 15:13:00 +00:00
Andres Rodriguez	bbd00844a2	mesa: remove usage of alloca in externalobjects.c v4 Don't want an overly large numBufferBarriers/numTextureBarriers to blow up the stack. v2: handle malloc errors v3: fix patch v4: initialize texObjs/bufObjs Suggested-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com>	2018-02-01 09:48:04 -05:00
Samuel Pitoiset	2ef5ce1198	radv: do not insert shaders in cache when it's disabled When the application doesn't provide its own pipeline cache, the driver uses a in-memory cache but it shouldn't insert any entries when the cache is explicitely disabled by the user. Found while running my experimental pipeline-db tool with a ton of shaders, the memory footprint was just huge, and sometimes the process was even killed... Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-01 09:40:11 +01:00
Samuel Pitoiset	4922e7f25c	radv: use separate bindings for graphics and compute descriptors The Vulkan spec says: "pipelineBindPoint is a VkPipelineBindPoint indicating whether the descriptors will be used by graphics pipelines or compute pipelines. There is a separate set of bind points for each of graphics and compute, so binding one does not disturb the other." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104732 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-01 09:37:09 +01:00
Samuel Pitoiset	cf224014dd	radv: store the bind point when creating descriptors with templates Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-01 09:37:07 +01:00
Dave Airlie	7ea15a36fb	r600/eg: make sure we allow vpm bit on other CF ops. the vpm bit wasn't being applied to the push/pop instructions. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-01 13:41:32 +10:00
Timothy Arceri	4d982ae2c7	gallium/st/clover: remove unused PIPE_SHADER_IR_LLVM This has been unused since `100796c15c`. Acked-by: Marek Olšák <marek.olsak@amd.com>	2018-02-01 13:56:34 +11:00
Dave Airlie	0491d5425f	r600/sb: just add some missing debug bits Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-01 12:06:40 +10:00
Dave Airlie	df155a73f4	r600: fix buffer resinfo opcode translation. The vtx operations never got translated, so things worked by 0 being equal to 0, translate them so we can use the proper buffer resinfo code. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-01 11:59:55 +10:00
Timothy Arceri	679e4e7a46	st/glsl_to_nir: add more nir opts to st_nir_opts() All of the current gallium nir driver use these optimisations but they do so in their backends. Having these called in the backend only can cause a number of problems: - Shader compile times are greater because the opts need to do significant passes over all shader variants. - The shader cache is partially defeated due to the significant optimisation passes over variants. - We might miss out on nir linking optimisation opportunities. Adding these passes to st_nir_opts() alleviates these problems. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-02-01 09:42:57 +11:00
Andres Gomez	5a7aba2e0a	i965: perform 2 uploads with dual slot 64PASSTHRU formats on gen<8 The emission of vertex attributes corresponding to dvec3 and dvec4 vertex shader input variables was not correct when the <size> passed to the VertexAttribL* commands was <= 2. In `61a8a55f55` ("i965/gen8: Fix vertex attrib upload for dvec3/4 shader inputs"), for gen8+ we needed to determine if the attrib was dual slot to emit 128 or 256-bit, independently of the VAO size. Similarly, for gen < 8 we also need to determine whether the attrib is dual slot to force the emission of 256-bits through 2 uploads. Additionally, we make use of the ISL_FORMAT_R32_FLOAT format in this second upload to fill these unspecified components with zeros, as we also do for gen8+. Fixes the following test on Haswell: KHR-GL46.vertex_attrib_binding.basic-inputL-case1 v2: Added more inline comments to explain why we are using ISL_FORMAT_R32_FLOAT and its consequences, as requested by Alejandro and Antía. Fixes: `75968a668e` ("i965/gen7: expose OpenGL 4.2 on Haswell when supported") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103006 Cc: Alejandro Piñeiro <apinheiro@igalia.com> Cc: Juan A. Suarez Romero <jasuarez@igalia.com> Cc: Antia Puentes <apuentes@igalia.com> Cc: Rafael Antognolli <rafael.antognolli@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Antia Puentes <apuentes@igalia.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-01-31 22:50:06 +02:00
Kenneth Graunke	ab1f2e6bc4	i965: Make texture validation code use texture objects, not units. This requires moving the _MaxLevel handling up to the callers. Another user of intel_finalize_mipmap_tree will be added later that depends on _MaxLevel not being modified. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-01-31 11:33:52 -08:00
Kenneth Graunke	0a2e878c69	i965: Pass tObj into intel_update_max_level instead of intel_obj. We want both anyway, but this will simplify things a tiny bit in an upcoming patch. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-01-31 11:33:52 -08:00
Kenneth Graunke	876f1537e9	i965: Delete more misleading comments. brw_bo_wait_rendering used to take a brw_context pointer for perf_debug messages about stalls. Chris eliminated that in `833108ac14`. This message about passing NULL to avoid those warnings is no longer relevant, and just adds confusion. So, drop it.	2018-01-31 11:33:52 -08:00
Andres Rodriguez	8996610acb	docs/features: mark EXT_semaphore(_fd) as DONE v2 Support for these extensions is available in radeonsi. v2: also updated relnotes Signed-off-by: Andres Rodriguez <andresx7@gmail.com>	2018-01-31 12:31:40 -05:00
Brian Paul	d32c22a13f	st/mesa: whitespace, formatting fixes in st_glsl_to_tgsi.cpp Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-01-31 08:17:25 -07:00
Brian Paul	3b3d8275d8	st/mesa: s/int/GLenum/ in st_glsl_to_tgsi.cpp Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-01-31 08:17:25 -07:00
Brian Paul	1882ec4ff7	svga: use opcode local var to simplify some code Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-01-31 08:17:25 -07:00
Brian Paul	338c35c427	svga: s/unsigned/VGPU10_OPCODE_TYPE/ Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-01-31 08:17:25 -07:00
Samuel Pitoiset	a097a6f519	radv: do not dump meta shader stats That's quite useless and that pollutes the output. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-31 14:10:26 +01:00
Samuel Pitoiset	26cc3e74b9	ac/nir: fix emission of ffract for 64-bit Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-31 14:10:24 +01:00
Eric Engestrom	2f0db33527	meson: dedup gallium-xa logic Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-01-31 11:17:03 +00:00
Eric Engestrom	fa5d616bf9	meson: dedup gallium-va logic Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-01-31 11:17:03 +00:00
Eric Engestrom	86168ed31c	meson: dedup gallium-omx logic Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-01-31 11:17:03 +00:00
Eric Engestrom	724916c8a8	meson: dedup gallium-xvmc logic Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-01-31 11:17:03 +00:00
Eric Engestrom	992af0a4b8	meson: dedup gallium-vdpau logic Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-01-31 11:17:03 +00:00
Antia Puentes	0da434fb47	Revert "mesa: add missing RGB9_E5 format in _mesa_base_fbo_format" This reverts commit `513c2263cb`. _mesa_base_fbo_format_ is used to validate the internalformat passed to RenderbufferStorage, which in the OpenGL 4.6 is said: "An INVALID_ENUM error is generated if internalformat is not one of the color-renderable, depth-renderable, or stencil-renderable formats defined in section 9.4." RGB9_E5 format is not renderable, as stated in the same specification (Bug 9338). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104794 Cc: Juan A. Suarez Romero <jasuarez@igalia.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2018-01-31 12:06:00 +01:00
Michel Dänzer	1cf1bf32ef	winsys/radeon: Compute is_displayable in surf_drm_to_winsys It was always 0, breaking (at least) DRI3 with Xwayland. Bugzilla: https://bugs.freedesktop.org/104306 Fixes: `5f2073be32` ("ac/surface: add ac_surface::is_displayable") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-31 09:53:58 +01:00
Matthew Nicholls	ef272b161e	radv: remove predication on cache flushes This can lead to a situation where cache flushes could get conditionally disabled while still clearing the flush_bits, and thus flushes due to application pipeline barriers may never get executed. Fixes: `a6c2001ace` (radv: add support for cmd predication.) Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-31 13:37:18 +10:00
Brian Paul	1ea9efd2f8	mesa: fix broken glGet*(GL_POLYGON_MODE) query This reverts part of the patch which introduced the GLenum16 change. Fixes a conform regression found by Roland. Fixes: `f96a69f916` ("mesa: replace GLenum with GLenum16 in common structures (v4)") Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-30 20:32:37 -07:00
Dave Airlie	49c61d8b84	virgl: also remove dimension on indirect. This fixes some dEQP tests that generated bad shaders. Fixes: `b6f6ead19` (virgl: drop const dimensions on first block.) Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org> Tested-by: Gurchetan Singh <gurchetansingh@chromium.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-31 12:24:11 +10:00
Marek Olšák	fdf01d0244	radeonsi: remove DBG_PRECOMPILE it's useless and shader-db stats only report the main shader part. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-31 03:21:20 +01:00
Marek Olšák	148b48646b	radeonsi: print shader-db stats for main parts, not final binaries This is needed to get shader-db stats for LS,HS,ES,GS stages on gfx9. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-31 03:21:20 +01:00
Marek Olšák	c02c9ee550	radeonsi: move max_simd_waves computation into a separate function Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-31 03:21:20 +01:00
Marek Olšák	a7311cd7ee	mesa: fix glGet MAX_VERTEX_ATTRIB queries Broken by `f96a69f916` Reviewed-by: Brian Paul <brianp@vmware.com>	2018-01-31 03:21:20 +01:00
Jason Ekstrand	97938dac36	anv/cmd_buffer: Re-emit the pipeline at every subpass If we ever hit this edge-case, it can theoretically cause problem for CNL because we could end up changing render targets without re-emitting 3DSTATE_MULTISAMPLE which is part of the pipeline. Just get rid of the edge case. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-01-30 17:16:33 -08:00
Ian Romanick	ee63933a73	nir: Distribute binary operations with constants into bcsel This was specifically designed to simplify 1+mix(0, a-1, condition) to mix(1, a, condition) by pushing the 1+ inside. Skylake, Broadwell, and Haswell had similar results. Skylake shown. total instructions in shared programs: 14521753 -> 14521716 (<.01%) instructions in affected programs: 10619 -> 10582 (-0.35%) helped: 51 HURT: 14 helped stats (abs) min: 1 max: 12 x̄: 1.43 x̃: 1 helped stats (rel) min: 0.20% max: 3.58% x̄: 1.01% x̃: 0.95% HURT stats (abs) min: 1 max: 11 x̄: 2.57 x̃: 1 HURT stats (rel) min: 0.22% max: 1.75% x̄: 1.20% x̃: 1.32% 95% mean confidence interval for instructions value: -1.31 0.17 95% mean confidence interval for instructions %-change: -0.80% -0.27% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 533000205 -> 533003533 (<.01%) cycles in affected programs: 110610 -> 113938 (3.01%) helped: 43 HURT: 28 helped stats (abs) min: 6 max: 440 x̄: 27.12 x̃: 16 helped stats (rel) min: 0.39% max: 4.84% x̄: 1.60% x̃: 1.67% HURT stats (abs) min: 2 max: 3066 x̄: 160.50 x̃: 14 HURT stats (rel) min: 0.08% max: 77.78% x̄: 5.16% x̃: 0.62% 95% mean confidence interval for cycles value: -43.81 137.56 95% mean confidence interval for cycles %-change: -1.47% 3.60% Inconclusive result (value mean confidence interval includes 0). Ivy Bridge total instructions in shared programs: 10018840 -> 10018713 (<.01%) instructions in affected programs: 9431 -> 9304 (-1.35%) helped: 51 HURT: 3 helped stats (abs) min: 1 max: 80 x̄: 2.76 x̃: 1 helped stats (rel) min: 0.20% max: 16.43% x̄: 1.16% x̃: 0.81% HURT stats (abs) min: 1 max: 12 x̄: 4.67 x̃: 1 HURT stats (rel) min: 0.22% max: 1.33% x̄: 0.59% x̃: 0.22% 95% mean confidence interval for instructions value: -5.36 0.66 95% mean confidence interval for instructions %-change: -1.66% -0.46% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 87571944 -> 87572785 (<.01%) cycles in affected programs: 117234 -> 118075 (0.72%) helped: 42 HURT: 23 helped stats (abs) min: 2 max: 114 x̄: 51.90 x̃: 30 helped stats (rel) min: 0.11% max: 11.01% x̄: 4.45% x̃: 2.74% HURT stats (abs) min: 1 max: 2341 x̄: 131.35 x̃: 10 HURT stats (rel) min: 0.06% max: 37.11% x̄: 2.75% x̃: 0.61% 95% mean confidence interval for cycles value: -61.05 86.93 95% mean confidence interval for cycles %-change: -3.47% -0.33% Inconclusive result (value mean confidence interval includes 0). Sandy Bridge total instructions in shared programs: 10542933 -> 10542844 (<.01%) instructions in affected programs: 11487 -> 11398 (-0.77%) helped: 52 HURT: 3 helped stats (abs) min: 1 max: 40 x̄: 1.96 x̃: 1 helped stats (rel) min: 0.08% max: 8.16% x̄: 0.90% x̃: 0.72% HURT stats (abs) min: 1 max: 11 x̄: 4.33 x̃: 1 HURT stats (rel) min: 0.22% max: 1.22% x̄: 0.55% x̃: 0.22% 95% mean confidence interval for instructions value: -3.17 -0.07 95% mean confidence interval for instructions %-change: -1.13% -0.52% Instructions are helped. total cycles in shared programs: 146098397 -> 146097094 (<.01%) cycles in affected programs: 128140 -> 126837 (-1.02%) helped: 47 HURT: 8 helped stats (abs) min: 2 max: 333 x̄: 29.21 x̃: 18 helped stats (rel) min: 0.13% max: 5.04% x̄: 1.18% x̃: 0.95% HURT stats (abs) min: 1 max: 16 x̄: 8.75 x̃: 9 HURT stats (rel) min: 0.08% max: 0.43% x̄: 0.30% x̃: 0.34% 95% mean confidence interval for cycles value: -37.49 -9.90 95% mean confidence interval for cycles %-change: -1.22% -0.71% Cycles are helped. Iron Lake total instructions in shared programs: 7886711 -> 7886509 (<.01%) instructions in affected programs: 10425 -> 10223 (-1.94%) helped: 50 HURT: 2 helped stats (abs) min: 1 max: 78 x̄: 4.08 x̃: 1 helped stats (rel) min: 0.34% max: 15.38% x̄: 1.12% x̃: 0.54% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.86% max: 0.91% x̄: 0.89% x̃: 0.89% 95% mean confidence interval for instructions value: -8.05 0.28 95% mean confidence interval for instructions %-change: -1.83% -0.26% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 178115324 -> 178114612 (<.01%) cycles in affected programs: 765726 -> 765014 (-0.09%) helped: 39 HURT: 1 helped stats (abs) min: 2 max: 276 x̄: 18.31 x̃: 8 helped stats (rel) min: <.01% max: 8.47% x̄: 0.39% x̃: 0.04% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.03% max: 0.03% x̄: 0.03% x̃: 0.03% 95% mean confidence interval for cycles value: -32.07 -3.53 95% mean confidence interval for cycles %-change: -0.86% 0.10% Inconclusive result (%-change mean confidence interval includes 0). GM45 total instructions in shared programs: 4857762 -> 4857661 (<.01%) instructions in affected programs: 5523 -> 5422 (-1.83%) helped: 25 HURT: 1 helped stats (abs) min: 1 max: 78 x̄: 4.08 x̃: 1 helped stats (rel) min: 0.34% max: 13.61% x̄: 1.04% x̃: 0.52% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.86% max: 0.86% x̄: 0.86% x̃: 0.86% 95% mean confidence interval for instructions value: -9.99 2.22 95% mean confidence interval for instructions %-change: -2.01% 0.08% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 122179674 -> 122179194 (<.01%) cycles in affected programs: 530162 -> 529682 (-0.09%) helped: 22 HURT: 1 helped stats (abs) min: 2 max: 292 x̄: 21.91 x̃: 7 helped stats (rel) min: <.01% max: 8.65% x̄: 0.44% x̃: 0.04% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.03% max: 0.03% x̄: 0.03% x̃: 0.03% 95% mean confidence interval for cycles value: -46.56 4.82 95% mean confidence interval for cycles %-change: -1.20% 0.36% Inconclusive result (value mean confidence interval includes 0). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-01-30 15:40:15 -08:00
Ian Romanick	03fb13f646	nir: Rearrange logic op-compounded integer compares Skylake and Broadwell had similar results. Skylake shown. total instructions in shared programs: 14521769 -> 14521753 (<.01%) instructions in affected programs: 8782 -> 8766 (-0.18%) helped: 16 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.12% max: 0.40% x̄: 0.20% x̃: 0.18% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.23% -0.16% Instructions are helped. total cycles in shared programs: 533000376 -> 533000205 (<.01%) cycles in affected programs: 447035 -> 446864 (-0.04%) helped: 9 HURT: 9 helped stats (abs) min: 2 max: 40 x̄: 35.78 x̃: 40 helped stats (rel) min: 0.02% max: 0.18% x̄: 0.10% x̃: 0.09% HURT stats (abs) min: 1 max: 52 x̄: 16.78 x̃: 10 HURT stats (rel) min: <.01% max: 1.11% x̄: 0.29% x̃: 0.12% 95% mean confidence interval for cycles value: -25.07 6.07 95% mean confidence interval for cycles %-change: -0.08% 0.27% Inconclusive result (value mean confidence interval includes 0). No changes on GM45, Iron Lake, Sandy Bridge, Ivy Bridge, or Haswell. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-01-30 15:40:14 -08:00
Ian Romanick	053be9f020	nir: Rearrange and-compounded float compares If both comparisons are used as sources for instructions other than the iand, this transformation is detrimental. If the non-identical value in both compares is constant, the fmin or fmax will be constant-folded away, so the transformation is always a win. It is interesting to me that on Iron Lake only 81 shaders have instruction counts changed, but 726 shaders have cycle counts changed. shader-db results: Skylake total instructions in shared programs: 14525728 -> 14521017 (-0.03%) instructions in affected programs: 1164726 -> 1160015 (-0.40%) helped: 1692 HURT: 5 helped stats (abs) min: 1 max: 637 x̄: 2.79 x̃: 2 helped stats (rel) min: 0.07% max: 16.36% x̄: 0.81% x̃: 0.33% HURT stats (abs) min: 1 max: 12 x̄: 3.20 x̃: 1 HURT stats (rel) min: 0.38% max: 2.86% x̄: 2.36% x̃: 2.86% 95% mean confidence interval for instructions value: -3.52 -2.03 95% mean confidence interval for instructions %-change: -0.86% -0.74% Instructions are helped. total cycles in shared programs: 533115449 -> 532991404 (-0.02%) cycles in affected programs: 119401803 -> 119277758 (-0.10%) helped: 1145 HURT: 467 helped stats (abs) min: 1 max: 34644 x̄: 145.92 x̃: 18 helped stats (rel) min: <.01% max: 45.33% x̄: 1.58% x̃: 0.42% HURT stats (abs) min: 1 max: 1590 x̄: 92.15 x̃: 15 HURT stats (rel) min: <.01% max: 13.48% x̄: 1.26% x̃: 0.39% 95% mean confidence interval for cycles value: -122.16 -31.74 95% mean confidence interval for cycles %-change: -0.94% -0.57% Cycles are helped. total spills in shared programs: 9597 -> 9534 (-0.66%) spills in affected programs: 403 -> 340 (-15.63%) helped: 1 HURT: 1 total fills in shared programs: 13904 -> 13790 (-0.82%) fills in affected programs: 1627 -> 1513 (-7.01%) helped: 2 HURT: 1 LOST: 0 GAINED: 2 Broadwell total instructions in shared programs: 14816966 -> 14812590 (-0.03%) instructions in affected programs: 1499885 -> 1495509 (-0.29%) helped: 1672 HURT: 15 helped stats (abs) min: 1 max: 455 x̄: 2.70 x̃: 2 helped stats (rel) min: 0.05% max: 16.36% x̄: 0.81% x̃: 0.33% HURT stats (abs) min: 1 max: 21 x̄: 9.20 x̃: 8 HURT stats (rel) min: 0.08% max: 2.86% x̄: 1.06% x̃: 0.53% 95% mean confidence interval for instructions value: -3.14 -2.05 95% mean confidence interval for instructions %-change: -0.85% -0.73% Instructions are helped. total cycles in shared programs: 559353622 -> 559345595 (<.01%) cycles in affected programs: 139893703 -> 139885676 (<.01%) helped: 921 HURT: 697 helped stats (abs) min: 1 max: 42424 x̄: 143.45 x̃: 18 helped stats (rel) min: <.01% max: 36.23% x̄: 2.02% x̃: 0.87% HURT stats (abs) min: 1 max: 2370 x̄: 178.03 x̃: 38 HURT stats (rel) min: <.01% max: 17.35% x̄: 0.71% x̃: 0.14% 95% mean confidence interval for cycles value: -59.64 49.72 95% mean confidence interval for cycles %-change: -1.02% -0.66% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 78902 -> 78861 (-0.05%) spills in affected programs: 2418 -> 2377 (-1.70%) helped: 1 HURT: 11 total fills in shared programs: 83782 -> 83678 (-0.12%) fills in affected programs: 3515 -> 3411 (-2.96%) helped: 2 HURT: 11 LOST: 0 GAINED: 5 Haswell and Ivy Bridge had similar results. Haswell shown. total instructions in shared programs: 9033898 -> 9032010 (-0.02%) instructions in affected programs: 308064 -> 306176 (-0.61%) helped: 921 HURT: 4 helped stats (abs) min: 1 max: 20 x̄: 2.05 x̃: 1 helped stats (rel) min: 0.17% max: 17.54% x̄: 0.80% x̃: 0.35% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 3.23% max: 3.23% x̄: 3.23% x̃: 3.23% 95% mean confidence interval for instructions value: -2.21 -1.87 95% mean confidence interval for instructions %-change: -0.88% -0.68% Instructions are helped. total cycles in shared programs: 84628949 -> 84620520 (<.01%) cycles in affected programs: 2164913 -> 2156484 (-0.39%) helped: 518 HURT: 359 helped stats (abs) min: 1 max: 440 x̄: 41.52 x̃: 20 helped stats (rel) min: <.01% max: 17.17% x̄: 1.95% x̃: 1.01% HURT stats (abs) min: 1 max: 586 x̄: 36.43 x̃: 8 HURT stats (rel) min: 0.04% max: 18.65% x̄: 1.47% x̃: 0.40% 95% mean confidence interval for cycles value: -15.17 -4.05 95% mean confidence interval for cycles %-change: -0.77% -0.32% Cycles are helped. LOST: 0 GAINED: 4 Sandy Bridge total instructions in shared programs: 10544860 -> 10542933 (-0.02%) instructions in affected programs: 360019 -> 358092 (-0.54%) helped: 931 HURT: 4 helped stats (abs) min: 1 max: 20 x̄: 2.07 x̃: 1 helped stats (rel) min: 0.11% max: 15.52% x̄: 0.68% x̃: 0.30% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 3.33% max: 3.33% x̄: 3.33% x̃: 3.33% 95% mean confidence interval for instructions value: -2.23 -1.89 95% mean confidence interval for instructions %-change: -0.76% -0.58% Instructions are helped. total cycles in shared programs: 146106820 -> 146098397 (<.01%) cycles in affected programs: 3435047 -> 3426624 (-0.25%) helped: 572 HURT: 329 helped stats (abs) min: 1 max: 1289 x̄: 32.52 x̃: 15 helped stats (rel) min: <.01% max: 26.29% x̄: 0.97% x̃: 0.33% HURT stats (abs) min: 1 max: 1714 x̄: 30.93 x̃: 6 HURT stats (rel) min: 0.02% max: 41.31% x̄: 1.13% x̃: 0.19% 95% mean confidence interval for cycles value: -16.85 -1.85 95% mean confidence interval for cycles %-change: -0.39% -0.01% Cycles are helped. LOST: 1 GAINED: 0 Iron Lake total instructions in shared programs: 7886925 -> 7886711 (<.01%) instructions in affected programs: 25763 -> 25549 (-0.83%) helped: 75 HURT: 6 helped stats (abs) min: 1 max: 13 x̄: 3.33 x̃: 1 helped stats (rel) min: 0.35% max: 17.57% x̄: 1.96% x̃: 0.53% HURT stats (abs) min: 1 max: 16 x̄: 6.00 x̃: 1 HURT stats (rel) min: 2.86% max: 4.79% x̄: 3.49% x̃: 2.86% 95% mean confidence interval for instructions value: -3.69 -1.60 95% mean confidence interval for instructions %-change: -2.54% -0.57% Instructions are helped. total cycles in shared programs: 178116888 -> 178115324 (<.01%) cycles in affected programs: 5858790 -> 5857226 (-0.03%) helped: 484 HURT: 242 helped stats (abs) min: 2 max: 76 x̄: 5.27 x̃: 6 helped stats (rel) min: 0.01% max: 10.70% x̄: 0.18% x̃: 0.06% HURT stats (abs) min: 2 max: 76 x̄: 4.07 x̃: 2 HURT stats (rel) min: 0.01% max: 3.99% x̄: 0.19% x̃: 0.03% 95% mean confidence interval for cycles value: -2.76 -1.55 95% mean confidence interval for cycles %-change: -0.12% 0.01% Inconclusive result (%-change mean confidence interval includes 0). GM45 total instructions in shared programs: 4857870 -> 4857762 (<.01%) instructions in affected programs: 13994 -> 13886 (-0.77%) helped: 39 HURT: 5 helped stats (abs) min: 1 max: 13 x̄: 3.28 x̃: 2 helped stats (rel) min: 0.33% max: 17.11% x̄: 1.86% x̃: 0.48% HURT stats (abs) min: 1 max: 16 x̄: 4.00 x̃: 1 HURT stats (rel) min: 2.86% max: 4.71% x̄: 3.23% x̃: 2.86% 95% mean confidence interval for instructions value: -3.86 -1.05 95% mean confidence interval for instructions %-change: -2.61% 0.04% Inconclusive result (%-change mean confidence interval includes 0). total cycles in shared programs: 122180744 -> 122179674 (<.01%) cycles in affected programs: 3686646 -> 3685576 (-0.03%) helped: 273 HURT: 141 helped stats (abs) min: 2 max: 76 x̄: 5.81 x̃: 6 helped stats (rel) min: 0.01% max: 10.70% x̄: 0.18% x̃: 0.06% HURT stats (abs) min: 2 max: 76 x̄: 3.66 x̃: 2 HURT stats (rel) min: 0.01% max: 3.99% x̄: 0.16% x̃: 0.02% 95% mean confidence interval for cycles value: -3.42 -1.75 95% mean confidence interval for cycles %-change: -0.15% 0.03% Inconclusive result (%-change mean confidence interval includes 0). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-01-30 15:40:14 -08:00
Ian Romanick	821e7a4d32	nir: Separate a weird compare with zero to two compares with zero min(a+b, c+d) >= 0 becomes (a+b >= 0 && c+d >= 0). No shader-db changes, but it does prevent 6 to 12 instruction regressions in the next patch on all measured Intel platforms. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-01-30 15:40:14 -08:00
Ian Romanick	68420d8322	nir: Simplify min and max of b2f v2: Rebase on almost 2 years. Require that one of the arguments to fmin or fmax be used only once. This prevents some regressions. shader-db results: Skylake and Broadwell had similar results. Skylake shown. total instructions in shared programs: 14526021 -> 14525913 (<.01%) instructions in affected programs: 4613 -> 4505 (-2.34%) helped: 31 HURT: 0 helped stats (abs) min: 1 max: 4 x̄: 3.48 x̃: 4 helped stats (rel) min: 0.62% max: 6.67% x̄: 3.31% x̃: 2.42% total cycles in shared programs: 533118710 -> 533118403 (<.01%) cycles in affected programs: 34334 -> 34027 (-0.89%) helped: 24 HURT: 0 helped stats (abs) min: 4 max: 24 x̄: 12.79 x̃: 14 helped stats (rel) min: 0.25% max: 2.40% x̄: 1.08% x̃: 1.03% No changes on GM45, Iron Lake, Sandy Bridge, Ivy Bridge, or Haswell. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-01-30 15:40:14 -08:00
Ian Romanick	d8d18516b0	nir: Undo possible damage caused by rearranging or-compounded float compares shader-db results: Skylake and Broadwell had similar results (Skylake shown) total instructions in shared programs: 14525898 -> 14525836 (<.01%) instructions in affected programs: 1964 -> 1902 (-3.16%) helped: 14 HURT: 0 helped stats (abs) min: 1 max: 25 x̄: 4.43 x̃: 1 helped stats (rel) min: 0.68% max: 9.77% x̄: 2.10% x̃: 0.86% 95% mean confidence interval for instructions value: -9.46 0.60 95% mean confidence interval for instructions %-change: -3.97% -0.24% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 533119892 -> 533115756 (<.01%) cycles in affected programs: 96061 -> 91925 (-4.31%) helped: 13 HURT: 1 helped stats (abs) min: 60 max: 596 x̄: 318.77 x̃: 300 helped stats (rel) min: 1.15% max: 5.49% x̄: 4.27% x̃: 4.42% HURT stats (abs) min: 8 max: 8 x̄: 8.00 x̃: 8 HURT stats (rel) min: 0.46% max: 0.46% x̄: 0.46% x̃: 0.46% 95% mean confidence interval for cycles value: -379.43 -211.43 95% mean confidence interval for cycles %-change: -4.84% -3.01% Cycles are helped. Haswell, Ivy Bridge and Sandy Bridge had similar results (Haswell shown). total instructions in shared programs: 9033948 -> 9033898 (<.01%) instructions in affected programs: 535 -> 485 (-9.35%) helped: 2 HURT: 0 total cycles in shared programs: 84631402 -> 84628949 (<.01%) cycles in affected programs: 63197 -> 60744 (-3.88%) helped: 13 HURT: 2 helped stats (abs) min: 1 max: 594 x̄: 189.62 x̃: 140 helped stats (rel) min: 0.07% max: 5.04% x̄: 3.79% x̃: 4.01% HURT stats (abs) min: 4 max: 8 x̄: 6.00 x̃: 6 HURT stats (rel) min: 0.17% max: 0.45% x̄: 0.31% x̃: 0.31% 95% mean confidence interval for cycles value: -253.40 -73.67 95% mean confidence interval for cycles %-change: -4.24% -2.25% Cycles are helped. No changes on GM45 or Iron Lake. v2: Add a couple more tautological compares. Suggested by Elie. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-01-30 15:40:14 -08:00
Ian Romanick	3941cba0f7	nir: Be more conservative about rearranging or-compounded compares If both comparisons are used as sources for instructions other than the ior, this transformation is detrimental. If the non-identical value in both compares is constant, the fmin or fmax will be constant-folded away, so the transformation is always a win. shader-db results: Skylake total instructions in shared programs: 14526147 -> 14525898 (<.01%) instructions in affected programs: 70239 -> 69990 (-0.35%) helped: 102 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 2.44 x̃: 1 helped stats (rel) min: 0.07% max: 2.30% x̄: 0.38% x̃: 0.20% 95% mean confidence interval for instructions value: -2.86 -2.02 95% mean confidence interval for instructions %-change: -0.46% -0.31% Instructions are helped. total cycles in shared programs: 533120531 -> 533119892 (<.01%) cycles in affected programs: 994875 -> 994236 (-0.06%) helped: 76 HURT: 26 helped stats (abs) min: 1 max: 324 x̄: 27.09 x̃: 13 helped stats (rel) min: <.01% max: 4.21% x̄: 0.45% x̃: 0.18% HURT stats (abs) min: 1 max: 167 x̄: 54.62 x̃: 26 HURT stats (rel) min: <.01% max: 4.36% x̄: 1.01% x̃: 0.39% 95% mean confidence interval for cycles value: -19.44 6.91 95% mean confidence interval for cycles %-change: -0.30% 0.15% Inconclusive result (value mean confidence interval includes 0). Broadwell total instructions in shared programs: 14816005 -> 14815787 (<.01%) instructions in affected programs: 64658 -> 64440 (-0.34%) helped: 97 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 2.25 x̃: 1 helped stats (rel) min: 0.07% max: 2.30% x̄: 0.38% x̃: 0.20% 95% mean confidence interval for instructions value: -2.62 -1.87 95% mean confidence interval for instructions %-change: -0.45% -0.30% Instructions are helped. total cycles in shared programs: 559340386 -> 559339907 (<.01%) cycles in affected programs: 1090491 -> 1090012 (-0.04%) helped: 66 HURT: 28 helped stats (abs) min: 2 max: 198 x̄: 23.83 x̃: 16 helped stats (rel) min: 0.01% max: 4.21% x̄: 0.47% x̃: 0.27% HURT stats (abs) min: 2 max: 226 x̄: 39.07 x̃: 11 HURT stats (rel) min: <.01% max: 4.61% x̄: 0.64% x̃: 0.20% 95% mean confidence interval for cycles value: -15.94 5.75 95% mean confidence interval for cycles %-change: -0.35% 0.07% Inconclusive result (value mean confidence interval includes 0). LOST: 0 GAINED: 1 Haswell total instructions in shared programs: 9034106 -> 9033948 (<.01%) instructions in affected programs: 24096 -> 23938 (-0.66%) helped: 38 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 4.16 x̃: 4 helped stats (rel) min: 0.42% max: 2.29% x̄: 0.71% x̃: 0.64% 95% mean confidence interval for instructions value: -4.71 -3.60 95% mean confidence interval for instructions %-change: -0.84% -0.58% Instructions are helped. total cycles in shared programs: 84631628 -> 84631402 (<.01%) cycles in affected programs: 148674 -> 148448 (-0.15%) helped: 14 HURT: 14 helped stats (abs) min: 1 max: 114 x̄: 22.14 x̃: 12 helped stats (rel) min: 0.02% max: 2.98% x̄: 0.66% x̃: 0.21% HURT stats (abs) min: 1 max: 10 x̄: 6.00 x̃: 5 HURT stats (rel) min: 0.01% max: 0.20% x̄: 0.12% x̃: 0.11% 95% mean confidence interval for cycles value: -19.42 3.28 95% mean confidence interval for cycles %-change: -0.59% 0.05% Inconclusive result (value mean confidence interval includes 0). Ivy Bridge total instructions in shared programs: 10015456 -> 10015293 (<.01%) instructions in affected programs: 27701 -> 27538 (-0.59%) helped: 38 HURT: 0 helped stats (abs) min: 1 max: 9 x̄: 4.29 x̃: 4 helped stats (rel) min: 0.33% max: 2.79% x̄: 0.66% x̃: 0.52% 95% mean confidence interval for instructions value: -4.87 -3.71 95% mean confidence interval for instructions %-change: -0.82% -0.51% Instructions are helped. total cycles in shared programs: 87524771 -> 87524569 (<.01%) cycles in affected programs: 112324 -> 112122 (-0.18%) helped: 6 HURT: 12 helped stats (abs) min: 2 max: 111 x̄: 44.67 x̃: 20 helped stats (rel) min: 0.02% max: 2.94% x̄: 1.45% x̃: 1.26% HURT stats (abs) min: 1 max: 16 x̄: 5.50 x̃: 5 HURT stats (rel) min: <.01% max: 0.16% x̄: 0.08% x̃: 0.08% 95% mean confidence interval for cycles value: -29.14 6.69 95% mean confidence interval for cycles %-change: -0.93% 0.08% Inconclusive result (value mean confidence interval includes 0). LOST: 0 GAINED: 2 Sandy Bridge total instructions in shared programs: 10545655 -> 10545465 (<.01%) instructions in affected programs: 37198 -> 37008 (-0.51%) helped: 42 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 4.52 x̃: 4 helped stats (rel) min: 0.31% max: 2.15% x̄: 0.58% x̃: 0.49% 95% mean confidence interval for instructions value: -5.14 -3.91 95% mean confidence interval for instructions %-change: -0.68% -0.47% Instructions are helped. total cycles in shared programs: 146113059 -> 146112427 (<.01%) cycles in affected programs: 423514 -> 422882 (-0.15%) helped: 32 HURT: 10 helped stats (abs) min: 4 max: 162 x̄: 24.34 x̃: 12 helped stats (rel) min: 0.06% max: 2.74% x̄: 0.37% x̃: 0.11% HURT stats (abs) min: 12 max: 19 x̄: 14.70 x̃: 14 HURT stats (rel) min: 0.10% max: 0.18% x̄: 0.16% x̃: 0.14% 95% mean confidence interval for cycles value: -26.03 -4.07 95% mean confidence interval for cycles %-change: -0.43% -0.05% Cycles are helped. Iron Lake total instructions in shared programs: 7886959 -> 7886925 (<.01%) instructions in affected programs: 1340 -> 1306 (-2.54%) helped: 4 HURT: 0 helped stats (abs) min: 2 max: 15 x̄: 8.50 x̃: 8 helped stats (rel) min: 0.63% max: 4.30% x̄: 2.45% x̃: 2.43% 95% mean confidence interval for instructions value: -20.44 3.44 95% mean confidence interval for instructions %-change: -5.78% 0.89% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 178116996 -> 178116888 (<.01%) cycles in affected programs: 6262 -> 6154 (-1.72%) helped: 2 HURT: 2 helped stats (abs) min: 44 max: 78 x̄: 61.00 x̃: 61 helped stats (rel) min: 3.31% max: 3.94% x̄: 3.62% x̃: 3.62% HURT stats (abs) min: 6 max: 8 x̄: 7.00 x̃: 7 HURT stats (rel) min: 0.34% max: 0.68% x̄: 0.51% x̃: 0.51% 95% mean confidence interval for cycles value: -93.27 39.27 95% mean confidence interval for cycles %-change: -5.38% 2.27% Inconclusive result (value mean confidence interval includes 0). GM45 total instructions in shared programs: 4857887 -> 4857870 (<.01%) instructions in affected programs: 674 -> 657 (-2.52%) helped: 2 HURT: 0 total cycles in shared programs: 122180816 -> 122180744 (<.01%) cycles in affected programs: 3764 -> 3692 (-1.91%) helped: 1 HURT: 1 helped stats (abs) min: 78 max: 78 x̄: 78.00 x̃: 78 helped stats (rel) min: 3.94% max: 3.94% x̄: 3.94% x̃: 3.94% HURT stats (abs) min: 6 max: 6 x̄: 6.00 x̃: 6 HURT stats (rel) min: 0.34% max: 0.34% x̄: 0.34% x̃: 0.34% Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-01-30 15:40:14 -08:00
Ian Romanick	cfc0d34802	nir: See through an fneg to apply existing optimizations Doing the same for the existing feq and fne transformations didn't help anything in shader-db. shader-db results: Broadwell and Skylake (Skylake shown) total instructions in shared programs: 14529463 -> 14526147 (-0.02%) instructions in affected programs: 402420 -> 399104 (-0.82%) helped: 2136 HURT: 131 helped stats (abs) min: 1 max: 10 x̄: 1.61 x̃: 1 helped stats (rel) min: 0.03% max: 16.22% x̄: 3.14% x̃: 1.12% HURT stats (abs) min: 1 max: 2 x̄: 1.01 x̃: 1 HURT stats (rel) min: 0.13% max: 7.69% x̄: 0.75% x̃: 0.57% 95% mean confidence interval for instructions value: -1.51 -1.41 95% mean confidence interval for instructions %-change: -3.06% -2.78% Instructions are helped. total cycles in shared programs: 533146915 -> 533120531 (<.01%) cycles in affected programs: 10356261 -> 10329877 (-0.25%) helped: 1933 HURT: 844 helped stats (abs) min: 1 max: 490 x̄: 29.44 x̃: 16 helped stats (rel) min: <.01% max: 28.57% x̄: 3.43% x̃: 1.88% HURT stats (abs) min: 1 max: 423 x̄: 36.17 x̃: 12 HURT stats (rel) min: <.01% max: 23.75% x̄: 1.90% x̃: 0.59% 95% mean confidence interval for cycles value: -11.78 -7.22 95% mean confidence interval for cycles %-change: -1.98% -1.65% Cycles are helped. Haswell total instructions in shared programs: 9037416 -> 9034106 (-0.04%) instructions in affected programs: 389831 -> 386521 (-0.85%) helped: 2184 HURT: 120 helped stats (abs) min: 1 max: 11 x̄: 1.57 x̃: 1 helped stats (rel) min: 0.03% max: 25.00% x̄: 2.73% x̃: 1.02% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.19% max: 7.69% x̄: 0.81% x̃: 0.57% 95% mean confidence interval for instructions value: -1.49 -1.39 95% mean confidence interval for instructions %-change: -2.68% -2.41% Instructions are helped. total cycles in shared programs: 84636243 -> 84631628 (<.01%) cycles in affected programs: 4745058 -> 4740443 (-0.10%) helped: 1904 HURT: 960 helped stats (abs) min: 1 max: 466 x̄: 30.21 x̃: 18 helped stats (rel) min: 0.02% max: 36.36% x̄: 3.57% x̃: 2.38% HURT stats (abs) min: 1 max: 1080 x̄: 55.11 x̃: 14 HURT stats (rel) min: 0.02% max: 51.33% x̄: 2.77% x̃: 0.81% 95% mean confidence interval for cycles value: -4.51 1.29 95% mean confidence interval for cycles %-change: -1.64% -1.25% Inconclusive result (value mean confidence interval includes 0). LOST: 1 GAINED: 0 Sandy Bridge and Ivy Bridge (Ivy Bridge shown) total instructions in shared programs: 10018873 -> 10015456 (-0.03%) instructions in affected programs: 512820 -> 509403 (-0.67%) helped: 2268 HURT: 162 helped stats (abs) min: 1 max: 11 x̄: 1.62 x̃: 1 helped stats (rel) min: 0.03% max: 25.00% x̄: 2.47% x̃: 0.88% HURT stats (abs) min: 1 max: 4 x̄: 1.59 x̃: 1 HURT stats (rel) min: 0.09% max: 7.69% x̄: 0.86% x̃: 0.50% 95% mean confidence interval for instructions value: -1.46 -1.35 95% mean confidence interval for instructions %-change: -2.38% -2.12% Instructions are helped. total cycles in shared programs: 87538223 -> 87524771 (-0.02%) cycles in affected programs: 5435520 -> 5422068 (-0.25%) helped: 1916 HURT: 946 helped stats (abs) min: 1 max: 1392 x̄: 29.44 x̃: 18 helped stats (rel) min: <.01% max: 34.51% x̄: 3.34% x̃: 1.97% HURT stats (abs) min: 1 max: 633 x̄: 45.41 x̃: 11 HURT stats (rel) min: 0.02% max: 25.95% x̄: 2.41% x̃: 0.62% 95% mean confidence interval for cycles value: -7.34 -2.06 95% mean confidence interval for cycles %-change: -1.62% -1.26% Cycles are helped. LOST: 1 GAINED: 0 Iron Lake total instructions in shared programs: 7888446 -> 7886959 (-0.02%) instructions in affected programs: 331581 -> 330094 (-0.45%) helped: 1160 HURT: 97 helped stats (abs) min: 1 max: 10 x̄: 1.37 x̃: 1 helped stats (rel) min: 0.02% max: 9.68% x̄: 0.93% x̃: 0.43% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.17% max: 4.17% x̄: 0.37% x̃: 0.25% 95% mean confidence interval for instructions value: -1.25 -1.12 95% mean confidence interval for instructions %-change: -0.91% -0.75% Instructions are helped. total cycles in shared programs: 178130766 -> 178116996 (<.01%) cycles in affected programs: 12534564 -> 12520794 (-0.11%) helped: 1856 HURT: 187 helped stats (abs) min: 2 max: 202 x̄: 7.78 x̃: 4 helped stats (rel) min: <.01% max: 6.47% x̄: 0.28% x̃: 0.11% HURT stats (abs) min: 2 max: 26 x̄: 3.55 x̃: 2 HURT stats (rel) min: 0.01% max: 2.14% x̄: 0.08% x̃: 0.02% 95% mean confidence interval for cycles value: -7.41 -6.07 95% mean confidence interval for cycles %-change: -0.28% -0.22% Cycles are helped. GM45 total instructions in shared programs: 4858912 -> 4857887 (-0.02%) instructions in affected programs: 237565 -> 236540 (-0.43%) helped: 867 HURT: 57 helped stats (abs) min: 1 max: 10 x̄: 1.25 x̃: 1 helped stats (rel) min: 0.02% max: 9.38% x̄: 0.87% x̃: 0.43% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.16% max: 3.85% x̄: 0.34% x̃: 0.22% 95% mean confidence interval for instructions value: -1.18 -1.04 95% mean confidence interval for instructions %-change: -0.88% -0.71% Instructions are helped. total cycles in shared programs: 122189118 -> 122180816 (<.01%) cycles in affected programs: 8776418 -> 8768116 (-0.09%) helped: 1213 HURT: 166 helped stats (abs) min: 2 max: 202 x̄: 7.30 x̃: 4 helped stats (rel) min: <.01% max: 6.43% x̄: 0.25% x̃: 0.11% HURT stats (abs) min: 2 max: 26 x̄: 3.35 x̃: 2 HURT stats (rel) min: 0.01% max: 2.14% x̄: 0.06% x̃: 0.02% 95% mean confidence interval for cycles value: -6.78 -5.26 95% mean confidence interval for cycles %-change: -0.24% -0.18% Cycles are helped. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-01-30 15:40:14 -08:00
Timothy Arceri	283e25102b	st/glsl_to_nir: disable io lowering and array splitting of fs inputs We need this to be able to support the interpolateAt builtins in a sane way. It also leads to the generation of more optimal code. The lowering and splitting is made conditional on lower_all_io_to_temps because vc4 and freedreno both expect these passes to be enabled and niether support glsl 400 so don't need to deal with the interpolateAt builtins. We leave the other stages for now as to avoid regressions. Ideally we could remove the stage checks and just set the nir options correctly for each stage. However all gallium drivers currently just use return the same nir compiler options for all stages, and it's probably more trouble than its worth to change this. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-31 09:14:08 +11:00
Timothy Arceri	9a2e085680	nir: add lower_all_io_to_temps flag This will be used for freedreno and vc4 which require all inputs and outputs to be copied to temps. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-31 09:14:08 +11:00
Timothy Arceri	3218756262	nir/st_glsl_to_nir: add param to disable splitting of inputs We need this because we will always copy fs outputs to temps and split the arrays, but do not want to do either of these with fs inputs as it is unnessisary and makes handling interpolateAt builtins difficult. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-31 09:14:08 +11:00
Timothy Arceri	93e213f91f	st/glsl_to_nir: copy nir compiler options to context Various nir passes may expect this to be here as does the nir serialisation pass. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-31 09:14:08 +11:00
Timothy Arceri	dd6d6c63a7	radeonsi/nir: add input support for arrays that have not been copied to temps and split We need this to be able to support the interpolateAt builtins in a sane way. It also leads to the generation of more optimal code. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-31 09:14:07 +11:00
Timothy Arceri	d185190222	ac/radeonsi: add lookup_interp_param and load_sample_position to the abi This will enable the interpolateAt builtins to work on the radeonsi nir backend. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-31 09:14:07 +11:00
Timothy Arceri	97058168a4	radeonsi/nir: add prim_mask to the abi Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-31 09:14:07 +11:00
Timothy Arceri	3ff012f142	radeonsi/nir: adjust load_sample_position() to be shared between backends With this interface change it can be shared between the tgsi and nir backends. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-31 09:14:07 +11:00
Timothy Arceri	3a47b138e3	radeonsi/nir: add si_nir_lookup_interp_param() helper Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-31 09:14:07 +11:00
Timothy Arceri	b8808848ce	ac/nir_to_llvm: move some interp defines to the header These will be used in the following patch. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-31 09:14:07 +11:00
Timothy Arceri	fea6da9aaa	radeonsi/nir: move the interpolation qualifier scanning We need to collect this when scanning over the instruction rather than when scanning over the inputs otherwise we might get confliting values for inputs that are use by the interpolateAt* builtins. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-31 09:14:07 +11:00
Timothy Arceri	580f1aa247	radeonsi/nir: add interpolate at intrinsics to scan_instruction() V2: use the uses__opcode_interp_ flags Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-31 09:14:07 +11:00
Bas Nieuwenhuizen	882eff4d20	radv: Merge raster state with PM4 generation. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:02:05 +01:00
Bas Nieuwenhuizen	69364f1c34	radv: Move gs state out of pipeline. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:02:01 +01:00
Bas Nieuwenhuizen	e4e060d135	radv: Split out cliprect rule generation. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:01:56 +01:00
Bas Nieuwenhuizen	acbaef3005	radv: Merge VGT_GS_MODE computation with PM4 generation. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:01:52 +01:00
Bas Nieuwenhuizen	4ae6a8b0cd	radv: Split out processing the vertex input state. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:01:41 +01:00
Bas Nieuwenhuizen	9062b1c241	radv: Move tessellation state out of pipeline. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:01:38 +01:00
Bas Nieuwenhuizen	4aa1cb4e90	radv: Move blend state out of pipeline. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:01:34 +01:00
Bas Nieuwenhuizen	0f72f0eacb	radv: Split out generating VGT_SHADER_STAGES_EN. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:01:30 +01:00
Bas Nieuwenhuizen	694c34314b	radv: Split out the ia_multi_vgt_param precomputation. Also moved everything in a struct and then return the struct from the helper function, so it is clear in the caller what part of the pipeline gets modified. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:01:26 +01:00
Bas Nieuwenhuizen	0bea0851aa	radv: Split out db_shader_control computation. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:01:18 +01:00
Bas Nieuwenhuizen	5dce47ae6d	radv: Compute shader_z_format when emitting it. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:01:13 +01:00
Bas Nieuwenhuizen	df2e7ab0db	radv: Merge depth stencil state with PM4 generation. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:01:06 +01:00
Bas Nieuwenhuizen	d5a0af84ec	radv: Merge ps_input_cntl computation with PM4 generation. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:01:01 +01:00
Bas Nieuwenhuizen	e2bf18030d	radv: Merge vtx_reuse_depth computation with PM4 generation. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:00:55 +01:00
Bas Nieuwenhuizen	c80747b32c	radv: Merge vs state computation with PM4 generation. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:00:50 +01:00
Bas Nieuwenhuizen	c4191cf944	radv: Merge binning state generation with pm4 emission. We don't need the pipeline state struct anymore. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:00:45 +01:00
Bas Nieuwenhuizen	6f1a3f081e	radv: Constify some pipeline helpers. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:00:40 +01:00
Bas Nieuwenhuizen	f0c9ef410a	radv: Add PM4 pregeneration for compute pipelines. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:00:34 +01:00
Bas Nieuwenhuizen	beeab44190	radv: Record a PM4 sequence for graphics pipeline switches. This gives about 2% performance improvement on dota2 for me. This is mostly a mechanical copy and replacement, but at bind time we still do: 1) Some stuff that is only based on num_samples changes. 2) Some command buffer state setting. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:00:22 +01:00
Bas Nieuwenhuizen	7c366bc152	radv: Determine unneeded dynamic states. Which avoids setting or emitting them. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:00:17 +01:00
Andres Rodriguez	0a89784bcc	mesa: check for invalid index on UUID glGet queries This fixes the piglit test: spec/ext_semaphore/api-errors/usigned-byte-i-v-bad-value Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 15:13:49 -05:00
Andres Rodriguez	566ed727a4	mesa: fix glGet for ext_external_objects parameters This allows the client to actually query the enums specified in the ext_external_objects spec. Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 15:13:49 -05:00
Andres Rodriguez	0ebd3cc863	mesa: fix error codes for importing memory/semaphore FDs This fixes the following piglit tests: spec/ext_semaphore_fd/api-errors/import-semaphore-fd-bad-enum spec/ext_memory_object_fd/api-errors/import-memory-fd-bad-enum Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 15:13:49 -05:00
Andres Rodriguez	50b06cbc10	radeonsi: fix fence_server_sync() holding up extra work v2 When calling si_fence_server_sync(), the wait operation is associated with the next kernel submission. Therefore, any unflushed work submitted previous to fence_server_sync() will also be affected by the wait. To avoid adding the dependency to the unflushed work, we flush before emitting the fence dependency. v2: s/semaphore/fence Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 15:13:49 -05:00
Andres Rodriguez	e0f16ee666	radeonsi: implement semaphore_server_signal v2 Syncobj based waits or signals only happen at submission boundaries. In order to guarantee that the requested signal event will occur when the state tracker requested it, we must issue a flush. v2: s/fence/semaphore for pipe objects Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 15:13:49 -05:00
Andres Rodriguez	5b07b06d6b	radeonsi: add support for importing PIPE_FD_TYPE_SYNCOBJ semaphores Hook up importing semaphores of type PIPE_FD_TYPE_SYNCOBJ Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 15:13:49 -05:00
Andres Rodriguez	cc9762d74d	winsys/amdgpu: add support for syncobj signaling v3 Add the ability to signal a syncobj when a cs completes execution. v2: corresponding changes for gallium fence->semaphore rename v3: s/semaphore/fence for pipe objects Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 15:13:49 -05:00
Andres Rodriguez	29b9bd0539	mesa/st: add support for semaphore object signal/wait v4 Bits to implement ServerWaitSemaphoreObject/ServerSignalSemaphoreObject v2: - corresponding changes for gallium fence->semaphore rename - flushing moved to mesa/main v3: s/semaphore/fence for pipe objects v4: add bitmap flushing Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 15:13:49 -05:00
Andres Rodriguez	89b52891fd	mesa: add support for semaphore object signal/wait v3 Memory synchronization is left for a future patch. v2: flush vertices/bitmaps moved to mesa/main v3: removed spaces before/after braces Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 15:13:49 -05:00
Andres Rodriguez	260f7fcc46	mesa: add semaphore parameter stub v2 EXT_semaphore and EXT_semaphore_fd define no pnames. Therefore there isn't much to do besides determining the correct error code. v2: removed useless return Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 15:13:49 -05:00
Andres Rodriguez	382067f065	mesa/st: add support for semaphore object create/import/delete v3 Add basic semaphore object operations. v2: s/semaphore/fence for pipe objects v3: added missing license headers Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 15:13:49 -05:00
Andres Rodriguez	67d5d08682	mesa: add support for semaphore object creation/import/delete v3 Used by EXT_semmaphore and EXT_semaphore_fd v2: Removed unnecessary dummy callback initialization v3: Fixed attempting to free the DummySemaphoreObject Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 15:13:49 -05:00
Andres Rodriguez	8e635f7d65	mesa/st: introduce EXT_semaphore and EXT_semaphore_fd v2 Guarded by PIPE_CAP_SEMAPHORE_SIGNAL v2: corresponding changes for PIPE_CAP_SEMAPHORE_SIGNAL rename Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 15:13:49 -05:00
Andres Rodriguez	fde1afc495	u_threaded_context: add support for fence_server_signal v2 v2: s/semaphore/fence Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 15:13:49 -05:00
Andres Rodriguez	d34c2cf3e6	gallium: add fence_server_signal() v2 Calling this function will emit a fence signal operation into the GPU's command stream. v2: documentation typos Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 15:13:49 -05:00
Andres Rodriguez	458f89be78	gallium: introduce PIPE_FD_TYPE_SYNCOBJ Denotes that a fd is backed by a synobj. For example, radv shared semaphores. Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 15:13:49 -05:00
Andres Rodriguez	2ab405d254	gallium: introduce PIPE_CAP_FENCE_SIGNAL v2 Protects semaphore signaling functionality required by GL_EXT_semaphore. v2: s/semaphore/fence Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 15:13:49 -05:00
Andres Rodriguez	585daa2378	gallium: add type parameter to create_fence_fd An fd can potentially have different types of objects backing it. Specifying the type helps us make sure we treat the FD correctly. This is in preparation to allow importing syncobj fence FDs in addition to native sync FDs. Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 15:13:49 -05:00
Dave Airlie	16dd0eb517	ac/llvm: bump the number of results to 8. This function can get access for a 64-bit dvec4, which means we have to load 8 components. This fixes: R600_DEBUG=nir ./bin/shader_runner generated_tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-abs-dvec4.shader_test -auto Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-31 05:37:16 +10:00
Dave Airlie	8d633f067b	r600/sb: insert the else clause when we might depart from a loop If there is a break inside the else clause and this means we are breaking from a loop, the loop finalise will want to insert the LOOP_BREAK/CONTINUE instruction, however if we don't emit the else there is no where for these to end up, so they will end up in the wrong place. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101442 Tested-By: Gert Wollny <gw.fossdev@gmail.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-31 04:47:29 +10:00
Brian Paul	1a9aa69ae8	mesa: remove invalid assertion in _mesa_enable_vertex_array_attrib() The meta module passes some 0-based attrib values. Should fix Piglit regressions reported by Mark Janes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104863 Fixes: `4ab7e03e1f` ("mesa: add an assertion in _mesa_enable_vertex_array_attrib()") Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-30 11:02:43 -07:00
Brian Paul	efa0993eaf	mesa: use gl_vert_attrib enum type in more places Slightly better readbility. Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>	2018-01-30 11:02:43 -07:00
Brian Paul	f892e332a8	mesa: rename some 'client' array functions A long time ago gl_vertex_array was gl_client_array. Update some function names to be consistent. Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>	2018-01-30 09:07:59 -07:00
Brian Paul	d2d9d090e5	mesa: s/src/attribs/ in _mesa_update_client_array() Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>	2018-01-30 09:07:59 -07:00
Brian Paul	e863541e43	mesa: check/assert array index in _mesa_bind_vertex_buffer() Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>	2018-01-30 09:07:59 -07:00
Brian Paul	fcee2cc711	mesa: trivial comment typo fix in arrayobj.c Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>	2018-01-30 09:07:59 -07:00
Brian Paul	4ab7e03e1f	mesa: add an assertion in _mesa_enable_vertex_array_attrib() Some of the enable/disable vertex array functions take a zero-based generic index, while others take a VERT_ATTRIB_GENERIC0-based value. Add an assertion to clarify that in one place. Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>	2018-01-30 09:07:59 -07:00
Brian Paul	7f12791cc6	mesa: rename some vars in client_state() Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>	2018-01-30 09:07:59 -07:00
Mathias Fröhlich	06621e8a0d	mesa: Care for differences in fog mode only if fog is consumed. In creating fixed function vertex shader hash keys do only care for producing the varying output if fog is enabled and the varing is consumed in the fragment stage. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-01-30 09:07:59 -07:00
Mathias Fröhlich	6395a0ecf2	mesa: Reduce ffvertex_prog state_key to 36 bytes. Using lower alignment restrictions for the state key fields finally yields to a smaller hashing state key. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-01-30 09:07:59 -07:00
Mathias Fröhlich	b4216b588e	mesa: Remove unused ffvertex_prog texunit_really_enabled. Remove set but not read field from the state key used for hashing fixed function vertex shaders. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-01-30 09:07:59 -07:00
Mathias Fröhlich	1169791c18	mesa: Remove unused bit in ffvertex_prog state_key. Remove set but not read field from the state key used for hashing fixed function vertex shaders. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-01-30 09:07:59 -07:00
Mathias Fröhlich	6726d16098	mesa: texgen_enabled is only 1 bit. For the state key for hashing fixed function vertex shaders, the texgen_enabled field requires only a single bit. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-01-30 09:07:59 -07:00
Mathias Fröhlich	d6b0ad51ec	mesa: Encode fog modes in a 2 bit field. For the state key for hashing fixed function vertex shaders, encode the different fog modes, including if fog is generally enabled or not, into a 2 bit field. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-01-30 09:07:59 -07:00
Mathias Fröhlich	63e845d3cc	mesa: Move seperate_specular into the lighting section. For the state key for hashing fixed function vertex shaders, the information is only evaluated if lighting is generally switched on. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-01-30 09:07:58 -07:00
Mathias Fröhlich	11e665d434	mesa: Get the point size array state from varying_vp_inputs. For the state key for hashing fixed function vertex shaders, The varying_vp_inputs bitmask already contains the point size array enabled information. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-01-30 09:07:58 -07:00
Mathias Fröhlich	bc5c54cadf	mesa: Remove unused gl_fog_attrib::_Scale. The patch removes a variable that is only written to. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-01-30 09:07:58 -07:00
Iago Toral Quiroga	99b57daf4a	anv/pipeline: lower constant initializers on output variables earlier If a shader only writes to an output via a constant initializer we need to lower it before we call nir_remove_dead_variables so that this pass sees the stores from the initializer and doesn't kill the output. Fixes test failures in new work-in-progress CTS tests: dEQP-VK.spirv_assembly.instruction.graphics.variable_init.output_vert dEQP-VK.spirv_assembly.instruction.graphics.variable_init.output_frag Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-30 08:10:29 +01:00
Tapani Pälli	6316c2ecbd	i965: move disk cache from brw_context to intel_screen Now every context refers to same disk_cache instance in screen. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Suggested-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-01-30 08:42:51 +02:00
Elie Tournier	6f8518e068	mesa: Correctly print glTexImage dimensions texture_format_error_check_gles() displays error like "glTexImage%dD". This patch just replace the %d by the correct dimension. Signed-off-by: Elie Tournier <elie.tournier@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-01-30 07:48:56 +02:00
Brian Paul	d5f42f96e1	mesa: shrink size of gl_array_attributes (v2) Inspired by Marek's earlier patch, but even smaller. Sort fields from largest to smallest. Use bitfields for more fields (sometimes with an extra bit for MSVC). Reduce Stride field to GLshort. Note that some fields cannot be bitfields because they're accessed via pointers (such as for glEnableClientState(GL_VERTEX_ARRAY) to set the Enabled field). Reduces size from 48 to 24 bytes. Also reduces size of gl_vertex_array_object from 3632 to 2864 bytes. And add some assertions in init_array(). v2: use s/GLuint/unsigned/, improve commit comments. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-29 21:16:50 -07:00
Brian Paul	79cafa0df3	mesa: shrink gl_vertex_array Inspired by Marek's earlier patch, but goes a little further. Sort fields from largest to smallest. Use bitfields. Reduced from 48 bytes to 32. Also reduces size of gl_vertex_array_object from 4144 to 3632 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-29 21:15:52 -07:00
Marek Olšák	f96a69f916	mesa: replace GLenum with GLenum16 in common structures (v4) v2: - fix glGet* - also use GLenum16 for DrawBuffers v3: - rebase to top of tree (BrianP) and incorporate Ian's suggestions v4: - fix a GLenum16 bug in VBO/save code, add some STATIC_ASSERT()s gl_context = 152432 -> 136840 bytes vbo_context = 22096 -> 20608 bytes Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-29 21:15:52 -07:00
Brian Paul	94843e6056	mesa: fix incorrect size/error test in _mesa_GetUnsignedBytevEXT() get_value_size() returns -1 for an error. The similar check in _mesa_GetUnsignedBytei_vEXT() is correct. Found by chance. There are apparently no Piglit tests which exercise glGetUnsignedBytei_vEXT() or glGetUnsignedBytevEXT(). Reviewed-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-29 21:15:52 -07:00
Neha Bhende	e4ca1d6456	svga: Check rasterization state object before checking poly_stipple_enable Sometimes rasterization state object could be empty. This is causing segfault on hw8,9,10 for some traces. This patch fixes enemy_territory_quake_wars_high, enemy_territory_quake_wars_low, etqw-demo, lightsmark2008, quake1 glretrace crashes on hw 8,9,10. Tested with mtt-glretrace and mtt-piglit. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-01-29 21:04:49 -07:00
Neha Bhende	d4a5e14fae	svga: Adjust alpha for S3TC_DXT1_EXT RGB formats According to spec, S3TC_DXT1_EXT RGB formats are supposed to be opaque. Correspoding svga formats are not handling it so explicitly setting it to 1.0. This fixes piglit test spec@ext_texture_compression_s3tc@s3tc-targeted Note: This test is testcase for freedesktop bug 100925 Tested with mtt-piglit and mtt-glretrace on 8,9,10,11 and 15 Reviewed-by: Brian Paul <brianp@vmware.com>	2018-01-29 21:04:49 -07:00
Gert Wollny	6a7d1ca2c4	mesa/st/glsl_to_tgsi: Mark first write as unconditional when appropriate In the register lifetime estimation if the first write is unconditional or conditional but not within a loop then this is an unconditional dominant write in the sense of register life time estimation. Add a test case and record the write accordingly. Fixes: `807e2539e5` ("mesa/st/glsl_to_tgsi: Add tracking of ifelse writes in register merging") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104803 Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-01-29 21:04:49 -07:00
Roland Scheidegger	3c7aa242f5	mesa: skip validation of legality of size/type queries for format queries The size/type query is always legal (if we made it that far). Removing this causes a difference for GL_TEXTURE_BUFFER - the reason is that these parameters are valid only with GetTexLevelParameter() if gl 3.1 is supported, but not if only ARB_texture_buffer_object is supported. However, while the spec says that these queries return "the same information as querying GetTexLevelParameter" I believe we're not expected to return just zeros here. By definition, these pnames are always valid (unlike for the GetTexLevelParameter() function which would return an error without GL 3.1). The spec is a bit inconsistent there and open to interpretation - while mentioning the "same information as querying GetTexLevelParameter" is returned, it also mentions that 0 is returned for size/type if the target/format is not supported - implying correct results to be returned if it is supported, regardless that GetTexLevelParameter would return an error. (Also, the bit about this returning the same as GetTexLevelParameter also includes querying stencil type, which isn't even possible with GetTexLevelParameter.) This breaks some piglit arb_internalformat_query2 tests (which I believe to be wrong). Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>§	2018-01-30 01:28:47 +01:00
Roland Scheidegger	21fe02d1d3	mesa: restrict formats being supported by target type for formatquery The code just considered all formats as being supported if they were either a valid fbo or texture format. This was quite awkward since then the query would return "supported" for e.g. GL_RGB9E5 or compressed formats and target RENDERBUFFER (albeit the driver could still refuse it in theory). However, when then querying for instance the internalformat sizes, it would just return 0 (due to the checks being more strict there). It was also a problem for texture buffer targets, which have a more restricted list of formats which are allowed (and again, it would return supported but then querying sizes would return 0). So only take validation of formats into account which make sense for a given target. Can also toss out some special checks for rgb9e5 later, since we'd never get there if it wasn't supported in the first place. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2018-01-30 01:28:47 +01:00
Roland Scheidegger	272e7e1bd5	mesa: (trivial) add TODO comment for default results for internal queries	2018-01-30 01:28:47 +01:00
Roland Scheidegger	09dc4f9012	mesa: remove misleading gles checks for formatquery Testing for gles there is just confusing - this is about target being supported, if it was valid at all was already determined earlier (in _legal_parameters). It didn't make sense at all in any case, since it would only have said false there for gles for 2d but not 2d arrays etc. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2018-01-30 01:28:47 +01:00
Rafael Antognolli	e7ecc5e160	i965: Emit PIPE_CONTROL with ISP bit on older platforms. Emit it on all platforms since gen7. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-29 14:52:07 -08:00
Rafael Antognolli	fa21ddf7b1	anv/cmd_buffer: Emit PIPE_CONTROL with ISP bit on older platforms. Emit it on all platforms since gen7. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-29 14:52:07 -08:00
Timothy Arceri	2b4afaef1c	st/glsl_to_nir: remove dead io after conversion to nir This fixes an assert in nir_lower_var_copies() for some bioshock shaders where an unused clipdistance array has no size. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 09:14:36 +11:00
Timothy Arceri	327c1a7fb3	radeonsi/nir: add support vs double inputs Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 09:08:47 +11:00
Timothy Arceri	44067d6f0d	radeonsi: pass input_idx to declare_nir_input_vs() This make it consistent with declare_nir_input_fs() and will allow us to support doubles. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 09:08:47 +11:00
Timothy Arceri	cf75ee3ab1	radeonsi: add bitcast_inputs() helper Will be used in a following patch to help support doubles. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 09:08:47 +11:00
Timothy Arceri	96cfd4bd7e	radeonsi/nir: fix num_inputs for doubles in vs Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 09:08:47 +11:00
Timothy Arceri	09cd484d61	nir: partially revert `c2acf97fcc` `c2acf97fcc` changed the use of double_inputs_read to be inconsitent with its previous meaning. Here we re-enable the gather info code that was removed as the modified code from `c2acf97fcc` now uses the double_inputs member rather than double_inputs_read. This change allows us to use double_inputs_read with gallium drivers without impacting double_inputs which is used by i965. We also make use of the compiler option vs_inputs_dual_locations to allow for the difference in behaviour between drivers that handle vs inputs as taking up two locations for doubles, versus those that treat them as taking a single location. Reviewed-by: Karol Herbst <kherbst@redhat.com>	2018-01-30 09:08:47 +11:00
Timothy Arceri	5b8de4bdff	nir: add vs_inputs_dual_locations compiler option Allows nir drivers to either use a single or dual locations for vs double inputs. i965 uses dual locations for both OpenGL and Vulkan drivers, for now gallium OpenGL drivers only use a single location. The following patch will also make use of this option when calling nir_shader_gather_info(). Reviewed-by: Karol Herbst <kherbst@redhat.com>	2018-01-30 09:08:47 +11:00
Timothy Arceri	f63e05ae9e	compiler: tidy up double_inputs_read uses First we move double_inputs_read into a vs struct in the union, double_inputs_read is only used for vs inputs so this will save space and also allows us to add a new double_inputs field. We add the new field because `c2acf97fcc` changed the behaviour of double_inputs_read, and while it's no longer used to track actual reads in i965 we do still want to track this for gallium drivers. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 09:08:47 +11:00
Dave Airlie	f6cc15dccd	radv/gfx9: fix block compression texture views. (v2) This ports a fix from amdvlk, to fix the sizing for mip levels when block compressed images are viewed using uncompressed views. My original fix didn't power the clamping, but it looks like the clamping is required to stop the sizing going too large. Fixes: dEQP-VK.image.texel_view_compatible.graphic.extendedbc Doesn't crash DOW3 anymore. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Fixes: `e38685cc62` 'Revert "radv: disable support for VEGA for now."' Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-30 07:39:13 +10:00
Bas Nieuwenhuizen	0347a83bbf	radv: Signal fence correctly after sparse binding. It did not signal syncobjs in the fence, and also signalled too early if there was work on the queue already, as we have to wait till that work is done. Fixes: `d27aaae4d2` "radv: Add external fence support." Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-29 17:22:58 +01:00
Brian Paul	0d044f7d61	mesa/vbo: replace vbo_draw_method() with _mesa_set_drawing_arrays() The arrays specified by ctx->Array._DrawArrays are used for all vertex drawing via vbo_context::draw_prims(). Different arrays are used for immediate mode, vertex arrays, display lists, etc. Changing from one to another requires updating derived/driver array state. Before, we indirectly specifid the arrays with the gl_draw_method values. Now we just directly specify the arrays instead. This is simpler and will allow a subsequent display list optimization. In the future, it might make sense to get rid of ctx->Array._DrawArrays entirely and just pass the arrays as another parameter to vbo_context::draw_prims(). Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-01-29 08:35:14 -07:00
Brian Paul	d9894ede02	vbo: s/[0]/[VERT_ATTRIB_POS]/ in recalculate_input_bindings() Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-01-29 08:35:14 -07:00
Brian Paul	48a6ab472a	vbo: add new VBO_ATTRIBS_ masks to vbo_attrib.h These will be used in a later patch. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-01-29 08:35:14 -07:00
Brian Paul	41cd3ee5a2	vbo: s/VBO_ATTRIB_INDEX/VBO_ATTRIB_COLOR_INDEX/ To match the VERT_ATTRIB_COLOR_INDEX name. Give a name to the previously anonymous enum of VBO_ATTRIB_x values. Update the comment on the enum. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-01-29 08:35:14 -07:00
Brian Paul	425da3bbfc	vbo: minor clean-ups in vbo_exec.h Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-01-29 08:35:14 -07:00
Brian Paul	d631ea3a23	vbo: s/_API_NOOP_H/VBO_NOOP_H/ in vbo_noop.h Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-01-29 08:35:14 -07:00
Brian Paul	094a80db4c	vbo: whitespace/formatting fixes in vbo_exec.h Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-01-29 08:35:14 -07:00
Brian Paul	b080fc6199	vbo: move, rename vp_mode enums, get_program_mode() function Instead of NONE/ARB use FF/SHADER. Move the enum declaration to vbo_private.h where it's used. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-01-29 08:35:14 -07:00
Brian Paul	35e0ff5bd5	vbo: s/cl/array/ in vbo_context.c I think 'cl' used to mean client array. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-01-29 08:35:14 -07:00
Tapani Pälli	d0343bef66	nir: mark unused space in packed_tex_data This change cleans following scary warnings in valgrind output when disk cache is being written: ==6532== Uninitialised byte(s) found during client check request ==6532== at 0x14423FAD: blob_write_bytes (blob.c:152) ==6532== by 0x144240FB: blob_write_uint32 (blob.c:194) ==6532== by 0x144001A5: write_tex (nir_serialize.c:613) and later (loads of): ==6532== Use of uninitialised value of size 8 ==6532== at 0x62FCD9E: crc32_z (in /usr/lib64/libz.so.1.2.11) ==6532== by 0x13F65014: util_hash_crc32 (crc32.c:127) ==6532== by 0x13F5DABA: cache_put (disk_cache.c:947) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-29 08:11:22 +02:00
Tapani Pälli	b99c88037b	i965: fix disk_cache leak when destroying context ==2780== 1,024 bytes in 1 blocks are possibly lost in loss record 180 of 205 ==2780== at 0x4C31A1E: calloc (vg_replace_malloc.c:711) ==2780== by 0x13F6467E: util_queue_init (u_queue.c:309) ==2780== by 0x13F5C9F6: disk_cache_create (disk_cache.c:369) ==2780== by 0x13F05406: brw_disk_cache_init (brw_disk_cache.c:428) ==2780== by 0x13F01E78: brwCreateContext (brw_context.c:1068) Fixes: `1a61a8b9a7` ("i965: Initialize disk shader cache if MESA_GLSL_CACHE_DISABLE is false") Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-01-29 08:11:14 +02:00
Tapani Pälli	28db950b51	i965: fix prog_data leak in brw_disk_cache ==25481== 576 bytes in 1 blocks are definitely lost in loss record 179 of 208 ==25481== at 0x4C2FB6B: malloc (vg_replace_malloc.c:299) ==25481== by 0x1404E2CC: ralloc_size (ralloc.c:121) ==25481== by 0x14119F82: read_and_upload (brw_disk_cache.c:176) ==25481== by 0x1411A5C9: brw_disk_cache_upload_program (brw_disk_cache.c:271) ==25481== by 0x1412FCA4: brw_upload_wm_prog (brw_wm.c:597) Fixes: `516d50db31` ("i965: add initial implementation of on disk shader cache") Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-29 08:11:03 +02:00
Timothy Arceri	9afc38c799	ac: fix indentation Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-29 11:14:23 +11:00
Timothy Arceri	03086f86ae	ac: remove unused nir2llvmtype() The last use of this was removed in the previous patch. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-29 11:14:23 +11:00
Timothy Arceri	fa29a9625e	ac: fix gs load inputs type This fixes the scenario where the input is a struct. With this the Unreal engines Elemental demo now works on radeonsi. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-29 11:14:23 +11:00
Kai Wasserbäch	0aba967328	ac/nir: call glsl_get_sampler_dim() only once where possible Changes since v1: * Rebased on top of `e68150de26` and `82adf53308`. Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-01-29 10:47:31 +11:00
Dave Airlie	2af66ba7e7	docs/features: add r600 ARB_query_buffer_object support Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-29 05:42:34 +10:00
Dave Airlie	1c9ea24a19	r600: add ARB_query_buffer_object support This uses a different shader than radeonsi, as we can't address non-256 aligned ssbos, which the radeonsi code does. This passes some extra offsets into the shader. It also contains a set of u64 instruction implementation that may or may not be complete (at least the u64div is definitely not something that works outside this use-case). If r600 grows 64-bit integers, it will use the GLSL lowering for divmod. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-29 05:42:28 +10:00
Dave Airlie	a7ec366e50	r600/shader: refactor mul hi/lo instruction emission This just makes it a bit simpler for cayman vs eg Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-29 05:42:17 +10:00
Dave Airlie	e0e23ea69c	r600/eg: construct proper rat mask for image/buffers. If the images/buffer bindings had a gap, this produced the wrong values, this should fix that to generate the correct rat mask for mixes of images/buffers/cbs. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Cc: "18.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-29 05:41:58 +10:00
Jon Turney	4a0bab1d7f	meson: libdrm shouldn't appear in Requires.private: if it wasn't found Otherwise, using pkg-config to retrieve flags will fail, e.g. $ pkg-config gl --cflags Package libdrm was not found in the pkg-config search path. Perhaps you should add the directory containing `libdrm.pc' to the PKG_CONFIG_PATH environment variable Package 'libdrm', required by 'gl', not found Fixes: `3218056e0e` ("meson: Build i965 and dri stack") Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>	2018-01-27 18:13:18 +00:00
Eric Anholt	e5a81ac704	broadcom/vc5: Don't forget to get the BO offset when opening a dmabuf. Fixes black display in DRI due to storing to 0x00000000.	2018-01-27 19:40:14 +11:00
Eric Anholt	314e9ee6c4	broadcom/vc5: Enable the driver on V3D 4.2. The changes in 4.2 haven't impacted any of our CL or state struct entries that I can see, so I haven't enabled custom compile for doing 4.2 instead of 4.1.	2018-01-27 19:39:56 +11:00
Eric Anholt	71c7e9bea1	broadcom/vc5: Enable CLIF dumping of V3D 4.2.	2018-01-27 19:04:21 +11:00
Eric Anholt	91f899cbc1	broadcom/vc5: Update the compiler for V3D 4.2.	2018-01-27 19:04:21 +11:00
Eric Anholt	f2e41daac5	broadcom/vc5: Update QPU instruction pack/unpack for v4.2. After the 4.1 spec, 4.2 retroactively renamed patchid to barrierid because it's used for other barriers in compute.	2018-01-27 19:03:55 +11:00
Eric Anholt	96d3e8f134	broadcom/vc5: Add XML for V3D 4.2.	2018-01-27 18:57:58 +11:00
Eric Anholt	b026063b16	broadcom/vc5: Fix a race between XML codegen build and CLIF build.	2018-01-27 18:57:58 +11:00
Eric Anholt	de60ea4432	Android: Attempt to fix broadcom build after vc5 changes.	2018-01-27 18:03:58 +11:00
Marek Olšák	b633999a4e	ac: rename and move si_const_array into common code Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-27 02:09:09 +01:00
Marek Olšák	e17eb8800f	ac: move address space definitions to common code Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-27 02:09:09 +01:00
Marek Olšák	0d62370bbb	ac: don't use byval LLVM qualifier in shaders shader-db doesn't show any regression and 32-bit pointers with byval are declared as VGPRs for some reason. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-27 02:09:09 +01:00
Marek Olšák	0e40c6a7b7	gallium/radeon: set number of pb_cache buckets = number of heaps Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-27 02:09:09 +01:00
Marek Olšák	175549e0e9	pb_cache: let drivers choose the number of buckets Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-27 02:09:09 +01:00
Marek Olšák	ecfd521502	pb_cache: call os_time_get outside of the loop Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-27 02:09:09 +01:00
Marek Olšák	e553cb5a68	gallium/radeon: simplify radeon_flags_from_heap Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-27 02:09:09 +01:00
Timothy Arceri	041b18cf23	st/shader_cache: restore num_tgsi_tokens when loading from cache Without this we will fail to correctly serialise programs when using glGetProgramBinary() if the program was retrieved from the disk cache rather than freshly compiled. Fixes: `c69b0dd681` "st/glsl_to_tgsi: store num_tgsi_tokens in st_*_program" Reviewed-by: Gert Wollny <gw.fossdev@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104762	2018-01-27 10:06:16 +11:00
Marek Olšák	17423c993d	winsys/amdgpu: fix assertion failure with UVD and VCE rings Cc: 18.0 <mesa-stable@lists.freedesktop.org>	2018-01-26 23:12:11 +01:00
Brian Paul	ac0e9e343c	mesa: remove MESA_FUNCTION Just use __func__ in the two macros where it was used. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-01-26 13:52:48 -07:00
Brian Paul	bacf72a18d	mesa: change gl_link_status enums to uppercase follow the convention of other enums. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-01-26 13:52:48 -07:00
Brian Paul	aff5d9c256	mesa: change gl_compile_status enums to uppercase To follow the convention of other enums. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-01-26 13:52:48 -07:00
Brian Paul	d9832f1fc4	mesa: minor comment reformatting, whitespace fixes in mtypes.h Trivial.	2018-01-26 13:52:42 -07:00
Rafael Antognolli	131e871385	i965/gen10: Use CS Stall instead of WriteImmediate. Fixes: `ca19ee33d7` Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-26 12:02:34 -08:00
Rafael Antognolli	20578f81a6	anv/gen10: Emit CS stall and mark push constants dirty. I got reviews and fixed the patches locally, but ended up merging the ones that I sent originally to the list. This patch fixes those mistakes. Fixes: `78c125af39` Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Cc: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-26 11:59:17 -08:00
Rafael Antognolli	bcfd78e448	i965/gen10: Re-enable push constants. The GPU hang caused by push constants is apparently fixed, so let's enable them again. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Cc: "18.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-26 10:07:44 -08:00
Rafael Antognolli	78c125af39	anv/gen10: Ignore push constant packets during context restore. Similar to the GL driver, ignore 3DSTATE_CONSTANT_* packets when doing a context restore. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Cc: Jason Ekstrand <jason@jlekstrand.net> Cc: "18.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-26 10:07:40 -08:00
Rafael Antognolli	ca19ee33d7	i965/gen10: Ignore push constant packets during context restore. These packets were causing GPU hangs when the context was restored, possibly because they were pointing to BO's that were already unreferenced. So we tell the hardware to ignore such packets after the batch buffer ends, since we know those BO's are not around anymore. This change fixes GPU hangs on CNL. The (partial) solution to this problem so far was to entirely disable push constants on this platform. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: "18.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-26 10:07:35 -08:00
Brian Paul	acaec6cdd9	mesa: silence MinGW 'may be unused uninitialized' warning in get.c The warning happens on line 2114 for the memcpy(data, p, size) call. I'm not sure why that generates the warning but not the earlier use of p in the code. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-01-26 10:44:05 -07:00
Eleni Maria Stea	8096b558a7	mesa: Fix function pointers initialization in status tracker We assigned the function that gets the device uuid to the GetDriverUuid function pointer and the function that gets the driver uuid to the GetDeviceUuid function pointer inside the state tracker. Exchanged the pointers. cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com>	2018-01-26 08:17:55 -07:00
Iago Toral Quiroga	d3ce493b34	anv/pipeline: remove the pipeline layout field from anv_pipeline It no longer has any users. Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-26 14:06:47 +01:00
Iago Toral Quiroga	75a4802060	anv/cmd_buffer: add the pipeline layout to the pipeline state We need to access the pipeline layout to compute correct dynamic offsets for dyamic UBO/SSBO descriptors when we emit draw commands. Instead of taking it from the pipeline object, store the layout in the command buffer pipeline state. Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-26 14:06:47 +01:00
Iago Toral Quiroga	e1a49f974b	anv/pipeline: don't take the layout from the pipeline to compile shaders The Vulkan spec states that VkPipelineLayout objects must not be destroyed while any command buffer that uses them is in the recording state, but it permits them to be destroyed otherwise. This means that applications are allowed to free pipeline layouts after command recording is finished even if there are pipeline objects that still exist and were created with these layouts. There are two solutions to this, one is to use reference counting on pipeline layout objects. The other is to avoid holding references to pipeline layouts where they are not really needed. This patch takes a step towards the second option by making the pipeline shader compile code take pipeline layout from the VkGraphicsPipelineCreateInfo provided rather than the pipeline object. A follow-up patch will remove any remaining uses of the layout field so we can remove it from the pipeline object and avoid the need for reference counting. v2: Use ANV_FROM_HANDLE, remove unnecessary braces (Jason) Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-26 14:06:46 +01:00
Iago Toral Quiroga	14f6275c92	anv/descriptor_set: add reference counting for descriptor set layouts The spec states that descriptor set layouts can be destroyed almost at any time: "VkDescriptorSetLayout objects may be accessed by commands that operate on descriptor sets allocated using that layout, and those descriptor sets must not be updated with vkUpdateDescriptorSets after the descriptor set layout has been destroyed. Otherwise, descriptor set layouts can be destroyed any time they are not in use by an API command." v2: allocate off the device allocator with DEVICE scope (Jason) Fixes the following work-in-progress CTS tests: dEQP-VK.api.descriptor_set.descriptor_set_layout_lifetime.graphics dEQP-VK.api.descriptor_set.descriptor_set_layout_lifetime.compute Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-26 14:06:46 +01:00
Samuel Pitoiset	e28233a527	ac/nir: set amdgpu.uniform and invariant.load for SSBOs For descriptors. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-26 12:14:28 +01:00
Samuel Pitoiset	49b0a140a7	ac/nir: set amdgpu.uniform and invariant.load for UBOs UBOs are constants buffers. Cc: "18.0" <mesa-stable@lists.freedesktop.org> Fixes: `41c36c45` ("amd/common: use ac_build_buffer_load() for emitting UBO loads") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-26 12:14:28 +01:00
Samuel Pitoiset	b453f38a47	ac/nir: set the noalias attribute on input pointers This attribute is similar to the definition of restrict in C99 and it might help LLVM. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-26 12:14:28 +01:00
Samuel Pitoiset	310d17fcf1	ac: only load used channels when sampling buffer views This allows to reduce the number of dwords that are loaded with buffer_load_format_xyzw. For example, when the only used channel is 1, the driver will emit buffer_load_format_x instead. Shader stats for DOW3 (with some local hacky scripts for SPIRV): 143 shaders in 143 tests Totals: SGPRS: 5344 -> 5352 (0.15 %) VGPRS: 3476 -> 3452 (-0.69 %) Spilled SGPRs: 30 -> 29 (-3.33 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 269860 -> 269808 (-0.02 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 1267 -> 1272 (0.39 %) Wait states: 0 -> 0 (0.00 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-26 12:14:27 +01:00
Samuel Pitoiset	51e14bc3c0	ac: pass the number of channels to ac_build_buffer_load_format() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-26 12:14:27 +01:00
Samuel Pitoiset	d7c93b558a	ac: add ac_build_buffer_load_common() helper For both versions of llvm.amdgcn.buffer.load.{format}.*. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-26 12:14:27 +01:00
Samuel Pitoiset	6d07e443ba	radv: fix RADV_DEBUG=syncshaders on GFX9 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-26 12:14:27 +01:00
Samuel Pitoiset	5391de1262	radv: fix a GPU hang with RADV_DEBUG=syncshaders The GPU hangs when the driver forces a PS_PARTIAL_FLUSH after a dispatch call (and vice versa for graphics). Something has changed in the kernel driver because it used to work. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-26 12:14:27 +01:00
Samuel Pitoiset	b358e0e67f	ac/shader: scan if fragment shaders write memory It's better to do that in ac_shader_info. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-26 12:14:27 +01:00
Samuel Pitoiset	b9e2f78d6e	ac/nir: only canonicalize 32-bit float min/max outputs on pre-GFX9 According to LLVM, only pre-GFX9 targets do not flush denorms for fmin/fmax. All dEQP-VK.glsl.builtin.precision.* still pass. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-26 12:14:27 +01:00
Jason Ekstrand	c8949e2498	anv/pipeline: Don't look at blend state unless we have an attachment Without this, we may end up dereferencing blend before we check for binding->index != UINT32_MAX. However, Vulkan allows the blend state to be NULL so long as you don't have any color attachments. This fixes a segfault when running The Talos Principal. Fixes: `12f4e00b69` Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-26 01:44:45 -08:00
Maxin B. John	8116b9170b	anv_icd.py: improve reproducible builds Sort the output to ensure build reproducibility Signed-off-by: Maxin B. John <maxin.john@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Fixes: `0ab04ba979` ("anv: Use python to generate ICD json files") Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-26 01:37:45 -08:00
Ian Romanick	c7deeb71a8	nouveau: Remove no-op nvgl_logicop_func function The values that this function returned were always the values passed in. The only thing that happened was either an assertion or undefined results when an unknown value was passed in. This doesn't seem that useful. Most of nouveau_gldefs.h could be removed in this manner. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-01-26 11:21:46 +08:00
Ian Romanick	f5b9c2a6e3	i915: Silence unused parameter warnings ../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c: In function ‘intel_alloc_window_storage’: ../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:290:48: warning: unused parameter ‘ctx’ [-Wunused-parameter] intel_alloc_window_storage(struct gl_context * ctx, struct gl_renderbuffer rb, ^~~ ../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c: In function ‘intel_nop_alloc_storage’: ../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:303:74: warning: unused parameter ‘rb’ [-Wunused-parameter] intel_nop_alloc_storage(struct gl_context ctx, struct gl_renderbuffer rb, ^~ ../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:304:32: warning: unused parameter ‘internalFormat’ [-Wunused-parameter] GLenum internalFormat, GLuint width, GLuint height) ^~~~~~~~~~~~~~ ../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:304:55: warning: unused parameter ‘width’ [-Wunused-parameter] GLenum internalFormat, GLuint width, GLuint height) ^~~~~ ../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:304:69: warning: unused parameter ‘height’ [-Wunused-parameter] GLenum internalFormat, GLuint width, GLuint height) ^~~~~~ ../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c: In function ‘intel_bind_framebuffer’: ../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:396:47: warning: unused parameter ‘fb’ [-Wunused-parameter] struct gl_framebuffer fb, struct gl_framebuffer fbread) ^~ ../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:396:74: warning: unused parameter ‘fbread’ [-Wunused-parameter] struct gl_framebuffer fb, struct gl_framebuffer fbread) ^~~~~~ ../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c: In function ‘intel_renderbuffer_update_wrapper’: ../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:422:57: warning: unused parameter ‘intel’ [-Wunused-parameter] intel_renderbuffer_update_wrapper(struct intel_context intel, ^~~~~ ../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c: In function ‘intel_blit_framebuffer_with_blitter’: ../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:644:61: warning: unused parameter ‘filter’ [-Wunused-parameter] GLbitfield mask, GLenum filter) ^~~~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-01-26 11:21:46 +08:00
Ian Romanick	39f875a6b7	i915: Make intelEmitCopyBlit static And rename to emit_copy_blit. v2: sed --in-place -e 's/color_logic_ops/gl_logicop_mode/g' $(grep -lr color_logic_ops src/) suggested by Brian. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1]	2018-01-26 11:21:46 +08:00
Ian Romanick	9eed6bea6b	i965: Make intelEmitCopyBlit static And rename to emit_copy_blit. v2: sed --in-place -e 's/color_logic_ops/gl_logicop_mode/g' $(grep -lr color_logic_ops src/) suggested by Brian. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1]	2018-01-26 11:21:46 +08:00
Ian Romanick	4e9e964de6	i915: Use enum color_logic_ops for blits v2: sed --in-place -e 's/color_logic_ops/gl_logicop_mode/g' $(grep -lr color_logic_ops src/) suggested by Brian. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1]	2018-01-26 11:21:46 +08:00
Ian Romanick	21be331401	i965: Use enum color_logic_ops for blits v2: sed --in-place -e 's/color_logic_ops/gl_logicop_mode/g' $(grep -lr color_logic_ops src/) suggested by Brian. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1]	2018-01-26 11:21:46 +08:00
Ian Romanick	0aaa27f291	mesa: Pass the translated color logic op dd_function_table::LogicOpcode And delete the resulting dead code. This has only been compile-tested. v2: sed --in-place -e 's/color_logic_ops/gl_logicop_mode/g' $(grep -lr color_logic_ops src/) suggested by Brian. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-26 11:21:46 +08:00
Ian Romanick	cf0b26ec12	st/mesa: Use the translated color logic op from the context And delete the resulting dead code. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-26 11:21:46 +08:00
Ian Romanick	0c69db895f	i965: Use the translated color logic op from the context And delete the resulting dead code. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-01-26 11:21:46 +08:00
Ian Romanick	9c1f010f34	mesa: Also track a remapped version of the color logic op With the exception of NVIDIA hardware, these are is the values that all hardware and Gallium want. The remapping is currently implemented in at least 6 places. This starts the process of consolidating to a single place. v2: sed --in-place -e 's/color_logic_ops/gl_logicop_mode/g' $(grep -lr color_logic_ops src/) suggested by Brian. Added some comments about the selection of bit patterns for gl_logicop_mode and the GLenums. Suggested by Nicolai. Folded the GLenum_to_color_logicop macro into its only users. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> [v1] Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-26 11:21:46 +08:00
Bas Nieuwenhuizen	5a3404d443	radeonsi: Export signalled sync file instead of -1. -1 is considered an error for EGL_ANDROID_native_fence_sync, so we need to actually create a sync file. Fixes: `f536f45250` "radeonsi: implement sync_file import/export" Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-26 01:26:53 +01:00
Jason Ekstrand	db682b8f0e	i965/fs: Reset the register file to VGRF in lower_integer_multiplication `18fde36ced` changed the way temporary registers were allocated in lower_integer_multiplication so that we allocate regs_written(inst) space and keep the stride of the original destination register. This was to ensure that any MUL which originally followed the CHV/BXT integer multiply regioning restrictions would continue to follow those restrictions even after lowering. This works fine except that I forgot to reset the register file to VGRF so, even though they were assigned a number from alloc.allocate(), they had the wrong register file. This caused some GLES 3.0 CTS tests to start failing on Sandy Bridge due to attempted reads from the MRF: ES3-CTS.functional.shaders.precision.int.highp_mul_fragment.snbm64 ES3-CTS.functional.shaders.precision.int.mediump_mul_fragment.snbm64 ES3-CTS.functional.shaders.precision.int.lowp_mul_fragment.snbm64 ES3-CTS.functional.shaders.precision.uint.highp_mul_fragment.snbm64 ES3-CTS.functional.shaders.precision.uint.mediump_mul_fragment.snbm64 ES3-CTS.functional.shaders.precision.uint.lowp_mul_fragment.snbm64 This commit remedies this problem by, instead of copying inst->dst and overwriting nr, just make a new register and set the region to match inst->dst. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103626 Fixes: `18fde36ced` Cc: "17.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-01-25 13:58:55 -08:00
Jason Ekstrand	af9d4ce480	vulkan: Update the XML and headers to 1.0.68 Acked-by: Dave Airlie <airlied@redhat.com> Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Chad Versace <chadversary@chromium.org>	2018-01-25 13:30:05 -08:00
Dave Airlie	f4c534ef68	radv: don't enable tc compat for d32s8 + 4/8 samples (v1.1) This seems to be broken, at least the cts tests fail. This fixes: dEQP-VK.renderpass.suballocation.multisample.d32_sfloat_s8_uint.samples_4 dEQP-VK.renderpass.suballocation.multisample.d32_sfloat_s8_uint.samples_8 2 samples seems to pass fine, amdvlk doesn't appear to enable TC for possibly some other reasons here. This is most likely a hack. v1.1: add a bit of explaination text. (Samuel) Fixes: `ad3d98da9` (radv: enable tc compatible htile for d32s8 also.) Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-26 06:55:09 +10:00
Chuck Atkins	6ac5e851f1	configure.ac: add missing llvm dependencies to .pc files v2: Only add as dependencies for gallium-osmesa and gallium-xlib CC: <mesa-stable@lists.freedesktop.org> Signed-of-by: Chuck Atkins <chuck.atkins@kitware.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-25 14:54:08 -05:00
George Kyriazis	5d8f270d10	swr/rast: Optimize DumpToFile output size Modify DumpToFile to only dump the function, not the entire module. Reduces file sizes and speeds up the dumping. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-25 13:26:49 -06:00
George Kyriazis	dfe4dd48ec	swr/rast: Updated copyright dates on knob-related files. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-25 13:26:49 -06:00
George Kyriazis	36dbbf11a0	swr/rast: Move memory-related JIT functions Move them to their own file (builder_mem.{h\|cpp}). Add builder_mem.cpp to the build system. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-25 13:26:49 -06:00
George Kyriazis	94922dbe4b	swr/rast: Add extra (optional) parameter in GATHERPS Now also takes in an additional parameter (draw context) for future expansion. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-25 13:26:49 -06:00
George Kyriazis	0b46c7b3b0	swr/rast: Better ExecCmd (i.e. system()) implmentation Hides console window creation during JIT linker execution in apps that don't have a console. Remove hooking of CreateProcessInternalA - the MSFT implementation just turns around and calls CreateProcessInternalW which, we do hook. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-25 13:26:49 -06:00
George Kyriazis	2d16b61bff	swr/rast: Support USE_SIMD16_FRONTEND=0 for EarlyRast Early Rasterization did not initially work with USE_SIMD16_FRONTEND=0. Fix it so it works there, too. Please note that the default setting is USE_SIMD16_FRONTEND=1. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-25 13:26:49 -06:00
Brian Paul	123798eb44	mesa: whitespace fixes in attrib.c Trivial.	2018-01-25 12:17:26 -07:00
Brian Paul	0e7aaaf5a5	mesa: whitespace fixes in varray.h Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-01-25 12:17:26 -07:00
Brian Paul	ba01589c0c	mesa: include mtypes.h in varray.h We actually use some of the types from mtypes.h so include it directly instead of relying on indirectly including it via bufferobj.h Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-01-25 12:17:26 -07:00
Brian Paul	e4504be6fc	mesa: s/gl_vertex_attrib_array/gl_array_attributes/ in comments The structure type was renamed some time ago, but some comments were not updated. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-01-25 12:17:26 -07:00
Brian Paul	6c724fb7c1	mesa: simplify _mesa_delete_list() a bit, add some assertions All but two cases of the switch did the same n += InstSize[n[0].opcode] instruction. Just move it after the switch. Add some sanity check assertions. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-25 12:17:26 -07:00
Brian Paul	c860171c63	st/mesa: expand glDrawPixels cache to handle multiple images The newest version of WSI Fusion makes several glDrawPixels calls per frame. By caching more than one image, we get better performance when panning/zooming the map. v2: move pixel unpack param checking out of cache search loop, per Roland v3: also move unpack->BufferObj check out of loop, per Roland.	2018-01-25 12:17:26 -07:00
Brian Paul	5092610f29	st/mesa: add some debug code in st_choose_format() To aid in debugging gallium surface format selection issues. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-25 12:17:26 -07:00
Brian Paul	94610758a3	svga: s/Bool/SVGA3dBool/ in SVGA3dDevCapResult And fix whitespace. To sync up with in-house code. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-01-25 11:56:33 -07:00
Emil Velikov	6aeef54644	configure.ac: correct driglx-direct help text The default was toggled a while back, but the text wasn't updated. Fixes: `bd526ec9e1` ("configure: Always default to --enable-driglx-direct") Cc: Jon TURNEY <jon.turney@dronecode.org.uk> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-01-25 17:44:35 +00:00
Emil Velikov	7b744a494d	swrast: remove non-applicable GLX_SWAP_COPY_OML comment Noticed while skimming for GLX_ instances in the dri codebase. Comment is completely off and was in such a state since day 1. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-25 17:42:57 +00:00
Emil Velikov	3e3956d6ae	mapi: remove duplicate GL typedefs Remove the instances already available in gl.h or glext.h. Sadly GLclampx is only available in GLES(1) so we need to keep that one. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-25 17:42:50 +00:00
Emil Velikov	647f40298a	mapi: remove non applicable HAVE_DIX_CONFIG_H hunk Seeming artefact from when the xserver build was diving directly into mesa's tree. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-25 17:42:48 +00:00
Emil Velikov	48e7bc6833	mapi: autotools: remove unused MAPI_FILES file list The sole user was OpenVG, which was removed couple of years ago. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-25 17:42:46 +00:00
Emil Velikov	785d9a4ed8	automake: st/mesa/tests: add st_tests_common.h to the tarball Fixes: `6569b33b6e` ("mesa/st/tests: unify MockCodeLine* classes") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-25 17:06:29 +00:00
Emil Velikov	0beaf7ad3e	automake: mesa: include vbo_private.h in the tarball Fixes: `a7cfec3be0` ("vbo: move VBO-private types, prototypes, etc. into new vbo_private.h header") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-25 17:06:29 +00:00
Emil Velikov	ac4437b20b	automake: small cleanup after the meson.build inclusion Namely extend the EXTRA_DIST list, instead of re-assigning it and bring back a file dropped by mistake. Fixes: `436ed65d38` ("autotools: include meson build files in tarball") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-01-25 17:06:29 +00:00
Emil Velikov	50265cd9ee	automake: anv: ship anv_extensions_gen.py in the tarball Fixes: `dd088d4bec` ("anv/extensions: Generate a header file with extension tables") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-25 17:06:29 +00:00
Emil Velikov	265d36c890	automake: vc5: remove non-applicable v3dx_simulator.h Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-25 17:06:28 +00:00
Roland Scheidegger	4fe662c58f	gallivm: fix crash with seamless cube filtering with different min/mag filter We are not allowed to modify the incoming coords values, or things may crash (as we may be inside a llvm conditional and the values may be used in another branch). I recently broke this when fixing an issue with NaNs and seamless cube map filtering, and it causes crashes when doing cubemap filtering if the min and mag filters are different. Add const to the pointers passed in to prevent this mishap in the future. Fixes: `a485ad0bcd` ("gallivm: fix an issue with NaNs with seamless cube filtering") Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-01-25 18:03:38 +01:00
Eric Engestrom	57223fb07a	egl: keep extension list sorted, per comment at the top Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-01-25 16:38:11 +00:00
George Kyriazis	0e879aad2f	swr/rast: support llvm 3.9 type declarations LLVM 3.9 was not taken into account in initial check-in. Fixes: `01ab218bbc` ("swr/rast: Initial work for debugging support.") cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104749 Acked-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-25 08:22:52 -06:00
Samuel Pitoiset	e1331c9d61	ac/nir: add break statements in needs_view_index_sgpr() Previous code is correct but as the first case statement uses a break, keep it consistent. CID: `1428579` Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-25 13:59:52 +01:00
Eric Engestrom	0663ae0aa1	loader: let compiler figure out the length of the string Basically, turn comment into code Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-25 11:40:25 +00:00
Eric Engestrom	57b0ccd178	meson: simplify dri3 logic Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-01-25 10:10:04 +00:00
Juan A. Suarez Romero	513c2263cb	mesa: add missing RGB9_E5 format in _mesa_base_fbo_format This fixes KHR-GL45.internalformat.renderbuffer.rgb9_e5. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-01-25 09:54:31 +01:00
Jason Ekstrand	df13588d21	i965: Stop disabling aux during texture preparation Previously, we were handling self-dependencies by marking the render buffer and then passing disable_aux=true to prepare_texture so that it would do a resolve. This works but ends us up doing to much resolving in some cases. Specifically, if we're doing something such as mipmap generation, this would cause us to resolve all levels of the texture if even one of them is overlapping. Instead, this commit makes us wait until we process the framebuffer to do these resolves and we only resolve the slices needed for rendering. Doing this resolve puts them into the pass-through state so, even if we do texture using CCS_E, the CCS data will effectively be ignored and the real surface contents read. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-01-24 19:05:36 -08:00
Jason Ekstrand	20f70ae385	i965/draw: Set NEW_AUX_STATE when draw aux changes Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104411 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104383 Fixes: `ea0d2e98ec` Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-01-24 19:05:36 -08:00
Jason Ekstrand	e52a9f18d6	i965: Replace draw_aux_buffer_disabled with draw_aux_usage Instead of keeping an array of booleans, we now hang onto an array of isl_aux_usage enums. This means that the thing we are passing from brw_draw.c to surface state setup is the thing that surface state setup actually needs instead of an input to compute what it needs. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-01-24 19:05:36 -08:00
Jason Ekstrand	468ea3cc45	i965/surface_state: Drop brw_aux_surface_disabled The only purpose of this function is to disable aux on texture surfaces when the corresponding renderbuffer has aux disabled. However, the act of disabling aux on the renderbuffer will cause it to be resolved and intel_miptree_texture_aux_usage will already check the resolved status of a texture and return ISL_AUX_USAGE_NONE for it. Even if we used CCS for it, that wouldn't really be a problem because the CCS will be in the pass-through state and so it would effectively be ignored. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-01-24 19:05:36 -08:00
Jason Ekstrand	d38ec24f53	i965/miptree: Add an aux_disabled parameter to render_aux_usage Only one of the callers of intel_miptree_render_aux_usage actually took brw->draw_aux_buffer_disabled into account. This was causing us to ignore draw_aux_buffer_disabled for the intel_miptree_prepare_render. This isn't a problem because the draw_aux_buffer_disabled entry was set during texture preparation and we already did the resolve at that time. However, this also meant that the aux_usage we were passing to brw_cache_flush_for_render and brw_render_cache_add_bo was wrong so our automatic cache flushing around aux_usage changes wasn't happening. This was causing GPU hangs in Oxenfree. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104711 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104411 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104383 Fixes: `ea0d2e98ec` Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-01-24 19:05:36 -08:00
Jason Ekstrand	dfe0217905	i965/miptree: Take an aux_usage in prepare/finish_render Both callers of intel_miptree_prepare/finish_render have to call intel_miptree_render_aux_usage anyway for other reasons. They may as well pass the result in instead of us calling it again. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-01-24 19:05:36 -08:00
Jason Ekstrand	7d4007d58a	aubinator: Multiply count by 4 to compute buffer sizes The count field is in terms of dwords and not bytes.	2018-01-24 19:05:36 -08:00
Timothy Arceri	e776791432	st/glsl_to_nir: remove reallocation of sampler/image location As far as I can tell this always just reassigns the same value. Also as we don't curretly store UniformHash in the shader cache removing this will help with adding a shader cache to gallium nir drivers. Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-01-25 13:27:22 +11:00
Jordan Justen	62b68d05e7	docs: add 18.1.0-devel release notes template Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2018-01-24 17:10:58 -08:00
Jordan Justen	65c18b02fc	mesa: bump version to 18.1.0-devel Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2018-01-24 17:10:58 -08:00
Greg V	8fae5eddd9	meson: handle LLVM 'x.x.xgit-revision' versions When LLVM is built inside of a git repo (even way below, e.g. /usr/ports/.git exists, and LLVM is built in /usr/ports/devel/llvm50/work), its version becomes something like 5.0.0git-f8ab206b2176. New meson versions already handle this, but we support older versions too. Fixes: `673dda8330` ("meson: build "radv" vulkan driver for radeon hardware") Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-01-24 15:25:54 -08:00
Greg V	53f9131205	meson: fix getting cflags from pkg-config get_pkgconfig_variable('cflags') always returns an empty list, it's a function for getting custom variables. Meson does not yet support asking for cflags, so explicitly invoke pkg-config for now. Fixes: `68076b8747` ("meson: build gallium vdpau state tracker") Fixes: a817af8a89eb ("meson: build gallium xvmc state tracker") Fixes: `1d36dc674d` ("meson: build gallium omx state tracker") Fixes: `5a785d51a6` ("meson: build gallium va state tracker") Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>	2018-01-24 15:25:54 -08:00
Greg V	c38c60a63c	meson: fix BSD build CC: 18.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-01-24 15:25:54 -08:00
Greg V	7c8cfe2d59	meson: fix missing dependencies Fixes: `66f97f6640` ("meson: build radeonsi") Reviewed-by: Emil Velikov <emil.velikov@colalbora.com> Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>	2018-01-24 15:25:54 -08:00
Grazvydas Ignotas	0cc7370733	anv: correct a duplicate check in an assert Looks like checking both sources was intended, instead of the first one twice. Found with Coccinelle, coccinellery/xand/xand.cocci semantic patch. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-01-25 01:10:45 +02:00
Marc Dietrich	a2a1b0e75e	meson: fix HAVE_LLVM version define in meson build LLVM patch level is not included in HAVE_LLVM. Fixes: `e6418ab156` ("meson: build "radv" vulkan driver for radeon hardware") Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan.c.baker@intel.com> Signed-off-by: Marc Dietrich <marvin24@gmx.de>	2018-01-24 14:04:20 -08:00
Dylan Baker	5781c3d1db	meson: correctly set SYSCONFDIR for loading dirrc Fixes: `d1992255bb` ("meson: Add build Intel "anv" vulkan driver") Reported-by: Marc Dietrich <marvin24@gmx.de> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-01-24 13:10:32 -08:00
Dave Airlie	d2414e64e4	radv: add multisample Z optimisation from amdvlk This was just found while reading for other stuff, src/core/hw/gfxip/gfx6/gfx6DepthStencilView.cpp. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-25 06:48:11 +10:00
Dave Airlie	298554541d	radv: move spi_baryc_cntl to pipeline We need to enable the pos float location 2 mode anytime we have persample not just when forced by the frag shader. This fixes: dEQP-VK.pipeline.multisample.min_sample_shading* Fixes: `58c97a079` (radv: enable location at sample when persample is forced.) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-25 06:47:28 +10:00
Marek Olšák	125c0529f3	gallium/u_tests: add texture_barrier and FBFETCH tests Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-24 21:08:45 +01:00
Marek Olšák	022c5b22fe	radeonsi: don't ignore pitch for imported textures Cc: 17.2 17.3 <mesa-stable@lists.freedesktop.org> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-24 21:08:45 +01:00
Scott D Phillips	0b8d38bd48	meson: Fix define for USE_SSE41 Before we were adding -DHAVE_SSE41 which isn't what the code is looking for, so some uses of the sse4.1 code were always being skipped. v2: Don't add any compile check for the quite old -msse4.1 option (Dylan) Fixes: `84486f6462` ("meson: Enable SSE4.1 optimizations") Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-01-24 11:32:34 -08:00
Gert Wollny	8172b9ff48	mesa/st/glsl_to_tgsi: remove now unneeded assert. With the implementation of the tracking of the registers used in reladdr asserting that a driver calling merge_register() uses the address register is no longer needed. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>	2018-01-24 10:34:05 -07:00
Gert Wollny	f2040fbe48	mesa/st/tests: Add tests for lifetime tracking with indirect addressing Add a code line type that accepts one layer of indirect addressing and add tests to check that temporary register access used for indirect addressing is accounted for in the lifetime estimation. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>	2018-01-24 10:34:00 -07:00
Gert Wollny	51c0cee267	mesa/st/glsl_to_tgsi: Add tracking of indirect addressing registers So far indirect addressing was not tracked to estimate the temporary life time, and it was not needed, because code to load the address registers was always emitted eliminating the reladdr* handles in the past glsl-to.tgsi stages. Now, with Mareks patch allowing any 1D register to be used for addressing on some hardware this changed, and the tracking becomes necessary. Because the registers have no direct indication on whether the reladdr* was already loaded into an address register, the temporaries in reladdr* are always tracked as reads. This may result in a slight over-estimation of the lifetime in the cases when the load to the address register was emitted. v2: no changes v3: Use debug_log variable instead of directly writing to std::err in debugging output. v6: fix indention and typos Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1) Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>	2018-01-24 10:23:00 -07:00
Gert Wollny	517e34c62f	mesa/st/tests: Add tests for improved tracking of temporaries Additional tests are added that check the tracking of access to temporaries in if-else branches. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>	2018-01-24 10:23:00 -07:00
Gert Wollny	807e2539e5	mesa/st/glsl_to_tgsi: Add tracking of ifelse writes in register merging Improve the life-time evaluation of temporary registers by also tracking writes in both if and else branches and in up to 32 nested scopes. As a result the estimated required register life-times can be further reduced enabling more registers to be merged. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>	2018-01-24 10:23:00 -07:00
Gert Wollny	8dda01ef5a	mesa/st/tests: cleanup whitespace usage and correct some comments Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>	2018-01-24 10:23:00 -07:00
Gert Wollny	6569b33b6e	mesa/st/tests: unify MockCodeLine* classes * Merge the classes MockCodeLine and MockCodelineWithSwizzle into one, and refactor tests accordingly. * Change memory allocations to use ralloc* interface. v2: * move the test classes into a conveniance library * rename the Mock* classes to Fake* since they are not really Mocks * Base assertion of correct number of src and dst registers in tests on what the operatand actually expects * Fix number of destinations in one test v6: * fix local includes using "..." insteadof <...> Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>	2018-01-24 10:23:00 -07:00
Gert Wollny	ad1990629e	mesa/st/tests: Fix zero-byte allocation leaks Don't allocate a zero-sized array, when no texture offsets are given. v5: correct spaces and empty lines Reviewed-by: Brian Paul <brianp@vmware.com>(v4) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1) Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>	2018-01-24 10:23:00 -07:00
Gert Wollny	ee48e3acb8	mesa/st/glsl_to_tgsi: Add some operators for glsl_to_tgsi related classes Add the equal operator and the "<<" stream write operator for the st_*_reg classes and the "<<" operator to the instruction class, and make use of these operators in the debugging output. v5: Fix empty lines Reviewed-by: Brian Paul <brianp@vmware.com> (v4) Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>	2018-01-24 10:23:00 -07:00
Gert Wollny	6a3421078a	mesa/program: Add missing file types to printout Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>	2018-01-24 10:23:00 -07:00
Brian Paul	365a48abdd	vbo: fix incorrect min/max_index values in display list draw call This fixes another regression from commit `8e4efdc895` ("vbo: optimize some display list drawing"). The problem was the min_index, max_index values passed to the vbo drawing function were not computed to compensate for the biased prim::start values. https://bugs.freedesktop.org/show_bug.cgi?id=104746 https://bugs.freedesktop.org/show_bug.cgi?id=104742 https://bugs.freedesktop.org/show_bug.cgi?id=104690 Tested-by: Clayton Craft <clayton.a.craft@intel.com> Fixes: `8e4efdc895` ("vbo: optimize some display list drawing") Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>	2018-01-24 10:12:49 -07:00
Brian Paul	2123bd2805	vbo: whitespace/formatting fixes in vbo_split_inplace.c Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	6b0109cf39	vbo: whitespace/formatting fixes in vbo.h Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	b9280031a8	vbo/i965: move vbo_all_varyings_in_vbos() to brw_draw.c It's only used in brw_draw_prims(). s/GLboolean/bool/, etc. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	a83f7e119c	vbo: remove unused vbo_any_varyings_in_vbos() function Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	718f4251c5	vbo: remove unneeded #includes Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	f4376a0c2b	vbo: remove vbo_context.h and change includes to use vbo.h instead Now vbo.h is the public interface to the VBO module. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	aafb56a148	vbo: move remaining items from vbo_context.h to vbo.h Non-VBO sources files sometimes included vbo.h while others included vbo_context.h. We're moving all public types, functions to the former. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	a7cfec3be0	vbo: move VBO-private types, prototypes, etc. into new vbo_private.h header Things which should not be used outside the VBO module. More public/private clean-ups coming. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	d40fa42292	mesa: use new _vbo_install_exec_vtxfmt() function Instead of reaching into the vbo_context object in vtxfmt.c Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	04a17ec327	nouveau: remove vbo_context() call _vbo_DestroyContext() can be safely called even if there's no VBO module. Removes a dependency on the vbo_context() function. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	7b0ae96711	i965: use vbo_set_[indirect]_draw_func() Instead of poking into the vbo_context object. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	3bbf8d9042	vbo: move vbo_sizeof_ib_type() into vbo_exec_array.c It's only used in this one file. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	a152cb7492	mesa: move vbo_count_tessellated_primitives() to api_validate.c It's only used in this file and has nothing VBO-specific about it. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	5d3e10fd27	mesa: update comment on gl_display_list Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	cffa82327d	mesa: whitespace clean-ups in mtypes.h Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	b3a1aa94d9	mesa: remove unused MAT_INDEX_AMBIENT/DIFFUSE/SPECULAR contants Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	67dc551ba9	vbo: move DLIST_DANGLING_REFS from mtypes.h to vbo_save_api.c It's only used in this file. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	cb7ef0df00	vbo: replace assert(0) with unreachable() Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	8b3cb7c651	vbo: fix, add comment in vbo_save.h Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	67ebde19d4	vbo: whitespace, formatting fixes in vbo_split.[ch] Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Topi Pohjolainen	ec4bb693a0	i965: Don't try to disable render aux buffers for compute Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104546 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-01-24 10:54:08 +02:00
Jason Ekstrand	4064fe59e7	anv/cmd_buffer: Move gen7 index buffer state to graphics state Tested-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "18.0" <mesa-stable@lists.freedesktop.org>	2018-01-23 21:10:46 -08:00
Jason Ekstrand	38ec78049f	anv/cmd_buffer: Move num_workgroups to compute state While we're here, make it an anv_address. Tested-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "18.0" <mesa-stable@lists.freedesktop.org>	2018-01-23 21:10:44 -08:00
Jason Ekstrand	95ff232294	anv/cmd_buffer: Move dynamic state to graphics state Tested-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "18.0" <mesa-stable@lists.freedesktop.org>	2018-01-23 21:10:43 -08:00
Jason Ekstrand	24caee8975	anv/cmd_buffer: Use a temporary variable for dynamic state We were already doing this for some packets to keep the lines shorter. We may as well just do it for all of them. Tested-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "18.0" <mesa-stable@lists.freedesktop.org>	2018-01-23 21:10:40 -08:00
Jason Ekstrand	8bd5ec5b86	anv/cmd_buffer: Move vb_dirty bits into anv_cmd_graphics_state Vertex buffers are entirely a graphics pipeline thing. Tested-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "18.0" <mesa-stable@lists.freedesktop.org>	2018-01-23 21:10:39 -08:00
Jason Ekstrand	e85aaec148	anv/cmd_buffer: Move dirty bits into anv_cmd_*_state Tested-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "18.0" <mesa-stable@lists.freedesktop.org>	2018-01-23 21:10:36 -08:00
Jason Ekstrand	97f96610c8	anv: Separate compute and graphics descriptor sets The Vulkan spec says: "pipelineBindPoint is a VkPipelineBindPoint indicating whether the descriptors will be used by graphics pipelines or compute pipelines. There is a separate set of bind points for each of graphics and compute, so binding one does not disturb the other." Up until now, we've been ignoring the pipeline bind point and had just one bind point for everything. This commit separates things out into separate bind points. Tested-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102897 Cc: "18.0" <mesa-stable@lists.freedesktop.org>	2018-01-23 21:10:33 -08:00
Jason Ekstrand	31b2144c83	anv/cmd_buffer: Use anv_descriptor_for_binding for samplers Tested-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "18.0" <mesa-stable@lists.freedesktop.org>	2018-01-23 21:10:31 -08:00
Jason Ekstrand	b9e1ca16f8	anv/cmd_buffer: Add a helper for binding descriptor sets This lets us unify some code between push descriptors and regular descriptors. It doesn't do much for us yet but it will. Tested-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "18.0" <mesa-stable@lists.freedesktop.org>	2018-01-23 21:10:30 -08:00
Jason Ekstrand	90cceaa9dd	anv/cmd_buffer: Refactor ensure_push_descriptor_set It's now a function which returns the push descriptor set. Since we set the error on the command buffer, returning the error is a little redundant. Returning the descriptor set (or NULL on error) is more convenient. Tested-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "18.0" <mesa-stable@lists.freedesktop.org>	2018-01-23 21:10:28 -08:00
Jason Ekstrand	d5592e2fda	anv: Remove semicolons from vk_error[f] definitions With the semicolons, they can't be used in a function argument without throwing syntax errors. Tested-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "18.0" <mesa-stable@lists.freedesktop.org>	2018-01-23 21:10:27 -08:00
Jason Ekstrand	9af5379228	anv/cmd_buffer: Add substructs to anv_cmd_state for graphics and compute Initially, these just contain the pipeline in a base struct. Tested-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "18.0" <mesa-stable@lists.freedesktop.org>	2018-01-23 21:10:25 -08:00
Jason Ekstrand	ddc2d28548	anv/cmd_buffer: Use some pre-existing pipeline temporaries There are several places where we'd already saved the pipeline off to a temporary variable but, due to an artifact of history, weren't actually using that temporary everywhere. No functional change. Tested-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "18.0" <mesa-stable@lists.freedesktop.org>	2018-01-23 21:10:24 -08:00
Jason Ekstrand	cd3feea745	anv/cmd_buffer: Rework anv_cmd_state_reset This splits anv_cmd_state_reset into separate init and finish functions. This lets us share init code with cmd_buffer_create. This potentially fixes subtle bugs where we may have missed some bit of state that needs to get initialized on command buffer creation. Tested-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "18.0" <mesa-stable@lists.freedesktop.org>	2018-01-23 21:10:22 -08:00
Jason Ekstrand	d6c9a89d13	anv/cmd_buffer: Get rid of the meta query workaround Meta has been gone for a long time. Tested-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "18.0" <mesa-stable@lists.freedesktop.org>	2018-01-23 21:10:20 -08:00
Jason Ekstrand	bc0a21e348	anv/cmd_state: Drop the scratch_size field This is a legacy left-over from the mechanism we used to use to handle scratch. The new (and better) mechanism doesn't use this. Tested-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "18.0" <mesa-stable@lists.freedesktop.org>	2018-01-23 21:10:19 -08:00
Jason Ekstrand	4b69ba3817	anv/pipeline: Don't assert on more than 32 samplers This prevents an assert when running one unreleased Vulkan game. Tested-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "18.0" <mesa-stable@lists.freedesktop.org>	2018-01-23 21:10:08 -08:00
Dave Airlie	766589d89a	radv: fix sample_mask_in loading. (v3.1) This is ported from radeonsi and fixes: dEQP-VK.pipeline.multisample_shader_builtin.sample_mask.bit_* v2: don't call this path for radeonsi, it does it in the epilog. use the radeonsi code path. v3: handle NULL pCreateInfo->pMultisampleState properly (Samuel) v3.1: set ps_iter_samples default to 1 (Bas) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Fixes: `bdcbe7c76` (radv: add sample mask input support) Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-24 14:25:11 +10:00
Dave Airlie	c727ea9370	radv: don't use hw resolves for r16g16 norm formats. radeonsi has a workaround for this, but it uses a R16A16 format, which vulkan doesn't have, we could probably come up with a work around but for now just avoid hw resolves. Fixes: dEQP-VK.renderpass.suballocation.multisample.r16g16_norm Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Fixes: `2a04f5481d` (radv/meta: select resolve paths) Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-24 09:01:12 +10:00
Dave Airlie	4df414bbd2	radv: don't use hw resolve for integer image formats From reading AMDVLK it currently never uses hw resolve paths. This patch takes from radeonsi which doesn't use hw resolve for integer formats, and does the same for radv. This fixes: dEQP-VK.renderpass.suballocation.multisample*uint tests. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Fixes: `2a04f5481d` (radv/meta: select resolve paths) Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-24 08:53:18 +10:00
Dave Airlie	316d762186	radv: add fs_key meta format support to resolve passes. Some of the hw resolve passes need the SPI color format setup correctly. This fixes lots of 16-bit and 32-bit format tests in dEQP-VK.renderpass.suballocation.multisample* Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Fixes: `f4e499ec79` "radv: add initial non-conformant radv vulkan driver" Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-24 08:50:51 +10:00
Grazvydas Ignotas	224fd17e1e	winsys/svga: check correct member after create .mob_fenced was already checked, probably a copy-paste bug. Found by Coccinelle. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-01-23 11:04:07 -07:00
Grazvydas Ignotas	08085df313	svga: fix context alloc error handling 'cleanup' path is dereferencing 'svga' a lot, 'done' is a better choice. Found by Coccinelle. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-01-23 11:04:07 -07:00
Christoph Haag	4b4d929c27	meson: remove lib prefix from libd3dadapter9.so Fixes: `6b4c7047d5` ("meson: build gallium nine state_tracker") Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>	2018-01-23 09:30:30 -08:00
Emil Velikov	3b6d232a5c	docs: update calendar 18.0.0-rc1 is out Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-23 17:02:17 +00:00
Eric Engestrom	eee8dd7c33	radeon: remove left over dead code Fixes: `4e0d99a635` "r100: Use shared debug code" Cc: Pauli Nieminen <suokkos@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-01-23 15:39:57 +00:00
Eric Engestrom	10f5e0dce2	docs: ask for backport nominations to cc: the author Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-01-23 15:39:57 +00:00
Marc Dietrich	911ca587f8	meson: fix some defines misspelled errors in meson.build Defines - HAVE_FUNC_ATTRIBUTE_RETURNS_NONNULL - HAVE_FUNC_ATTRIBUTE_VISIBILITY were misspelled. Signed-off-by: Marc Dietrich <marvin24@gmx.de> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-01-23 15:39:57 +00:00

2216 changed files with 191951 additions and 97134 deletions

2

.mailmap

View File

@@ -148,6 +148,8 @@ Emil Velikov <emil.l.velikov@gmail.com> <emil.velikov@collabora.com>
 Eric Anholt <eric@anholt.net> Eric Anholt <anholt@FreeBSD.org>
 Eric Engestrom <eric@engestrom.ch> <eric.engestrom@imgtec.com>
 Eugeni Dodonov <eugeni.dodonov@intel.com> <eugeni@mandriva.com>
 Fabian Bieler <der.fabe@gmx.net> <fabianbieler@fastmail.fm>

									
										164

.travis.yml
									
												View File
												
				@@ -17,12 +17,14 @@ env:

				    - DRI2PROTO_VERSION=dri2proto-2.8

				    - LIBPCIACCESS_VERSION=libpciaccess-0.13.4

				    - LIBDRM_VERSION=libdrm-2.4.74

				    - XCBPROTO_VERSION=xcb-proto-1.11

				    - LIBXCB_VERSION=libxcb-1.11

				    - XCBPROTO_VERSION=xcb-proto-1.13

				    - RANDRPROTO_VERSION=randrproto-1.3.0

				    - LIBXRANDR_VERSION=libXrandr-1.3.0

				    - LIBXCB_VERSION=libxcb-1.13

				    - LIBXSHMFENCE_VERSION=libxshmfence-1.2

				    - LIBVDPAU_VERSION=libvdpau-1.1

				    - LIBVA_VERSION=libva-1.6.2

				    - LIBWAYLAND_VERSION=wayland-1.11.1

				    - LIBVA_VERSION=libva-1.7.0

				    - LIBWAYLAND_VERSION=wayland-1.15.0

				    - WAYLAND_PROTOCOLS_VERSION=wayland-protocols-1.8

				    - PKG_CONFIG_PATH=$HOME/prefix/lib/pkgconfig:$HOME/prefix/share/pkgconfig

				    - LD_LIBRARY_PATH="$HOME/prefix/lib:$LD_LIBRARY_PATH"

				@@ -33,18 +35,18 @@ matrix:

				    - env:

				        - LABEL="meson Vulkan"

				        - BUILD=meson

				        - MESON_OPTIONS="-Ddri-drivers= -Dgallium-drivers="

				        - LLVM_VERSION=4.0

				        - MESON_OPTIONS="-Ddri-drivers=[] -Dgallium-drivers=[]"

				        - LLVM_VERSION=5.0

				        - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"

				      addons:

				        apt:

				          sources:

				            - llvm-toolchain-trusty-3.9

				            - llvm-toolchain-trusty-5.0

				          packages:

				            # LLVM packaging is broken and misses these dependencies

				            - libedit-dev

				            # From sources above

				            - llvm-3.9-dev

				            - llvm-5.0-dev

				            # Common

				            - xz-utils

				            - libexpat1-dev

				@@ -53,7 +55,7 @@ matrix:

				    - env:

				        - LABEL="meson loaders/classic DRI"

				        - BUILD=meson

				        - MESON_OPTIONS="-Dvulkan-drivers= -Dgallium-drivers="

				        - MESON_OPTIONS="-Dvulkan-drivers=[] -Dgallium-drivers=[]"

				      addons:

				        apt:

				          packages:

				@@ -92,12 +94,10 @@ matrix:

				        - BUILD=make

				        - MAKEFLAGS="-j4"

				        - MAKE_CHECK_COMMAND="true"

				        - LLVM_VERSION=3.9

				        - LLVM_VERSION=5.0

				        - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"

				        - OVERRIDE_CC="gcc-4.8"

				        - OVERRIDE_CXX="g++-4.8"

				        # New binutils linker is required for llvm-3.9

				        - OVERRIDE_PATH=/usr/lib/binutils-2.26/bin

				        - DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"

				        - DRI_DRIVERS=""

				        - GALLIUM_ST="--enable-dri --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"

				@@ -107,13 +107,41 @@ matrix:

				      addons:

				        apt:

				          sources:

				            - llvm-toolchain-trusty-3.9

				            - llvm-toolchain-trusty-5.0

				          packages:

				            - binutils-2.26

				            # LLVM packaging is broken and misses these dependencies

				            - libedit-dev

				            # From sources above

				            - llvm-3.9-dev

				            - llvm-5.0-dev

				            # Common

				            - xz-utils

				            - x11proto-xf86vidmode-dev

				            - libexpat1-dev

				            - libx11-xcb-dev

				            - libelf-dev

				            - libunwind8-dev

				    - env:

				        - LABEL="make Gallium Drivers RadeonSI"

				        - BUILD=make

				        - MAKEFLAGS="-j4"

				        - MAKE_CHECK_COMMAND="true"

				        - LLVM_VERSION=5.0

				        - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"

				        - DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"

				        - DRI_DRIVERS=""

				        - GALLIUM_ST="--enable-dri --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"

				        - GALLIUM_DRIVERS="radeonsi"

				        - VULKAN_DRIVERS=""

				        - LIBUNWIND_FLAGS="--enable-libunwind"

				      addons:

				        apt:

				          sources:

				            - llvm-toolchain-trusty-5.0

				          packages:

				            # LLVM packaging is broken and misses these dependencies

				            - libedit-dev

				            # From sources above

				            - llvm-5.0-dev

				            # Common

				            - xz-utils

				            - x11proto-xf86vidmode-dev

				@@ -133,7 +161,7 @@ matrix:

				        - DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"

				        - DRI_DRIVERS=""

				        - GALLIUM_ST="--enable-dri --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"

				        - GALLIUM_DRIVERS="i915,nouveau,pl111,r300,r600,radeonsi,freedreno,svga,swrast,vc4,virgl,etnaviv,imx"

				        - GALLIUM_DRIVERS="i915,nouveau,pl111,r300,r600,freedreno,svga,swrast,v3d,vc4,virgl,etnaviv,imx"

				        - VULKAN_DRIVERS=""

				        - LIBUNWIND_FLAGS="--enable-libunwind"

				      addons:

				@@ -168,7 +196,7 @@ matrix:

				        - DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"

				        - DRI_DRIVERS=""

				        - GALLIUM_ST="--disable-dri --enable-opencl --enable-opencl-icd --enable-llvm --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"

				        - GALLIUM_DRIVERS="r600,radeonsi"

				        - GALLIUM_DRIVERS="r600"

				        - VULKAN_DRIVERS=""

				        - LIBUNWIND_FLAGS="--enable-libunwind"

				      addons:

				@@ -205,7 +233,7 @@ matrix:

				        - DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"

				        - DRI_DRIVERS=""

				        - GALLIUM_ST="--disable-dri --enable-opencl --enable-opencl-icd --enable-llvm --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"

				        - GALLIUM_DRIVERS="r600,radeonsi"

				        - GALLIUM_DRIVERS="r600"

				        - VULKAN_DRIVERS=""

				        - LIBUNWIND_FLAGS="--enable-libunwind"

				      addons:

				@@ -264,6 +292,39 @@ matrix:

				            - libx11-xcb-dev

				            - libelf-dev

				            - libunwind8-dev

				    - env:

				        # NOTE: Analogous to SWR above, building Clover is quite slow.

				        - LABEL="make Gallium ST Clover LLVM-6.0"

				        - BUILD=make

				        - MAKEFLAGS="-j4"

				        - MAKE_CHECK_COMMAND="true"

				        - LLVM_VERSION=6.0

				        - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"

				        - DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"

				        - DRI_DRIVERS=""

				        - GALLIUM_ST="--disable-dri --enable-opencl --enable-opencl-icd --enable-llvm --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"

				        - GALLIUM_DRIVERS="r600,radeonsi"

				        - VULKAN_DRIVERS=""

				        - LIBUNWIND_FLAGS="--enable-libunwind"

				      addons:

				        apt:

				          sources:

				            - llvm-toolchain-trusty-6.0

				            # llvm-6 depends on gcc-4.9 which is not in main repo

				            - ubuntu-toolchain-r-test

				          packages:

				            - libclc-dev

				            # From sources above

				            - llvm-6.0-dev

				            - clang-6.0

				            - libclang-6.0-dev

				            # Common

				            - xz-utils

				            - x11proto-xf86vidmode-dev

				            - libexpat1-dev

				            - libx11-xcb-dev

				            - libelf-dev

				            - libunwind8-dev

				    - env:

				        - LABEL="make Gallium ST Other"

				        - BUILD=make

				@@ -305,10 +366,8 @@ matrix:

				        - BUILD=make

				        - MAKEFLAGS="-j4"

				        - MAKE_CHECK_COMMAND="make -C src/gtest check && make -C src/intel check"

				        - LLVM_VERSION=3.9

				        - LLVM_VERSION=5.0

				        - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"

				        # New binutils linker is required for llvm-3.9

				        - OVERRIDE_PATH=/usr/lib/binutils-2.26/bin

				        - DRI_LOADERS="--disable-glx --disable-gbm --disable-egl --with-platforms=x11,wayland"

				        - DRI_DRIVERS=""

				        - GALLIUM_ST="--enable-dri --enable-dri3 --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"

				@@ -318,13 +377,12 @@ matrix:

				      addons:

				        apt:

				          sources:

				            - llvm-toolchain-trusty-3.9

				            - llvm-toolchain-trusty-5.0

				          packages:

				            - binutils-2.26

				            # LLVM packaging is broken and misses these dependencies

				            - libedit-dev

				            # From sources above

				            - llvm-3.9-dev

				            - llvm-5.0-dev

				            # Common

				            - xz-utils

				            - x11proto-xf86vidmode-dev

				@@ -342,7 +400,6 @@ matrix:

				      addons:

				        apt:

				          packages:

				            - scons

				            # Common

				            - xz-utils

				            - x11proto-xf86vidmode-dev

				@@ -361,7 +418,6 @@ matrix:

				      addons:

				        apt:

				          packages:

				            - scons

				            # LLVM packaging is broken and misses these dependencies

				            - libedit-dev

				            - llvm-3.3-dev

				@@ -376,7 +432,7 @@ matrix:

				        - BUILD=scons

				        - SCONSFLAGS="-j4"

				        - SCONS_TARGET="swr=1"

				        - LLVM_VERSION=3.9

				        - LLVM_VERSION=5.0

				        - LLVM_CONFIG="llvm-config-${LLVM_VERSION}"

				        # Keep it symmetrical to the make build. There's no actual SWR, yet.

				        - SCONS_CHECK_COMMAND="true"

				@@ -385,13 +441,12 @@ matrix:

				      addons:

				        apt:

				          sources:

				            - llvm-toolchain-trusty-3.9

				            - llvm-toolchain-trusty-5.0

				          packages:

				            - scons

				            # LLVM packaging is broken and misses these dependencies

				            - libedit-dev

				            # From sources above

				            - llvm-3.9-dev

				            - llvm-5.0-dev

				            # Common

				            - xz-utils

				            - x11proto-xf86vidmode-dev

				@@ -405,6 +460,11 @@ matrix:

				        - MAKE_CHECK_COMMAND="make check"

				        - DRI_LOADERS="--with-platforms=x11 --disable-egl"

				      os: osx

				    - env:

				        - LABEL="macOS meson"

				        - BUILD=meson

				        - MESON_OPTIONS="-Degl=false"

				      os: osx

				before_install:

				  - |

				@@ -439,6 +499,11 @@ install:

				      pip3 install --user "meson<0.45.0";

				    fi

				  # Install a more modern scons from pip.

				  - if test "x$BUILD" = xscons; then

				      pip2 install --user "scons>=2.4";

				    fi

				  # Since libdrm gets updated in configure.ac regularly, try to pick up the

				  # latest version from there.

				  - for line in `grep "^LIBDRM.*_REQUIRED=" configure.ac`; do

				@@ -482,6 +547,14 @@ install:

				      tar -jxvf $LIBDRM_VERSION.tar.bz2

				      (cd $LIBDRM_VERSION && ./configure --prefix=$HOME/prefix --enable-vc4 --enable-freedreno --enable-etnaviv-experimental-api && make install)

				      wget $XORG_RELEASES/proto/$RANDRPROTO_VERSION.tar.bz2

				      tar -jxvf $RANDRPROTO_VERSION.tar.bz2

				      (cd $RANDRPROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)

				      wget $XORG_RELEASES/lib/$LIBXRANDR_VERSION.tar.bz2

				      tar -jxvf $LIBXRANDR_VERSION.tar.bz2

				      (cd $LIBXRANDR_VERSION && ./configure --prefix=$HOME/prefix && make install)

				      wget $XORG_RELEASES/lib/$LIBXSHMFENCE_VERSION.tar.bz2

				      tar -jxvf $LIBXSHMFENCE_VERSION.tar.bz2

				      (cd $LIBXSHMFENCE_VERSION && ./configure --prefix=$HOME/prefix && make install)

				@@ -513,13 +586,34 @@ install:

				           "#ifndef _LINUX_MEMFD_H" \

				           "#define _LINUX_MEMFD_H" \

				           "" \

				           "#define __NR_memfd_create 319" \

				           "#define SYS_memfd_create __NR_memfd_create" \

				           "" \

				           "#define MFD_CLOEXEC             0x0001U" \

				           "#define MFD_ALLOW_SEALING       0x0002U" \

				           "" \

				           "#endif /* _LINUX_MEMFD_H */" > linux/memfd.h

				      # Generate this header, including the missing SYS_memfd_create

				      # macro, which is not provided by the header in the Travis

				      # instance

				      mkdir -p sys

				      printf "%s\n" \

				           "#ifndef _SYSCALL_H" \

				           "#define _SYSCALL_H      1" \

				           "" \

				           "#include <asm/unistd.h>" \

				           "" \

				           "#ifndef _LIBC" \

				           "# include <bits/syscall.h>" \

				           "#endif" \

				           "" \

				           "#ifndef __NR_memfd_create" \

				           "# define __NR_memfd_create 319 /* Taken from <asm/unistd_64.h> */" \

				           "#endif" \

				           "" \

				           "#ifndef SYS_memfd_create" \

				           "# define SYS_memfd_create __NR_memfd_create" \

				           "#endif" \

				           "" \

				           "#endif" > sys/syscall.h

				    fi

				script:

				@@ -530,7 +624,9 @@ script:

				      export CFLAGS="$CFLAGS -isystem`pwd`";

				      ./autogen.sh --enable-debug

				      mkdir build &&

				      cd build &&

				      ../autogen.sh --enable-debug

				        $LIBUNWIND_FLAGS

				        $DRI_LOADERS

				        --with-dri-drivers=$DRI_DRIVERS

									
										3

Android.common.mk
									
												View File
												
				@@ -52,6 +52,7 @@ LOCAL_CFLAGS += \

					-DHAVE___BUILTIN_EXPECT \

					-DHAVE___BUILTIN_FFS \

					-DHAVE___BUILTIN_FFSLL \

					-DHAVE_DLFCN_H \

					-DHAVE_FUNC_ATTRIBUTE_FLATTEN \

					-DHAVE_FUNC_ATTRIBUTE_UNUSED \

					-DHAVE_FUNC_ATTRIBUTE_FORMAT \

				@@ -70,8 +71,10 @@ LOCAL_CFLAGS += \

					-DHAVE_DLADDR \

					-DHAVE_DL_ITERATE_PHDR \

					-DHAVE_LINUX_FUTEX_H \

					-DHAVE_ENDIAN_H \

					-DHAVE_ZLIB \

					-DMAJOR_IN_SYSMACROS \

					-DVK_USE_PLATFORM_ANDROID_KHR \

					-fvisibility=hidden \

					-Wno-sign-compare

									
										10

Makefile.am
									
												View File
												
				@@ -45,7 +45,7 @@ AM_DISTCHECK_CONFIGURE_FLAGS = \

					--enable-libunwind \

					--with-platforms=x11,wayland,drm,surfaceless \

					--with-dri-drivers=i915,i965,nouveau,radeon,r200,swrast \

					--with-gallium-drivers=i915,nouveau,r300,pl111,r600,radeonsi,freedreno,svga,swrast,vc4,virgl,swr,etnaviv,imx \

					--with-gallium-drivers=i915,nouveau,r300,pl111,r600,radeonsi,freedreno,svga,swrast,vc4,tegra,virgl,swr,etnaviv,imx \

					--with-vulkan-drivers=intel,radeon

				ACLOCAL_AMFLAGS = -I m4

				@@ -64,7 +64,8 @@ EXTRA_DIST = \

					meson_options.txt \

					bin/meson.build \

					include/meson.build \

					bin/install_megadrivers.py

					bin/install_megadrivers.py \

					bin/meson_get_version.py

				noinst_HEADERS = \

					include/c99_alloca.h \

				@@ -75,12 +76,15 @@ noinst_HEADERS = \

					include/drm-uapi/drm_fourcc.h \

					include/drm-uapi/drm_mode.h \

					include/drm-uapi/i915_drm.h \

					include/drm-uapi/tegra_drm.h \

					include/drm-uapi/v3d_drm.h \

					include/drm-uapi/vc4_drm.h \

					include/D3D9 \

					include/GL/wglext.h \

					include/HaikuGL \

					include/no_extern_c.h \

					include/pci_ids

					include/pci_ids \

					include/vulkan

				# We list some directories in EXTRA_DIST, but don't actually want to include

				# the .gitignore files in the tarball.

									
										79

README.rst
									
										Normal file
									
												View File
												
				@@ -0,0 +1,79 @@

				`Mesa <https://mesa3d.org>`_ - The 3D Graphics Library

				======================================================

				Source

				------

				This repository lives at https://gitlab.freedesktop.org/mesa/mesa.

				Other repositories are likely forks, and code found there is not supported.

				Build status

				------------

				Travis:

				.. image:: https://travis-ci.org/mesa3d/mesa.svg?branch=master

				    :target: https://travis-ci.org/mesa3d/mesa

				Appveyor:

				.. image:: https://img.shields.io/appveyor/ci/mesa3d/mesa.svg

				    :target: https://ci.appveyor.com/project/mesa3d/mesa

				Coverity:

				.. image:: https://scan.coverity.com/projects/139/badge.svg?flat=1

				    :target: https://scan.coverity.com/projects/mesa

				Build & install

				---------------

				You can find more information in our documentation (`docs/install.html

				<https://mesa3d.org/install.html>`_), but the recommended way is to use

				Meson (`docs/meson.html <https://mesa3d.org/meson.html>`_):

				.. code-block:: sh

				  $ mkdir build

				  $ cd build

				  $ meson ..

				  $ sudo ninja install

				Support

				-------

				Many Mesa devs hang on IRC; if you're not sure which channel is

				appropriate, you should ask your question on `Freenode's #dri-devel

				<irc://chat.freenode.net#dri-devel>`_, someone will redirect you if

				necessary.

				Remember that not everyone is in the same timezone as you, so it might

				take a while before someone qualified sees your question.

				To figure out who you're talking to, or which nick to ping for your

				question, check out `Who's Who on IRC

				<https://dri.freedesktop.org/wiki/WhosWho/>`_.

				The next best option is to ask your question in an email to the

				mailing lists: `mesa-dev\@lists.freedesktop.org

				<https://lists.freedesktop.org/mailman/listinfo/mesa-dev>`_

				Bug reports

				-----------

				If you think something isn't working properly, please file a bug report

				(`docs/bugs.html <https://mesa3d.org/bugs.html>`_).

				Contributing

				------------

				Contributions are welcome, and step-by-step instructions can be found in our

				documentation (`docs/submittingpatches.html

				<https://mesa3d.org/submittingpatches.html>`_).

				Note that Mesa uses email mailing-lists for patches submission, review and

				discussions.

1

REVIEWERS

View File

@@ -116,6 +116,7 @@ MESON BUILD
 R: Dylan Baker <dylan@pnwbakers.com>
 R: Eric Engestrom <eric@engestrom.ch>
 F: */meson.build
 F: meson.build
 F: meson_options.txt
 ANDROID EGL SUPPORT

									
										6

SConstruct
									
												View File
												
				@@ -27,6 +27,12 @@ import SCons.Util

				import common

				#######################################################################

				# Minimal scons version

				EnsureSConsVersion(2, 4)

				#######################################################################

				# Configuration options

2

VERSION

View File

@@ -1 +1 @@
 .0.0-rc5
 .2.0-devel

									
										10

appveyor.yml
									
												View File
												
				@@ -35,13 +35,13 @@ clone_depth: 100

				cache:

				- win_flex_bison-2.5.9.zip

				- llvm-3.3.1-msvc2013-mtd.7z

				- llvm-5.0.1-msvc2015-mtd.7z

				os: Visual Studio 2013

				os: Visual Studio 2015

				environment:

				  WINFLEXBISON_ARCHIVE: win_flex_bison-2.5.9.zip

				  LLVM_ARCHIVE: llvm-3.3.1-msvc2013-mtd.7z

				  LLVM_ARCHIVE: llvm-5.0.1-msvc2015-mtd.7z

				install:

				# Check pip

				@@ -69,10 +69,10 @@ install:

				- set LLVM=%CD%\llvm

				build_script:

				- scons -j%NUMBER_OF_PROCESSORS% MSVC_VERSION=12.0 llvm=1

				- scons -j%NUMBER_OF_PROCESSORS% MSVC_VERSION=14.0 llvm=1

				after_build:

				- scons -j%NUMBER_OF_PROCESSORS% MSVC_VERSION=12.0 llvm=1 check

				- scons -j%NUMBER_OF_PROCESSORS% MSVC_VERSION=14.0 llvm=1 check

				# It's possible to setup notification here, as described in

6

bin/.cherry-ignore

View File

@@ -1,6 +0,0 @@
 # fixes: The following commits were applied without the "cherry-picked from" tag
 cd9ee4caffee853700bdcd75b92eedc0e7b automake: anv: ship anv_extensions_gen.py in the tarball
 ac4437b20b87c7285b89466f05b51518ae616873 automake: small cleanup after the meson.build inclusion
 # stable: The KHX extension is disabled all together in the stable branches.
 ffe395cba0f7b3c1f1c41062f4376eae3a188b5 radv: Don't expose VK_KHX_multiview on android.

									
										2

bin/bugzilla_mesa.sh
									
												View File
												
				@@ -23,7 +23,7 @@ echo "<ul>"

				echo ""

				# extract fdo urls from commit log

				git log $* | grep 'bugs.freedesktop.org/show_bug' | sed -e $trim_before | sort -n -u | sed -e $use_after |\

				git log --pretty=medium $* | grep 'bugs.freedesktop.org/show_bug' | sed -e $trim_before | sort -n -u | sed -e $use_after |\

				while read url

				do

					id=$(echo $url | cut -d'=' -f2)

									
										4

bin/get-fixes-pick-list.sh
									
												View File
												
				@@ -16,7 +16,7 @@ latest_branchpoint=`git merge-base origin/master HEAD`

				git log --reverse --pretty=%H $latest_branchpoint > already_landed

				# ... and the ones cherry-picked.

				git log --reverse --grep="cherry picked from commit" $latest_branchpoint..HEAD |\

				git log --reverse --pretty=medium --grep="cherry picked from commit" $latest_branchpoint..HEAD |\

					grep "cherry picked from commit" |\

					sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//'  > already_picked

				@@ -38,7 +38,7 @@ do

					# Place every "fixes:" tag on its own line and join with the next word

					# on its line or a later one.

					fixes=`git show -s $sha | tr -d "\n" | sed -e 's/fixes:[[:space:]]*/\nfixes:/Ig' | grep "fixes:" | sed -e 's/\(fixes:[a-zA-Z0-9]*\).*$/\1/'`

					fixes=`git show --pretty=medium -s $sha | tr -d "\n" | sed -e 's/fixes:[[:space:]]*/\nfixes:/Ig' | grep "fixes:" | sed -e 's/\(fixes:[a-zA-Z0-9]*\).*$/\1/'`

					# For each one try to extract the tag

					fixes_count=`echo "$fixes" | wc -l`

									
										2

bin/get-pick-list.sh
									
												View File
												
				@@ -12,7 +12,7 @@

				latest_branchpoint=`git merge-base origin/master HEAD`

				# Grep for commits with "cherry picked from commit" in the commit message.

				git log --reverse --grep="cherry picked from commit" $latest_branchpoint..HEAD |\

				git log --reverse --pretty=medium --grep="cherry picked from commit" $latest_branchpoint..HEAD |\

					grep "cherry picked from commit" |\

					sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked

									
										22

bin/install_megadrivers.py
									
												View File
												
				@@ -1,6 +1,6 @@

				#!/usr/bin/env python

				# encoding=utf-8

				# Copyright © 2017 Intel Corporation

				# Copyright © 2017-2018 Intel Corporation

				# Permission is hereby granted, free of charge, to any person obtaining a copy

				# of this software and associated documentation files (the "Software"), to deal

				@@ -35,26 +35,30 @@ def main():

				    parser.add_argument('drivers', nargs='+')

				    args = parser.parse_args()

				    to = os.path.join(os.environ.get('MESON_INSTALL_DESTDIR_PREFIX'), args.libdir)

				    if os.path.isabs(args.libdir):

				        to = os.path.join(os.environ.get('DESTDIR', '/'), args.libdir[1:])

				    else:

				        to = os.path.join(os.environ['MESON_INSTALL_DESTDIR_PREFIX'], args.libdir)

				    master = os.path.join(to, os.path.basename(args.megadriver))

				    if not os.path.exists(to):

				        os.makedirs(to)

				    shutil.copy(args.megadriver, master)

				    for each in args.drivers:

				        driver = os.path.join(to, each)

				    for driver in args.drivers:

				        abs_driver = os.path.join(to, driver)

				        if os.path.exists(driver):

				            os.unlink(driver)

				        print('installing {} to {}'.format(args.megadriver, driver))

				        os.link(master, driver)

				        if os.path.exists(abs_driver):

				            os.unlink(abs_driver)

				        print('installing {} to {}'.format(args.megadriver, abs_driver))

				        os.link(master, abs_driver)

				        try:

				            ret = os.getcwd()

				            os.chdir(to)

				            name, ext = os.path.splitext(each)

				            name, ext = os.path.splitext(driver)

				            while ext != '.so':

				                if os.path.exists(name):

				                    os.unlink(name)

292

configure.ac

View File

@@ -74,24 +74,29 @@ AC_SUBST([OPENCL_VERSION])
 # in the first entry.
 LIBDRM_REQUIRED=2.4.75
 LIBDRM_RADEON_REQUIRED=2.4.71
 LIBDRM_AMDGPU_REQUIRED=2.4.89
 LIBDRM_AMDGPU_REQUIRED=2.4.91
 LIBDRM_INTEL_REQUIRED=2.4.75
 LIBDRM_NVVIEUX_REQUIRED=2.4.66
 LIBDRM_NOUVEAU_REQUIRED=2.4.66
 LIBDRM_FREEDRENO_REQUIRED=2.4.89
 LIBDRM_ETNAVIV_REQUIRED=2.4.82
 LIBDRM_FREEDRENO_REQUIRED=2.4.92
 LIBDRM_ETNAVIV_REQUIRED=2.4.89
 LIBDRM_VC4_REQUIRED=2.4.89
 dnl Versions for external dependencies
 DRI2PROTO_REQUIRED=2.8
 GLPROTO_REQUIRED=1.4.14
 LIBOMXIL_BELLAGIO_REQUIRED=0.0
 LIBVA_REQUIRED=0.38.0
 LIBOMXIL_TIZONIA_REQUIRED=0.10.0
 LIBVA_REQUIRED=0.39.0
 VDPAU_REQUIRED=1.1
 WAYLAND_REQUIRED=1.11
 WAYLAND_EGL_BACKEND_REQUIRED=3
 WAYLAND_PROTOCOLS_REQUIRED=1.8
 XCB_REQUIRED=1.9.3
 XCBDRI2_REQUIRED=1.8
 XCBDRI3_MODIFIERS_REQUIRED=1.13
 XCBGLX_REQUIRED=1.8.1
 XCBPRESENT_MODIFIERS_REQUIRED=1.13
 XDAMAGE_REQUIRED=1.1
 XSHMFENCE_REQUIRED=1.1
 XVMC_REQUIRED=1.0.6
@@ -103,9 +108,9 @@ dnl LLVM versions
 LLVM_REQUIRED_GALLIUM=3.3.0
 LLVM_REQUIRED_OPENCL=3.9.0
 LLVM_REQUIRED_R600=3.9.0
 LLVM_REQUIRED_RADEONSI=3.9.0
 LLVM_REQUIRED_RADV=3.9.0
 LLVM_REQUIRED_SWR=3.9.0
 LLVM_REQUIRED_RADEONSI=5.0.0
 LLVM_REQUIRED_RADV=5.0.0
 LLVM_REQUIRED_SWR=5.0.0
 dnl Check for progs
 AC_PROG_CPP
@@ -116,6 +121,8 @@ dnl other CC/CXX flags related help
 AC_ARG_VAR([CXX11_CXXFLAGS], [Compiler flag to enable C++11 support (only needed if not
                               enabled by default and different  from -std=c++11)])
 AM_PROG_CC_C_O
 AC_PROG_GREP
 AC_PROG_NM
 AM_PROG_AS
 AX_CHECK_GNU_MAKE
 AC_CHECK_PROGS([PYTHON2], [python2.7 python2 python])
@@ -295,7 +302,10 @@ AX_CHECK_COMPILE_FLAG([-Wall],                                 [CFLAGS="$CFLAGS
 AX_CHECK_COMPILE_FLAG([-Werror=implicit-function-declaration], [CFLAGS="$CFLAGS -Werror=implicit-function-declaration"])
 AX_CHECK_COMPILE_FLAG([-Werror=missing-prototypes],            [CFLAGS="$CFLAGS -Werror=missing-prototypes"])
 AX_CHECK_COMPILE_FLAG([-Wmissing-prototypes],                  [CFLAGS="$CFLAGS -Wmissing-prototypes"])
 dnl Dylan Baker: gcc and clang always accepr -Wno-*, hence check for the original warning, then set the no-* flag
 AX_CHECK_COMPILE_FLAG([-Wmissing-field-initializers],          [CFLAGS="$CFLAGS -Wno-missing-field-initializers"])
 AX_CHECK_COMPILE_FLAG([-fno-math-errno],                       [CFLAGS="$CFLAGS -fno-math-errno"])
 AX_CHECK_COMPILE_FLAG([-fno-trapping-math],                    [CFLAGS="$CFLAGS -fno-trapping-math"])
 AX_CHECK_COMPILE_FLAG([-fvisibility=hidden],                   [VISIBILITY_CFLAGS="-fvisibility=hidden"])
@@ -307,6 +317,7 @@ AX_CHECK_COMPILE_FLAG([-Wall],                                 [CXXFLAGS="$CXXFL
 AX_CHECK_COMPILE_FLAG([-fno-math-errno],                       [CXXFLAGS="$CXXFLAGS -fno-math-errno"])
 AX_CHECK_COMPILE_FLAG([-fno-trapping-math],                    [CXXFLAGS="$CXXFLAGS -fno-trapping-math"])
 AX_CHECK_COMPILE_FLAG([-fvisibility=hidden],                   [VISIBILITY_CXXFLAGS="-fvisibility=hidden"])
 AX_CHECK_COMPILE_FLAG([-Wmissing-field-initializers],          [CXXFLAGS="$CXXFLAGS -Wno-missing-field-initializers"])
 AC_LANG_POP([C++])
 # Flags to help ensure that certain portions of the code -- and only those
@@ -429,28 +440,41 @@ fi
 AM_CONDITIONAL([SSE41_SUPPORTED], [test x$SSE41_SUPPORTED = x1])
 AC_SUBST([SSE41_CFLAGS], $SSE41_CFLAGS)
 dnl Check for new-style atomic builtins
 AC_COMPILE_IFELSE([AC_LANG_SOURCE([[
 dnl Check for new-style atomic builtins. We first check without linking to
 dnl -latomic.
 AC_MSG_CHECKING(whether __atomic_load_n is supported)
 AC_LINK_IFELSE([AC_LANG_SOURCE([[
 #include <stdint.h>
 int main() {
     int n;
     return __atomic_load_n(&n, __ATOMIC_ACQUIRE);
 }]])], GCC_ATOMIC_BUILTINS_SUPPORTED=1)
 if test "x$GCC_ATOMIC_BUILTINS_SUPPORTED" = x1; then
     DEFINES="$DEFINES -DUSE_GCC_ATOMIC_BUILTINS"
     dnl On some platforms, new-style atomics need a helper library
     AC_MSG_CHECKING(whether -latomic is needed)
     AC_LINK_IFELSE([AC_LANG_SOURCE([[
     #include <stdint.h>
     uint64_t v;
     int main() {
         return (int)__atomic_load_n(&v, __ATOMIC_ACQUIRE);
     }]])], GCC_ATOMIC_BUILTINS_NEED_LIBATOMIC=no, GCC_ATOMIC_BUILTINS_NEED_LIBATOMIC=yes)
     AC_MSG_RESULT($GCC_ATOMIC_BUILTINS_NEED_LIBATOMIC)
     if test "x$GCC_ATOMIC_BUILTINS_NEED_LIBATOMIC" = xyes; then
         LIBATOMIC_LIBS="-latomic"
     fi
     struct {
         uint64_t *v;
     } x;
     return (int)__atomic_load_n(x.v, __ATOMIC_ACQUIRE) &
            (int)__atomic_add_fetch(x.v, (uint64_t)1, __ATOMIC_ACQ_REL);
 }]])], GCC_ATOMIC_BUILTINS_SUPPORTED=yes, GCC_ATOMIC_BUILTINS_SUPPORTED=no)
 dnl If that didn't work, we try linking with -latomic, which is needed on some
 dnl platforms.
 if test "x$GCC_ATOMIC_BUILTINS_SUPPORTED" != xyes; then
    save_LDFLAGS=$LDFLAGS
    LDFLAGS="$LDFLAGS -latomic"
    AC_LINK_IFELSE([AC_LANG_SOURCE([[
    #include <stdint.h>
    int main() {
         struct {
             uint64_t *v;
         } x;
         return (int)__atomic_load_n(x.v, __ATOMIC_ACQUIRE) &
                (int)__atomic_add_fetch(x.v, (uint64_t)1, __ATOMIC_ACQ_REL);
    }]])], GCC_ATOMIC_BUILTINS_SUPPORTED=yes LIBATOMIC_LIBS="-latomic",
           GCC_ATOMIC_BUILTINS_SUPPORTED=no)
    LDFLAGS=$save_LDFLAGS
 fi
 AC_MSG_RESULT($GCC_ATOMIC_BUILTINS_SUPPORTED)
 if test "x$GCC_ATOMIC_BUILTINS_SUPPORTED" = xyes; then
     DEFINES="$DEFINES -DUSE_GCC_ATOMIC_BUILTINS"
 fi
 AM_CONDITIONAL([GCC_ATOMIC_BUILTINS_SUPPORTED], [test x$GCC_ATOMIC_BUILTINS_SUPPORTED = x1])
 AC_SUBST([LIBATOMIC_LIBS])
 dnl Check if host supports 64-bit atomics
@@ -743,21 +767,6 @@ esac
 AC_SUBST([LIB_EXT])
 dnl
 dnl potentially-infringing-but-nobody-knows-for-sure stuff
 dnl
 AC_ARG_ENABLE([texture-float],
     [AS_HELP_STRING([--enable-texture-float],
         [enable floating-point textures and renderbuffers @<:@default=disabled@:>@])],
     [enable_texture_float="$enableval"],
     [enable_texture_float=no]
 )
 if test "x$enable_texture_float" = xyes; then
     AC_MSG_WARN([Floating-point textures enabled.])
     AC_MSG_WARN([Please consult docs/patents.txt with your lawyer before building Mesa.])
     DEFINES="$DEFINES -DTEXTURE_FLOAT_ENABLED"
 fi
 dnl
 dnl Arch/platform-specific settings
 dnl
@@ -862,6 +871,8 @@ fi
 AC_HEADER_MAJOR
 AC_CHECK_HEADER([xlocale.h], [DEFINES="$DEFINES -DHAVE_XLOCALE_H"])
 AC_CHECK_HEADER([sys/sysctl.h], [DEFINES="$DEFINES -DHAVE_SYS_SYSCTL_H"])
 AC_CHECK_HEADERS([endian.h])
 AC_CHECK_HEADER([dlfcn.h], [DEFINES="$DEFINES -DHAVE_DLFCN_H"])
 AC_CHECK_FUNC([strtof], [DEFINES="$DEFINES -DHAVE_STRTOF"])
 AC_CHECK_FUNC([mkostemp], [DEFINES="$DEFINES -DHAVE_MKOSTEMP"])
 AC_CHECK_FUNC([timespec_get], [DEFINES="$DEFINES -DHAVE_TIMESPEC_GET"])
@@ -1311,14 +1322,19 @@ AC_ARG_ENABLE([vdpau],
    [enable_vdpau=auto])
 AC_ARG_ENABLE([omx],
    [AS_HELP_STRING([--enable-omx],
          [DEPRECATED: Use --enable-omx-bellagio instead @<:@default=auto@:>@])],
    [AC_MSG_ERROR([--enable-omx is deprecated. Use --enable-omx-bellagio instead.])],
          [DEPRECATED: Use --enable-omx-bellagio or --enable-omx-tizonia instead @<:@default=auto@:>@])],
    [AC_MSG_ERROR([--enable-omx is deprecated. Use --enable-omx-bellagio or --enable-omx-tizonia instead.])],
    [])
 AC_ARG_ENABLE([omx-bellagio],
    [AS_HELP_STRING([--enable-omx-bellagio],
          [enable OpenMAX Bellagio library @<:@default=disabled@:>@])],
    [enable_omx_bellagio="$enableval"],
    [enable_omx_bellagio=no])
 AC_ARG_ENABLE([omx-tizonia],
    [AS_HELP_STRING([--enable-omx-tizonia],
          [enable OpenMAX Tizonia library @<:@default=disabled@:>@])],
    [enable_omx_tizonia="$enableval"],
    [enable_omx_tizonia=no])
 AC_ARG_ENABLE([va],
    [AS_HELP_STRING([--enable-va],
          [enable va library @<:@default=auto@:>@])],
@@ -1350,7 +1366,7 @@ GALLIUM_DRIVERS_DEFAULT="r300,r600,svga,swrast"
 AC_ARG_WITH([gallium-drivers],
     [AS_HELP_STRING([--with-gallium-drivers@<:@=DIRS...@:>@],
         [comma delimited Gallium drivers list, e.g.
         "i915,nouveau,r300,r600,radeonsi,freedreno,pl111,svga,swrast,swr,vc4,vc5,virgl,etnaviv,imx"
         "i915,nouveau,r300,r600,radeonsi,freedreno,pl111,svga,swrast,swr,tegra,v3d,vc4,virgl,etnaviv,imx"
         @<:@default=r300,r600,svga,swrast@:>@])],
     [with_gallium_drivers="$withval"],
     [with_gallium_drivers="$GALLIUM_DRIVERS_DEFAULT"])
@@ -1370,11 +1386,17 @@ if test "x$enable_opengl" = xno -a \
         "x$enable_xvmc" = xno -a \
         "x$enable_vdpau" = xno -a \
         "x$enable_omx_bellagio" = xno -a \
         "x$enable_omx_tizonia" = xno -a \
         "x$enable_va" = xno -a \
         "x$enable_opencl" = xno; then
     AC_MSG_ERROR([at least one API should be enabled])
 fi
 if test "x$enable_omx_bellagio" = xyes -a \
         "x$enable_omx_tizonia" = xyes; then
    AC_MSG_ERROR([Can't enable both bellagio and tizonia at same time])
 fi
 # Building OpenGL ES1 and/or ES2 without OpenGL is not supported on mesa 9.0.x
 if test "x$enable_opengl" = xno -a \
         "x$enable_gles1" = xyes; then
@@ -1547,6 +1569,7 @@ AM_CONDITIONAL(HAVE_APPLEDRI, test "x$enable_dri" = xyes -a "x$dri_platform" = x
 AM_CONDITIONAL(HAVE_LMSENSORS, test "x$enable_lmsensors" = xyes )
 AM_CONDITIONAL(HAVE_GALLIUM_EXTRA_HUD, test "x$enable_gallium_extra_hud" = xyes )
 AM_CONDITIONAL(HAVE_WINDOWSDRI, test "x$enable_dri" = xyes -a "x$dri_platform" = xwindows )
 AM_CONDITIONAL(HAVE_XLEASE, test "x$have_xlease" = xyes )
 AC_ARG_ENABLE([shared-glapi],
     [AS_HELP_STRING([--enable-shared-glapi],
@@ -1770,19 +1793,6 @@ if test "x$with_platforms" = xauto; then
     with_platforms=$with_egl_platforms
 fi
 PKG_CHECK_MODULES([WAYLAND_SCANNER], [wayland-scanner],
         WAYLAND_SCANNER=`$PKG_CONFIG --variable=wayland_scanner wayland-scanner`,
         WAYLAND_SCANNER='')
 if test "x$WAYLAND_SCANNER" = x; then
     AC_PATH_PROG([WAYLAND_SCANNER], [wayland-scanner], [:])
 fi
 PKG_CHECK_EXISTS([wayland-protocols >= $WAYLAND_PROTOCOLS_REQUIRED], [have_wayland_protocols=yes], [have_wayland_protocols=no])
 if test "x$have_wayland_protocols" = xyes; then
     ac_wayland_protocols_pkgdatadir=`$PKG_CONFIG --variable=pkgdatadir wayland-protocols`
 fi
 AC_SUBST(WAYLAND_PROTOCOLS_DATADIR, $ac_wayland_protocols_pkgdatadir)
 # Do per platform setups and checks
 platforms=`IFS=', '; echo $with_platforms`
 for plat in $platforms; do
@@ -1791,13 +1801,22 @@ for plat in $platforms; do
         PKG_CHECK_MODULES([WAYLAND_CLIENT], [wayland-client >= $WAYLAND_REQUIRED])
         PKG_CHECK_MODULES([WAYLAND_SERVER], [wayland-server >= $WAYLAND_REQUIRED])
         PKG_CHECK_MODULES([WAYLAND_PROTOCOLS], [wayland-protocols >= $WAYLAND_PROTOCOLS_REQUIRED])
         if test "x$enable_egl" = xyes; then
           PKG_CHECK_MODULES([WAYLAND_EGL], [wayland-egl-backend >= $WAYLAND_EGL_BACKEND_REQUIRED])
         fi
         WAYLAND_PROTOCOLS_DATADIR=`$PKG_CONFIG --variable=pkgdatadir wayland-protocols`
         PKG_CHECK_MODULES([WAYLAND_SCANNER], [wayland-scanner],
                           WAYLAND_SCANNER=`$PKG_CONFIG --variable=wayland_scanner wayland-scanner`,
                           WAYLAND_SCANNER='')
         if test "x$WAYLAND_SCANNER" = x; then
             AC_PATH_PROG([WAYLAND_SCANNER], [wayland-scanner], [:])
         fi
         if test "x$WAYLAND_SCANNER" = "x:"; then
                 AC_MSG_ERROR([wayland-scanner is needed to compile the wayland platform])
         fi
         if test "x$have_wayland_protocols" = xno; then
                 AC_MSG_ERROR([wayland-protocols >= $WAYLAND_PROTOCOLS_REQUIRED is needed to compile the wayland platform])
         fi
         DEFINES="$DEFINES -DHAVE_WAYLAND_PLATFORM -DWL_HIDE_DEPRECATED"
         ;;
@@ -1818,6 +1837,9 @@ for plat in $platforms; do
     android)
         PKG_CHECK_MODULES([ANDROID], [cutils hardware sync])
         if test -n "$with_gallium_drivers"; then
             PKG_CHECK_MODULES([BACKTRACE], [backtrace])
         fi
         DEFINES="$DEFINES -DHAVE_ANDROID_PLATFORM"
         ;;
@@ -1832,6 +1854,7 @@ for plat in $platforms; do
         ;;
     esac
 done
 AC_SUBST([WAYLAND_PROTOCOLS_DATADIR])
 if test "x$enable_glx" != xno; then
     if ! echo "$platforms" | grep -q 'x11'; then
@@ -1844,6 +1867,26 @@ if test x"$enable_dri3" = xyes; then
     dri3_modules="x11-xcb xcb >= $XCB_REQUIRED xcb-dri3 xcb-xfixes xcb-present xcb-sync xshmfence >= $XSHMFENCE_REQUIRED"
     PKG_CHECK_MODULES([XCB_DRI3], [$dri3_modules])
     dri3_modifier_modules="xcb-dri3 >= $XCBDRI3_MODIFIERS_REQUIRED xcb-present >= $XCBPRESENT_MODIFIERS_REQUIRED"
     PKG_CHECK_MODULES([XCB_DRI3_MODIFIERS], [$dri3_modifier_modules], [have_dri3_modifiers=yes], [have_dri3_modifiers=no])
     if test "x$have_dri3_modifiers" == xyes; then
         DEFINES="$DEFINES -DHAVE_DRI3_MODIFIERS"
     fi
 fi
 if echo "$platforms" | grep -q 'x11' && echo "$platforms" | grep -q 'drm'; then
     have_xlease=yes
 else
     have_xlease=no
 fi
 if test x"$have_xlease" = xyes; then
     randr_modules="x11-xcb xcb-randr"
     PKG_CHECK_MODULES([XCB_RANDR], [$randr_modules])
     xlib_randr_modules="xrandr"
     PKG_CHECK_MODULES([XLIB_RANDR], [$xlib_randr_modules])
 fi
 AM_CONDITIONAL(HAVE_PLATFORM_X11, echo "$platforms" | grep -q 'x11')
@@ -1852,6 +1895,25 @@ AM_CONDITIONAL(HAVE_PLATFORM_DRM, echo "$platforms" | grep -q 'drm')
 AM_CONDITIONAL(HAVE_PLATFORM_SURFACELESS, echo "$platforms" | grep -q 'surfaceless')
 AM_CONDITIONAL(HAVE_PLATFORM_ANDROID, echo "$platforms" | grep -q 'android')
 AC_ARG_ENABLE(xlib-lease,
     [AS_HELP_STRING([--enable-xlib-lease]
                     [enable VK_acquire_xlib_display using X leases])],
     [enable_xlib_lease=$enableval], [enable_xlib_lease=auto])
 case "x$enable_xlib_lease" in
 xyes)
     ;;
 xno)
     ;;
 *)
     if echo "$platforms" | grep -q 'x11' && echo "$platforms" | grep -q 'drm'; then
         enable_xlib_lease=yes
     else
         enable_xlib_lease=no
     fi
 esac
 AM_CONDITIONAL(HAVE_XLIB_LEASE, test "x$enable_xlib_lease" = xyes)
 dnl
 dnl More DRI setup
 dnl
@@ -2070,6 +2132,9 @@ if test -n "$with_vulkan_drivers"; then
             PKG_CHECK_MODULES([AMDGPU], [libdrm >= $LIBDRM_AMDGPU_REQUIRED libdrm_amdgpu >= $LIBDRM_AMDGPU_REQUIRED])
             radeon_llvm_check $LLVM_REQUIRED_RADV "radv"
             require_x11_dri3 "radv"
             if test "x$acv_mako_found" = xno; then
                 AC_MSG_ERROR([Python mako module v$PYTHON_MAKO_REQUIRED or higher not found])
             fi
             HAVE_RADEON_VULKAN=yes
             ;;
         *)
@@ -2187,13 +2252,13 @@ else
     have_vdpau_platform=no
 fi
 if echo $platforms | grep -q "x11\|drm"; then
 if echo $platforms | egrep -q "x11|drm"; then
     have_omx_platform=yes
 else
     have_omx_platform=no
 fi
 if echo $platforms | grep -q "x11\|drm\|wayland"; then
 if echo $platforms | egrep -q "x11|drm|wayland"; then
     have_va_platform=yes
 else
     have_va_platform=no
@@ -2215,6 +2280,10 @@ if test -n "$with_gallium_drivers" -a "x$with_gallium_drivers" != xswrast; then
         PKG_CHECK_EXISTS([libomxil-bellagio >= $LIBOMXIL_BELLAGIO_REQUIRED], [enable_omx_bellagio=yes], [enable_omx_bellagio=no])
     fi
     if test "x$enable_omx_tizonia" = xauto -a "x$have_omx_platform" = xyes; then
        PKG_CHECK_EXISTS([libtizonia >= $LIBOMXIL_TIZONIA_REQUIRED], [enable_omx_tizonia=yes], [enable_omx_tizonia=no])
     fi
     if test "x$enable_va" = xauto -a "x$have_va_platform" = xyes; then
         PKG_CHECK_EXISTS([libva >= $LIBVA_REQUIRED], [enable_va=yes], [enable_va=no])
     fi
@@ -2224,6 +2293,7 @@ if test "x$enable_dri" = xyes -o \
         "x$enable_xvmc" = xyes -o \
         "x$enable_vdpau" = xyes -o \
         "x$enable_omx_bellagio" = xyes -o \
         "x$enable_omx_tizonia" = xyes -o \
         "x$enable_va" = xyes; then
     need_gallium_vl=yes
 fi
@@ -2232,6 +2302,7 @@ AM_CONDITIONAL(NEED_GALLIUM_VL, test "x$need_gallium_vl" = xyes)
 if test "x$enable_xvmc" = xyes -o \
         "x$enable_vdpau" = xyes -o \
         "x$enable_omx_bellagio" = xyes -o \
         "x$enable_omx_tizonia" = xyes -o \
         "x$enable_va" = xyes; then
     if echo $platforms | grep -q "x11"; then
         PKG_CHECK_MODULES([VL], [x11-xcb xcb xcb-dri2 >= $XCBDRI2_REQUIRED])
@@ -2265,9 +2336,27 @@ if test "x$enable_omx_bellagio" = xyes; then
     fi
     PKG_CHECK_MODULES([OMX_BELLAGIO], [libomxil-bellagio >= $LIBOMXIL_BELLAGIO_REQUIRED])
     gallium_st="$gallium_st omx_bellagio"
     AC_DEFINE([ENABLE_ST_OMX_BELLAGIO], 1, [Use Bellagio for OMX IL])
 else
     AC_DEFINE([ENABLE_ST_OMX_BELLAGIO], 0)
 fi
 AM_CONDITIONAL(HAVE_ST_OMX_BELLAGIO, test "x$enable_omx_bellagio" = xyes)
 if test "x$enable_omx_tizonia" = xyes; then
     if test "x$have_omx_platform" != xyes; then
         AC_MSG_ERROR([OMX requires at least one of the x11 or drm platforms])
     fi
     PKG_CHECK_MODULES([OMX_TIZONIA],
                       [libtizonia >= $LIBOMXIL_TIZONIA_REQUIRED
                        tizilheaders >= $LIBOMXIL_TIZONIA_REQUIRED
                        libtizplatform >= $LIBOMXIL_TIZONIA_REQUIRED])
     gallium_st="$gallium_st omx_tizonia"
     AC_DEFINE([ENABLE_ST_OMX_TIZONIA], 1, [Use Tizoina for OMX IL])
 else
     AC_DEFINE([ENABLE_ST_OMX_TIZONIA], 0)
 fi
 AM_CONDITIONAL(HAVE_ST_OMX_TIZONIA, test "x$enable_omx_tizonia" = xyes)
 if test "x$enable_va" = xyes; then
     if test "x$have_va_platform" != xyes; then
         AC_MSG_ERROR([VA requires at least one of the x11 drm or wayland platforms])
@@ -2441,6 +2530,15 @@ AC_ARG_WITH([omx-bellagio-libdir],
                                    $PKG_CONFIG --define-variable=libdir=\$libdir --variable=pluginsdir libomxil-bellagio`])
 AC_SUBST([OMX_BELLAGIO_LIB_INSTALL_DIR])
 dnl Directory for OMX_TIZONIA libs
 AC_ARG_WITH([omx-tizonia-libdir],
     [AS_HELP_STRING([--with-omx-tizonia-libdir=DIR],
         [directory for the OMX_TIZONIA libraries])],
     [OMX_TIZONIA_LIB_INSTALL_DIR="$withval"],
     [OMX_TIZONIA_LIB_INSTALL_DIR=`$PKG_CONFIG --define-variable=libdir=\$libdir --variable=pluginsdir libtizcore`])
 AC_SUBST([OMX_TIZONIA_LIB_INSTALL_DIR])
 dnl Directory for VA libs
 AC_ARG_WITH([va-libdir],
@@ -2571,14 +2669,6 @@ if test -n "$with_gallium_drivers"; then
             HAVE_GALLIUM_RADEONSI=yes
             PKG_CHECK_MODULES([RADEON], [libdrm >= $LIBDRM_RADEON_REQUIRED libdrm_radeon >= $LIBDRM_RADEON_REQUIRED])
             PKG_CHECK_MODULES([AMDGPU], [libdrm >= $LIBDRM_AMDGPU_REQUIRED libdrm_amdgpu >= $LIBDRM_AMDGPU_REQUIRED])
             # Blacklist libdrm_amdgpu 2.4.90 because it causes a crash in older
             # radeonsi with pretty much any app.
             libdrm_version=`pkg-config libdrm_amdgpu --modversion`
             if test "x$libdrm_version" = x2.4.90; then
                 AC_MSG_ERROR([radeonsi can't use libdrm 2.4.90 due to a compatibility issue. Use a newer or older version.])
             fi
             require_libdrm "radeonsi"
             radeon_llvm_check $LLVM_REQUIRED_RADEONSI "radeonsi"
             if test "x$enable_egl" = xyes; then
@@ -2603,6 +2693,10 @@ if test -n "$with_gallium_drivers"; then
        ximx)
             HAVE_GALLIUM_IMX=yes
             ;;
         xtegra)
             HAVE_GALLIUM_TEGRA=yes
             require_libdrm "tegra"
             ;;
         xswrast)
             HAVE_GALLIUM_SOFTPIPE=yes
             if test "x$enable_llvm" = xyes; then
@@ -2670,20 +2764,20 @@ if test -n "$with_gallium_drivers"; then
             ;;
         xvc4)
             HAVE_GALLIUM_VC4=yes
             require_libdrm "vc4"
             PKG_CHECK_MODULES([VC4], [libdrm >= $LIBDRM_VC4_REQUIRED])
             PKG_CHECK_MODULES([SIMPENROSE], [simpenrose],
                               [USE_VC4_SIMULATOR=yes;
                                DEFINES="$DEFINES -DUSE_VC4_SIMULATOR"],
                               [USE_VC4_SIMULATOR=no])
             ;;
         xvc5)
             HAVE_GALLIUM_VC5=yes
         xv3d)
             HAVE_GALLIUM_V3D=yes
             PKG_CHECK_MODULES([VC5_SIMULATOR], [v3dv3],
                               [USE_VC5_SIMULATOR=yes;
                                DEFINES="$DEFINES -DUSE_VC5_SIMULATOR"],
                               [AC_MSG_ERROR([vc5 requires the simulator])])
             PKG_CHECK_MODULES([V3D_SIMULATOR], [v3dv3],
                               [USE_V3D_SIMULATOR=yes;
                                DEFINES="$DEFINES -DUSE_V3D_SIMULATOR"],
                               [USE_V3D_SIMULATOR=no])
             ;;
         xpl111)
             HAVE_GALLIUM_PL111=yes
@@ -2703,8 +2797,9 @@ if test -n "$with_gallium_drivers"; then
 fi
 # XXX: Keep in sync with LLVM_REQUIRED_SWR
 AM_CONDITIONAL(SWR_INVALID_LLVM_VERSION, test "x$LLVM_VERSION" != x3.9.0 -a \
                                               "x$LLVM_VERSION" != x3.9.1)
 AM_CONDITIONAL(SWR_INVALID_LLVM_VERSION, test "x$LLVM_VERSION" != x5.0.0 -a \
                                               "x$LLVM_VERSION" != x5.0.1 -a \
                                               "x$LLVM_VERSION" != x5.0.2)
 if test "x$enable_llvm" = "xyes" -a "$with_gallium_drivers"; then
     llvm_require_version $LLVM_REQUIRED_GALLIUM "gallium"
@@ -2727,6 +2822,9 @@ if test "x$HAVE_GALLIUM_VC4" != xyes -a "x$HAVE_GALLIUM_PL111" = xyes  ; then
     AC_MSG_ERROR([Building with pl111 requires vc4])
 fi
 if test "x$HAVE_GALLIUM_NOUVEAU" != xyes -a "x$HAVE_GALLIUM_TEGRA" = xyes; then
     AC_MSG_ERROR([Building with tegra requires nouveau])
 fi
 detect_old_buggy_llvm() {
     dnl llvm-config may not give the right answer when llvm is a built as a
@@ -2821,19 +2919,19 @@ AM_CONDITIONAL(HAVE_GALLIUM_PL111, test "x$HAVE_GALLIUM_PL111" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_R300, test "x$HAVE_GALLIUM_R300" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_R600, test "x$HAVE_GALLIUM_R600" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_RADEONSI, test "x$HAVE_GALLIUM_RADEONSI" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_RADEON_COMMON, test "x$HAVE_GALLIUM_RADEONSI" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_NOUVEAU, test "x$HAVE_GALLIUM_NOUVEAU" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_FREEDRENO, test "x$HAVE_GALLIUM_FREEDRENO" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_ETNAVIV, test "x$HAVE_GALLIUM_ETNAVIV" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_IMX, test "x$HAVE_GALLIUM_IMX" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_TEGRA, test "x$HAVE_GALLIUM_TEGRA" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_SOFTPIPE, test "x$HAVE_GALLIUM_SOFTPIPE" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_LLVMPIPE, test "x$HAVE_GALLIUM_LLVMPIPE" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_SWR, test "x$HAVE_GALLIUM_SWR" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_SWRAST, test "x$HAVE_GALLIUM_SOFTPIPE" = xyes -o \
                                          "x$HAVE_GALLIUM_LLVMPIPE" = xyes -o \
                                          "x$HAVE_GALLIUM_SWR" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_V3D, test "x$HAVE_GALLIUM_V3D" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_VC4, test "x$HAVE_GALLIUM_VC4" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_VC5, test "x$HAVE_GALLIUM_VC5" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_VIRGL, test "x$HAVE_GALLIUM_VIRGL" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_STATIC_TARGETS, test "x$enable_shared_pipe_drivers" = xno)
@@ -2861,7 +2959,7 @@ AM_CONDITIONAL(HAVE_AMD_DRIVERS, test "x$HAVE_GALLIUM_RADEONSI" = xyes -o \
                                       "x$HAVE_RADEON_VULKAN" = xyes)
 AM_CONDITIONAL(HAVE_BROADCOM_DRIVERS, test "x$HAVE_GALLIUM_VC4" = xyes -o \
                                       "x$HAVE_GALLIUM_VC5" = xyes)
                                       "x$HAVE_GALLIUM_V3D" = xyes)
 AM_CONDITIONAL(HAVE_INTEL_DRIVERS, test "x$HAVE_INTEL_VULKAN" = xyes -o \
                                         "x$HAVE_I965_DRI" = xyes)
@@ -2872,8 +2970,8 @@ AM_CONDITIONAL(NEED_RADEON_DRM_WINSYS, test "x$HAVE_GALLIUM_R300" = xyes -o \
 AM_CONDITIONAL(NEED_WINSYS_XLIB, test "x$enable_glx" = xgallium-xlib)
 AM_CONDITIONAL(HAVE_GALLIUM_COMPUTE, test x$enable_opencl = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_LLVM, test "x$enable_llvm" = xyes)
 AM_CONDITIONAL(USE_V3D_SIMULATOR, test x$USE_V3D_SIMULATOR = xyes)
 AM_CONDITIONAL(USE_VC4_SIMULATOR, test x$USE_VC4_SIMULATOR = xyes)
 AM_CONDITIONAL(USE_VC5_SIMULATOR, test x$USE_VC5_SIMULATOR = xyes)
 AM_CONDITIONAL(HAVE_LIBDRM, test "x$have_libdrm" = xyes)
 AM_CONDITIONAL(HAVE_OSMESA, test "x$enable_osmesa" = xyes)
@@ -2909,7 +3007,7 @@ AC_SUBST([XVMC_MAJOR], 1)
 AC_SUBST([XVMC_MINOR], 0)
 AC_SUBST([XA_MAJOR], 2)
 AC_SUBST([XA_MINOR], 3)
 AC_SUBST([XA_MINOR], 4)
 AC_SUBST([XA_PATCH], 0)
 AC_SUBST([XA_VERSION], "$XA_MAJOR.$XA_MINOR.$XA_PATCH")
@@ -2958,37 +3056,33 @@ AC_CONFIG_FILES([Makefile
                  src/egl/Makefile
                  src/egl/main/egl.pc
                  src/egl/wayland/wayland-drm/Makefile
                  src/egl/wayland/wayland-egl/Makefile
                  src/egl/wayland/wayland-egl/wayland-egl.pc
                  src/gallium/Makefile
                  src/gallium/auxiliary/Makefile
                  src/gallium/auxiliary/pipe-loader/Makefile
                  src/gallium/drivers/freedreno/Makefile
                  src/gallium/drivers/ddebug/Makefile
                  src/gallium/drivers/i915/Makefile
                  src/gallium/drivers/llvmpipe/Makefile
                  src/gallium/drivers/noop/Makefile
                  src/gallium/drivers/nouveau/Makefile
                  src/gallium/drivers/pl111/Makefile
                  src/gallium/drivers/r300/Makefile
                  src/gallium/drivers/r600/Makefile
                  src/gallium/drivers/radeon/Makefile
                  src/gallium/drivers/radeonsi/Makefile
                  src/gallium/drivers/rbug/Makefile
                  src/gallium/drivers/softpipe/Makefile
                  src/gallium/drivers/svga/Makefile
                  src/gallium/drivers/swr/Makefile
                  src/gallium/drivers/trace/Makefile
                  src/gallium/drivers/tegra/Makefile
                  src/gallium/drivers/etnaviv/Makefile
                  src/gallium/drivers/imx/Makefile
                  src/gallium/drivers/v3d/Makefile
                  src/gallium/drivers/vc4/Makefile
                  src/gallium/drivers/vc5/Makefile
                  src/gallium/drivers/virgl/Makefile
                  src/gallium/state_trackers/clover/Makefile
                  src/gallium/state_trackers/dri/Makefile
                  src/gallium/state_trackers/glx/xlib/Makefile
                  src/gallium/state_trackers/nine/Makefile
                  src/gallium/state_trackers/omx_bellagio/Makefile
                  src/gallium/state_trackers/omx/Makefile
                  src/gallium/state_trackers/omx/bellagio/Makefile
                  src/gallium/state_trackers/omx/tizonia/Makefile
                  src/gallium/state_trackers/osmesa/Makefile
                  src/gallium/state_trackers/va/Makefile
                  src/gallium/state_trackers/vdpau/Makefile
@@ -2999,7 +3093,7 @@ AC_CONFIG_FILES([Makefile
                  src/gallium/targets/d3dadapter9/d3d.pc
                  src/gallium/targets/dri/Makefile
                  src/gallium/targets/libgl-xlib/Makefile
                  src/gallium/targets/omx-bellagio/Makefile
                  src/gallium/targets/omx/Makefile
                  src/gallium/targets/opencl/Makefile
                  src/gallium/targets/opencl/mesa.icd
                  src/gallium/targets/osmesa/Makefile
@@ -3026,8 +3120,9 @@ AC_CONFIG_FILES([Makefile
                  src/gallium/winsys/sw/null/Makefile
                  src/gallium/winsys/sw/wrapper/Makefile
                  src/gallium/winsys/sw/xlib/Makefile
                  src/gallium/winsys/tegra/drm/Makefile
                  src/gallium/winsys/v3d/drm/Makefile
                  src/gallium/winsys/vc4/drm/Makefile
                  src/gallium/winsys/vc5/drm/Makefile
                  src/gallium/winsys/virgl/drm/Makefile
                  src/gallium/winsys/virgl/vtest/Makefile
                  src/gbm/Makefile
@@ -3062,7 +3157,9 @@ AC_CONFIG_FILES([Makefile
                  src/mesa/state_tracker/tests/Makefile
                  src/util/Makefile
                  src/util/tests/hash_table/Makefile
                  src/util/tests/set/Makefile
                  src/util/tests/string_buffer/Makefile
                  src/util/tests/vma/Makefile
                  src/util/xmlpool/Makefile
                  src/vulkan/Makefile])
@@ -3075,6 +3172,9 @@ $SED -i -e 's/brw_blorp.cpp/brw_blorp.c/' src/mesa/drivers/dri/i965/.deps/brw_bl
 rm -f src/compiler/spirv/spirv_info.lo
 echo "# dummy" > src/compiler/spirv/.deps/spirv_info.Plo
 rm -f src/compiler/nir/.deps/nir_intrinsics.Plo
 echo "# dummy" > src/compiler/nir/.deps/nir_intrinsics.Plo
 dnl
 dnl Output some configuration info for the user
 dnl

									
										4

docs/codingstyle.html
									
												View File
												
				@@ -83,7 +83,7 @@ We try to quote the OpenGL specification where prudent:

				    *     "An INVALID_OPERATION error is generated for any of the following

				    *     conditions:

				    *

				    *     * <length> is zero."

				    *     * &lt;length&gt; is zero."

				    *

				    * Additionally, page 94 of the PDF of the OpenGL 4.5 core spec

				    * (30.10.2014) also says this, so it's no longer allowed for desktop GL,

				@@ -94,7 +94,7 @@ Function comment example:

				<pre>

				   /**

				    * Create and initialize a new buffer object.  Called via the

				    * ctx->Driver.CreateObject() driver callback function.

				    * ctx-&gt;Driver.CreateObject() driver callback function.

				    * \param  name  integer name of the object

				    * \param  type  one of GL_FOO, GL_BAR, etc.

				    * \return  pointer to new object or NULL if error

									
										1

docs/egl.html
									
												View File
												
				@@ -168,6 +168,7 @@ the X server directly using (XCB-)DRI2 protocol.</p>

				<p>This driver can share DRI drivers with <code>libGL</code>.</p>

				</dd>

				</dl>

				<h2>Packaging</h2>

									
										68

docs/envvars.html
									
												View File
												
				@@ -88,22 +88,40 @@ This is a work-around for that.

				<li>MESA_GL_VERSION_OVERRIDE - changes the value returned by

				glGetString(GL_VERSION) and possibly the GL API type.

				<ul>

				<li> The format should be MAJOR.MINOR[FC]

				<li> FC is an optional suffix that indicates a forward compatible context.

				This is only valid for versions &gt;= 3.0.

				<li> GL versions &lt; 3.0 are set to a compatibility (non-Core) profile

				<li> GL versions = 3.0, see below

				<li> GL versions &gt; 3.0 are set to a Core profile

				<li> Examples: 2.1, 3.0, 3.0FC, 3.1, 3.1FC

				<ul>

				<li> 2.1 - select a compatibility (non-Core) profile with GL version 2.1

				<li> 3.0 - select a compatibility (non-Core) profile with GL version 3.0

				<li> 3.0FC - select a Core+Forward Compatible profile with GL version 3.0

				<li> 3.1 - select a Core profile with GL version 3.1

				<li> 3.1FC - select a Core+Forward Compatible profile with GL version 3.1

				</ul>

				<li> Mesa may not really implement all the features of the given version.

				(for developers only)

				  <li>The format should be MAJOR.MINOR[FC|COMPAT]

				  <li>FC is an optional suffix that indicates a forward compatible

				      context. This is only valid for versions &gt;= 3.0.

				  <li>COMPAT is an optional suffix that indicates a compatibility

				      context or GL_ARB_compatibility support. This is only valid for

				      versions &gt;= 3.1.

				  <li>GL versions &lt;= 3.0 are set to a compatibility (non-Core)

				      profile

				  <li>GL versions = 3.1, depending on the driver, it may or may not

				      have the ARB_compatibility extension enabled.

				  <li>GL versions &gt;= 3.2 are set to a Core profile

				  <li>Examples: 2.1, 3.0, 3.0FC, 3.1, 3.1FC, 3.1COMPAT, X.Y, X.YFC,

				      X.YCOMPAT.

				  <ul>

				    <li>2.1 - select a compatibility (non-Core) profile with GL

				        version 2.1.

				    <li>3.0 - select a compatibility (non-Core) profile with GL

				        version 3.0.

				    <li>3.0FC - select a Core+Forward Compatible profile with GL

				        version 3.0.

				    <li>3.1 - select GL version 3.1 with GL_ARB_compatibility enabled

				        per the driver default.

				    <li>3.1FC - select GL version 3.1 with forward compatibility and

				        GL_ARB_compatibility disabled.

				    <li>3.1COMPAT - select GL version 3.1 with GL_ARB_compatibility

				        enabled.

				    <li>X.Y - override GL version to X.Y without changing the profile.

				    <li>X.YFC - select a Core+Forward Compatible profile with GL

				        version X.Y.

				    <li>X.YCOMPAT - select a Compatibility profile with GL version

				        X.Y.

				  </ul>

				  <li>Mesa may not really implement all the features of the given

				      version. (for developers only)

				</ul>

				<li>MESA_GLES_VERSION_OVERRIDE - changes the value returned by

				glGetString(GL_VERSION) for OpenGL ES.

				@@ -135,6 +153,16 @@ home directory.

				<li>MESA_NO_MINMAX_CACHE - when set, the minmax index cache is globally disabled.

				<li>MESA_SHADER_CAPTURE_PATH - see <a href="shading.html#capture">Capturing Shaders</a></li>

				<li>MESA_SHADER_DUMP_PATH and MESA_SHADER_READ_PATH - see <a href="shading.html#replacement">Experimenting with Shader Replacements</a></li>

				<li>MESA_VK_VERSION_OVERRIDE - changes the Vulkan physical device version

				    as returned in VkPhysicalDeviceProperties::apiVersion.

				  <ul>

				    <li>The format should be MAJOR.MINOR[.PATCH]</li>

				    <li>This will not let you force a version higher than the driver's

				        instance versionas advertised by vkEnumerateInstanceVersion</li>

				    <li>This can be very useful for debugging but some features may not be

				        implemented correctly. (For developers only)</li>

				  </ul>

				</li>

				</ul>

				@@ -241,7 +269,7 @@ Mesa EGL supports different sets of environment variables.  See the

				    Especially useful to toggle hud at specific points of application and

				    disable for unencumbered viewing the rest of the time. For example, set

				    GALLIUM_HUD_VISIBLE to false and GALLIUM_HUD_TOGGLE_SIGNAL to 10 (SIGUSR1).

				    Use kill -10 <pid> to toggle the hud as desired.

				    Use kill -10 &lt;pid&gt; to toggle the hud as desired.

				<li>GALLIUM_HUD_DUMP_DIR - specifies a directory for writing the displayed

				    hud values into files.

				<li>GALLIUM_DRIVER - useful in combination with LIBGL_ALWAYS_SOFTWARE=true for

				@@ -313,6 +341,12 @@ such as the OpenGL program's name and command line arguments.

				<li>See the driver code for other, lesser-used variables.

				</ul>

				<h3>WGL environment variables</h3>

				<ul>

				<li>WGL_SWAP_INTERVAL - to set a swap interval, equivalent to calling

				wglSwapIntervalEXT() in an application.  If this environment variable

				is set, application calls to wglSwapIntervalEXT() will have no effect.

				</ul>

				<h3>VA-API state tracker environment variables</h3>

				<ul>

BIN
docs/favicon.ico Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 13 KiB

BIN
docs/favicon.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 2.9 KiB

171

docs/features.txt

View File

@@ -24,16 +24,19 @@ not started
 # OpenGL Core and Compatibility context support
 OpenGL 3.1 and later versions are only supported with the Core profile.
 There are no plans to support GL_ARB_compatibility. The last supported OpenGL
 version with all deprecated features is 3.0. Some of the later GL features
 are exposed in the 3.0 context as extensions.
 Some drivers do not support the Compatibility profile or the
 ARB_compatibility extensions.  If an application does not request a
 specific version without the forward-compatiblity flag, such drivers
 will be limited to OpenGL 3.0.  If an application requests OpenGL 3.1,
 it will get a context that may or may not have the ARB_compatibility
 extension enabled.  Some of the later GL features are exposed in the 3.0
 context as extensions.
 Feature                                                 Status
 ------------------------------------------------------- ------------------------
 GL 3.0, GLSL 1.30 --- all DONE: freedreno, i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr
 GL 3.0, GLSL 1.30 --- all DONE: freedreno, i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr, virgl
   glBindFragDataLocation, glGetFragDataLocation         DONE
   GL_NV_conditional_render (Conditional rendering)      DONE ()
@@ -60,12 +63,12 @@ GL 3.0, GLSL 1.30 --- all DONE: freedreno, i965, nv50, nvc0, r600, radeonsi, llv
   glVertexAttribI commands                              DONE
   Depth format cube textures                            DONE ()
   GLX_ARB_create_context (GLX 1.4 is required)          DONE
   Multisample anti-aliasing                             DONE (freedreno (*), llvmpipe (*), softpipe (*), swr (*))
   Multisample anti-aliasing                             DONE (freedreno/a5xx, freedreno (*), llvmpipe (*), softpipe (*), swr (*))
 (*) freedreno, llvmpipe, softpipe, and swr have fake Multisample anti-aliasing support
 (*) freedreno (a2xx-a4xx), llvmpipe, softpipe, and swr have fake Multisample anti-aliasing support
 GL 3.1, GLSL 1.40 --- all DONE: freedreno, i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr
 GL 3.1, GLSL 1.40 --- all DONE: freedreno, i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr, virgl
   Forward compatible context support/deprecations       DONE ()
   GL_ARB_draw_instanced (Instanced drawing)             DONE ()
@@ -78,7 +81,7 @@ GL 3.1, GLSL 1.40 --- all DONE: freedreno, i965, nv50, nvc0, r600, radeonsi, llv
   GL_EXT_texture_snorm (Signed normalized textures)     DONE ()
 GL 3.2, GLSL 1.50 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr
 GL 3.2, GLSL 1.50 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr, virgl
   Core/compatibility profiles                           DONE
   Geometry shaders                                      DONE ()
@@ -87,13 +90,13 @@ GL 3.2, GLSL 1.50 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, soft
   GL_ARB_fragment_coord_conventions (Frag shader coord) DONE (freedreno)
   GL_ARB_provoking_vertex (Provoking vertex)            DONE (freedreno)
   GL_ARB_seamless_cube_map (Seamless cubemaps)          DONE (freedreno)
   GL_ARB_texture_multisample (Multisample textures)     DONE ()
   GL_ARB_texture_multisample (Multisample textures)     DONE (freedreno/a5xx)
   GL_ARB_depth_clamp (Frag depth clamp)                 DONE (freedreno)
   GL_ARB_sync (Fence objects)                           DONE (freedreno)
   GLX_ARB_create_context_profile                        DONE
 GL 3.3, GLSL 3.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe
 GL 3.3, GLSL 3.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, virgl
   GL_ARB_blend_func_extended                            DONE (freedreno/a3xx, swr)
   GL_ARB_explicit_attrib_location                       DONE (all drivers that support GLSL)
@@ -107,18 +110,18 @@ GL 3.3, GLSL 3.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, soft
   GL_ARB_vertex_type_2_10_10_10_rev                     DONE (freedreno, swr)
 GL 4.0, GLSL 4.00 --- all DONE: i965/gen7+, nvc0, r600, radeonsi
 GL 4.0, GLSL 4.00 --- all DONE: i965/gen7+, nvc0, r600, radeonsi, virgl
   GL_ARB_draw_buffers_blend                             DONE (freedreno, i965/gen6+, nv50, llvmpipe, softpipe, swr)
   GL_ARB_draw_indirect                                  DONE (freedreno, i965/gen7+, llvmpipe, softpipe, swr)
   GL_ARB_gpu_shader5                                    DONE (i965/gen7+)
   - 'precise' qualifier                                 DONE
   - Dynamically uniform sampler array indices           DONE (softpipe)
   - Dynamically uniform UBO array indices               DONE ()
   - Dynamically uniform UBO array indices               DONE (freedreno)
   - Implicit signed -> unsigned conversions             DONE
   - Fused multiply-add                                  DONE ()
   - Packing/bitfield/conversion functions               DONE (softpipe)
   - Enhanced textureGather                              DONE (softpipe)
   - Packing/bitfield/conversion functions               DONE (freedreno, softpipe)
   - Enhanced textureGather                              DONE (freedreno, softpipe)
   - Geometry shader instancing                          DONE (llvmpipe, softpipe)
   - Geometry shader multiple streams                    DONE ()
   - Enhanced per-sample shading                         DONE ()
@@ -136,7 +139,7 @@ GL 4.0, GLSL 4.00 --- all DONE: i965/gen7+, nvc0, r600, radeonsi
   GL_ARB_transform_feedback3                            DONE (i965/gen7+, llvmpipe, softpipe, swr)
 GL 4.1, GLSL 4.10 --- all DONE: i965/gen7+, nvc0, r600, radeonsi
 GL 4.1, GLSL 4.10 --- all DONE: i965/gen7+, nvc0, r600, radeonsi, virgl
   GL_ARB_ES2_compatibility                              DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_get_program_binary                             DONE (0 or 1 binary formats)
@@ -146,7 +149,7 @@ GL 4.1, GLSL 4.10 --- all DONE: i965/gen7+, nvc0, r600, radeonsi
   GL_ARB_viewport_array                                 DONE (i965, nv50, llvmpipe, softpipe)
 GL 4.2, GLSL 4.20 -- all DONE: i965/gen7+, nvc0, r600, radeonsi
 GL 4.2, GLSL 4.20 -- all DONE: i965/gen7+, nvc0, r600, radeonsi, virgl
   GL_ARB_texture_compression_bptc                       DONE (freedreno, i965)
   GL_ARB_compressed_texture_pixel_storage               DONE (all drivers)
@@ -162,7 +165,7 @@ GL 4.2, GLSL 4.20 -- all DONE: i965/gen7+, nvc0, r600, radeonsi
   GL_ARB_map_buffer_alignment                           DONE (all drivers)
 GL 4.3, GLSL 4.30 -- all DONE: i965/gen8+, nvc0, r600, radeonsi
 GL 4.3, GLSL 4.30 -- all DONE: i965/gen8+, nvc0, r600, radeonsi, virgl
   GL_ARB_arrays_of_arrays                               DONE (all drivers that support GLSL 1.30)
   GL_ARB_ES3_compatibility                              DONE (all drivers that support GLSL 3.30)
@@ -188,12 +191,12 @@ GL 4.3, GLSL 4.30 -- all DONE: i965/gen8+, nvc0, r600, radeonsi
   GL_ARB_vertex_attrib_binding                          DONE (all drivers)
 GL 4.4, GLSL 4.40 -- all DONE: i965/gen8+, nvc0, radeonsi
 GL 4.4, GLSL 4.40 -- all DONE: i965/gen8+, nvc0, r600, radeonsi
   GL_MAX_VERTEX_ATTRIB_STRIDE                           DONE (all drivers)
   GL_ARB_buffer_storage                                 DONE (freedreno, i965, nv50, r600, llvmpipe, swr)
   GL_ARB_clear_texture                                  DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_enhanced_layouts                               DONE (i965, nv50, r600, llvmpipe, softpipe)
   GL_ARB_buffer_storage                                 DONE (freedreno, i965, nv50, llvmpipe, swr)
   GL_ARB_clear_texture                                  DONE (i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_enhanced_layouts                               DONE (i965, nv50, llvmpipe, softpipe)
   - compile-time constant expressions                   DONE
   - explicit byte offsets for blocks                    DONE
   - forced alignment within blocks                      DONE
@@ -202,20 +205,20 @@ GL 4.4, GLSL 4.40 -- all DONE: i965/gen8+, nvc0, radeonsi
   - input/output block locations                        DONE
   GL_ARB_multi_bind                                     DONE (all drivers)
   GL_ARB_query_buffer_object                            DONE (i965/hsw+)
   GL_ARB_texture_mirror_clamp_to_edge                   DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_texture_stencil8                               DONE (freedreno, i965/hsw+, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_vertex_type_10f_11f_11f_rev                    DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_texture_mirror_clamp_to_edge                   DONE (i965, nv50, llvmpipe, softpipe, swr, virgl)
   GL_ARB_texture_stencil8                               DONE (freedreno, i965/hsw+, nv50, llvmpipe, softpipe, swr, virgl)
   GL_ARB_vertex_type_10f_11f_11f_rev                    DONE (i965, nv50, llvmpipe, softpipe, swr, virgl)
 GL 4.5, GLSL 4.50 -- all DONE: nvc0, radeonsi
   GL_ARB_ES3_1_compatibility                            DONE (i965/hsw+, r600)
   GL_ARB_ES3_1_compatibility                            DONE (i965/hsw+, r600, virgl)
   GL_ARB_clip_control                                   DONE (freedreno, i965, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_conditional_render_inverted                    DONE (freedreno, i965, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_cull_distance                                  DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_derivative_control                             DONE (i965, nv50, r600)
   GL_ARB_conditional_render_inverted                    DONE (freedreno, i965, nv50, r600, llvmpipe, softpipe, swr, virgl)
   GL_ARB_cull_distance                                  DONE (i965, nv50, r600, llvmpipe, softpipe, swr, virgl)
   GL_ARB_derivative_control                             DONE (i965, nv50, r600, virgl)
   GL_ARB_direct_state_access                            DONE (all drivers)
   GL_ARB_get_texture_sub_image                          DONE (all drivers)
   GL_ARB_shader_texture_image_samples                   DONE (i965, nv50, r600)
   GL_ARB_shader_texture_image_samples                   DONE (i965, nv50, r600, virgl)
   GL_ARB_texture_barrier                                DONE (freedreno, i965, nv50, r600)
   GL_KHR_context_flush_control                          DONE (all - but needs GLX/EGL extension to be useful)
   GL_KHR_robustness                                     DONE (i965)
@@ -226,19 +229,19 @@ GL 4.6, GLSL 4.60
   GL_ARB_gl_spirv                                       in progress (Nicolai Hähnle, Ian Romanick)
   GL_ARB_indirect_parameters                            DONE (i965/gen7+, nvc0, radeonsi)
   GL_ARB_pipeline_statistics_query                      DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_polygon_offset_clamp                           DONE (freedreno, i965, nv50, nvc0, r600, radeonsi, llvmpipe, swr)
   GL_ARB_shader_atomic_counter_ops                      DONE (freedreno/a5xx, i965/gen7+, nvc0, r600, radeonsi, softpipe)
   GL_ARB_polygon_offset_clamp                           DONE (freedreno, i965, nv50, nvc0, r600, radeonsi, llvmpipe, swr, virgl)
   GL_ARB_shader_atomic_counter_ops                      DONE (freedreno/a5xx, i965/gen7+, nvc0, r600, radeonsi, softpipe, virgl)
   GL_ARB_shader_draw_parameters                         DONE (i965, nvc0, radeonsi)
   GL_ARB_shader_group_vote                              DONE (i965, nvc0, radeonsi)
   GL_ARB_spirv_extensions                               in progress (Nicolai Hähnle, Ian Romanick)
   GL_ARB_texture_filter_anisotropic                     DONE (freedreno, i965, nv50, nvc0, r600, radeonsi, softpipe (*), llvmpipe (*))
   GL_ARB_transform_feedback_overflow_query              DONE (i965/gen6+, radeonsi, llvmpipe, softpipe)
   GL_ARB_transform_feedback_overflow_query              DONE (i965/gen6+, nvc0, radeonsi, llvmpipe, softpipe, virgl)
   GL_KHR_no_error                                       DONE (all drivers)
 (*) softpipe and llvmpipe advertise 16x anisotropy but simply ignore the setting
 These are the extensions cherry-picked to make GLES 3.1
 GLES3.1, GLSL ES 3.1 -- all DONE: i965/hsw+, nvc0, r600, radeonsi
 GLES3.1, GLSL ES 3.1 -- all DONE: i965/hsw+, nvc0, r600, radeonsi, virgl
   GL_ARB_arrays_of_arrays                               DONE (all drivers that support GLSL 1.30)
   GL_ARB_compute_shader                                 DONE (freedreno/a5xx, i965/gen7+, softpipe)
@@ -253,11 +256,11 @@ GLES3.1, GLSL ES 3.1 -- all DONE: i965/hsw+, nvc0, r600, radeonsi
   GL_ARB_shading_language_packing                       DONE (all drivers)
   GL_ARB_separate_shader_objects                        DONE (all drivers)
   GL_ARB_stencil_texturing                              DONE (freedreno, nv50, llvmpipe, softpipe, swr)
   GL_ARB_texture_multisample (Multisample textures)     DONE (i965/gen7+, nv50, llvmpipe, softpipe)
   GL_ARB_texture_multisample (Multisample textures)     DONE (freedreno/a5xx, i965/gen7+, nv50, llvmpipe, softpipe)
   GL_ARB_texture_storage_multisample                    DONE (all drivers that support GL_ARB_texture_multisample)
   GL_ARB_vertex_attrib_binding                          DONE (all drivers)
   GS5 Enhanced textureGather                            DONE (freedreno, i965/gen7+,)
   GS5 Packing/bitfield/conversion functions             DONE (i965/gen6+)
   GS5 Enhanced textureGather                            DONE (freedreno, i965/gen7+)
   GS5 Packing/bitfield/conversion functions             DONE (freedreno/a5xx, i965/gen6+)
   GL_EXT_shader_integer_mix                             DONE (all drivers that support GLSL)
   Additional functionality not covered above:
@@ -266,28 +269,28 @@ GLES3.1, GLSL ES 3.1 -- all DONE: i965/hsw+, nvc0, r600, radeonsi
       glGetBooleani_v - restrict to GLES enums
       gl_HelperInvocation support                       DONE (i965, r600)
 GLES3.2, GLSL ES 3.2 -- all DONE: i965/gen9+
 GLES3.2, GLSL ES 3.2 -- all DONE: i965/gen9+, radeonsi, virgl
   GL_EXT_color_buffer_float                             DONE (all drivers)
   GL_KHR_blend_equation_advanced                        DONE (i965, nvc0)
   GL_KHR_debug                                          DONE (all drivers)
   GL_KHR_robustness                                     DONE (i965, nvc0, radeonsi)
   GL_KHR_robustness                                     DONE (i965, nvc0)
   GL_KHR_texture_compression_astc_ldr                   DONE (freedreno, i965/gen9+)
   GL_OES_copy_image                                     DONE (all drivers)
   GL_OES_draw_buffers_indexed                           DONE (all drivers that support GL_ARB_draw_buffers_blend)
   GL_OES_draw_elements_base_vertex                      DONE (all drivers)
   GL_OES_geometry_shader                                DONE (i965/hsw+, nvc0, radeonsi)
   GL_OES_geometry_shader                                DONE (i965/hsw+, nvc0)
   GL_OES_gpu_shader5                                    DONE (all drivers that support GL_ARB_gpu_shader5)
   GL_OES_primitive_bounding_box                         DONE (i965/gen7+, nvc0, radeonsi)
   GL_OES_sample_shading                                 DONE (i965, nvc0, r600, radeonsi)
   GL_OES_sample_variables                               DONE (i965, nvc0, r600, radeonsi)
   GL_OES_primitive_bounding_box                         DONE (i965/gen7+, nvc0)
   GL_OES_sample_shading                                 DONE (i965, nvc0, r600)
   GL_OES_sample_variables                               DONE (i965, nvc0, r600)
   GL_OES_shader_image_atomic                            DONE (all drivers that support GL_ARB_shader_image_load_store)
   GL_OES_shader_io_blocks                               DONE (All drivers that support GLES 3.1)
   GL_OES_shader_multisample_interpolation               DONE (i965, nvc0, r600, radeonsi)
   GL_OES_shader_multisample_interpolation               DONE (i965, nvc0, r600)
   GL_OES_tessellation_shader                            DONE (all drivers that support GL_ARB_tessellation_shader)
   GL_OES_texture_border_clamp                           DONE (all drivers)
   GL_OES_texture_buffer                                 DONE (i965, nvc0, radeonsi)
   GL_OES_texture_cube_map_array                         DONE (i965/hsw+, nvc0, radeonsi)
   GL_OES_texture_buffer                                 DONE (freedreno, i965, nvc0)
   GL_OES_texture_cube_map_array                         DONE (i965/hsw+, nvc0)
   GL_OES_texture_stencil8                               DONE (all drivers that support GL_ARB_texture_stencil8)
   GL_OES_texture_storage_multisample_2d_array           DONE (all drivers that support GL_ARB_texture_multisample)
@@ -296,17 +299,17 @@ Khronos, ARB, and OES extensions that are not part of any OpenGL or OpenGL ES ve
   GL_ARB_bindless_texture                               DONE (nvc0, radeonsi)
   GL_ARB_cl_event                                       not started
   GL_ARB_compute_variable_group_size                    DONE (nvc0, radeonsi)
   GL_ARB_ES3_2_compatibility                            DONE (i965/gen8+)
   GL_ARB_fragment_shader_interlock                      not started
   GL_ARB_ES3_2_compatibility                            DONE (i965/gen8+, radeonsi, virgl)
   GL_ARB_fragment_shader_interlock                      DONE (i965)
   GL_ARB_gpu_shader_int64                               DONE (i965/gen8+, nvc0, radeonsi, softpipe, llvmpipe)
   GL_ARB_parallel_shader_compile                        not started, but Chia-I Wu did some related work in 2014
   GL_ARB_post_depth_coverage                            DONE (i965)
   GL_ARB_post_depth_coverage                            DONE (i965, nvc0)
   GL_ARB_robustness_isolation                           not started
   GL_ARB_sample_locations                               not started
   GL_ARB_seamless_cubemap_per_texture                   DONE (i965, nvc0, radeonsi, r600, softpipe, swr)
   GL_ARB_sample_locations                               DONE (nvc0)
   GL_ARB_seamless_cubemap_per_texture                   DONE (freedreno, i965, nvc0, radeonsi, r600, softpipe, swr, virgl)
   GL_ARB_shader_ballot                                  DONE (i965/gen8+, nvc0, radeonsi)
   GL_ARB_shader_clock                                   DONE (i965/gen7+, nv50, nvc0, r600, radeonsi)
   GL_ARB_shader_stencil_export                          DONE (i965/gen9+, r600, radeonsi, softpipe, llvmpipe, swr)
   GL_ARB_shader_stencil_export                          DONE (i965/gen9+, r600, radeonsi, softpipe, llvmpipe, swr, virgl)
   GL_ARB_shader_viewport_layer_array                    DONE (i965/gen6+, nvc0, radeonsi)
   GL_ARB_sparse_buffer                                  DONE (radeonsi/CIK+)
   GL_ARB_sparse_texture                                 not started
@@ -316,15 +319,17 @@ Khronos, ARB, and OES extensions that are not part of any OpenGL or OpenGL ES ve
   GL_EXT_memory_object                                  DONE (radeonsi)
   GL_EXT_memory_object_fd                               DONE (radeonsi)
   GL_EXT_memory_object_win32                            not started
   GL_EXT_semaphore                                      not started
   GL_EXT_semaphore_fd                                   not started
   GL_EXT_semaphore                                      DONE (radeonsi)
   GL_EXT_semaphore_fd                                   DONE (radeonsi)
   GL_EXT_semaphore_win32                                not started
   GL_EXT_texture_norm16                                 DONE (i965, r600, radeonsi, nvc0)
   GL_KHR_blend_equation_advanced_coherent               DONE (i965/gen9+)
   GL_KHR_texture_compression_astc_hdr                   DONE (i965/bxt)
   GL_KHR_texture_compression_astc_sliced_3d             DONE (i965/gen9+)
   GL_OES_depth_texture_cube_map                         DONE (all drivers that support GLSL 1.30+)
   GL_OES_EGL_image                                      DONE (all drivers)
   GL_OES_EGL_image_external_essl3                       not started
   GL_OES_EGL_image_external                             DONE (all drivers)
   GL_OES_EGL_image_external_essl3                       DONE (all drivers)
   GL_OES_required_internalformat                        DONE (all drivers)
   GL_OES_surfaceless_context                            DONE (all drivers)
   GL_OES_texture_compression_astc                       DONE (core only)
@@ -332,7 +337,7 @@ Khronos, ARB, and OES extensions that are not part of any OpenGL or OpenGL ES ve
   GL_OES_texture_float_linear                           DONE (freedreno, i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
   GL_OES_texture_half_float                             DONE (freedreno, i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
   GL_OES_texture_half_float_linear                      DONE (freedreno, i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
   GL_OES_texture_view                                   not started - based on GL_ARB_texture_view
   GL_OES_texture_view                                   DONE (i965/gen8+)
   GL_OES_viewport_array                                 DONE (i965, nvc0, radeonsi)
   GLX_ARB_context_flush_control                         not started
   GLX_ARB_robustness_application_isolation              not started
@@ -349,39 +354,55 @@ we DO NOT WANT implementations of these extensions for Mesa.
 Vulkan 1.0 -- all DONE: anv, radv
 Khronos extensions that are not part of any Vulkan version:
 Vulkan 1.1 -- all DONE: anv, radv
   VK_KHR_16bit_storage                                  in progress (Alejandro)
   VK_KHR_android_surface                                not started
   VK_KHR_bind_memory2                                   DONE (anv, radv)
   VK_KHR_dedicated_allocation                           DONE (anv, radv)
   VK_KHR_descriptor_update_template                     DONE (anv, radv)
   VK_KHR_display                                        not started
   VK_KHR_display_swapchain                              not started
   VK_KHR_external_fence                                 not started
   VK_KHR_external_fence_capabilities                    not started
   VK_KHR_external_fence_fd                              not started
   VK_KHR_external_fence_win32                           not started
   VK_KHR_device_group                                   not started
   VK_KHR_device_group_creation                          not started
   VK_KHR_external_fence                                 DONE (anv, radv)
   VK_KHR_external_fence_capabilities                    DONE (anv, radv)
   VK_KHR_external_memory                                DONE (anv, radv)
   VK_KHR_external_memory_capabilities                   DONE (anv, radv)
   VK_KHR_external_memory_fd                             DONE (anv, radv)
   VK_KHR_external_memory_win32                          not started
   VK_KHR_external_semaphore                             DONE (radv)
   VK_KHR_external_semaphore_capabilities                DONE (radv)
   VK_KHR_external_semaphore_fd                          DONE (radv)
   VK_KHR_external_semaphore_win32                       not started
   VK_KHR_external_semaphore                             DONE (anv, radv)
   VK_KHR_external_semaphore_capabilities                DONE (anv, radv)
   VK_KHR_get_memory_requirements2                       DONE (anv, radv)
   VK_KHR_get_physical_device_properties2                DONE (anv, radv)
   VK_KHR_get_surface_capabilities2                      DONE (anv)
   VK_KHR_incremental_present                            DONE (anv, radv)
   VK_KHR_maintenance1                                   DONE (anv, radv)
   VK_KHR_maintenance2                                   DONE (anv, radv)
   VK_KHR_maintenance3                                   DONE (anv, radv)
   VK_KHR_multiview                                      DONE (anv, radv)
   VK_KHR_relaxed_block_layout                           DONE (anv, radv)
   VK_KHR_sampler_ycbcr_conversion                       DONE (anv)
   VK_KHR_shader_draw_parameters                         DONE (anv, radv)
   VK_KHR_storage_buffer_storage_class                   DONE (anv, radv)
   VK_KHR_variable_pointers                              DONE (anv, radv)
 Khronos extensions that are not part of any Vulkan version:
   VK_KHR_8bit_storage                                   DONE (anv)
   VK_KHR_android_surface                                not started
   VK_KHR_create_renderpass2                             DONE (anv, radv)
   VK_KHR_display                                        DONE (anv, radv)
   VK_KHR_display_swapchain                              DONE (anv, radv)
   VK_KHR_draw_indirect_count                            DONE (radv)
   VK_KHR_external_fence_fd                              DONE (anv, radv)
   VK_KHR_external_fence_win32                           not started
   VK_KHR_external_memory_fd                             DONE (anv, radv)
   VK_KHR_external_memory_win32                          not started
   VK_KHR_external_semaphore_fd                          DONE (anv, radv)
   VK_KHR_external_semaphore_win32                       not started
   VK_KHR_get_display_properties2                        DONE (anv, radv)
   VK_KHR_get_surface_capabilities2                      DONE (anv, radv)
   VK_KHR_image_format_list                              DONE (anv, radv)
   VK_KHR_incremental_present                            DONE (anv, radv)
   VK_KHR_mir_surface                                    not started
   VK_KHR_push_descriptor                                DONE (anv, radv)
   VK_KHR_sampler_mirror_clamp_to_edge                   DONE (anv, radv)
   VK_KHR_shader_draw_parameters                         DONE (anv, radv)
   VK_KHR_shared_presentable_image                       not started
   VK_KHR_storage_buffer_storage_class                   DONE (anv, radv)
   VK_KHR_surface                                        DONE (anv, radv)
   VK_KHR_swapchain                                      DONE (anv, radv)
   VK_KHR_variable_pointers                              DONE (anv, radv)
   VK_KHR_wayland_surface                                DONE (anv, radv)
   VK_KHR_win32_keyed_mutex                              not started
   VK_KHR_win32_surface                                  not started

									
										118

docs/index.html
									
												View File
												
				@@ -16,6 +16,124 @@

				<h1>News</h1>

				<h2>July 27, 2018</h2>

				<p>

				<a href="relnotes/18.1.5.html">Mesa 18.1.5</a> is released.

				This is a bug-fix release.

				</p>

				<h2>July 13, 2018</h2>

				<p>

				<a href="relnotes/18.1.4.html">Mesa 18.1.4</a> is released.

				This is a bug-fix release.

				</p>

				<h2>June 29, 2018</h2>

				<p>

				<a href="relnotes/18.1.3.html">Mesa 18.1.3</a> is released.

				This is a bug-fix release.

				</p>

				<h2>June 15, 2018</h2>

				<p>

				<a href="relnotes/18.1.2.html">Mesa 18.1.2</a> is released.

				This is a bug-fix release.

				</p>

				<h2>June 3, 2018</h2>

				<p>

				<a href="relnotes/18.0.5.html">Mesa 18.0.5</a> is released.

				This is a bug-fix release.

				<br>

				NOTE: It is anticipated that 18.0.5 will be the final release in the

				18.0 series. Users of 18.0 are encouraged to migrate to the 18.1

				series in order to obtain future fixes.

				</p>

				<h2>June 1, 2018</h2>

				<p>

				<a href="relnotes/18.1.1.html">Mesa 18.1.1</a> is released.

				This is a bug-fix release.

				</p>

				<h2>May 18, 2018</h2>

				<p>

				<a href="relnotes/18.1.0.html">Mesa 18.1.0</a> is released.  This is a

				new development release.  See the release notes for more information

				about the release.

				</p>

				<h2>May 17, 2018</h2>

				<p>

				<a href="relnotes/18.0.4.html">Mesa 18.0.4</a> is released.

				This is a bug-fix release.

				</p>

				<h2>May 7, 2018</h2>

				<p>

				<a href="relnotes/18.0.3.html">Mesa 18.0.3</a> is released.

				This is a bug-fix release.

				</p>

				<h2>April 28, 2018</h2>

				<p>

				<a href="relnotes/18.0.2.html">Mesa 18.0.2</a> is released.

				This is a bug-fix release.

				</p>

				<h2>April 18, 2018</h2>

				<p>

				<a href="relnotes/18.0.1.html">Mesa 18.0.1</a> is released.

				This is a bug-fix release.

				</p>

				<h2>April 18, 2018</h2>

				<p>

				<a href="relnotes/17.3.9.html">Mesa 17.3.9</a> is released.

				This is a bug-fix release.

				<br>

				NOTE: It is anticipated that 17.3.9 will be the final release in the

				17.3 series. Users of 17.3 are encouraged to migrate to the 18.0

				series in order to obtain future fixes.

				</p>

				<h2>April 03, 2018</h2>

				<p>

				<a href="relnotes/17.3.8.html">Mesa 17.3.8</a> is released.

				This is a bug-fix release.

				</p>

				<h2>March 27, 2018</h2>

				<p>

				<a href="relnotes/18.0.0.html">Mesa 18.0.0</a> is released.  This is a

				new development release.  See the release notes for more information

				about the release.

				</p>

				<h2>March 21, 2018</h2>

				<p>

				<a href="relnotes/17.3.7.html">Mesa 17.3.7</a> is released.

				This is a bug-fix release.

				</p>

				<h2>February 26, 2018</h2>

				<p>

				<a href="relnotes/17.3.6.html">Mesa 17.3.6</a> is released.

				This is a bug-fix release.

				</p>

				<h2>February 19, 2018</h2>

				<p>

				<a href="relnotes/17.3.5.html">Mesa 17.3.5</a> is released.

				This is a bug-fix release.

				</p>

				<h2>February 15, 2018</h2>

				<p>

				<a href="relnotes/17.3.4.html">Mesa 17.3.4</a> is released.

				This is a bug-fix release.

				</p>

				<h2>January 18, 2018</h2>

				<p>

				<a href="relnotes/17.3.3.html">Mesa 17.3.3</a> is released.

									
										79

docs/meson.html
									
												View File
												
				@@ -18,16 +18,22 @@

				<h2 id="basic">1. Basic Usage</h2>

				<p><strong>The Meson build system for Mesa is still under active development,

				and should not be used in production environments.</strong></p>

				<p><strong>The Meson build system is generally considered stable and ready

				for production</strong></p>

				<p>The meson build is currently only tested on linux, and is known to not work

				on macOS, Windows, and haiku. This will be fixed.</p>

				<p>The meson build is tested on on Linux, macOS, Cygwin and Haiku, it should

				work on FreeBSD, DragonflyBSD, NetBSD, and OpenBSD.</p>

				<p><strong>Mesa requires Meson >= 0.44.1 to build.</strong>

				Some older versions of meson do not check that they are too old and will error

				out in odd ways.

				</p>

				<p>

				The meson program is used to configure the source directory and generates

				either a ninja build file or Visual Studio® build files. The latter must

				be enabled via the --backend switch, as ninja is the default backend on all

				be enabled via the <code>--backend</code> switch, as ninja is the default backend on all

				operating systems. Meson only supports out-of-tree builds, and must be passed a

				directory to put built and generated sources into. We'll call that directory

				"build" for examples.

				@@ -43,7 +49,7 @@ along with a build directory to view the selected options for. This will show

				your meson global arguments and project arguments, along with their defaults

				and your local settings.

				Moes does not currently support listing options before configure a build

				Meson does not currently support listing options before configure a build

				directory, but this feature is being discussed upstream.

				</p>

				@@ -54,13 +60,21 @@ directory, but this feature is being discussed upstream.

				<p>

				With additional arguments <code>meson configure</code> is used to change

				options on already configured build directory. All options passed to this

				command are in the form -D "command"="value".

				command are in the form <code>-D "command"="value"</code>.

				</p>

				<pre>

				    meson configure build/ -Dprefix=/tmp/install -Dglx=true

				</pre>

				<p>

				Note that options taking lists (such as <code>platforms</code>) are

				<a href="http://mesonbuild.com/Build-options.html#using-build-options">a bit

				more complicated</a>, but the simplest form compatible with Mesa options

				is to use a comma to separate values (<code>-D platforms=drm,wayland</code>)

				and brackets to represent an empty list (<code>-D platforms=[]</code>).

				</p>

				<p>

				Once you've run the initial <code>meson</code> command successfully you can use

				your configured backend to build the project. With ninja, the -C option can be

				@@ -76,13 +90,14 @@ Without arguments, it will produce libGL.so and/or several other libraries

				depending on the options you have chosen. Later, if you want to rebuild for a

				different configuration, you should run <code>ninja clean</code> before

				changing the configuration, or create a new out of tree build directory for

				each configuration you want to build.

				http://mesonbuild.com/Using-multiple-build-directories.html

				each configuration you want to build

				<a href="http://mesonbuild.com/Using-multiple-build-directories.html">as

				recommended in the documentation</a>

				</p>

				<dl>

				<dt><code>Environment Variables</code></dt>

				<dd><p>Meson supports the standard CC and CXX envrionment variables for

				<dd><p>Meson supports the standard CC and CXX environment variables for

				changing the default compiler, and CFLAGS, CXXFLAGS, and LDFLAGS for setting

				options to the compiler and linker.

				@@ -93,9 +108,9 @@ the popular compilers, a complete list is available

				These arguments are consumed and stored by meson when it is initialized or

				re-initialized. Therefore passing them to meson configure will not do anything,

				and passing them to ninja will only do something if ninja decides to

				re-initialze meson, for example, if a meson.build file has been changed.

				re-initialize meson, for example, if a meson.build file has been changed.

				Changing these variables will not cause all targets to be rebuilt, so running

				ninja clean is recomended when changing CFLAGS or CXXFLAGS. meson will never

				ninja clean is recommended when changing CFLAGS or CXXFLAGS. Meson will never

				change compiler in a configured build directory.

				</p>

				@@ -107,27 +122,27 @@ change compiler in a configured build directory.

				    CFLAGS=-Wno-typedef-redefinition ninja -C build-clang

				</pre>

				<p>Meson also honors DESTDIR for installs</p>

				<p>Meson also honors <code>DESTDIR</code> for installs</p>

				</dd>

				<dt><code>LLVM</code></dt>

				<dd><p>Meson includes upstream logic to wrap llvm-config using it's standard

				dependncy interface. It will search $PATH (or %PATH% on windows) for

				dependency interface. It will search <code>$PATH</code> (or <code>%PATH%</code> on windows) for

				llvm-config, so using an LLVM from a non-standard path is as easy as

				<code>PATH=/path/with/llvm-config:$PATH meson build</code>.

				</p></dd>

				</dl>

				<dl>

				<dt><code>PKG_CONFIG_PATH</code></dt>

				<dd><p>The

				<code>pkg-config</code> utility is a hard requirement for configuring and

				building Mesa on Linux and *BSD. It is used to search for external libraries

				on the system. This environment variable is used to control the search

				path for <code>pkg-config</code>. For instance, setting

				<code>PKG_CONFIG_PATH=/usr/X11R6/lib/pkgconfig</code> will search for

				package metadata in <code>/usr/X11R6</code> before the standard

				directories.</p>

				building Mesa on Unix-like systems. It is used to search for external libraries

				on the system. This environment variable is used to control the search path for

				<code>pkg-config</code>. For instance, setting

				<code>PKG_CONFIG_PATH=/usr/X11R6/lib/pkgconfig</code> will search for package

				metadata in <code>/usr/X11R6</code> before the standard directories.</p>

				</dd>

				</dl>

				@@ -136,7 +151,7 @@ One of the oddities of meson is that some options are different when passed to

				the <code>meson</code> than to <code>meson configure</code>. These options are

				passed as --option=foo to <code>meson</code>, but -Doption=foo to <code>meson

				configure</code>. Mesa defined options are always passed as -Doption=foo.

				<p>

				</p>

				<p>For those coming from autotools be aware of the following:</p>

				@@ -145,24 +160,28 @@ configure</code>. Mesa defined options are always passed as -Doption=foo.

				<dd><p>This option will set the compiler debug/optimisation levels to aid

				debugging the Mesa libraries.</p>

				<p>Note that in meson this defaults to "debugoptimized", and  not setting it to

				"release" will yield non-optimal performance and binary size. Not using "debug"

				may interfer with debbugging as some code and validation will be optimized

				away.

				<p>Note that in meson this defaults to <code>debugoptimized</code>, and

				not setting it to <code>release</code> will yield non-optimal

				performance and binary size. Not using <code>debug</code> may interfere

				with debugging as some code and validation will be optimized away.

				</p>

				<p> For those wishing to pass their own -O option, use the "plain" buildtype,

				which cuases meson to inject no additional compiler arguments, only those in

				the C/CXXFLAGS and those that mesa itself defines.</p>

				<p> For those wishing to pass their own optimization flags, use the <code>plain</code>

				buildtype, which causes meson to inject no additional compiler arguments, only

				those in the C/CXXFLAGS and those that mesa itself defines.</p>

				</dd>

				</dl>

				<dl>

				<dt><code>-Db_ndebug</code></dt>

				<dd><p>This option controls assertions in meson projects. When set to false

				<dd><p>This option controls assertions in meson projects. When set to <code>false</code>

				(the default) assertions are enabled, when set to true they are disabled. This

				is unrelated to the <code>buildtype</code>; setting the latter to

				<code>release</code> will not turn off assertions.

				</p>

				</dd>

				</dl>

				</div>

				</body>

				</html>

31

docs/patents.txt

View File

@@ -1,31 +0,0 @@
 ARB_texture_float:
     Silicon Graphics, Inc. owns US Patent #6,650,327, issued November 18,
 [1].
     SGI believes this patent contains necessary IP for graphics systems
     implementing floating point rasterization and floating point
     framebuffer capabilities described in ARB_texture_float extension, and
     will discuss licensing on RAND terms, on an individual basis with
     companies wishing to use this IP in the context of conformant OpenGL
     implementations [2].
     The source code to implement ARB_texture_float extension is included
     and can be toggled on at compile time, for those who purchased a
     license from SGI, or are in a country where the patent does not apply,
     etc.
     The software is provided "as is", without warranty of any kind, express
     or implied, including but not limited to the warranties of
     merchantability, fitness for a particular purpose and noninfringement.
     In no event shall the authors or copyright holders be liable for any
     claim, damages or other liability, whether in an action of contract,
     tort or otherwise, arising from, out of or in connection with the
     software or the use or other dealings in the software.
     You should contact a lawyer or SGI's legal department if you want to
     enable this extension.
 [1] https://www.google.com/patents/about?id=mIIOAAAAEBAJ&dq=6650327
 [2] https://www.opengl.org/registry/specs/ARB/texture_float.txt

									
										2

docs/precompiled.html
									
												View File
												
				@@ -24,10 +24,12 @@ Some Linux distributions closely follow the latest Mesa releases. On others one

				has to use unofficial channels.

				<br>

				There are some general directions:

				<ul>

				<li>Debian/Ubuntu based distros - PPA: xorg-edgers, oibaf and padoka</li>

				<li>Fedora - Corp: erp and che</li>

				<li>OpenSuse/SLES - OBS: X11:XOrg and pontostroy:X11</li>

				<li>Gentoo/Archlinux - officially provided/supported</li>

				</ul>

				</p>

				</div>

									
										66

docs/release-calendar.html
									
												View File
												
				@@ -39,67 +39,49 @@ if you'd like to nominate a patch in the next stable release.

				<th>Notes</th>

				</tr>

				<tr>

				<td rowspan="3">17.3</td>

				<td>2018-01-26</td>

				<td>17.3.4</td>

				<td>Emil Velikov</td>

				<td rowspan="3">18.1</td>

				<td>2018-08-10</td>

				<td>18.1.6</td>

				<td>Dylan Baker</td>

				<td></td>

				</tr>

				<tr>

				<td>2018-02-09</td>

				<td>17.3.5</td>

				<td>Juan A. Suarez Romero</td>

				<td>2018-08-24</td>

				<td>18.1.7</td>

				<td>Dylan Baker</td>

				<td></td>

				</tr>

				<tr>

				<td>2018-02-23</td>

				<td>17.3.6</td>

				<td>Juan A. Suarez Romero</td>

				<td>Final planned release for the 17.3 series</td>

				<td>2018-09-07</td>

				<td>18.1.8</td>

				<td>Dylan Baker</td>

				<td>Last planned 18.1.x release</td>

				</tr>

				<tr>

				<td rowspan="7">18.0</td>

				<td>2018-01-19</td>

				<td>18.0.0-rc1</td>

				<td>Emil Velikov</td>

				<td></td>

				</tr>

				<tr>

				<td>2018-01-26</td>

				<td>18.0.0-rc2</td>

				<td>Emil Velikov</td>

				<td></td>

				</tr>

				<tr>

				<td>2018-02-02</td>

				<td>18.0.0-rc3</td>

				<td>Emil Velikov</td>

				<td></td>

				</tr>

				<tr>

				<td>2018-02-09</td>

				<td>18.0.0-rc4</td>

				<td>Emil Velikov</td>

				<td>May be promoted to 18.0.0 final</td>

				</tr>

				<tr>

				<td>2018-02-23</td>

				<td>18.0.1</td>

				<td rowspan="4">18.2</td>

				<td>2018-08-01</td>

				<td>18.2.0rc1</td>

				<td>Andres Gomez</td>

				<td></td>

				</tr>

				<tr>

				<td>2018-03-09</td>

				<td>18.0.2</td>

				<td>2018-08-08</td>

				<td>18.2.0rc2</td>

				<td>Andres Gomez</td>

				<td></td>

				</tr>

				<tr>

				<td>2018-03-23</td>

				<td>18.0.3</td>

				<td>2018-08-15</td>

				<td>18.2.0rc3</td>

				<td>Andres Gomez</td>

				<td></td>

				</tr>

				<tr>

				<td>2018-08-22</td>

				<td>18.2.0rc4</td>

				<td>Andres Gomez</td>

				<td>Last planned RC/Final release</td>

				</tr>

				</table>

				</div>

									
										4

docs/releasing.html
									
												View File
												
				@@ -54,8 +54,8 @@ For example:

				<h1 id="schedule">Release schedule</h1>

				<p>

				Releases should happen on Fridays. Delays can occur although those should be keep

				to a minimum.

				Releases should happen on Wednesdays. Delays can occur although those

				should be keep to a minimum.

				<br>

				See our <a href="release-calendar.html" target="_parent">calendar</a> for the

				date and other details for individual releases.

									
										20

docs/relnotes.html
									
												View File
												
				@@ -21,7 +21,25 @@ The release notes summarize what's new or changed in each Mesa release.

				</p>

				<ul>

				<li><a href="relnotes/17.3.2.html">17.3.3 release notes</a>

				<li><a href="relnotes/18.1.5.html">18.1.5 release notes</a>

				<li><a href="relnotes/18.1.4.html">18.1.4 release notes</a>

				<li><a href="relnotes/18.1.3.html">18.1.3 release notes</a>

				<li><a href="relnotes/18.1.2.html">18.1.2 release notes</a>

				<li><a href="relnotes/18.0.5.html">18.0.5 release notes</a>

				<li><a href="relnotes/18.1.1.html">18.1.1 release notes</a>

				<li><a href="relnotes/18.1.0.html">18.1.0 release notes</a>

				<li><a href="relnotes/18.0.4.html">18.0.4 release notes</a>

				<li><a href="relnotes/18.0.3.html">18.0.3 release notes</a>

				<li><a href="relnotes/18.0.2.html">18.0.2 release notes</a>

				<li><a href="relnotes/18.0.1.html">18.0.1 release notes</a>

				<li><a href="relnotes/17.3.9.html">17.3.9 release notes</a>

				<li><a href="relnotes/17.3.8.html">17.3.8 release notes</a>

				<li><a href="relnotes/18.0.0.html">18.0.0 release notes</a>

				<li><a href="relnotes/17.3.7.html">17.3.7 release notes</a>

				<li><a href="relnotes/17.3.6.html">17.3.6 release notes</a>

				<li><a href="relnotes/17.3.5.html">17.3.5 release notes</a>

				<li><a href="relnotes/17.3.4.html">17.3.4 release notes</a>

				<li><a href="relnotes/17.3.3.html">17.3.3 release notes</a>

				<li><a href="relnotes/17.3.2.html">17.3.2 release notes</a>

				<li><a href="relnotes/17.2.8.html">17.2.8 release notes</a>

				<li><a href="relnotes/17.3.1.html">17.3.1 release notes</a>

									
										275

docs/relnotes/17.3.4.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,275 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 17.3.4 Release Notes / January 15, 2018</h1>

				<p>

				Mesa 17.3.4 is a bug fix release which fixes bugs found since the 17.3.3 release.

				</p>

				<p>

				Mesa 17.3.4 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				2d3a4c3cbc995b3e192361dce710d8c749e046e7575aa1b7d8fc9e6b4df28f84  mesa-17.3.4.tar.gz

				71f995e233bc5df1a0dd46c980d1720106e7f82f02d61c1ca50854b5e02590d0  mesa-17.3.4.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90311">Bug 90311</a> - Fail to build libglx with clang at linking stage</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101442">Bug 101442</a> - Piglit shaders&#64;ssa&#64;fs-if-def-else-break fails with sb but passes with R600_DEBUG=nosb</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102435">Bug 102435</a> - [skl,kbl] [drm] GPU HANG: ecode 9:0:0x86df7cf9, in csgo_linux64 [4947], reason: Hang on rcs, action: reset</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103006">Bug 103006</a> - [OpenGL CTS] [HSW] KHR-GL45.vertex_attrib_binding.basic-inputL-case1</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103626">Bug 103626</a> - [SNB] ES3-CTS.functional.shaders.precision</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104163">Bug 104163</a> - [GEN9+] 2-3% perf drop in GfxBench Manhattan 3.1 from &quot;i965: Disable regular fast-clears (CCS_D) on gen9+&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104383">Bug 104383</a> - [KBL] Intel GPU hang with firefox</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104411">Bug 104411</a> - [CCS] lemonbar-xft GPU hang</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104487">Bug 104487</a> - [KBL] portal2_linux GPU hang</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104711">Bug 104711</a> - [skl CCS] Oxenfree (unity engine game) hangs GPU</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104741">Bug 104741</a> - Graphic corruption for Android apps Telegram and KineMaster</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104745">Bug 104745</a> - HEVC VDPAU decoding broken on RX 460 with UVD Firmware v1.130</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104818">Bug 104818</a> - mesa fails to build on ia64</li>

				</ul>

				<h2>Changes</h2>

				<p>Andres Gomez (1):</p>

				<ul>

				  <li>i965: perform 2 uploads with dual slot *64*PASSTHRU formats on gen&lt;8</li>

				</ul>

				<p>Bas Nieuwenhuizen (10):</p>

				<ul>

				  <li>radv: Fix ordering issue in meta memory allocation failure path.</li>

				  <li>radv: Fix memory allocation failure path in compute resolve init.</li>

				  <li>radv: Fix freeing meta state if the device pipeline cache fails to allocate.</li>

				  <li>radv: Fix fragment resolve init memory allocation failure paths.</li>

				  <li>radv: Fix bufimage failure deallocation.</li>

				  <li>radv: Init variant entry with memset.</li>

				  <li>radv: Don't allow 3d or 1d depth/stencil textures.</li>

				  <li>ac/nir: Use instance_rate_inputs per attribute, not per variable.</li>

				  <li>ac/nir: Use correct 32-bit component writemask for 64-bit SSBO stores.</li>

				  <li>ac/nir: Fix vector extraction if source vector has &gt;4 elements.</li>

				</ul>

				<p>Boyuan Zhang (2):</p>

				<ul>

				  <li>radeon/vcn: add and manage render picture list</li>

				  <li>radeon/uvd: add and manage render picture list</li>

				</ul>

				<p>Chuck Atkins (1):</p>

				<ul>

				  <li>configure.ac: add missing llvm dependencies to .pc files</li>

				</ul>

				<p>Dave Airlie (10):</p>

				<ul>

				  <li>r600/sb: fix a bug emitting ar load from a constant.</li>

				  <li>ac/nir: account for view index in the user sgpr allocation.</li>

				  <li>radv: add fs_key meta format support to resolve passes.</li>

				  <li>radv: don't use hw resolve for integer image formats</li>

				  <li>radv: don't use hw resolves for r16g16 norm formats.</li>

				  <li>radv: move spi_baryc_cntl to pipeline</li>

				  <li>r600/sb: insert the else clause when we might depart from a loop</li>

				  <li>radv: don't enable tc compat for d32s8 + 4/8 samples (v1.1)</li>

				  <li>radv/gfx9: fix block compression texture views. (v2)</li>

				  <li>virgl: also remove dimension on indirect.</li>

				</ul>

				<p>Eleni Maria Stea (1):</p>

				<ul>

				  <li>mesa: Fix function pointers initialization in status tracker</li>

				</ul>

				<p>Emil Velikov (18):</p>

				<ul>

				  <li>cherry-ignore: i965: Accept CONTEXT_ATTRIB_PRIORITY for brwCreateContext</li>

				  <li>cherry-ignore: swr: refactor swr_create_screen to allow for proper cleanup on error</li>

				  <li>cherry-ignore: anv: add explicit 18.0 only nominations</li>

				  <li>cherry-ignore: radv: fix sample_mask_in loading. (v3.1)</li>

				  <li>cherry-ignore: meson: multiple fixes</li>

				  <li>cherry-ignore: swr/rast: support llvm 3.9 type declarations</li>

				  <li>Revert "cherry-ignore: intel/fs: Use the original destination region for int MUL lowering"</li>

				  <li>cherry-ignore: ac/nir: set amdgpu.uniform and invariant.load for UBOs</li>

				  <li>cherry-ignore: add gen10 fixes</li>

				  <li>cherry-ignore: add r600/amdgpu 18.0 nominations</li>

				  <li>cherry-ignore: add i965 shader cache fixes</li>

				  <li>cherry-ignore: nir: mark unused space in packed_tex_data</li>

				  <li>radv: Stop advertising VK_KHX_multiview</li>

				  <li>cherry-ignore: radv: Don't expose VK_KHX_multiview on android.</li>

				  <li>configure.ac: correct driglx-direct help text</li>

				  <li>cherry-ignore: add meson fix</li>

				  <li>cherry-ignore: add a few more meson fixes</li>

				  <li>Update version to 17.3.4</li>

				</ul>

				<p>Eric Engestrom (1):</p>

				<ul>

				  <li>radeon: remove left over dead code</li>

				</ul>

				<p>Gert Wollny (1):</p>

				<ul>

				  <li>r600/shader: Initialize max_driver_temp_used correctly for the first time</li>

				</ul>

				<p>Grazvydas Ignotas (2):</p>

				<ul>

				  <li>st/va: release held locks in error paths</li>

				  <li>st/vdpau: release held lock in error path</li>

				</ul>

				<p>Igor Gnatenko (1):</p>

				<ul>

				  <li>link mesautil with pthreads</li>

				</ul>

				<p>Indrajit Das (4):</p>

				<ul>

				  <li>st/omx_bellagio: Update default intra matrix per MPEG2 spec</li>

				  <li>radeon/uvd: update quantiser matrices only when requested</li>

				  <li>radeon/vcn: update quantiser matrices only when requested</li>

				  <li>st/va: clear pointers for mpeg2 quantiser matrices</li>

				</ul>

				<p>Jason Ekstrand (19):</p>

				<ul>

				  <li>i965: Call brw_cache_flush_for_render in predraw_resolve_framebuffer</li>

				  <li>i965: Add more precise cache tracking helpers</li>

				  <li>i965/blorp: Add more destination flushing</li>

				  <li>i965: Track the depth and render caches separately</li>

				  <li>i965: Track format and aux usage in the render cache</li>

				  <li>Re-enable regular fast-clears (CCS_D) on gen9+</li>

				  <li>i965/miptree: Refactor CCS_E and CCS_D cases in render_aux_usage</li>

				  <li>i965/miptree: Add an explicit tiling parameter to create_for_bo</li>

				  <li>i965/miptree: Use the tiling from the modifier instead of the BO</li>

				  <li>i965/bufmgr: Add a create_from_prime_tiled function</li>

				  <li>i965: Set tiling on BOs imported with modifiers</li>

				  <li>i965/miptree: Take an aux_usage in prepare/finish_render</li>

				  <li>i965/miptree: Add an aux_disabled parameter to render_aux_usage</li>

				  <li>i965/surface_state: Drop brw_aux_surface_disabled</li>

				  <li>intel/fs: Use the original destination region for int MUL lowering</li>

				  <li>anv/pipeline: Don't look at blend state unless we have an attachment</li>

				  <li>anv/cmd_buffer: Re-emit the pipeline at every subpass</li>

				  <li>anv: Stop advertising VK_KHX_multiview</li>

				  <li>i965: Call prepare_external after implicit window-system MSAA resolves</li>

				</ul>

				<p>Jon Turney (3):</p>

				<ul>

				  <li>configure: Default to gbm=no on osx</li>

				  <li>glx/apple: include util/debug.h for env_var_as_boolean prototype</li>

				  <li>glx/apple: locate dispatch table functions to wrap by name</li>

				</ul>

				<p>José Fonseca (1):</p>

				<ul>

				  <li>svga: Prevent use after free.</li>

				</ul>

				<p>Juan A. Suarez Romero (1):</p>

				<ul>

				  <li>docs: add sha256 checksums for 17.3.3</li>

				</ul>

				<p>Kenneth Graunke (2):</p>

				<ul>

				  <li>i965: Bind null render targets for shadow sampling + color.</li>

				  <li>i965: Bump official kernel requirement to Linux v3.9.</li>

				</ul>

				<p>Lucas Stach (2):</p>

				<ul>

				  <li>etnaviv: dirty TS state when framebuffer has changed</li>

				  <li>renderonly: fix dumb BO allocation for non 32bpp formats</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>radeonsi: don't ignore pitch for imported textures</li>

				</ul>

				<p>Matthew Nicholls (2):</p>

				<ul>

				  <li>radv: restore previous stencil reference after depth-stencil clear</li>

				  <li>radv: remove predication on cache flushes</li>

				</ul>

				<p>Maxin B. John (1):</p>

				<ul>

				  <li>anv_icd.py: improve reproducible builds</li>

				</ul>

				<p>Michel Dänzer (1):</p>

				<ul>

				  <li>winsys/radeon: Compute is_displayable in surf_drm_to_winsys</li>

				</ul>

				<p>Roland Scheidegger (1):</p>

				<ul>

				  <li>r600: don't do stack workarounds for hemlock</li>

				</ul>

				<p>Samuel Pitoiset (1):</p>

				<ul>

				  <li>radv: create pipeline layout objects for all meta operations</li>

				</ul>

				<p>Samuel Thibault (1):</p>

				<ul>

				  <li>glx: fix non-dri build</li>

				</ul>

				<p>Timothy Arceri (2):</p>

				<ul>

				  <li>ac: fix buffer overflow bug in 64bit SSBO loads</li>

				  <li>ac: fix visit_ssa_undef() for doubles</li>

				</ul>

				</div>

				</body>

				</html>

									
										66

docs/relnotes/17.3.5.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,66 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 17.3.5 Release Notes / February 19, 2018</h1>

				<p>

				Mesa 17.3.5 is a bug fix release which fixes bugs found since the 17.3.4 release.

				</p>

				<p>

				Mesa 17.3.5 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				bc1ee20366aae2affc37c89228f871f438136f70252005e9f842169bde976788  mesa-17.3.5.tar.gz

				eb9228fc8aaa71e0205c1481c5b157752ebaec9b646b030d27478e25a6d7936a  mesa-17.3.5.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				</ul>

				<h2>Changes</h2>

				<p>Emil Velikov (2):</p>

				<ul>

				  <li>docs: add sha256 checksums for 17.3.4</li>

				  <li>Update version to 17.3.5</li>

				</ul>

				<p>James Legg (1):</p>

				<ul>

				  <li>ac/nir: Fix conflict resolution typo in handle_vs_input_decl</li>

				</ul>

				</div>

				</body>

				</html>

									
										85

docs/relnotes/17.3.6.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,85 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 17.3.6 Release Notes / February 27, 2018</h1>

				<p>

				Mesa 17.3.6 is a bug fix release which fixes bugs found since the 17.3.5 release.

				</p>

				<p>

				Mesa 17.3.6 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				d5e10ea3f0d11b06d2b0b235bba372a04278c39bc0e712090bda1f61842db188  mesa-17.3.6.tar.gz

				e5915680d44ac9d05defdec529db7459ac9edd441c9845266eff2e2d3e57fbf8  mesa-17.3.6.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104383">Bug 104383</a> - [KBL] Intel GPU hang with firefox</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104411">Bug 104411</a> - [CCS] lemonbar-xft GPU hang</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104546">Bug 104546</a> - Crash happens when running compute pipeline after calling glxMakeCurrent two times</li>

				</ul>

				<h2>Changes</h2>

				<p>Emil Velikov (2):</p>

				<ul>

				  <li>docs: add sha256 checksums for 17.3.5</li>

				  <li>Update version to 17.3.6</li>

				</ul>

				<p>Jason Ekstrand (4):</p>

				<ul>

				  <li>i965/draw: Do resolves properly for textures used by TXF</li>

				  <li>i965: Replace draw_aux_buffer_disabled with draw_aux_usage</li>

				  <li>i965/draw: Set NEW_AUX_STATE when draw aux changes</li>

				  <li>i965: Stop disabling aux during texture preparation</li>

				</ul>

				<p>Kenneth Graunke (1):</p>

				<ul>

				  <li>i965: Don't disable CCS for RT dependencies when dispatching compute.</li>

				</ul>

				<p>Topi Pohjolainen (1):</p>

				<ul>

				  <li>i965: Don't try to disable render aux buffers for compute</li>

				</ul>

				</div>

				</body>

				</html>

									
										312

docs/relnotes/17.3.7.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,312 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 17.3.7 Release Notes / March 21, 2018</h1>

				<p>

				Mesa 17.3.7 is a bug fix release which fixes bugs found since the 17.3.7 release.

				</p>

				<p>

				Mesa 17.3.7 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				f08de6d0ccb3dbca04b44790d85c3ff9e7b1cc4189d1b7c7167e5ba7d98736c0  mesa-17.3.7.tar.gz

				0595904a8fba65a8fe853a84ad3c940205503b94af41e8ceed245fada777ac1e  mesa-17.3.7.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103007">Bug 103007</a> - [OpenGL CTS] [HSW] KHR-GL45.gpu_shader_fp64.fp64.max_uniform_components fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103988">Bug 103988</a> - Intermittent piglit failures with shader cache enabled</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104302">Bug 104302</a> - Wolfenstein 2 (2017) under wine graphical artifacting on RADV</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104381">Bug 104381</a> - swr fails to build since llvm-svn r321257</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104625">Bug 104625</a> - semicolon after if</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104642">Bug 104642</a> - Android: NULL pointer dereference with i965 mesa-dev, seems build_id_length related</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104654">Bug 104654</a> - r600/sb: Alien Isolation GPU lock</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104905">Bug 104905</a> - SpvOpFOrdEqual doesn't return correct results for NaNs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104915">Bug 104915</a> - Indexed SHADING_LANGUAGE_VERSION query not supported</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104923">Bug 104923</a> - anv: Dota2 rendering corruption</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105013">Bug 105013</a> - [regression] GLX+VA-API+clutter-gst video playback is corrupt with Mesa 17.3 (but is fine with 17.2)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105029">Bug 105029</a> - simdlib_512_avx512.inl:371:57: error: could not convert ‘_mm512_mask_blend_epi32((__mmask16)(ImmT), a, b)’ from ‘__m512i’ {aka ‘__vector(8) long long int’} to ‘SIMDImpl::SIMD512Impl::Float’</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105098">Bug 105098</a> - [RADV] GPU freeze with simple Vulkan App</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105103">Bug 105103</a> - Wayland master causes Mesa to fail to compile</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105224">Bug 105224</a> - Webgl Pointclouds flickers</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105255">Bug 105255</a> - Waiting for fences without waitAll is not implemented</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105271">Bug 105271</a> - WebGL2 shader crashes i965_dri.so 17.3.3</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105436">Bug 105436</a> - Blinking textures in UT2004 [bisected]</li>

				</ul>

				<h2>Changes</h2>

				<p>Alex Smith (1):</p>

				<ul>

				  <li>radv: Fix CmdCopyImage between uncompressed and compressed images</li>

				</ul>

				<p>Andriy Khulap (1):</p>

				<ul>

				  <li>i965: Fix RELOC_WRITE typo in brw_store_data_imm64()</li>

				</ul>

				<p>Anuj Phogat (1):</p>

				<ul>

				  <li>isl: Don't use surface format R32_FLOAT for typed atomic integer operations</li>

				</ul>

				<p>Bas Nieuwenhuizen (6):</p>

				<ul>

				  <li>radv: Always lower indirect derefs after nir_lower_global_vars_to_local.</li>

				  <li>radeonsi: Export signalled sync file instead of -1.</li>

				  <li>radv: Implement WaitForFences with !waitAll.</li>

				  <li>radv: Implement waiting on non-submitted fences.</li>

				  <li>radv: Fix copying from 3D images starting at non-zero depth.</li>

				  <li>radv: Increase the number of dynamic uniform buffers.</li>

				</ul>

				<p>Brian Paul (1):</p>

				<ul>

				  <li>mesa: add missing switch case for EXTRA_VERSION_40 in check_extra()</li>

				</ul>

				<p>Chuck Atkins (1):</p>

				<ul>

				  <li>glx: Properly handle cases where screen creation fails</li>

				</ul>

				<p>Daniel Stone (3):</p>

				<ul>

				  <li>i965: Fix bugs in intel_from_planar</li>

				  <li>egl/wayland: Fix ARGB/XRGB transposition in config map</li>

				  <li>egl/wayland: Always use in-tree wayland-egl-backend.h</li>

				</ul>

				<p>Dave Airlie (9):</p>

				<ul>

				  <li>r600: fix cubemap arrays</li>

				  <li>r600/sb/cayman: fix indirect ubo access on cayman</li>

				  <li>r600: fix xfb stream check.</li>

				  <li>ac/nir: to integer the args to bcsel.</li>

				  <li>r600/cayman: fix fragcood loading recip generation.</li>

				  <li>radv: don't support tc-compat on multisample d32s8 at all.</li>

				  <li>virgl: remap query types to hw support.</li>

				  <li>ac/nir: don't apply slice rounding on txf_ms</li>

				  <li>r600: implement callstack workaround for evergreen.</li>

				</ul>

				<p>Dylan Baker (2):</p>

				<ul>

				  <li>glapi/check_table: Remove 'extern "C"' block</li>

				  <li>glapi: remove APPLE extensions from test</li>

				</ul>

				<p>Emil Velikov (1):</p>

				<ul>

				  <li>docs: add sha256 checksums for 17.3.6</li>

				</ul>

				<p>Eric Anholt (4):</p>

				<ul>

				  <li>mesa: Drop incorrect A4B4G4R4 _mesa_format_matches_format_and_type() cases.</li>

				  <li>ac/nir: Fix compiler warning about uninitialized dw_addr.</li>

				  <li>glsl/tests: Fix strict aliasing warning about int64/double.</li>

				  <li>glsl/tests: Fix a compiler warning about signed/unsigned loop comparison.</li>

				</ul>

				<p>Francisco Jerez (1):</p>

				<ul>

				  <li>i965: Fix KHR_blend_equation_advanced with some render targets.</li>

				</ul>

				<p>Frank Binns (1):</p>

				<ul>

				  <li>egl/dri2: fix segfault when display initialisation fails</li>

				</ul>

				<p>George Kyriazis (1):</p>

				<ul>

				  <li>swr/rast: blend_epi32() should return Integer, not Float</li>

				</ul>

				<p>Gert Wollny (1):</p>

				<ul>

				  <li>r600: Take ALU_EXTENDED into account when evaluating jump offsets</li>

				</ul>

				<p>Gurchetan Singh (1):</p>

				<ul>

				  <li>mesa: don't clamp just based on ARB_viewport_array extension</li>

				</ul>

				<p>Iago Toral Quiroga (2):</p>

				<ul>

				  <li>i965/sbe: fix number of inputs for active components</li>

				  <li>i965/vec4: use a temp register to compute offsets for pull loads</li>

				</ul>

				<p>James Legg (1):</p>

				<ul>

				  <li>radv: Really use correct HTILE expanded words.</li>

				</ul>

				<p>Jason Ekstrand (3):</p>

				<ul>

				  <li>intel/isl: Add an isl_color_value_is_zero helper</li>

				  <li>vulkan/wsi/x11: Set OUT_OF_DATE if wait_for_special_event fails</li>

				  <li>intel/fs: Set up sampler message headers in the visitor on gen7+</li>

				</ul>

				<p>Jonathan Gray (1):</p>

				<ul>

				  <li>configure.ac: pthread-stubs not present on OpenBSD</li>

				</ul>

				<p>Jordan Justen (3):</p>

				<ul>

				  <li>i965: Create new program cache bo when clearing the program cache</li>

				  <li>program: Don't reset SamplersValidated when restoring from shader cache</li>

				  <li>intel/vulkan: Hard code CS scratch_ids_per_subslice for Cherryview</li>

				</ul>

				<p>Juan A. Suarez Romero (14):</p>

				<ul>

				  <li>cherry-ignore: Explicit 18.0 only nominations</li>

				  <li>cherry-ignore: r600/compute: only mark buffer/image state dirty for fragment shaders</li>

				  <li>cherry-ignore: anv: Move setting current_pipeline to cmd_state_init</li>

				  <li>cherry-ignore: anv: Be more careful about fast-clear colors</li>

				  <li>cherry-ignore: Add patches that has a specific version for 17.3</li>

				  <li>cherry-ignore: r600: Take ALU_EXTENDED into account when evaluating jump offsets</li>

				  <li>cherry-ignore: intel/compiler: Memory fence commit must always be enabled for gen10+</li>

				  <li>cherry-ignore: i965: Avoid problems from referencing orphaned BOs after growing.</li>

				  <li>cherry-ignore: include all Meson related fixes</li>

				  <li>cherry-ignore: ac/shader: fix vertex input with components.</li>

				  <li>cherry-ignore: i965: Use absolute addressing for constant buffer 0 on Kernel 4.16+.</li>

				  <li>cherry-ignore: anv/image: Separate modifiers from legacy scanout</li>

				  <li>cherry-ignore: glsl: Fix memory leak with known glsl_type instances</li>

				  <li>Update version to 17.3.7</li>

				</ul>

				<p>Karol Herbst (1):</p>

				<ul>

				  <li>nvir/nvc0: fix legalizing of ld unlock c0[0x10000]</li>

				</ul>

				<p>Kenneth Graunke (1):</p>

				<ul>

				  <li>i965: Emit CS stall before MEDIA_VFE_STATE.</li>

				</ul>

				<p>Lionel Landwerlin (1):</p>

				<ul>

				  <li>i965: perf: ensure reading config IDs from sysfs isn't interrupted</li>

				</ul>

				<p>Marek Olšák (2):</p>

				<ul>

				  <li>radeonsi: align command buffer starting address to fix some Raven hangs</li>

				  <li>configure.ac: blacklist libdrm 2.4.90</li>

				</ul>

				<p>Michal Navratil (1):</p>

				<ul>

				  <li>winsys/amdgpu: allow non page-aligned size bo creation from pointer</li>

				</ul>

				<p>Samuel Iglesias Gonsálvez (1):</p>

				<ul>

				  <li>glsl/linker: fix bug when checking precision qualifier</li>

				</ul>

				<p>Samuel Pitoiset (2):</p>

				<ul>

				  <li>ac/nir: use ordered float comparisons except for not equal</li>

				  <li>Revert "mesa: do not trigger _NEW_TEXTURE_STATE in glActiveTexture()"</li>

				</ul>

				<p>Stephan Gerhold (1):</p>

				<ul>

				  <li>util/build-id: Fix address comparison for binaries with LOAD vaddr &gt; 0</li>

				</ul>

				<p>Thomas Hellstrom (2):</p>

				<ul>

				  <li>svga: Fix a leftover debug hack</li>

				  <li>loader_dri3/glx/egl: Reinstate the loader_dri3_vtable get_dri_screen callback</li>

				</ul>

				<p>Tim Rowley (1):</p>

				<ul>

				  <li>swr/rast: fix MemoryBuffer build break for llvm-6</li>

				</ul>

				<p>Timothy Arceri (1):</p>

				<ul>

				  <li>nir: fix interger divide by zero crash during constant folding</li>

				</ul>

				<p>Tobias Droste (1):</p>

				<ul>

				  <li>gallivm: Use new LLVM fast-math-flags API</li>

				</ul>

				<p>Vadym Shovkoplias (1):</p>

				<ul>

				  <li>mesa: add glsl version query (v4)</li>

				</ul>

				<p>Vinson Lee (1):</p>

				<ul>

				  <li>swr/rast: Fix macOS macro.</li>

				</ul>

				</div>

				</body>

				</html>

									
										147

docs/relnotes/17.3.8.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,147 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 17.3.8 Release Notes / April 03, 2018</h1>

				<p>

				Mesa 17.3.8 is a bug fix release which fixes bugs found since the 17.3.7 release.

				</p>

				<p>

				Mesa 17.3.8 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				175d2ca9be2af3a8db6cd603986096d75da70f59699528d7b6675d542a305e23  mesa-17.3.8.tar.gz

				8f9d9bf281c48e4a8f5228816577263b4c655248dc7666e75034ab422951a6b1  mesa-17.3.8.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102542">Bug 102542</a> - mesa-17.2.0/src/gallium/state_trackers/nine/nine_ff.c:1938: bad assignment ?</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103746">Bug 103746</a> - [BDW BSW SKL KBL] dEQP-GLES31.functional.copy_image regressions</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104636">Bug 104636</a> - [BSW/HD400] Aztec Ruins GL version GPU hangs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105290">Bug 105290</a> - [BSW/HD400] SynMark OglCSDof GPU hangs when shaders come from cache</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105464">Bug 105464</a> - Reading per-patch outputs in Tessellation Control Shader returns undefined values</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105670">Bug 105670</a> - [regression][hang] Trine1EE hangs GPU after loading screen on Mesa3D-17.3 and later</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105704">Bug 105704</a> - compiler assertion hit</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105717">Bug 105717</a> - [bisected] Mesa build tests fails: BIGENDIAN_CPU or LITTLEENDIAN_CPU must be defined</li>

				</ul>

				<h2>Changes</h2>

				<p>Axel Davy (3):</p>

				<ul>

				  <li>st/nine: Fix bad tracking of vs textures for NINESBT_ALL</li>

				  <li>st/nine: Fixes warning about implicit conversion</li>

				  <li>st/nine: Fix non inversible matrix check</li>

				</ul>

				<p>Caio Marcelo de Oliveira Filho (1):</p>

				<ul>

				  <li>anv/pipeline: fail if TCS/TES compile fail</li>

				</ul>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>radv: get correct offset into LDS for indexed vars.</li>

				</ul>

				<p>Derek Foreman (1):</p>

				<ul>

				  <li>egl/wayland: Make swrast display_sync the correct queue</li>

				</ul>

				<p>Eric Engestrom (1):</p>

				<ul>

				  <li>meson/configure: detect endian.h instead of trying to guess when it's available</li>

				</ul>

				<p>Ian Romanick (2):</p>

				<ul>

				  <li>mesa: Don't write to user buffer in glGetTexParameterIuiv on error</li>

				  <li>i965/vec4: Fix null destination register in 3-source instructions</li>

				</ul>

				<p>Jason Ekstrand (1):</p>

				<ul>

				  <li>i965: Emit texture cache invalidates around blorp_copy</li>

				</ul>

				<p>Jordan Justen (2):</p>

				<ul>

				  <li>i965: Calculate thread_count in brw_alloc_stage_scratch</li>

				  <li>i965: Hard code CS scratch_ids_per_subslice for Cherryview</li>

				</ul>

				<p>Juan A. Suarez Romero (6):</p>

				<ul>

				  <li>docs: add sha256 checksums for 17.3.7</li>

				  <li>cherry-ignore: ac/nir: pass the nir variable through tcs loading.</li>

				  <li>cherry-ignore: radv: handle exporting view index to fragment shader. (v1.1)</li>

				  <li>cherry-ignore: omx: always define ENABLE_ST_OMX_{BELLAGIO,TIZONIA}</li>

				  <li>cherry-ignore: docs: fix 18.0 release note version</li>

				  <li>Update version to 17.3.8</li>

				</ul>

				<p>Leo Liu (1):</p>

				<ul>

				  <li>radeon/vce: move feedback command inside of destroy function</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>st/dri: fix OpenGL-OpenCL interop for GL_TEXTURE_BUFFER</li>

				</ul>

				<p>Rob Clark (1):</p>

				<ul>

				  <li>nir: fix per_vertex_output intrinsic</li>

				</ul>

				<p>Timothy Arceri (2):</p>

				<ul>

				  <li>glsl: fix infinite loop caused by bug in loop unrolling pass</li>

				  <li>nir: fix crash in loop unroll corner case</li>

				</ul>

				</div>

				</body>

				</html>

									
										162

docs/relnotes/17.3.9.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,162 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 17.3.9 Release Notes / April 18, 2018</h1>

				<p>

				Mesa 17.3.9 is a bug fix release which fixes bugs found since the 17.3.8 release.

				</p>

				<p>

				Mesa 17.3.9 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				4d625f65a1ff4cd8cfeb39e38f047507c6dea047502a0d53113c96f54588f340  mesa-17.3.9.tar.gz

				c5beb5fc05f0e0c294fefe1a393ee118cb67e27a4dca417d77c297f7d4b6e479  mesa-17.3.9.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98281">Bug 98281</a> - 'message's in ctx-&gt;Debug.LogMessages[] seem to leak.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101408">Bug 101408</a> - [Gen8+] Xonotic fails to render one of the weapons</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102342">Bug 102342</a> - mesa-17.1.7/src/gallium/auxiliary/pipebuffer/pb_cache.c:169]: (style) Suspicious condition</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105317">Bug 105317</a> - The GPU Vega 56 was hang while try to pass #GraphicsFuzz shader15 test</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105440">Bug 105440</a> - GEN7: rendering issue on citra</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105442">Bug 105442</a> - Hang when running nine ff lighting shader with radeonsi</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105994">Bug 105994</a> - surface state leak when creating and destroying image views with aspectMask depth and stencil</li>

				</ul>

				<h2>Changes</h2>

				<p>Andres Gomez (2):</p>

				<ul>

				  <li>dri_util: when overriding, always reset the core version</li>

				  <li>mesa: adds some comments regarding MESA_GLES_VERSION_OVERRIDE usage</li>

				</ul>

				<p>Axel Davy (2):</p>

				<ul>

				  <li>st/nine: Declare lighting consts for ff shaders</li>

				  <li>st/nine: Do not use scratch for face register</li>

				</ul>

				<p>Bas Nieuwenhuizen (1):</p>

				<ul>

				  <li>ac/nir: Add workaround for GFX9 buffer views.</li>

				</ul>

				<p>Daniel Stone (1):</p>

				<ul>

				  <li>st/dri: Initialise modifier to INVALID for DRI2</li>

				</ul>

				<p>Emil Velikov (1):</p>

				<ul>

				  <li>glsl: remove unreachable assert()</li>

				</ul>

				<p>Eric Engestrom (1):</p>

				<ul>

				  <li>gbm: remove never-implemented function</li>

				</ul>

				<p>Henri Verbeet (1):</p>

				<ul>

				  <li>mesa: Inherit texture view multi-sample information from the original texture images.</li>

				</ul>

				<p>Iago Toral Quiroga (1):</p>

				<ul>

				  <li>compiler/spirv: set is_shadow for depth comparitor sampling opcodes</li>

				</ul>

				<p>Jason Ekstrand (4):</p>

				<ul>

				  <li>nir/vars_to_ssa: Remove copies from the correct set</li>

				  <li>nir/lower_indirect_derefs: Support interp_var_at intrinsics</li>

				  <li>intel/vec4: Set channel_sizes for MOV_INDIRECT sources</li>

				  <li>nir/lower_vec_to_movs: Only coalesce if the vec had a SSA destination</li>

				</ul>

				<p>Juan A. Suarez Romero (3):</p>

				<ul>

				  <li>docs: add sha256 checksums for 17.3.8</li>

				  <li>cherry-ignore: Explicit 18.0 only nominations</li>

				  <li>Update version to 17.3.9</li>

				</ul>

				<p>Lionel Landwerlin (1):</p>

				<ul>

				  <li>anv: fix number of planes for depth &amp; stencil</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>mesa: simplify MESA_GL_VERSION_OVERRIDE behavior of API override</li>

				</ul>

				<p>Samuel Pitoiset (1):</p>

				<ul>

				  <li>radv: fix picking the method for resolve subpass</li>

				</ul>

				<p>Sergii Romantsov (1):</p>

				<ul>

				  <li>i965: Extend the negative 32-bit deltas to 64-bits</li>

				</ul>

				<p>Timothy Arceri (6):</p>

				<ul>

				  <li>gallium/pipebuffer: fix parenthesis location</li>

				  <li>glsl: always call do_lower_jumps() after loop unrolling</li>

				  <li>ac: add if/loop build helpers</li>

				  <li>radeonsi: make use of if/loop build helpers in ac</li>

				  <li>ac: make use of if/loop build helpers</li>

				  <li>mesa: free debug messages when destroying the debug state</li>

				</ul>

				<p>Xiong, James (1):</p>

				<ul>

				  <li>i965: return the fourcc saved in __DRIimage when possible</li>

				</ul>

				</div>

				</body>

				</html>

									
										76

docs/relnotes/17.4.0.html
									
												View File
											
				@@ -1,76 +0,0 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 17.4.0 Release Notes / TBD</h1>

				<p>

				Mesa 17.4.0 is a new development release.

				People who are concerned with stability and reliability should stick

				with a previous release or wait for Mesa 17.4.1.

				</p>

				<p>

				Mesa 17.4.0 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				TBD.

				</pre>

				<h2>New features</h2>

				<p>

				Note: some of the new features are only available with certain drivers.

				</p>

				<ul>

				<li>Disk shader cache support for i965 when MESA_GLSL_CACHE_DISABLE environment variable is set to "0" or "false"</li>

				<li>GL_ARB_shader_atomic_counters and GL_ARB_shader_atomic_counter_ops on r600/evergreen+</li>

				<li>GL_ARB_shader_image_load_store and GL_ARB_shader_image_size on r600/evergreen+</li>

				<li>GL_ARB_shader_storage_buffer_object on r600/evergreen+<li>

				<li>GL_ARB_compute_shader on r600/evergreen+<li>

				<li>GL_ARB_cull_distance on r600/evergreen+</li>

				<li>GL_ARB_enhanced_layouts on r600/evergreen+</li>

				<li>GL_ARB_bindless_texture on nvc0/kepler</li>

				<li>OpenGL 4.3 on r600/evergreen with hw fp64 support</li>

				<li>Support 1 binary format for GL_ARB_get_program_binary on i965.

				    (For the 18.0 release, 0 formats continue to be supported in

				    compatibility profiles.)</li>

				<li>Cannonlake support on i965 and anv</li>

				</ul>

				<h2>Bug fixes</h2>

				<ul>

				TBD

				</ul>

				<h2>Changes</h2>

				<ul>

				<li>Remove incomplete GLX_MESA_set_3dfx_mode from the Xlib libGL</li>

				</ul>

				</div>

				</body>

				</html>

									
										321

docs/relnotes/18.0.0.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,321 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.0.0 Release Notes / March 27 2018</h1>

				<p>

				Mesa 18.0.0 is a new development release.

				People who are concerned with stability and reliability should stick

				with a previous release or wait for Mesa 18.0.1.

				</p>

				<p>

				Mesa 18.0.0 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				93c2d3504b2871ac2146603fb1270f341d36a39695e2950a469c5eac74f98457  mesa-18.0.0.tar.gz

				694e5c3d37717d23258c1f88bc134223c5d1aac70518d2f9134d6df3ee791eea  mesa-18.0.0.tar.xz

				</pre>

				<h2>New features</h2>

				<p>

				Note: some of the new features are only available with certain drivers.

				</p>

				<ul>

				<li>Disk shader cache support for i965 when MESA_GLSL_CACHE_DISABLE environment variable is set to "0" or "false"</li>

				<li>GL_ARB_shader_atomic_counters and GL_ARB_shader_atomic_counter_ops on r600/evergreen+</li>

				<li>GL_ARB_shader_image_load_store and GL_ARB_shader_image_size on r600/evergreen+</li>

				<li>GL_ARB_shader_storage_buffer_object on r600/evergreen+</li>

				<li>GL_ARB_compute_shader on r600/evergreen+</li>

				<li>GL_ARB_cull_distance on r600/evergreen+</li>

				<li>GL_ARB_enhanced_layouts on r600/evergreen+</li>

				<li>GL_ARB_bindless_texture on nvc0/kepler</li>

				<li>OpenGL 4.3 on r600/evergreen with hw fp64 support</li>

				<li>Support 1 binary format for GL_ARB_get_program_binary on i965.

				    (For the 18.0 release, 0 formats continue to be supported in

				    compatibility profiles.)</li>

				<li>Cannonlake support on i965 and anv</li>

				</ul>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85564">Bug 85564</a> - Dead Island rendering issues</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90311">Bug 90311</a> - Fail to build libglx with clang at linking stage</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92363">Bug 92363</a> - [BSW/BDW] ogles1conform Gets test fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94739">Bug 94739</a> - Mesa 11.1.2 implementation error: bad format MESA_FORMAT_Z_FLOAT32 in _mesa_unpack_uint_24_8_depth_stencil_row</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97532">Bug 97532</a> - Regression: GLB 2.7 &amp; Glmark-2 GLES versions segfault due to linker precision error (259fc505) on dead variable</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97852">Bug 97852</a> - Unreal Engine corrupted preview viewport</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100438">Bug 100438</a> - glsl/ir.cpp:1376: ir_dereference_variable::ir_dereference_variable(ir_variable*): Assertion `var != NULL' failed.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101378">Bug 101378</a> - interpolateAtSample check for input parameter is too strict</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101442">Bug 101442</a> - Piglit shaders&#64;ssa&#64;fs-if-def-else-break fails with sb but passes with R600_DEBUG=nosb</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101560">Bug 101560</a> - SPIR-V OpSwitch with int64 not supported even though shaderInt64 is true</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101691">Bug 101691</a> - gfx corruption on windowed 3d-apps running on dGPU</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102177">Bug 102177</a> - [SKL] ES31-CTS.core.sepshaderobjs.StateInteraction fails sporadically</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102264">Bug 102264</a> - Missing MESA_FORMAT_{B8G8R8A8,B8G8R8X8}_SRGB formats</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102354">Bug 102354</a> - Mesa 17.2 no longer can give SRGB-capable framebuffer on i965, even though Mesa 17.1.x does.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102358">Bug 102358</a> - WarThunder freezes at start, with activated vsync (vblank_mode=2)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102435">Bug 102435</a> - [skl,kbl] [drm] GPU hang in Valve games based on Source 1</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102503">Bug 102503</a> - Report SRGB framebuffer to SuperTuxKart to workaround SuperTuxKart crash</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102665">Bug 102665</a> - test_glsl_to_tgsi_lifetime.cpp:53:67: error: ‘&gt;&gt;’ should be ‘&gt; &gt;’ within a nested template argument list</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102677">Bug 102677</a> - [OpenGL CTS] KHR-GL45.CommonBugs.CommonBug_PerVertexValidation fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102680">Bug 102680</a> - [OpenGL CTS] KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102710">Bug 102710</a> - vkCmdBlitImage with arrayLayers &gt; 1 fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102774">Bug 102774</a> - [BDW] [Bisected] Absolute constant buffers break VAAPI in mpv</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102809">Bug 102809</a> - Rust shadows(?) flash random colours</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102897">Bug 102897</a> - Separate bind points are not implemented correctly</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102955">Bug 102955</a> - HyperZ related rendering issue in ARK: Survival Evolved</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103006">Bug 103006</a> - [OpenGL CTS] [HSW] KHR-GL45.vertex_attrib_binding.basic-inputL-case1</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103007">Bug 103007</a> - [OpenGL CTS] [HSW] KHR-GL45.gpu_shader_fp64.fp64.max_uniform_components fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103085">Bug 103085</a> - [ivb byt hsw] piglit.spec.arb_indirect_parameters.tf-count-arrays</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103098">Bug 103098</a> - [OpenGL CTS] KHR-GL45.enhanced_layouts.varying_structure_locations fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103101">Bug 103101</a> - [SKL][bisected] DiRT Rally GPU hang</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103115">Bug 103115</a> - [BSW BXT GLK] dEQP-VK.spirv_assembly.instruction.compute.sconvert.int32_to_int64</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103128">Bug 103128</a> - [softpipe] piglit fs-ldexp regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103142">Bug 103142</a> - R600g+sb: optimizer apparently stuck in an endless loop</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103227">Bug 103227</a> - [G965 G45 ILK] ES2-CTS.gtf.GL2ExtensionTests.texture_float.texture_float regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103283">Bug 103283</a> - drm_get_device_name_for_fd is broken on FreeBSD</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103388">Bug 103388</a> - Linking libcltgsi.la (llvm/codegen/libclllvm_la-common.lo) fails with &quot;error: no match for 'operator-'&quot; with GCC-7, Mesa from Git and current LLVM revisions</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103393">Bug 103393</a> - glDispatchComputeGroupSizeARB : gl_GlobalInvocationID.x != gl_WorkGroupID.x * gl_LocalGroupSizeARB.x + gl_LocalInvocationID.x</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103412">Bug 103412</a> - gallium/wgl: Another fix to context creation without prior SetPixelFormat()</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103496">Bug 103496</a> - svga_screen.c:26:46: error: git_sha1.h: No such file or directory</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103513">Bug 103513</a> - [build failure] radv_shader.c:683:2: error: format not a string literal and no format arguments [-Werror=format-security]</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103519">Bug 103519</a> - wayland egl apps crash on start with mesa 17.2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103529">Bug 103529</a> - [GM45] GPU hang with mpv fullscreen (bisected)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103537">Bug 103537</a> - i965: Shadow of Mordor broken since commit 379b24a40d3d34ffdaaeb1b328f50e28ecb01468 on Haswell</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103544">Bug 103544</a> - Graphical glitches r600 in game this war of mine linux native</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103579">Bug 103579</a> - Vertex shader causes compiler to crash in SPIRV-to-NIR</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103616">Bug 103616</a> - Increased difference from reference image in shaders</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103626">Bug 103626</a> - [SNB] ES3-CTS.functional.shaders.precision</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103628">Bug 103628</a> - [BXT, GLK, BSW] KHR-GL46.shader_ballot_tests.ShaderBallotBitmasks</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103653">Bug 103653</a> - Unreal segfault since gallium/u_threaded: avoid syncs for get_query_result</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103658">Bug 103658</a> - addrlib/gfx9/gfx9addrlib.cpp:727:50: error: expected expression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103674">Bug 103674</a> - u_queue.c:173:7: error: implicit declaration of function 'timespec_get' is invalid in C99</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103746">Bug 103746</a> - [BDW BSW SKL KBL] dEQP-GLES31.functional.copy_image regressions</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103759">Bug 103759</a> - plasma desktop corrupted rendering</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103784">Bug 103784</a> - [bisected] Egl changes breaks all of EGL</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103787">Bug 103787</a> - [BDW,BSW] gpu hang on spec.arb_pipeline_statistics_query.arb_pipeline_statistics_query-comp</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103801">Bug 103801</a> - [i965] &gt;Observer_ issue</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103808">Bug 103808</a> - [radeonsi, bisected] World of Warcraft scribbling all over screen</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103902">Bug 103902</a> - Portal 2 game  hangs at startup   with latest mesa dev</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103904">Bug 103904</a> - Source engine-based games won't hang at start without R600_DEBUG=vs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103909">Bug 103909</a> - anv_allocator.c:113:1: error: static declaration of ‘memfd_create’ follows non-static declaration</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103942">Bug 103942</a> - KHR-GL46.enhanced_layouts.varying* regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103955">Bug 103955</a> - Using array in structure results in wrong GLSL compilation output</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103966">Bug 103966</a> - Mesa 17.2.5 implementation error: bad format MESA_FORMAT_Z_FLOAT32 in _mesa_unpack_uint_24_8_depth_stencil_row</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103988">Bug 103988</a> - Intermittent piglit failures with shader cache enabled</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104005">Bug 104005</a> - [sklgt4e] GPU hangs in Car_Chase</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104119">Bug 104119</a> - radv: OpBitFieldInsert produces 0 with a loop counter for Insert</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104141">Bug 104141</a> - include/c11/threads_posix.h:96: undefined reference to `pthread_once'</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104143">Bug 104143</a> - r600/sb: clobbers gl_Position -&gt; gl_FragCoord</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104163">Bug 104163</a> - [GEN9+] 2-3% perf drop in GfxBench Manhattan 3.1 from &quot;i965: Disable regular fast-clears (CCS_D) on gen9+&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104183">Bug 104183</a> - mesa-17.3.0/src/broadcom/qpu/qpu_pack.c:171]: (error) Invalid memcmp() argument</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104199">Bug 104199</a> - [i965 bisected] BIO and EM Vision in &gt;Observer_ is broken since commit af2c320190f3c73180f1610c8df955a7fa2a4d09</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104213">Bug 104213</a> - NULL pointer access crashes on compiling Vulkan compute shaders after &quot;anv: Add support for the variablePointers feature&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104214">Bug 104214</a> - Dota crashes when switching from game to desktop</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104226">Bug 104226</a> - [bisected] Anvil accesses uninitialized memory while compiling shaders</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104231">Bug 104231</a> - DispatchSanity_test.GL30 regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104246">Bug 104246</a> - Talos Principle Vulkan version crash: spirv_to_nir() returns NULL entry_point</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104271">Bug 104271</a> - i965: Timeout in dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.5</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104288">Bug 104288</a> - Steamroll needs allow_glsl_cross_stage_interpolation_mismatch=true</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104302">Bug 104302</a> - Wolfenstein 2 (2017) under wine graphical artifacting on RADV</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104331">Bug 104331</a> - [r600g] Ogre demo &quot;TutorialUAV01&quot; crash at r600_decompress_color_images</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104338">Bug 104338</a> - NULL pointer access crash on Sacha Willems' Vulkan raytracing demo after &quot;spirv: Add basic type validation for OpLoad, OpStore, and OpCopyMemory&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104359">Bug 104359</a> - Mesa freezes in &quot;vtn_cfg_walk_blocks&quot; with Sacha Willems' hdr, parallaxmapping and specializationconstants Vulkan demos</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104381">Bug 104381</a> - swr fails to build since llvm-svn r321257</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104383">Bug 104383</a> - [KBL] Intel GPU hang with firefox</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104411">Bug 104411</a> - [CCS] lemonbar-xft GPU hang</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104424">Bug 104424</a> - DOOM 2016 broken by spirv OpStore validation</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104487">Bug 104487</a> - [KBL] portal2_linux GPU hang</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104490">Bug 104490</a> - [radeonsi/290x] Dota2 fails to start (can't create opengl context)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104492">Bug 104492</a> - Compute Shader: Wrong alignment when assigning struct value to structured SSBO</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104546">Bug 104546</a> - Crash happens when running compute pipeline after calling glxMakeCurrent two times</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104551">Bug 104551</a> - Check if Mako templates for Python are installed</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104625">Bug 104625</a> - semicolon after if</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104636">Bug 104636</a> - [BSW/HD400] Aztec Ruins GL version GPU hangs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104642">Bug 104642</a> - Android: NULL pointer dereference with i965 mesa-dev, seems build_id_length related</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104654">Bug 104654</a> - r600/sb: Alien Isolation GPU lock</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104668">Bug 104668</a> - dEQP-GLES31.functional.shaders.linkage.uniform.block.differing_precision regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104677">Bug 104677</a> - radv_generate_graphics_pipeline_key reads input rate from incorrect binding</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104690">Bug 104690</a> - [G33] regression: piglit.spec.!opengl 1_4.draw-batch and gl-1_4-dlist-multidrawarrays</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104711">Bug 104711</a> - [skl CCS] Oxenfree (unity engine game) hangs GPU</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104741">Bug 104741</a> - Graphic corruption for Android apps Telegram and KineMaster</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104742">Bug 104742</a> - [swrast] piglit gl-1.4-dlist-multidrawarrays regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104746">Bug 104746</a> - [swrast] piglit attribs regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104749">Bug 104749</a> - rasterizer/jitter/JitManager.cpp:252:91: error: no matching function for call to ‘llvm::DIBuilder::createBasicType(const char [8], int, llvm::dwarf::TypeKind)’</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104762">Bug 104762</a> - Various segfaults/problems in qt/plasma</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104777">Bug 104777</a> - Attaching multiple shader objects for the same stage to a GLSL program triggers a linker error</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104884">Bug 104884</a> - memory leak with intel i965 mesa when running android container in Ubuntu</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104905">Bug 104905</a> - SpvOpFOrdEqual doesn't return correct results for NaNs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104915">Bug 104915</a> - Indexed SHADING_LANGUAGE_VERSION query not supported</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104923">Bug 104923</a> - anv: Dota2 rendering corruption</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105013">Bug 105013</a> - [regression] GLX+VA-API+clutter-gst video playback is corrupt with Mesa 17.3 (but is fine with 17.2)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105029">Bug 105029</a> - simdlib_512_avx512.inl:371:57: error: could not convert ‘_mm512_mask_blend_epi32((__mmask16)(ImmT), a, b)’ from ‘__m512i’ {aka ‘__vector(8) long long int’} to ‘SIMDImpl::SIMD512Impl::Float’</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105065">Bug 105065</a> - Qt Programs occasionally fail to render with new Mesa (glGetProgramBinary)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105098">Bug 105098</a> - [RADV] GPU freeze with simple Vulkan App</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105103">Bug 105103</a> - Wayland master causes Mesa to fail to compile</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105120">Bug 105120</a> - meson build broken</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105224">Bug 105224</a> - Webgl Pointclouds flickers</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105255">Bug 105255</a> - Waiting for fences without waitAll is not implemented</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105271">Bug 105271</a> - WebGL2 shader crashes i965_dri.so 17.3.3</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105290">Bug 105290</a> - [BSW/HD400] SynMark OglCSDof GPU hangs when shaders come from cache</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105292">Bug 105292</a> - vkGetQueryPoolResults returns incorrect query status for large query buffers (bisected)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105436">Bug 105436</a> - Blinking textures in UT2004 [bisected]</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105464">Bug 105464</a> - Reading per-patch outputs in Tessellation Control Shader returns undefined values</li>

				</ul>

				<h2>Changes</h2>

				<ul>

				<li>Remove incomplete GLX_MESA_set_3dfx_mode from the Xlib libGL</li>

				</ul>

				</div>

				</body>

				</html>

									
										225

docs/relnotes/18.0.1.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,225 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.0.1 Release Notes / April 18, 2018</h1>

				<p>

				Mesa 18.0.1 is a bug fix release which fixes bugs found since the 18.0.0 release.

				</p>

				<p>

				Mesa 18.0.1 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				0c93ba892c0610f5dd87f2e2673b9445187995c395b3ddb33fd4260bfb291e89  mesa-18.0.1.tar.gz

				b2d2f5b5dbaab13e15cb0dcb5ec81887467f55ebc9625945b303a3647cd87954  mesa-18.0.1.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101408">Bug 101408</a> - [Gen8+] Xonotic fails to render one of the weapons</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102342">Bug 102342</a> - mesa-17.1.7/src/gallium/auxiliary/pipebuffer/pb_cache.c:169]: (style) Suspicious condition</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102542">Bug 102542</a> - mesa-17.2.0/src/gallium/state_trackers/nine/nine_ff.c:1938: bad assignment ?</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105317">Bug 105317</a> - The GPU Vega 56 was hang while try to pass #GraphicsFuzz shader15 test</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105440">Bug 105440</a> - GEN7: rendering issue on citra</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105442">Bug 105442</a> - Hang when running nine ff lighting shader with radeonsi</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105567">Bug 105567</a> - meson/ninja: 1. mesa/vdpau incorrect symlinks in DESTDIR and 2. Ddri-drivers-path Dvdpau-libs-path overrides DESTDIR</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105670">Bug 105670</a> - [regression][hang] Trine1EE hangs GPU after loading screen on Mesa3D-17.3 and later</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105704">Bug 105704</a> - compiler assertion hit</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105717">Bug 105717</a> - [bisected] Mesa build tests fails: BIGENDIAN_CPU or LITTLEENDIAN_CPU must be defined</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105942">Bug 105942</a> - Graphical artefacts after update to mesa 18.0.0-2</li>

				</ul>

				<h2>Changes</h2>

				<p>Andres Gomez (2):</p>

				<ul>

				  <li>dri_util: when overriding, always reset the core version</li>

				  <li>mesa: adds some comments regarding MESA_GLES_VERSION_OVERRIDE usage</li>

				</ul>

				<p>Axel Davy (5):</p>

				<ul>

				  <li>st/nine: Fix bad tracking of vs textures for NINESBT_ALL</li>

				  <li>st/nine: Fixes warning about implicit conversion</li>

				  <li>st/nine: Fix non inversible matrix check</li>

				  <li>st/nine: Declare lighting consts for ff shaders</li>

				  <li>st/nine: Do not use scratch for face register</li>

				</ul>

				<p>Bas Nieuwenhuizen (3):</p>

				<ul>

				  <li>ac/nir: Add workaround for GFX9 buffer views.</li>

				  <li>radv: Don't set instance count using predication.</li>

				  <li>radv: Always reset draw user SGPRs after secondary command buffer.</li>

				</ul>

				<p>Caio Marcelo de Oliveira Filho (1):</p>

				<ul>

				  <li>anv/pipeline: fail if TCS/TES compile fail</li>

				</ul>

				<p>Daniel Stone (1):</p>

				<ul>

				  <li>st/dri: Initialise modifier to INVALID for DRI2</li>

				</ul>

				<p>Derek Foreman (1):</p>

				<ul>

				  <li>egl/wayland: Make swrast display_sync the correct queue</li>

				</ul>

				<p>Dylan Baker (4):</p>

				<ul>

				  <li>meson: don't use compiler.has_header</li>

				  <li>autotools: include meson_get_version</li>

				  <li>meson: Set .so version for xa like autotools does</li>

				  <li>meson: fix megadriver symlinking</li>

				</ul>

				<p>Emil Velikov (1):</p>

				<ul>

				  <li>docs: add sha256 checksums for 18.0.0</li>

				</ul>

				<p>Eric Engestrom (3):</p>

				<ul>

				  <li>meson/configure: detect endian.h instead of trying to guess when it's available</li>

				  <li>docs: fix 18.0 release note version</li>

				  <li>gbm: remove never-implemented function</li>

				</ul>

				<p>Henri Verbeet (1):</p>

				<ul>

				  <li>mesa: Inherit texture view multi-sample information from the original texture images.</li>

				</ul>

				<p>Iago Toral Quiroga (1):</p>

				<ul>

				  <li>compiler/spirv: set is_shadow for depth comparitor sampling opcodes</li>

				</ul>

				<p>Ian Romanick (1):</p>

				<ul>

				  <li>i965/vec4: Fix null destination register in 3-source instructions</li>

				</ul>

				<p>Jason Ekstrand (4):</p>

				<ul>

				  <li>nir/vars_to_ssa: Remove copies from the correct set</li>

				  <li>nir/lower_indirect_derefs: Support interp_var_at intrinsics</li>

				  <li>intel/vec4: Set channel_sizes for MOV_INDIRECT sources</li>

				  <li>nir/lower_vec_to_movs: Only coalesce if the vec had a SSA destination</li>

				</ul>

				<p>Juan A. Suarez Romero (5):</p>

				<ul>

				  <li>cherry-ignore anv: Be more careful about fast-clear colors</li>

				  <li>cherry-ignore: ac/shader: fix vertex input with components.</li>

				  <li>cherry-ignore: radv: handle exporting view index to fragment shader. (v1.1)</li>

				  <li>cherry-ignore: omx: always define ENABLE_ST_OMX_{BELLAGIO,TIZONIA}</li>

				  <li>Update version to 18.0.1</li>

				</ul>

				<p>Leo Liu (1):</p>

				<ul>

				  <li>radeon/vce: move feedback command inside of destroy function</li>

				</ul>

				<p>Lionel Landwerlin (1):</p>

				<ul>

				  <li>i965/perf: fix config registration when uploading to kernel</li>

				</ul>

				<p>Marc Dietrich (1):</p>

				<ul>

				  <li>meson: fix HAVE_LLVM version define in meson build</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>mesa: simplify MESA_GL_VERSION_OVERRIDE behavior of API override</li>

				</ul>

				<p>Mark Thompson (1):</p>

				<ul>

				  <li>st/va: Enable vaExportSurfaceHandle()</li>

				</ul>

				<p>Rob Clark (3):</p>

				<ul>

				  <li>nir: fix per_vertex_output intrinsic</li>

				  <li>freedreno/a5xx: fix page faults on last level</li>

				  <li>freedreno/a5xx: don't align height for PIPE_BUFFER</li>

				</ul>

				<p>Samuel Pitoiset (2):</p>

				<ul>

				  <li>radv: fix picking the method for resolve subpass</li>

				  <li>radv: fix radv_layout_dcc_compressed() when image doesn't have DCC</li>

				</ul>

				<p>Sergii Romantsov (1):</p>

				<ul>

				  <li>i965: Extend the negative 32-bit deltas to 64-bits</li>

				</ul>

				<p>Timothy Arceri (7):</p>

				<ul>

				  <li>ac: add if/loop build helpers</li>

				  <li>radeonsi: make use of if/loop build helpers in ac</li>

				  <li>ac: make use of if/loop build helpers</li>

				  <li>glsl: fix infinite loop caused by bug in loop unrolling pass</li>

				  <li>nir: fix crash in loop unroll corner case</li>

				  <li>gallium/pipebuffer: fix parenthesis location</li>

				  <li>glsl: always call do_lower_jumps() after loop unrolling</li>

				</ul>

				<p>Xiong, James (1):</p>

				<ul>

				  <li>i965: return the fourcc saved in __DRIimage when possible</li>

				</ul>

				</div>

				</body>

				</html>

									
										144

docs/relnotes/18.0.2.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,144 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.0.2 Release Notes / April 28, 2018</h1>

				<p>

				Mesa 18.0.2 is a bug fix release which fixes bugs found since the 18.0.1 release.

				</p>

				<p>

				Mesa 18.0.2 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				SHA256: ffd8dfe3337b474a3baa085f0e7ef1a32c7cdc3bed1ad810b2633919a9324840  mesa-18.0.2.tar.gz

				SHA256: 98fa159768482dc568b9f8bf0f36c7acb823fa47428ffd650b40784f16b9e7b3  mesa-18.0.2.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95009">Bug 95009</a> - [SNB] amd_shader_trinary_minmax.execution.built-in-functions.gs-mid3-ivec2-ivec2-ivec2 intermittent</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95012">Bug 95012</a> - [SNB] glsl-1_50.execution.built-in-functions.gs-op tests intermittent</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98281">Bug 98281</a> - 'message's in ctx-&gt;Debug.LogMessages[] seem to leak.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105320">Bug 105320</a> - Storage texel buffer access produces wrong results (RX Vega)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105775">Bug 105775</a> - SI reaches the maximum IB size in dwords and fail to submit</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105994">Bug 105994</a> - surface state leak when creating and destroying image views with aspectMask depth and stencil</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106074">Bug 106074</a> - radv: si_scissor_from_viewport returns incorrect result when using half-pixel viewport offset</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106126">Bug 106126</a> - eglMakeCurrent does not always ensure dri_drawable-&gt;update_drawable_info has been called for a new EGLSurface if another has been created and destroyed first</li>

				</ul>

				<h2>Changes</h2>

				<p>Bas Nieuwenhuizen (2):</p>

				<ul>

				  <li>ac/nir: Make the GFX9 buffer size fix apply to image loads/atomics too.</li>

				  <li>radv: Mark GTT memory as device local for APUs.</li>

				</ul>

				<p>Dylan Baker (2):</p>

				<ul>

				  <li>bin/install_megadrivers: fix DESTDIR and -D*-path</li>

				  <li>meson: don't build classic mesa tests without dri_drivers</li>

				</ul>

				<p>Ian Romanick (1):</p>

				<ul>

				  <li>intel/compiler: Add scheduler deps for instructions that implicitly read g0</li>

				</ul>

				<p>Jason Ekstrand (1):</p>

				<ul>

				  <li>i965/fs: Return mlen * 8 for size_read() for INTERPOLATE_AT_*</li>

				</ul>

				<p>Johan Klokkhammer Helsing (1):</p>

				<ul>

				  <li>st/dri: Fix dangling pointer to a destroyed dri_drawable</li>

				</ul>

				<p>Juan A. Suarez Romero (4):</p>

				<ul>

				  <li>docs: add sha256 checksums for 18.0.1</li>

				  <li>travis: radv needs LLVM 4.0</li>

				  <li>cherry-ignore: add explicit 18.1 only nominations</li>

				  <li>Update version to 18.0.2</li>

				</ul>

				<p>Kenneth Graunke (1):</p>

				<ul>

				  <li>i965: Fix shadow batches to be the same size as the real BO.</li>

				</ul>

				<p>Lionel Landwerlin (1):</p>

				<ul>

				  <li>anv: fix number of planes for depth &amp; stencil</li>

				</ul>

				<p>Lucas Stach (1):</p>

				<ul>

				  <li>etnaviv: fix texture_format_needs_swiz</li>

				</ul>

				<p>Marek Olšák (3):</p>

				<ul>

				  <li>radeonsi/gfx9: fix a hang with an empty first IB</li>

				  <li>glsl_to_tgsi: try harder to lower unsupported ir_binop_vector_extract</li>

				  <li>Revert "st/dri: Fix dangling pointer to a destroyed dri_drawable"</li>

				</ul>

				<p>Samuel Pitoiset (2):</p>

				<ul>

				  <li>radv: fix scissor computation when using half-pixel viewport offset</li>

				  <li>radv/winsys: allow to submit up to 4 IBs for chips without chaining</li>

				</ul>

				<p>Thomas Hellstrom (1):</p>

				<ul>

				  <li>svga: Fix incorrect advertizing of EGL_KHR_gl_colorspace</li>

				</ul>

				<p>Timothy Arceri (1):</p>

				<ul>

				  <li>mesa: free debug messages when destroying the debug state</li>

				</ul>

				</div>

				</body>

				</html>

									
										107

docs/relnotes/18.0.3.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,107 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.0.3 Release Notes / May 7, 2018</h1>

				<p>

				Mesa 18.0.3 is a bug fix release which fixes bugs found since the 18.0.2 release.

				</p>

				<p>

				Mesa 18.0.3 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				58cc5c5b1ab2a44e6e47f18ef6c29836ad06f95450adce635ce3c317507a171b  mesa-18.0.3.tar.gz

				099d9667327a76a61741a533f95067d76ea71a656e66b91507b3c0caf1d49e30  mesa-18.0.3.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105374">Bug 105374</a> - texture3d, a SaschaWillems demo, assert fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106147">Bug 106147</a> - SIGBUS in write_reloc() when Sacha Willems' &quot;texture3d&quot; Vulkan demo starts</li>

				</ul>

				<h2>Changes</h2>

				<p>Andres Rodriguez (1):</p>

				<ul>

				  <li>radv/winsys: fix leaking resources from bo's imported by fd</li>

				</ul>

				<p>Boyuan Zhang (1):</p>

				<ul>

				  <li>radeon/vcn: fix mpeg4 msg buffer settings</li>

				</ul>

				<p>Eric Anholt (1):</p>

				<ul>

				  <li>gallium/util: Fix incorrect refcounting of separate stencil.</li>

				</ul>

				<p>Jason Ekstrand (1):</p>

				<ul>

				  <li>anv/allocator: Don't shrink either end of the block pool</li>

				</ul>

				<p>Juan A. Suarez Romero (3):</p>

				<ul>

				  <li>docs: add sha256 checksums for 18.0.2</li>

				  <li>cherry-ignore: add explicit 18.1 only nominations</li>

				  <li>Update version to 18.0.3</li>

				</ul>

				<p>Leo Liu (1):</p>

				<ul>

				  <li>st/omx/enc: fix blit setup for YUV LoadImage</li>

				</ul>

				<p>Marek Olšák (2):</p>

				<ul>

				  <li>util/u_queue: fix a deadlock in util_queue_finish</li>

				  <li>radeonsi/gfx9: workaround for INTERP with indirect indexing</li>

				</ul>

				<p>Nanley Chery (1):</p>

				<ul>

				  <li>i965/tex_image: Avoid the ASTC LDR workaround on gen9lp</li>

				</ul>

				<p>Samuel Pitoiset (1):</p>

				<ul>

				  <li>radv: compute the number of subpass attachments correctly</li>

				</ul>

				</div>

				</body>

				</html>

									
										157

docs/relnotes/18.0.4.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,157 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.0.4 Release Notes / May 17, 2018</h1>

				<p>

				Mesa 18.0.4 is a bug fix release which fixes bugs found since the 18.0.3 release.

				</p>

				<p>

				Mesa 18.0.4 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				d1dc3469faccdd73439479426952d71a9e8f684e8d03b6687063c12b13430801  mesa-18.0.4.tar.gz

				1f3bcfe7cef0a5c20dae2b41df5d7e0a985e06be0183fa4d43b6068fcba2920f  mesa-18.0.4.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91808">Bug 91808</a> - trine1 misrender r600g</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100430">Bug 100430</a> - [radv] graphical glitches on dolphin emulator</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106243">Bug 106243</a> - [kbl] GPU HANG: 9:0:0x85dffffb, in Cinnamon</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106480">Bug 106480</a> - A2B10G10R10_SNORM vertex attribute doesn't work.</li>

				</ul>

				<h2>Changes</h2>

				<p>Bas Nieuwenhuizen (3):</p>

				<ul>

				  <li>radv: Translate logic ops.</li>

				  <li>radv: Fix up 2_10_10_10 alpha sign.</li>

				  <li>radv: Disable texel buffers with A2 SNORM/SSCALED/SINT for pre-vega.</li>

				</ul>

				<p>Dave Airlie (3):</p>

				<ul>

				  <li>r600: fix constant buffer bounds.</li>

				  <li>radv: resolve all layers in compute resolve path.</li>

				  <li>radv: use compute path for multi-layer images.</li>

				</ul>

				<p>Deepak Rawat (1):</p>

				<ul>

				  <li>egl/x11: Send invalidate to driver on copy_region path in swap_buffer</li>

				</ul>

				<p>Ian Romanick (1):</p>

				<ul>

				  <li>mesa: Add missing support for glFogiv(GL_FOG_DISTANCE_MODE_NV)</li>

				</ul>

				<p>Jan Vesely (8):</p>

				<ul>

				  <li>clover: Add explicit virtual destructor to argument class</li>

				  <li>eg/compute: Drop reference on code_bo in destructor.</li>

				  <li>r600: Cleanup constant buffers on context destruction</li>

				  <li>eg/compute: Drop reference to kernel_param bo in destructor</li>

				  <li>pipe-loader: Free driver_name in error path</li>

				  <li>gallium/auxiliary: Add helper function to count the number of entries in hash table</li>

				  <li>winsys/radeon: Destroy fd_hash table when the last winsys is removed.</li>

				  <li>winsys/amdgpu: Destroy dev_hash table when the last winsys is removed.</li>

				</ul>

				<p>Jason Ekstrand (1):</p>

				<ul>

				  <li>i965,anv: Set the CS stall bit on the ISP disable PIPE_CONTROL</li>

				</ul>

				<p>Jose Maria Casanova Crespo (2):</p>

				<ul>

				  <li>intel/compiler: fix 16-bit int brw_negate_immediate and brw_abs_immediate</li>

				  <li>intel/compiler: fix brw_imm_w for negative 16-bit integers</li>

				</ul>

				<p>Juan A. Suarez Romero (7):</p>

				<ul>

				  <li>docs: add sha256 checksums for 18.0.3</li>

				  <li>cherry-ignore: add explicit 18.1 only nominations</li>

				  <li>cherry-ignore: glsl: change ast_type_qualifier bitset size to work around GCC 5.4 bug</li>

				  <li>cherry-ignore: mesa: fix glGetInteger/Float/etc queries for vertex arrays attribs</li>

				  <li>cherry-ignore: mesa: revert GL_[SECONDARY_]COLOR_ARRAY_SIZE glGet type to TYPE_INT</li>

				  <li>cherry-ignore: radv/resolve: do fmask decompress on all layers.</li>

				  <li>Update version to 18.0.4</li>

				</ul>

				<p>Kai Wasserbäch (1):</p>

				<ul>

				  <li>opencl: autotools: Fix linking order for OpenCL target</li>

				</ul>

				<p>Kenneth Graunke (1):</p>

				<ul>

				  <li>i965: Don't leak blorp on Gen4-5.</li>

				</ul>

				<p>Lionel Landwerlin (2):</p>

				<ul>

				  <li>i965: require pixel scoreboard stall prior to ISP disable</li>

				  <li>anv: emit pixel scoreboard stall before ISP disable</li>

				</ul>

				<p>Matthew Nicholls (1):</p>

				<ul>

				  <li>radv: fix multisample image copies</li>

				</ul>

				<p>Neil Roberts (1):</p>

				<ul>

				  <li>spirv: Apply OriginUpperLeft to FragCoord</li>

				</ul>

				<p>Rhys Perry (1):</p>

				<ul>

				  <li>mesa: fix error handling in get_framebuffer_parameteriv</li>

				</ul>

				<p>Ross Burton (1):</p>

				<ul>

				  <li>src/intel/Makefile.vulkan.am: add missing MKDIR_GEN</li>

				</ul>

				</div>

				</body>

				</html>

									
										162

docs/relnotes/18.0.5.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,162 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.0.5 Release Notes / June 3, 2018</h1>

				<p>

				Mesa 18.0.5 is a bug fix release which fixes bugs found since the 18.0.4 release.

				</p>

				<p>

				Mesa 18.0.5 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				ea3e00329cea899b1e32db812fd2f426832be37e4baa2e2fd9288a3480f30531  mesa-18.0.5.tar.gz

				5187bba8d72aea78f2062d134ec6079a508e8216062dce9ec9048b5eb2c4fc6b  mesa-18.0.5.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78097">Bug 78097</a> - glUniform1ui and friends not supported by display lists</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102390">Bug 102390</a> - centroid interpolation causes broken attribute values</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105351">Bug 105351</a> - [Gen6+] piglit's arb_shader_image_load_store-host-mem-barrier fails with a glGetTexSubImage fallback path</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106090">Bug 106090</a> - Compiling compute shader crashes RADV</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106315">Bug 106315</a> - The witness + dxvk suffers flickering garbage</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106465">Bug 106465</a> - No test for Image Load/Store on format-incompatible texture buffer</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106479">Bug 106479</a> - NDEBUG not defined for libamdgpu_addrlib</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106481">Bug 106481</a> - No test for Image Load/Store on texture buffer sized greater than MAX_TEXTURE_BUFFER_SIZE_ARB</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106504">Bug 106504</a> - vulkan SPIR-V parsing failed at ../src/compiler/spirv/vtn_cfg.c:381</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106587">Bug 106587</a> - Dota2 is very dark when using vulkan render on a Intel &lt;&lt; AMD prime setup</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106629">Bug 106629</a> - [SNB,IVB,HSW,BDW] dEQP-EGL.functional.image.create.gles2_cubemap_negative_z_rgb_read_pixels</li>

				</ul>

				<h2>Changes</h2>

				<p>Anuj Phogat (1):</p>

				<ul>

				  <li>i965/glk: Add l3 banks count for 2x6 configuration</li>

				</ul>

				<p>Bas Nieuwenhuizen (2):</p>

				<ul>

				  <li>amd/addrlib: Use defines in autotools build.</li>

				  <li>radv: Fix SRGB compute copies.</li>

				</ul>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>tgsi/scan: add hw atomic to the list of memory accessing files</li>

				</ul>

				<p>Francisco Jerez (4):</p>

				<ul>

				  <li>Revert "mesa: simplify _mesa_is_image_unit_valid for buffers"</li>

				  <li>i965: Move buffer texture size calculation into a common helper function.</li>

				  <li>i965: Handle non-zero texture buffer offsets in buffer object range calculation.</li>

				  <li>i965: Use intel_bufferobj_buffer() wrapper in image surface state setup.</li>

				</ul>

				<p>Jan Vesely (1):</p>

				<ul>

				  <li>eg/compute: Use reference counting to handle compute memory pool.</li>

				</ul>

				<p>Jason Ekstrand (2):</p>

				<ul>

				  <li>intel/eu: Set EXECUTE_1 when setting the rounding mode in cr0</li>

				  <li>intel/blorp: Support blits and clears on surfaces with offsets</li>

				</ul>

				<p>Jose Dapena Paz (1):</p>

				<ul>

				  <li>mesa: do not leak ctx-&gt;Shader.ReferencedProgram references</li>

				</ul>

				<p>Juan A. Suarez Romero (8):</p>

				<ul>

				  <li>docs: add sha256 checksums for 18.0.4</li>

				  <li>cherry-ignore: i965/miptree: Fix handling of uninitialized MCS buffers</li>

				  <li>cherry-ignore: add explicit 18.1 only nominations</li>

				  <li>cherry-ignore: mesa/st: handle vert_attrib_mask in nir case too</li>

				  <li>cherry-ignore: Tegra is not supported</li>

				  <li>cherry-ignore: st/mesa: fix assertion failures with GL_UNSIGNED_INT64_ARB (v2)</li>

				  <li>cherry-ignore: nv30: ensure that displayable formats are marked accordingly</li>

				  <li>Update version to 18.0.5</li>

				</ul>

				<p>Marek Olšák (3):</p>

				<ul>

				  <li>st/mesa: simplify lastLevel determination in st_finalize_texture</li>

				  <li>radeonsi: fix incorrect parentheses around VS-PS varying elimination</li>

				  <li>mesa: handle GL_UNSIGNED_INT64_ARB properly (v2)</li>

				</ul>

				<p>Michel Dänzer (1):</p>

				<ul>

				  <li>dri3: Stricter SBC wraparound handling</li>

				</ul>

				<p>Nanley Chery (1):</p>

				<ul>

				  <li>i965/miptree: Zero-initialize CCS_D buffers</li>

				</ul>

				<p>Samuel Pitoiset (2):</p>

				<ul>

				  <li>spirv: fix visiting inner loops with same break/continue block</li>

				  <li>radv: fix centroid interpolation</li>

				</ul>

				<p>Stuart Young (1):</p>

				<ul>

				  <li>etnaviv: Fix missing rnndb file in tarballs</li>

				</ul>

				<p>Timothy Arceri (1):</p>

				<ul>

				  <li>mesa: add glUniform*ui{v} support to display lists</li>

				</ul>

				</div>

				</body>

				</html>

									
										268

docs/relnotes/18.1.0.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,268 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.1.0 Release Notes / May 18 2018</h1>

				<p>

				Mesa 18.1.0 is a new development release. People who are concerned

				with stability and reliability should stick with a previous release or

				wait for Mesa 18.1.1.

				</p>

				<p>

				Mesa 18.1.0 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				b1c1dbb42597190503d3abc518b12de880623f097c6cb6c293ecf69ae87e6fbf  mesa-18.1.0.tar.gz

				c855c5b67ef993b7621f76d8b120769ec0415f1c3616eaff44ef7f7f300aceba  mesa-18.1.0.tar.xz

				</pre>

				<h2>New features</h2>

				<p>

				Note: some of the new features are only available with certain drivers.

				</p>

				<ul>

				<li>OpenGL 3.1 with ARB_compatibility on nv50, nvc0, r600, radeonsi, softpipe, llvmpipe, svga</li>

				<li>GL_ARB_bindless_texture on nvc0/maxwell+</li>

				<li>GL_ARB_transform_feedback_overflow_query on nvc0</li>

				<li>GL_EXT_semaphore on radeonsi</li>

				<li>GL_EXT_semaphore_fd on radeonsi</li>

				<li>GL_EXT_shader_framebuffer_fetch on i965 on desktop GL (GLES was already supported)</li>

				<li>GL_EXT_shader_framebuffer_fetch_non_coherent on i965</li>

				<li>GL_KHR_blend_equation_advanced on radeonsi</li>

				<li>Disk shader cache support for i965 enabled by default</li>

				</ul>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90311">Bug 90311</a> - Fail to build libglx with clang at linking stage</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91808">Bug 91808</a> - trine1 misrender r600g</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95009">Bug 95009</a> - [SNB] amd_shader_trinary_minmax.execution.built-in-functions.gs-mid3-ivec2-ivec2-ivec2 intermittent</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95012">Bug 95012</a> - [SNB] glsl-1_50.execution.built-in-functions.gs-op tests intermittent</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98281">Bug 98281</a> - 'message's in ctx-&gt;Debug.LogMessages[] seem to leak.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99549">Bug 99549</a> - pp: Failed to translate a shader</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100259">Bug 100259</a> - [EGL] [GBM] undefined reference to `gbm_bo_create_with_modifiers'</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101408">Bug 101408</a> - [Gen8+] Xonotic fails to render one of the weapons</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101442">Bug 101442</a> - Piglit shaders&#64;ssa&#64;fs-if-def-else-break fails with sb but passes with R600_DEBUG=nosb</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102342">Bug 102342</a> - mesa-17.1.7/src/gallium/auxiliary/pipebuffer/pb_cache.c:169]: (style) Suspicious condition</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102542">Bug 102542</a> - mesa-17.2.0/src/gallium/state_trackers/nine/nine_ff.c:1938: bad assignment ?</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102905">Bug 102905</a> - [R600] Miscompilation of TGSI to VLIW causes artifacts in Gallium Nine with Crysis2 bump mapping</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103006">Bug 103006</a> - [OpenGL CTS] [HSW] KHR-GL45.vertex_attrib_binding.basic-inputL-case1</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103142">Bug 103142</a> - R600g+sb: optimizer apparently stuck in an endless loop</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103626">Bug 103626</a> - </li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103746">Bug 103746</a> - [BDW BSW SKL KBL] dEQP-GLES31.functional.copy_image regressions</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104302">Bug 104302</a> - Wolfenstein 2 (2017) under wine graphical artifacting on RADV</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104335">Bug 104335</a> - [OpenGL CTS][SKL,KBL] KHR-GL45.vertex_attrib_64bit.limits_test occasionally fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104625">Bug 104625</a> - semicolon after if</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104636">Bug 104636</a> - [BSW/HD400] Aztec Ruins GL version GPU hangs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104642">Bug 104642</a> - Android: NULL pointer dereference with i965 mesa-dev, seems build_id_length related</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104654">Bug 104654</a> - r600/sb: Alien Isolation GPU lock</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104668">Bug 104668</a> - dEQP-GLES31.functional.shaders.linkage.uniform.block.differing_precision regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104717">Bug 104717</a> - Rocket League: grass rendering broken with nir</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104732">Bug 104732</a> - [radv] Binding descriptor sets disturbs other pipeline bindings</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104741">Bug 104741</a> - Graphic corruption for Android apps Telegram and KineMaster</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104762">Bug 104762</a> - Various segfaults/problems in qt/plasma</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104777">Bug 104777</a> - Attaching multiple shader objects for the same stage to a GLSL program triggers a linker error</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104794">Bug 104794</a> - piglit.spec.arb_internalformat_query2.samples and num_sample_counts pname checks</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104803">Bug 104803</a> - SIGSEGV in state_tracker/st_glsl_to_tgsi_temprename.cpp</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104863">Bug 104863</a> - 186 assertions in piglit</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104884">Bug 104884</a> - memory leak with intel i965 mesa when running android container in Ubuntu</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104905">Bug 104905</a> - SpvOpFOrdEqual doesn't return correct results for NaNs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104908">Bug 104908</a> - Texture Compression Hint not converted to enum16</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104915">Bug 104915</a> - Indexed SHADING_LANGUAGE_VERSION query not supported</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104923">Bug 104923</a> - anv: Dota2 rendering corruption</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104989">Bug 104989</a> - [r600] [bisected] OpenGL applications can't render anything at all</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105013">Bug 105013</a> - [regression] GLX+VA-API+clutter-gst video playback is corrupt with Mesa 17.3 (but is fine with 17.2)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105026">Bug 105026</a> - glxgears asserts with pp_jimenezmlaa=1</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105029">Bug 105029</a> - simdlib_512_avx512.inl:371:57: error: could not convert ‘_mm512_mask_blend_epi32((__mmask16)(ImmT), a, b)’ from ‘__m512i’ {aka ‘__vector(8) long long int’} to ‘SIMDImpl::SIMD512Impl::Float’</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105052">Bug 105052</a> - </li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105065">Bug 105065</a> - Qt Programs occasionally fail to render with new Mesa (glGetProgramBinary)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105067">Bug 105067</a> - </li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105088">Bug 105088</a> - brw_nir_uniforms.cpp:256:10: error: non-constant-expression cannot be narrowed</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105098">Bug 105098</a> - [RADV] GPU freeze with simple Vulkan App</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105103">Bug 105103</a> - Wayland master causes Mesa to fail to compile</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105120">Bug 105120</a> - meson build broken</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105161">Bug 105161</a> - KHR_blend_equation_advanced doesn't work in GLSL 1.10-1.40 shaders</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105183">Bug 105183</a> - Weird assertion in NIR linker</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105211">Bug 105211</a> - build failure after zwp_dmabuf commit if wayland-protocols is not installed</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105224">Bug 105224</a> - Webgl Pointclouds flickers</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105229">Bug 105229</a> - [KBL SKL BDW HSW] [Regression] KHR-GLES31.core.shader_image_load_store.advanced-sso-simple failures</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105238">Bug 105238</a> - ast.h:648:16: error: union member 'i' has a non-trivial constructor</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105255">Bug 105255</a> - Waiting for fences without waitAll is not implemented</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105262">Bug 105262</a> - [R600] [BISECTED] ttf fonts are invisible in many programs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105271">Bug 105271</a> - WebGL2 shader crashes i965_dri.so 17.3.3</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105274">Bug 105274</a> - </li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105290">Bug 105290</a> - </li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105292">Bug 105292</a> - vkGetQueryPoolResults returns incorrect query status for large query buffers (bisected)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105317">Bug 105317</a> - The GPU Vega 56 was hang while try to pass #GraphicsFuzz shader15 test</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105320">Bug 105320</a> - Storage texel buffer access produces wrong results (RX Vega)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105374">Bug 105374</a> - texture3d, a SaschaWillems demo, assert fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105436">Bug 105436</a> - Blinking textures in UT2004 [bisected]</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105440">Bug 105440</a> - GEN7: rendering issue on citra</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105442">Bug 105442</a> - Hang when running nine ff lighting shader with radeonsi</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105444">Bug 105444</a> - Enable GL disk shader cache when transform feedback is enabled</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105464">Bug 105464</a> - </li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105471">Bug 105471</a> - [g33] [bisected] dEQP-GLES2.functional.shaders failures</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105497">Bug 105497</a> - shader-db crashes on 72 core system after ast_type_qualifier bitset change</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105529">Bug 105529</a> - u_debug_stack.c:268: error: #pragma GCC diagnostic not allowed inside functions</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105567">Bug 105567</a> - meson/ninja: 1. mesa/vdpau incorrect symlinks in DESTDIR and 2. Ddri-drivers-path Dvdpau-libs-path overrides DESTDIR</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105621">Bug 105621</a> - Build failure on GNOME Continuous</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105634">Bug 105634</a> - Android build test fails when building brw_oa_metrics.c</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105670">Bug 105670</a> - </li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105704">Bug 105704</a> - </li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105717">Bug 105717</a> - [bisected] Mesa build tests fails: BIGENDIAN_CPU or LITTLEENDIAN_CPU must be defined</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105737">Bug 105737</a> - st_tests_common.cpp:140:42: error: no matching function for call to 'tgsi_get_opcode_info'</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105738">Bug 105738</a> - commit f7ffa504a065dc2631fd38cc5fe885b277f4e7e7 causes artifacting in radv</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105740">Bug 105740</a> - glsl_types.cpp(524): error: a dynamically-initialized local static variable is not allowed inside of a statement expression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105775">Bug 105775</a> - SI reaches the maximum IB size in dwords and fail to submit</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105807">Bug 105807</a> - [Regression, bisected]: 3D Rendering not working correctly in Warhammer 40k: Dawn of War II</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105817">Bug 105817</a> - scons build broken by glSpecializeShaderARB</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105820">Bug 105820</a> - [m32] piglit regressions relinking program without shaders</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105942">Bug 105942</a> - Graphical artefacts after update to mesa 18.0.0-2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105952">Bug 105952</a> - radv causes GPU hang on SI</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105960">Bug 105960</a> - [bisected] meson build test fails with: undefined reference to `etna_pm_create_query'</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105994">Bug 105994</a> - surface state leak when creating and destroying image views with aspectMask depth and stencil</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106074">Bug 106074</a> - radv: si_scissor_from_viewport returns incorrect result when using half-pixel viewport offset</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106126">Bug 106126</a> - eglMakeCurrent does not always ensure dri_drawable-&gt;update_drawable_info has been called for a new EGLSurface if another has been created and destroyed first</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106131">Bug 106131</a> - meson/ninja build missing file gtest.h</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106133">Bug 106133</a> - make check &quot;OSError: [Errno 24] Too many open files&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106147">Bug 106147</a> - SIGBUS in write_reloc() when Sacha Willems' &quot;texture3d&quot; Vulkan demo starts</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106174">Bug 106174</a> - vulkan dota2 broken (segfaulting), found bug commit</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106180">Bug 106180</a> - [bisected] radv vulkan smoke test black screen (Add support for DRI3 v1.2)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106243">Bug 106243</a> - [kbl] GPU HANG: 9:0:0x85dffffb, in Cinnamon</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106450">Bug 106450</a> - </li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106462">Bug 106462</a> - piglit.spec.arb_vertex_array_bgra.get regression</li>

				</ul>

				<h2>Changes</h2>

				<ul>

				<li>Remove incomplete GLX_SGIX_swap_barrier stubs from the Xlib libGL</li>

				<li>Remove incomplete GLX_SGIX_swap_group stubs from the Xlib libGL</li>

				</ul>

				</div>

				</body>

				</html>

									
										168

docs/relnotes/18.1.1.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,168 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.1.1 Release Notes / June 1 2018</h1>

				<p>

				Mesa 18.1.1 is a bug fix release which fixes bugs found since the 18.1.0 release.

				</p>

				<p>

				Mesa 18.1.1 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				366a35f7530a016f2a8284fb0ee5759eeb216b4d6fa47f0e96b89ad2e43faf96  mesa-18.1.1.tar.gz

				d3312a2ede5aac14a47476b208b8e3a401367838330197c4588ab8ad420d7781  mesa-18.1.1.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>None<p>

				<h2>Changes</h2>

				<p>Anuj Phogat (1):</p>

				<ul>

				  <li>i965/glk: Add l3 banks count for 2x6 configuration</li>

				</ul>

				<p>Bas Nieuwenhuizen (7):</p>

				<ul>

				  <li>radv: Fix multiview queries.</li>

				  <li>radv: Translate logic ops.</li>

				  <li>radv: Fix up 2_10_10_10 alpha sign.</li>

				  <li>radv: Disable texel buffers with A2 SNORM/SSCALED/SINT for pre-vega.</li>

				  <li>amd/addrlib: Use defines in autotools build.</li>

				  <li>radv: Fix SRGB compute copies.</li>

				  <li>radv: Only expose subgroup shuffles on VI+.</li>

				</ul>

				<p>Christoph Haag (1):</p>

				<ul>

				  <li>radv: fix VK_EXT_descriptor_indexing</li>

				</ul>

				<p>Dave Airlie (5):</p>

				<ul>

				  <li>radv/resolve: do fmask decompress on all layers.</li>

				  <li>radv: resolve all layers in compute resolve path.</li>

				  <li>radv: use compute path for multi-layer images.</li>

				  <li>virgl: set texture buffer offset alignment to disable ARB_texture_buffer_range.</li>

				  <li>tgsi/scan: add hw atomic to the list of memory accessing files</li>

				</ul>

				<p>Dylan Baker (2):</p>

				<ul>

				  <li>docs: Add sha sums for release</li>

				  <li>VERSION: bump to 18.1.1 for next release</li>

				</ul>

				<p>Eric Engestrom (1):</p>

				<ul>

				  <li>vulkan: don't free uninitialised memory</li>

				</ul>

				<p>Francisco Jerez (4):</p>

				<ul>

				  <li>Revert "mesa: simplify _mesa_is_image_unit_valid for buffers"</li>

				  <li>i965: Move buffer texture size calculation into a common helper function.</li>

				  <li>i965: Handle non-zero texture buffer offsets in buffer object range calculation.</li>

				  <li>i965: Use intel_bufferobj_buffer() wrapper in image surface state setup.</li>

				</ul>

				<p>Ilia Mirkin (1):</p>

				<ul>

				  <li>nv30: ensure that displayable formats are marked accordingly</li>

				</ul>

				<p>Jan Vesely (1):</p>

				<ul>

				  <li>eg/compute: Use reference counting to handle compute memory pool.</li>

				</ul>

				<p>Jason Ekstrand (2):</p>

				<ul>

				  <li>intel/eu: Set EXECUTE_1 when setting the rounding mode in cr0</li>

				  <li>intel/blorp: Support blits and clears on surfaces with offsets</li>

				</ul>

				<p>Jose Dapena Paz (1):</p>

				<ul>

				  <li>mesa: do not leak ctx-&gt;Shader.ReferencedProgram references</li>

				</ul>

				<p>Kai Wasserbäch (1):</p>

				<ul>

				  <li>opencl: autotools: Fix linking order for OpenCL target</li>

				</ul>

				<p>Marek Olšák (3):</p>

				<ul>

				  <li>st/mesa: simplify lastLevel determination in st_finalize_texture</li>

				  <li>radeonsi: fix incorrect parentheses around VS-PS varying elimination</li>

				  <li>mesa: handle GL_UNSIGNED_INT64_ARB properly (v2)</li>

				</ul>

				<p>Michel Dänzer (1):</p>

				<ul>

				  <li>dri3: Stricter SBC wraparound handling</li>

				</ul>

				<p>Nanley Chery (4):</p>

				<ul>

				  <li>i965: Add and use a getter for the miptree aux buffer</li>

				  <li>i965: Add and use a single miptree aux_buf field</li>

				  <li>i965/miptree: Fix handling of uninitialized MCS buffers</li>

				  <li>i965/miptree: Zero-initialize CCS_D buffers</li>

				</ul>

				<p>Samuel Pitoiset (2):</p>

				<ul>

				  <li>spirv: fix visiting inner loops with same break/continue block</li>

				  <li>radv: fix centroid interpolation</li>

				</ul>

				<p>Stuart Young (1):</p>

				<ul>

				  <li>etnaviv: Fix missing rnndb file in tarballs</li>

				</ul>

				<p>Thierry Reding (3):</p>

				<ul>

				  <li>tegra: Treat resources with modifiers as scanout</li>

				  <li>tegra: Fix scanout resources without modifiers</li>

				  <li>tegra: Remove usage of non-stable UAPI</li>

				</ul>

				<p>Timothy Arceri (1):</p>

				<ul>

				  <li>mesa: add glUniform*ui{v} support to display lists</li>

				</ul>

				</div>

				</body>

				</html>

									
										170

docs/relnotes/18.1.2.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,170 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.1.2 Release Notes / June 15 2018</h1>

				<p>

				Mesa 18.1.2 is a bug fix release which fixes bugs found since the 18.1.1 release.

				</p>

				<p>

				Mesa 18.1.2 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				a644df23937f4078a2bd9a54349f6315c1955f5e3a4ac272832da51dea4d3c11  mesa-18.1.1.tar.gz

				070bf0648ba5b242d7303ceed32aed80842f4c0ba16e5acc1a650a46eadfb1f9  mesa-18.1.1.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>None<p>

				<h2>Changes</h2>

				<p>Alex Smith (4):</p>

				<ul>

				  <li>radv: Consolidate GFX9 merged shader lookup logic</li>

				  <li>radv: Handle GFX9 merged shaders in radv_flush_constants()</li>

				  <li>radeonsi: Fix crash on shaders using MSAA image load/store</li>

				  <li>radv: Set active_stages the same whether or not shaders were cached</li>

				</ul>

				<p>Andrew Galante (2):</p>

				<ul>

				  <li>meson: Test for __atomic_add_fetch in atomic checks</li>

				  <li>configure.ac: Test for __atomic_add_fetch in atomic checks</li>

				</ul>

				<p>Bas Nieuwenhuizen (1):</p>

				<ul>

				  <li>radv: Don't pass a TESS_EVAL shader when tesselation is not enabled.</li>

				</ul>

				<p>Cameron Kumar (1):</p>

				<ul>

				  <li>vulkan/wsi: Destroy swapchain images after terminating FIFO queues</li>

				</ul>

				<p>Dylan Baker (6):</p>

				<ul>

				  <li>docs/relnotes: Add sha256 sums for mesa 18.1.1</li>

				  <li>cherry-ignore: add commits not to pull</li>

				  <li>cherry-ignore: Add patches from Jason that he rebased on 18.1</li>

				  <li>meson: work around gentoo applying -m32 to host compiler in cross builds</li>

				  <li>cherry-ignore: Add another patch</li>

				  <li>version: bump version for 18.1.2 release</li>

				</ul>

				<p>Eric Engestrom (3):</p>

				<ul>

				  <li>autotools: add missing android file to package</li>

				  <li>configure: radv depends on mako</li>

				  <li>i965: fix resource leak</li>

				</ul>

				<p>Jason Ekstrand (10):</p>

				<ul>

				  <li>intel/eu: Add some brw_get_default_ helpers</li>

				  <li>intel/eu: Copy fields manually in brw_next_insn</li>

				  <li>intel/eu: Set flag [sub]register number differently for 3src</li>

				  <li>intel/blorp: Don't vertex fetch directly from clear values</li>

				  <li>intel/isl: Add bounds-checking assertions in isl_format_get_layout</li>

				  <li>intel/isl: Add bounds-checking assertions for the format_info table</li>

				  <li>i965/screen: Refactor query_dma_buf_formats</li>

				  <li>i965/screen: Use RGBA non-sRGB formats for images</li>

				  <li>anv: Set fence/semaphore types to NONE in impl_cleanup</li>

				  <li>i965/screen: Return false for unsupported formats in query_modifiers</li>

				</ul>

				<p>Jordan Justen (1):</p>

				<ul>

				  <li>mesa/program_binary: add implicit UseProgram after successful ProgramBinary</li>

				</ul>

				<p>Juan A. Suarez Romero (1):</p>

				<ul>

				  <li>glsl: Add ir_binop_vector_extract in NIR</li>

				</ul>

				<p>Kenneth Graunke (2):</p>

				<ul>

				  <li>i965: Fix batch-last mode to properly swap BOs.</li>

				  <li>anv: Disable __gen_validate_value if NDEBUG is set.</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>r300g/swtcl: make pipe_context uploaders use malloc'd memory as before</li>

				</ul>

				<p>Matt Turner (1):</p>

				<ul>

				  <li>meson: Fix -latomic check</li>

				</ul>

				<p>Michel Dänzer (1):</p>

				<ul>

				  <li>glx: Fix number of property values to read in glXImportContextEXT</li>

				</ul>

				<p>Nicolas Boichat (1):</p>

				<ul>

				  <li>configure.ac/meson.build: Fix -latomic test</li>

				</ul>

				<p>Philip Rebohle (1):</p>

				<ul>

				  <li>radv: Use correct color format for fast clears</li>

				</ul>

				<p>Samuel Pitoiset (3):</p>

				<ul>

				  <li>radv: fix a GPU hang when MRTs are sparse</li>

				  <li>radv: fix missing ZRANGE_PRECISION(1) for GFX9+</li>

				  <li>radv: add a workaround for DXVK hangs by setting amdgpu-skip-threshold</li>

				</ul>

				<p>Scott D Phillips (1):</p>

				<ul>

				  <li>intel/tools: add intel_sanitize_gpu to EXTRA_DIST</li>

				</ul>

				<p>Thomas Petazzoni (1):</p>

				<ul>

				  <li>configure.ac: rework -latomic check</li>

				</ul>

				<p>Timothy Arceri (2):</p>

				<ul>

				  <li>ac: fix possible truncation of intrinsic name</li>

				  <li>radeonsi: fix possible truncation on renderer string</li>

				</ul>

				</div>

				</body>

				</html>

									
										167

docs/relnotes/18.1.3.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,167 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.1.3 Release Notes / June 29 2018</h1>

				<p>

				Mesa 18.1.3 is a bug fix release which fixes bugs found since the 18.1.2 release.

				</p>

				<p>

				Mesa 18.1.2 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				2a1e36280d01ad18ba6d5b3fbd653ceaa109eaa031b78eb5dfaa4df452742b66  mesa-18.1.3.tar.gz

				54f08deeda0cd2f818e8d40140040ed013de7852573002453b7f50da9ea738ce  mesa-18.1.3.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105396">Bug 105396</a> - tc compatible htile sets depth of htiles of discarded fragments to 1.0</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105399">Bug 105399</a> - [snb] GPU hang: after geometry shader emits no geometry, the program hangs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106756">Bug 106756</a> - Wine 3.9 crashes with DXVK on Just Cause 3 and Quantum Break on VEGA but works ON POLARIS</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106774">Bug 106774</a> - GLSL IR copy propagates loads of SSBOs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106903">Bug 106903</a> - radv: Fragment shader output goes to wrong attachments when render targets are sparse</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106907">Bug 106907</a> - Correct Transform Feedback Varyings information is expected after using ProgramBinary</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106912">Bug 106912</a> - radv: 16-bit depth buffer causes artifacts in Shadow Warrior 2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106980">Bug 106980</a> - Basemark GPU vulkan benchmark fails.</li>

				</ul>

				<h2>Changes</h2>

				<p>Andrii Simiklit (1):</p>

				<ul>

				  <li>i965/gen6/gs: Handle case where a GS doesn't allocate VUE</li>

				</ul>

				<p>Bas Nieuwenhuizen (2):</p>

				<ul>

				  <li>radv: Fix output for sparse MRTs.</li>

				  <li>ac/surface: Set compressZ for stencil-only surfaces.</li>

				</ul>

				<p>Christian Gmeiner (1):</p>

				<ul>

				  <li>util/bitset: include util/macro.h</li>

				</ul>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>glsl: allow standalone semicolons outside main()</li>

				</ul>

				<p>Dylan Baker (8):</p>

				<ul>

				  <li>docs: Add release notes for 18.1.2</li>

				  <li>cherry-ignore: Add 587e712eda95c31d88ea9d20e59ad0ae59afef4f</li>

				  <li>meson: Fix auto option for va</li>

				  <li>meson: Fix auto option for xvmc</li>

				  <li>meson: Correct behavior of vdpau=auto</li>

				  <li>cherry-ignore: Ignore cac7ab1192eefdd8d8b3f25053fb006b5c330eb8</li>

				  <li>cherry-ignore: add a2f5292c82ad07731d633b36a663e46adc181db9</li>

				  <li>VERSION: bump version to 18.1.3</li>

				</ul>

				<p>Emil Velikov (2):</p>

				<ul>

				  <li>configure: use compliant grep regex checks</li>

				  <li>glsl/tests/glcpp: reinstate "error out if no tests found"</li>

				</ul>

				<p>Eric Engestrom (3):</p>

				<ul>

				  <li>radv: fix reported number of available VGPRs</li>

				  <li>radv: fix bitwise check</li>

				  <li>meson: fix i965/anv/isl genX static lib names</li>

				</ul>

				<p>Ian Romanick (2):</p>

				<ul>

				  <li>glsl: Don't copy propagate from SSBO or shared variables either</li>

				  <li>glsl: Don't copy propagate elements from SSBO or shared variables either</li>

				</ul>

				<p>Jason Ekstrand (2):</p>

				<ul>

				  <li>nir: Handle call instructions in foreach_src</li>

				  <li>nir/validate: Use the type from the tail of call parameter derefs</li>

				</ul>

				<p>Lukas Rusak (2):</p>

				<ul>

				  <li>meson: only build vl_winsys_dri.c when x11 platform is used</li>

				  <li>meson: fix private libs when building without glx</li>

				</ul>

				<p>Marek Olšák (5):</p>

				<ul>

				  <li>radeonsi/gfx9: fix si_get_buffer_from_descriptors for 48-bit pointers</li>

				  <li>ac/gpu_info: report real total memory sizes</li>

				  <li>ac/gpu_info: add kernel_flushes_hdp_before_ib</li>

				  <li>radeonsi: always put persistent buffers into GTT on radeon</li>

				  <li>mesa: fix glGetInteger64v for arrays of integers</li>

				</ul>

				<p>Rob Clark (1):</p>

				<ul>

				  <li>freedreno/ir3: fix base_vertex</li>

				</ul>

				<p>Samuel Pitoiset (6):</p>

				<ul>

				  <li>radv: don't fast clear HTILE for 16-bit depth surfaces on GFX8</li>

				  <li>radv: update the ZRANGE_PRECISION value for the TC-compat bug</li>

				  <li>radv: fix emitting the TCS regs on GFX9</li>

				  <li>radv: fix HTILE metadata initialization in presence of subpass clears</li>

				  <li>radv: ignore pInheritanceInfo for primary command buffers</li>

				  <li>radv: use separate bind points for the dynamic buffers</li>

				</ul>

				<p>Tapani Pälli (1):</p>

				<ul>

				  <li>glsl: serialize data from glTransformFeedbackVaryings</li>

				</ul>

				<p>Tomeu Vizoso (1):</p>

				<ul>

				  <li>virgl: Remove debugging left-overs</li>

				</ul>

				</div>

				</body>

				</html>

									
										150

docs/relnotes/18.1.4.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,150 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.1.4 Release Notes / July 13 2018</h1>

				<p>

				Mesa 18.1.4 is a bug fix release which fixes bugs found since the 18.1.3 release.

				</p>

				<p>

				Mesa 18.1.4 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				SHA256: 8acd42e4ac4d1e96ed22344073b3d4fef03d10f225f4eaf3f88c001dfc10e2db  mesa-18.1.4.tar.gz

				SHA256: 3061488b5d85504092cf4343816cfb2d96f2ad9bc2edec31fc96933d184cf58b  mesa-18.1.4.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106906">Bug 106906</a> - Failed to recongnize keyword “sampler2DRect” and &quot;sampler2DRectShadow&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106928">Bug 106928</a> - When starting a match Rocket League crashes on &quot;Go&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107193">Bug 107193</a> - piglit.spec.arb_compute_shader.linker.bug-93840 fails</li>

				</ul>

				<h2>Changes</h2>

				<p>Adam Jackson (1):</p>

				<ul>

				  <li>glx: Don't allow glXMakeContextCurrent() with only one valid drawable</li>

				</ul>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>r600/sb: cleanup if_conversion iterator to be legal C++</li>

				</ul>

				<p>Dylan Baker (2):</p>

				<ul>

				  <li>docs: Add SHA256 sums to notes for 18.1.3</li>

				  <li>Bump version for release</li>

				</ul>

				<p>Iago Toral Quiroga (3):</p>

				<ul>

				  <li>anv/cmd_buffer: make descriptors dirty when emitting base state address</li>

				  <li>anv/cmd_buffer: clean dirty push constants flag after emitting push constants</li>

				  <li>anv/cmd_buffer: never shrink the push constant buffer size</li>

				</ul>

				<p>Ian Romanick (4):</p>

				<ul>

				  <li>i965/vec4: Don't cmod propagate from CMP to ADD if the writemask isn't compatible</li>

				  <li>intel/compiler: Relax mixed type restriction for saturating immediates</li>

				  <li>i965/vec4: Properly handle sign(-abs(x))</li>

				  <li>i965/fs: Properly handle sign(-abs(x))</li>

				</ul>

				<p>Jason Ekstrand (3):</p>

				<ul>

				  <li>intel/fs: Split instructions low to high in lower_simd_width</li>

				  <li>anv: Be more careful about hashing pipeline layouts</li>

				  <li>intel/fs: Mark LINTERP opcode as writing accumulator on platforms without PLN</li>

				</ul>

				<p>Jose Maria Casanova Crespo (3):</p>

				<ul>

				  <li>i965/fs: Register allocator shoudn't use grf127 for sends dest</li>

				  <li>intel/compiler: grf127 can not be dest when src and dest overlap in send</li>

				  <li>i965/fs: unspills shoudn't use grf127 as dest since Gen8+</li>

				</ul>

				<p>Lionel Landwerlin (1):</p>

				<ul>

				  <li>i965: fix clear color bo address relocation</li>

				</ul>

				<p>Marek Olšák (3):</p>

				<ul>

				  <li>radeonsi: fix memory exhaustion issue with DCC statistics gathering with DRI2</li>

				  <li>glsl/cache: save and restore ExternalSamplersUsed</li>

				  <li>st/dri: fix a crash in server_wait_sync</li>

				</ul>

				<p>Neil Roberts (1):</p>

				<ul>

				  <li>i965: Fix output register sizes when variable ranges are interleaved</li>

				</ul>

				<p>Rhys Perry (1):</p>

				<ul>

				  <li>nvc0/ir: fix TargetNVC0::insnCanLoadOffset()</li>

				</ul>

				<p>Roland Scheidegger (1):</p>

				<ul>

				  <li>r600/sb: fix crash in fold_alu_op3</li>

				</ul>

				<p>Ross Burton (1):</p>

				<ul>

				  <li>egl: fix build race in automake</li>

				</ul>

				<p>Samuel Pitoiset (1):</p>

				<ul>

				  <li>radv: fix emitting the view index on GFX9</li>

				</ul>

				<p>Timothy Arceri (2):</p>

				<ul>

				  <li>glsl: skip comparison opt when adding vars of different size</li>

				  <li>nir: fix selection of loop terminator when two or more have the same limit</li>

				</ul>

				<p>zhaowei yuan (1):</p>

				<ul>

				  <li>glsl: Treat sampler2DRect and sampler2DRectShadow as reserved in ES2</li>

				</ul>

				</div>

				</body>

				</html>

									
										183

docs/relnotes/18.1.5.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,183 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.1.4 Release Notes / July 13 2018</h1>

				<p>

				Mesa 18.1.5 is a bug fix release which fixes bugs found since the 18.1.4 release.

				</p>

				<p>

				Mesa 18.1.5 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				SHA256: f966d5d5d373a5b8a16ed5036c1e7f05d4ad46d130f793bf9782c3ac9133a02e  mesa-18.1.5.tar.gz

				SHA256: 69dbe6f1a6660386f5beb85d4fcf003ee23023ed7b9a603de84e9a37e8d98dea  mesa-18.1.5.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103274">Bug 103274</a> - BRW allocates too much heap memory</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107275">Bug 107275</a> - NIR segfaults after spirv-opt</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107295">Bug 107295</a> - Access violation on glDrawArrays with count &gt;= 2048</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107312">Bug 107312</a> - Mesa-git RPM build fails after commit 8cacf38f527d42e41441ef8c25d95d4b2f4e8602</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107366">Bug 107366</a> - NIR verification crashes on piglit tests</li>

				</ul>

				<h2>Changes</h2>

				<p>Alex Smith (1):</p>

				<ul>

				  <li>anv: Pay attention to VK_ACCESS_MEMORY_(READ|WRITE)_BIT</li>

				</ul>

				<p>Bas Nieuwenhuizen (7):</p>

				<ul>

				  <li>radv: Select correct entries for binning.</li>

				  <li>radv: Fix number of samples used for binning.</li>

				  <li>radv: Disable disabled color buffers in rbplus opts.</li>

				  <li>nir: Do not use continue block after removing it.</li>

				  <li>util/disk_cache: Fix disk_cache_get_function_timestamp with disabled cache.</li>

				  <li>nir: Fix end of function without return warning/error.</li>

				  <li>radv: Still enable inmemory &amp; API level caching if disk cache is not enabled.</li>

				</ul>

				<p>Chad Versace (2):</p>

				<ul>

				  <li>anv/android: Fix type error in call to vk_errorf()</li>

				  <li>anv/android: Fix Autotools build for VK_ANDROID_native_buffer</li>

				</ul>

				<p>Chih-Wei Huang (1):</p>

				<ul>

				  <li>Android: fix a missing nir_intrinsics.h error</li>

				</ul>

				<p>Danylo Piliaiev (1):</p>

				<ul>

				  <li>i965: Sweep NIR after linking phase to free held memory</li>

				</ul>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>r600: enable tess_input_info for TES</li>

				</ul>

				<p>Dylan Baker (5):</p>

				<ul>

				  <li>docs: Add sha256 sums for 18.1.4 tarballs</li>

				  <li>cherry-ignore: add 4a67ce886a7b3def5f66c1aedf9e5436d157a03c</li>

				  <li>cherry-ignore: Add 1f616a840eac02241c585d28e9dac8f19a297f39</li>

				  <li>cherry-ignore: add 11712b9ca17e4e1a819dcb7d020e19c6da77bc90</li>

				  <li>bump version to 18.1.5</li>

				</ul>

				<p>Eric Anholt (2):</p>

				<ul>

				  <li>vc4: Don't automatically reallocate a PERSISTENT-mapped buffer.</li>

				  <li>meson: Move xvmc test tools from unit tests to installed tools.</li>

				</ul>

				<p>Harish Krupo (1):</p>

				<ul>

				  <li>egl: Fix missing clamping in eglSetDamageRegionKHR</li>

				</ul>

				<p>Jan Vesely (3):</p>

				<ul>

				  <li>radeonsi: Refuse to accept code with unhandled relocations</li>

				  <li>clover: Report error when pipe driver fails to create compute state</li>

				  <li>clover: Catch errors from executing event action</li>

				</ul>

				<p>Jason Ekstrand (6):</p>

				<ul>

				  <li>anv: Stop setting 3DSTATE_PS_EXTRA::PixelShaderHasUAV</li>

				  <li>nir/serialize: Alloc constants off the variable</li>

				  <li>blorp: Handle the RGB workaround more like other workarounds</li>

				  <li>intel/blorp: Handle 3-component formats in clears</li>

				  <li>intel/compiler: Account for built-in uniforms in analyze_ubo_ranges</li>

				  <li>spirv: Fix a couple of image atomic load/store bugs</li>

				</ul>

				<p>José Fonseca (1):</p>

				<ul>

				  <li>gallium/tests: Don't ignore S3TC errors.</li>

				</ul>

				<p>Karol Herbst (1):</p>

				<ul>

				  <li>nir: fix printing of vec16 type</li>

				</ul>

				<p>Lepton Wu (1):</p>

				<ul>

				  <li>virgl: Fix flush in virgl_encoder_inline_write.</li>

				</ul>

				<p>Lucas Stach (1):</p>

				<ul>

				  <li>st/mesa: call resource_changed when binding a EGLImage to a texture</li>

				</ul>

				<p>Mauro Rossi (2):</p>

				<ul>

				  <li>radv: winsys/amdgpu: include missing pthread.h header</li>

				  <li>android: util/disk_cache: fix building errors in gallium drivers</li>

				</ul>

				<p>Michel Dänzer (1):</p>

				<ul>

				  <li>gallium: Check pipe_screen::resource_changed before dereferencing it</li>

				</ul>

				<p>Roland Scheidegger (1):</p>

				<ul>

				  <li>draw: force draw pipeline if there's more than 65535 vertices</li>

				</ul>

				<p>Samuel Iglesias Gonsálvez (1):</p>

				<ul>

				  <li>anv: fix assert in anv_CmdBindDescriptorSets()</li>

				</ul>

				<p>Samuel Pitoiset (3):</p>

				<ul>

				  <li>radv: make sure to wait for CP DMA when needed</li>

				  <li>radv: emit a dummy ZPASS_DONE to prevent GPU hangs on GFX9</li>

				  <li>radv: fix a memleak for merged shaders on GFX9</li>

				</ul>

				</div>

				</body>

				</html>

									
										75

docs/relnotes/18.2.0.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,75 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.2.0 Release Notes / TBD</h1>

				<p>

				Mesa 18.2.0 is a new development release. People who are concerned

				with stability and reliability should stick with a previous release or

				wait for Mesa 18.2.1.

				</p>

				<p>

				Mesa 18.2.0 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<p>

				libwayland-egl is now distributed by Wayland (since 1.15,

				<a href="https://lists.freedesktop.org/archives/wayland-devel/2018-April/037767.html">see announcement</a>),

				and has been removed from Mesa in this release. Make sure you're using

				an up-to-date version of Wayland to keep the functionality.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				TBD.

				</pre>

				<h2>New features</h2>

				<p>

				Note: some of the new features are only available with certain drivers.

				</p>

				<ul>

				<li>OpenGL 4.3 on virgl</li>

				<li>OpenGL 4.4 Compatibility profile on radeonsi</li>

				<li>OpenGL ES 3.2 on radeonsi and virgl</li>

				<li>GL_ARB_ES3_2_compatibility on radeonsi</li>

				<li>GL_ARB_fragment_shader_interlock on i965</li>

				<li>GL_ARB_sample_locations and GL_NV_sample_locations on nvc0 (GM200+)</li>

				<li>GL_ANDROID_extension_pack_es31a on radeonsi.</li>

				<li>GL_KHR_texture_compression_astc_ldr on radeonsi</li>

				</ul>

				<h2>Bug fixes</h2>

				<h2>Changes</h2>

				<ul>

				<li>Removed GL_EXT_polygon_offset applications should use glPolygonOffset instead.</li>

				<li>Removed libwayland-egl, now part of Wayland</li>

				</ul>

				</div>

				</body>

				</html>

81

docs/specs/MESA_framebuffer_flip_y.txt Normal file

View File

@@ -0,0 +1,81 @@
 Name
     MESA_framebuffer_flip_y
 Name Strings
     GL_MESA_framebuffer_flip_y
 Contact
     Fritz Koenig <frkoenig@google.com>
 Contributors
     Fritz Koenig, Google
     Kristian Høgsberg, Google
     Chad Versace, Google
 Status
     Proposal
 Version
     Version 1, June 7, 2018
 Number
 
 Dependencies
     OpenGL ES 3.1 is required, for FramebufferParameteri.
 Overview
     This extension defines a new framebuffer parameter,
     GL_FRAMEBUFFER_FLIP_Y_MESA, that changes the behavior of the reads and
     writes to the framebuffer attachment points. When GL_FRAMEBUFFER_FLIP_Y_MESA
     is GL_TRUE, render commands and pixel transfer operations access the
     backing store of each attachment point with an y-inverted coordinate
     system. This y-inversion is relative to the coordinate system set when
     GL_FRAMEBUFFER_FLIP_Y_MESA is GL_FALSE.
     Access through TexSubImage2D and similar calls will notice the effect of
     the flip when they are not attached to framebuffer objects because
     GL_FRAMEBUFFER_FLIP_Y_MESA is associated with the framebuffer object and
     not the attachment points.
 IP Status
     None
 Issues
     None
 New Procedures and Functions
     None
 New Types
     None
 New Tokens
     Accepted by the <pname> argument of FramebufferParameteri and
     GetFramebufferParameteriv:
         GL_FRAMEBUFFER_FLIP_Y_MESA                      0x8BBB
 Errors
     An INVALID_OPERATION error is generated by GetFramebufferParameteriv if the
     default framebuffer is bound to <target> and <pname> is FRAMEBUFFER_FLIP_Y_MESA.
 Revision History
     Version 1, June, 2018
         Initial draft (Fritz Koenig)

3

docs/specs/enums.txt

View File

@@ -71,6 +71,9 @@ GL_MESA_tile_raster_order
 	GL_TILE_RASTER_ORDER_INCREASING_X_MESA	0x8BB9
 	GL_TILE_RASTER_ORDER_INCREASING_Y_MESA	0x8BBA
 GL_MESA_framebuffer_flip_y
 	GL_FRAMEBUFFER_FLIP_Y_MESA           0x8BBB
 EGL_MESA_drm_image
         EGL_DRM_BUFFER_FORMAT_MESA		0x31D0
         EGL_DRM_BUFFER_USE_MESA			0x31D1

									
										12

docs/submittingpatches.html
									
												View File
												
				@@ -36,7 +36,7 @@

				perhaps, in very trivial cases.)

				<li>Code patches should follow Mesa

				<a href="codingstyle.html" target="_parent">coding conventions</a>.

				<li>Whenever possible, patches should only effect individual Mesa/Gallium

				<li>Whenever possible, patches should only affect individual Mesa/Gallium

				components.

				<li>Patches should never introduce build breaks and should be bisectable (see

				<code>git bisect</code>.)

				@@ -122,9 +122,9 @@ Please use common sense and do <strong>not</strong> blindly add everyone.

				<pre>

				    $ scripts/get_reviewer.pl --help # to get the help screen

				    $ scripts/get_reviewer.pl -f src/egl/drivers/dri2/platform_android.c

				    Rob Herring <robh@kernel.org> (reviewer:ANDROID EGL SUPPORT,added_lines:188/700=27%,removed_lines:58/283=20%)

				    Tomasz Figa <tfiga@chromium.org> (reviewer:ANDROID EGL SUPPORT,authored:12/41=29%,added_lines:308/700=44%,removed_lines:115/283=41%)

				    Emil Velikov <emil.l.velikov@gmail.com> (authored:13/41=32%,removed_lines:76/283=27%)

				    Rob Herring &lt;robh@kernel.org&gt; (reviewer:ANDROID EGL SUPPORT,added_lines:188/700=27%,removed_lines:58/283=20%)

				    Tomasz Figa &lt;tfiga@chromium.org&gt; (reviewer:ANDROID EGL SUPPORT,authored:12/41=29%,added_lines:308/700=44%,removed_lines:115/283=41%)

				    Emil Velikov &lt;emil.l.velikov@gmail.com&gt; (authored:13/41=32%,removed_lines:76/283=27%)

				</pre>

				</ul>

				@@ -246,6 +246,10 @@ release.

				Note: resending patch identical to one on mesa-dev@ or one that differs only

				by the extra mesa-stable@ tag is <strong>not</strong> recommended.

				</p>

				<p>

				If you are not the author of the original patch, please Cc: them in your

				nomination request.

				</p>

				<h3 id="thetag">The stable tag</h3>

									
										2

docs/utilities.html
									
												View File
												
				@@ -31,7 +31,7 @@

				  <dd>is a very useful tool for tracking down

				  memory-related problems in your code.</dd>

				  <dt><a href="https://scan.coverity.com/projects/mesa">Coverity</a><dt>

				  <dt><a href="https://scan.coverity.com/projects/mesa">Coverity</a></dt>

				  <dd>provides static code analysis of Mesa.  If you create an account

				  you can see the results and try to fix outstanding issues.</dd>

				</dl>

									
										21

docs/viewperf.html
									
												View File
												
				@@ -18,8 +18,8 @@

				<p>

				This page lists known issues with

				<a href="https://www.spec.org/gwpg/gpc.static/vp11info.html" target="_main">SPEC Viewperf 11</a>

				and <a href="https://www.spec.org/gwpg/gpc.static/vp12info.html" target="_main">SPEC Viewperf 12</a>

				<a href="https://www.spec.org/gwpg/gpc.static/vp11info.html">SPEC Viewperf 11</a>

				and <a href="https://www.spec.org/gwpg/gpc.static/vp12info.html">SPEC Viewperf 12</a>

				when running on Mesa-based drivers.

				</p>

				@@ -66,13 +66,10 @@ either in Viewperf or the Mesa driver.

				<p>

				These tests use features of the

				<a href="https://www.opengl.org/registry/specs/NV/fragment_program2.txt"

				target="_main">

				GL_NV_fragment_program2</a> and

				<a href="https://www.opengl.org/registry/specs/NV/vertex_program3.txt"

				target="_main">

				GL_NV_vertex_program3</a> extensions without checking if the driver supports

				them.

				<a href="https://www.opengl.org/registry/specs/NV/fragment_program2.txt">GL_NV_fragment_program2</a>

				and

				<a href="https://www.opengl.org/registry/specs/NV/vertex_program3.txt">GL_NV_vertex_program3</a>

				extensions without checking if the driver supports them.

				</p>

				<p>

				When Mesa tries to compile the vertex/fragment programs it generates errors

				@@ -86,8 +83,8 @@ Subsequent drawing calls become no-ops and the rendering is incorrect.

				<p>

				These tests depend on the

				<a href="https://www.opengl.org/registry/specs/NV/primitive_restart.txt"

				target="_main">GL_NV_primitive_restart</a> extension.

				<a href="https://www.opengl.org/registry/specs/NV/primitive_restart.txt">GL_NV_primitive_restart</a>

				extension.

				</p>

				<p>

				@@ -124,7 +121,7 @@ never specified.

				<p>

				A trace captured with

				<a href="https://github.com/apitrace/apitrace" target="_main">API trace</a>

				<a href="https://github.com/apitrace/apitrace">API trace</a>

				shows this sequences of calls like this:

				<pre>

									
										1

include/EGL/eglext.h
									
												View File
												
				@@ -933,6 +933,7 @@ EGLAPI EGLSurface EGLAPIENTRY eglCreatePixmapSurfaceHI (EGLDisplay dpy, EGLConfi

				#define EGL_DRM_BUFFER_STRIDE_MESA        0x31D4

				#define EGL_DRM_BUFFER_USE_SCANOUT_MESA   0x00000001

				#define EGL_DRM_BUFFER_USE_SHARE_MESA     0x00000002

				#define EGL_DRM_BUFFER_USE_CURSOR_MESA    0x00000004

				typedef EGLImageKHR (EGLAPIENTRYP PFNEGLCREATEDRMIMAGEMESAPROC) (EGLDisplay dpy, const EGLint *attrib_list);

				typedef EGLBoolean (EGLAPIENTRYP PFNEGLEXPORTDRMIMAGEMESAPROC) (EGLDisplay dpy, EGLImageKHR image, EGLint *name, EGLint *handle, EGLint *stride);

				#ifdef EGL_EGLEXT_PROTOTYPES

									
										7

include/EGL/eglmesaext.h
									
												View File
												
				@@ -34,13 +34,6 @@ extern "C" {

				#include <EGL/eglplatform.h>

				#ifdef EGL_MESA_drm_image

				/* Mesa's extension to EGL_MESA_drm_image... */

				#ifndef EGL_DRM_BUFFER_USE_CURSOR_MESA

				#define EGL_DRM_BUFFER_USE_CURSOR_MESA		0x0004

				#endif

				#endif

				#ifndef EGL_WL_bind_wayland_display

				#define EGL_WL_bind_wayland_display 1

									
										16

include/EGL/eglplatform.h
									
												View File
												
				@@ -104,6 +104,12 @@ typedef struct ANativeWindow*           EGLNativeWindowType;

				typedef struct egl_native_pixmap_t*     EGLNativePixmapType;

				typedef void*                           EGLNativeDisplayType;

				#elif defined(USE_OZONE)

				typedef intptr_t EGLNativeDisplayType;

				typedef intptr_t EGLNativeWindowType;

				typedef intptr_t EGLNativePixmapType;

				#elif defined(__unix__) || defined(__APPLE__)

				#if defined(MESA_EGL_NO_X11_HEADERS)

				@@ -124,11 +130,13 @@ typedef Window   EGLNativeWindowType;

				#endif /* MESA_EGL_NO_X11_HEADERS */

				#elif __HAIKU__

				#elif defined(__HAIKU__)

				#include <kernel/image.h>

				typedef void				*EGLNativeDisplayType;

				typedef khronos_uintptr_t	 EGLNativePixmapType;

				typedef khronos_uintptr_t	 EGLNativeWindowType;

				typedef void              *EGLNativeDisplayType;

				typedef khronos_uintptr_t  EGLNativePixmapType;

				typedef khronos_uintptr_t  EGLNativeWindowType;

				#else

				#error "Platform not recognized"

									
										4

include/GL/gl.h
									
												View File
												
				@@ -47,9 +47,9 @@

				#    define GLAPI __declspec(dllimport)

				#  else /* for use with static link lib build of Win32 edition only */

				#    define GLAPI extern

				#  endif /* _STATIC_MESA support */

				#  endif

				#  if defined(__MINGW32__) && defined(GL_NO_STDCALL) || defined(UNDER_CE)  /* The generated DLLs by MingW with STDCALL are not compatible with the ones done by Microsoft's compilers */

				#    define GLAPIENTRY 

				#    define GLAPIENTRY

				#  else

				#    define GLAPIENTRY __stdcall

				#  endif

									
										50

include/GL/internal/dri_interface.h
									
												View File
												
				@@ -82,7 +82,7 @@ typedef struct __DRI2flushExtensionRec	__DRI2flushExtension;

				typedef struct __DRI2throttleExtensionRec	__DRI2throttleExtension;

				typedef struct __DRI2fenceExtensionRec          __DRI2fenceExtension;

				typedef struct __DRI2interopExtensionRec	__DRI2interopExtension;

				typedef struct __DRI2blobExtensionRec           __DRI2blobExtension;

				typedef struct __DRIimageLoaderExtensionRec     __DRIimageLoaderExtension;

				typedef struct __DRIimageDriverExtensionRec     __DRIimageDriverExtension;

				@@ -336,6 +336,30 @@ struct __DRI2throttleExtensionRec {

						    enum __DRI2throttleReason reason);

				};

				/**

				 * Extension for EGL_ANDROID_blob_cache

				 */

				#define __DRI2_BLOB "DRI2_Blob"

				#define __DRI2_BLOB_VERSION 1

				typedef void

				(*__DRIblobCacheSet) (const void *key, signed long keySize,

				                      const void *value, signed long valueSize);

				typedef signed long

				(*__DRIblobCacheGet) (const void *key, signed long keySize,

				                      void *value, signed long valueSize);

				struct __DRI2blobExtensionRec {

				   __DRIextension base;

				   /**

				    * Set cache functions for setting and getting cache entries.

				    */

				   void (*set_cache_funcs) (__DRIscreen *screen,

				                            __DRIblobCacheSet set, __DRIblobCacheGet get);

				};

				/**

				 * Extension for fences / synchronization objects.

				@@ -565,7 +589,7 @@ struct __DRIdamageExtensionRec {

				 * SWRast Loader extension.

				 */

				#define __DRI_SWRAST_LOADER "DRI_SWRastLoader"

				#define __DRI_SWRAST_LOADER_VERSION 3

				#define __DRI_SWRAST_LOADER_VERSION 4

				struct __DRIswrastLoaderExtensionRec {

				    __DRIextension base;

				@@ -607,6 +631,24 @@ struct __DRIswrastLoaderExtensionRec {

				   void (*getImage2)(__DRIdrawable *readable,

						     int x, int y, int width, int height, int stride,

						     char *data, void *loaderPrivate);

				    /**

				     * Put shm image to drawable

				     *

				     * \since 4

				     */

				    void (*putImageShm)(__DRIdrawable *drawable, int op,

				                        int x, int y, int width, int height, int stride,

				                        int shmid, char *shmaddr, unsigned offset,

				                        void *loaderPrivate);

				    /**

				     * Get shm image from readable

				     *

				     * \since 4

				     */

				    void (*getImageShm)(__DRIdrawable *readable,

				                        int x, int y, int width, int height,

				                        int shmid, void *loaderPrivate);

				};

				/**

				@@ -1227,6 +1269,9 @@ struct __DRIdri2ExtensionRec {

				#define __DRI_IMAGE_FORMAT_R16          0x100d

				#define __DRI_IMAGE_FORMAT_GR1616       0x100e

				#define __DRI_IMAGE_FORMAT_YUYV         0x100f

				#define __DRI_IMAGE_FORMAT_XBGR2101010  0x1010

				#define __DRI_IMAGE_FORMAT_ABGR2101010  0x1011

				#define __DRI_IMAGE_FORMAT_SABGR8       0x1012

				#define __DRI_IMAGE_USE_SHARE		0x0001

				#define __DRI_IMAGE_USE_SCANOUT		0x0002

				@@ -1263,6 +1308,7 @@ struct __DRIdri2ExtensionRec {

				#define __DRI_IMAGE_FOURCC_ABGR8888	0x34324241

				#define __DRI_IMAGE_FOURCC_XBGR8888	0x34324258

				#define __DRI_IMAGE_FOURCC_SARGB8888	0x83324258

				#define __DRI_IMAGE_FOURCC_SABGR8888	0x84324258

				#define __DRI_IMAGE_FOURCC_ARGB2101010	0x30335241

				#define __DRI_IMAGE_FOURCC_XRGB2101010	0x30335258

				#define __DRI_IMAGE_FOURCC_ABGR2101010	0x30334241

									
										5

include/GLES2/gl2ext.h
									
												View File
												
				@@ -2334,6 +2334,11 @@ GL_APICALL void GL_APIENTRY glGetPerfQueryInfoINTEL (GLuint queryId, GLuint quer

				#endif

				#endif /* GL_INTEL_performance_query */

				#ifndef GL_MESA_framebuffer_flip_y

				#define GL_MESA_framebuffer_flip_y 1

				#define GL_FRAMEBUFFER_FLIP_Y_MESA        0x8BBB

				#endif /* GL_MESA_framebuffer_flip_y */

				#ifndef GL_MESA_program_binary_formats

				#define GL_MESA_program_binary_formats 1

				#define GL_PROGRAM_BINARY_FORMAT_MESA     0x875F

8

include/drm-uapi/README

View File

@@ -13,9 +13,9 @@ $ make headers_install INSTALL_HDR_PATH=/path/to/install
 The last update was done at the following kernel commit :
 commit ca797d29cd63e7b71b4eea29aff3b1cefd1ecb59
 Merge: 2c1c55cb75a9 010d118c2061
 commit 78230c46ec0a91dd4256c9e54934b3c7095a7ee3
 Merge: b65bd4031156 037f03155b7d
 Author: Dave Airlie <airlied@redhat.com>
 Date:   Mon Dec 4 09:40:35 2017 +1000
 Date:   Wed Mar 21 14:07:03 2018 +1000
     Merge tag 'drm-intel-next-2017-11-17-1' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
     Merge tag 'omapdrm-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux into drm-next

									
										118

include/drm-uapi/drm_fourcc.h
									
												View File
												
				@@ -178,7 +178,7 @@ extern "C" {

				#define DRM_FORMAT_MOD_VENDOR_NONE    0

				#define DRM_FORMAT_MOD_VENDOR_INTEL   0x01

				#define DRM_FORMAT_MOD_VENDOR_AMD     0x02

				#define DRM_FORMAT_MOD_VENDOR_NV      0x03

				#define DRM_FORMAT_MOD_VENDOR_NVIDIA  0x03

				#define DRM_FORMAT_MOD_VENDOR_SAMSUNG 0x04

				#define DRM_FORMAT_MOD_VENDOR_QCOM    0x05

				#define DRM_FORMAT_MOD_VENDOR_VIVANTE 0x06

				@@ -188,7 +188,7 @@ extern "C" {

				#define DRM_FORMAT_RESERVED	      ((1ULL << 56) - 1)

				#define fourcc_mod_code(vendor, val) \

					((((__u64)DRM_FORMAT_MOD_VENDOR_## vendor) << 56) | (val & 0x00ffffffffffffffULL))

					((((__u64)DRM_FORMAT_MOD_VENDOR_## vendor) << 56) | ((val) & 0x00ffffffffffffffULL))

				/*

				 * Format Modifier tokens:

				@@ -338,29 +338,17 @@ extern "C" {

				 */

				#define DRM_FORMAT_MOD_VIVANTE_SPLIT_SUPER_TILED fourcc_mod_code(VIVANTE, 4)

				/* NVIDIA Tegra frame buffer modifiers */

				/*

				 * Some modifiers take parameters, for example the number of vertical GOBs in

				 * a block. Reserve the lower 32 bits for parameters

				 */

				#define __fourcc_mod_tegra_mode_shift 32

				#define fourcc_mod_tegra_code(val, params) \

					fourcc_mod_code(NV, ((((__u64)val) << __fourcc_mod_tegra_mode_shift) | params))

				#define fourcc_mod_tegra_mod(m) \

					(m & ~((1ULL << __fourcc_mod_tegra_mode_shift) - 1))

				#define fourcc_mod_tegra_param(m) \

					(m & ((1ULL << __fourcc_mod_tegra_mode_shift) - 1))

				/* NVIDIA frame buffer modifiers */

				/*

				 * Tegra Tiled Layout, used by Tegra 2, 3 and 4.

				 *

				 * Pixels are arranged in simple tiles of 16 x 16 bytes.

				 */

				#define NV_FORMAT_MOD_TEGRA_TILED fourcc_mod_tegra_code(1, 0)

				#define DRM_FORMAT_MOD_NVIDIA_TEGRA_TILED fourcc_mod_code(NVIDIA, 1)

				/*

				 * Tegra 16Bx2 Block Linear layout, used by TK1/TX1

				 * 16Bx2 Block Linear layout, used by desktop GPUs, and Tegra K1 and later

				 *

				 * Pixels are arranged in 64x8 Groups Of Bytes (GOBs). GOBs are then stacked

				 * vertically by a power of 2 (1 to 32 GOBs) to form a block.

				@@ -380,7 +368,38 @@ extern "C" {

				 * Chapter 20 "Pixel Memory Formats" of the Tegra X1 TRM describes this format

				 * in full detail.

				 */

				#define NV_FORMAT_MOD_TEGRA_16BX2_BLOCK(v) fourcc_mod_tegra_code(2, v)

				#define DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK(v) \

					fourcc_mod_code(NVIDIA, 0x10 | ((v) & 0xf))

				#define DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_ONE_GOB \

					fourcc_mod_code(NVIDIA, 0x10)

				#define DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_TWO_GOB \

					fourcc_mod_code(NVIDIA, 0x11)

				#define DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_FOUR_GOB \

					fourcc_mod_code(NVIDIA, 0x12)

				#define DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_EIGHT_GOB \

					fourcc_mod_code(NVIDIA, 0x13)

				#define DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_SIXTEEN_GOB \

					fourcc_mod_code(NVIDIA, 0x14)

				#define DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK_THIRTYTWO_GOB \

					fourcc_mod_code(NVIDIA, 0x15)

				/*

				 * Some Broadcom modifiers take parameters, for example the number of

				 * vertical lines in the image. Reserve the lower 32 bits for modifier

				 * type, and the next 24 bits for parameters. Top 8 bits are the

				 * vendor code.

				 */

				#define __fourcc_mod_broadcom_param_shift 8

				#define __fourcc_mod_broadcom_param_bits 48

				#define fourcc_mod_broadcom_code(val, params) \

					fourcc_mod_code(BROADCOM, ((((__u64)params) << __fourcc_mod_broadcom_param_shift) | val))

				#define fourcc_mod_broadcom_param(m) \

					((int)(((m) >> __fourcc_mod_broadcom_param_shift) &	\

					       ((1ULL << __fourcc_mod_broadcom_param_bits) - 1)))

				#define fourcc_mod_broadcom_mod(m) \

					((m) & ~(((1ULL << __fourcc_mod_broadcom_param_bits) - 1) <<	\

						 __fourcc_mod_broadcom_param_shift))

				/*

				 * Broadcom VC4 "T" format

				@@ -403,6 +422,69 @@ extern "C" {

				 */

				#define DRM_FORMAT_MOD_BROADCOM_VC4_T_TILED fourcc_mod_code(BROADCOM, 1)

				/*

				 * Broadcom SAND format

				 *

				 * This is the native format that the H.264 codec block uses.  For VC4

				 * HVS, it is only valid for H.264 (NV12/21) and RGBA modes.

				 *

				 * The image can be considered to be split into columns, and the

				 * columns are placed consecutively into memory.  The width of those

				 * columns can be either 32, 64, 128, or 256 pixels, but in practice

				 * only 128 pixel columns are used.

				 *

				 * The pitch between the start of each column is set to optimally

				 * switch between SDRAM banks. This is passed as the number of lines

				 * of column width in the modifier (we can't use the stride value due

				 * to various core checks that look at it , so you should set the

				 * stride to width*cpp).

				 *

				 * Note that the column height for this format modifier is the same

				 * for all of the planes, assuming that each column contains both Y

				 * and UV.  Some SAND-using hardware stores UV in a separate tiled

				 * image from Y to reduce the column height, which is not supported

				 * with these modifiers.

				 */

				#define DRM_FORMAT_MOD_BROADCOM_SAND32_COL_HEIGHT(v) \

					fourcc_mod_broadcom_code(2, v)

				#define DRM_FORMAT_MOD_BROADCOM_SAND64_COL_HEIGHT(v) \

					fourcc_mod_broadcom_code(3, v)

				#define DRM_FORMAT_MOD_BROADCOM_SAND128_COL_HEIGHT(v) \

					fourcc_mod_broadcom_code(4, v)

				#define DRM_FORMAT_MOD_BROADCOM_SAND256_COL_HEIGHT(v) \

					fourcc_mod_broadcom_code(5, v)

				#define DRM_FORMAT_MOD_BROADCOM_SAND32 \

					DRM_FORMAT_MOD_BROADCOM_SAND32_COL_HEIGHT(0)

				#define DRM_FORMAT_MOD_BROADCOM_SAND64 \

					DRM_FORMAT_MOD_BROADCOM_SAND64_COL_HEIGHT(0)

				#define DRM_FORMAT_MOD_BROADCOM_SAND128 \

					DRM_FORMAT_MOD_BROADCOM_SAND128_COL_HEIGHT(0)

				#define DRM_FORMAT_MOD_BROADCOM_SAND256 \

					DRM_FORMAT_MOD_BROADCOM_SAND256_COL_HEIGHT(0)

				/* Broadcom UIF format

				 *

				 * This is the common format for the current Broadcom multimedia

				 * blocks, including V3D 3.x and newer, newer video codecs, and

				 * displays.

				 *

				 * The image consists of utiles (64b blocks), UIF blocks (2x2 utiles),

				 * and macroblocks (4x4 UIF blocks).  Those 4x4 UIF block groups are

				 * stored in columns, with padding between the columns to ensure that

				 * moving from one column to the next doesn't hit the same SDRAM page

				 * bank.

				 *

				 * To calculate the padding, it is assumed that each hardware block

				 * and the software driving it knows the platform's SDRAM page size,

				 * number of banks, and XOR address, and that it's identical between

				 * all blocks using the format.  This tiling modifier will use XOR as

				 * necessary to reduce the padding.  If a hardware block can't do XOR,

				 * the assumption is that a no-XOR tiling modifier will be created.

				 */

				#define DRM_FORMAT_MOD_BROADCOM_UIF fourcc_mod_code(BROADCOM, 6)

				#if defined(__cplusplus)

				}

				#endif

									
										43

include/drm-uapi/drm_mode.h
									
												View File
												
				@@ -38,14 +38,18 @@ extern "C" {

				#define DRM_DISPLAY_MODE_LEN	32

				#define DRM_PROP_NAME_LEN	32

				#define DRM_MODE_TYPE_BUILTIN	(1<<0)

				#define DRM_MODE_TYPE_CLOCK_C	((1<<1) | DRM_MODE_TYPE_BUILTIN)

				#define DRM_MODE_TYPE_CRTC_C	((1<<2) | DRM_MODE_TYPE_BUILTIN)

				#define DRM_MODE_TYPE_BUILTIN	(1<<0) /* deprecated */

				#define DRM_MODE_TYPE_CLOCK_C	((1<<1) | DRM_MODE_TYPE_BUILTIN) /* deprecated */

				#define DRM_MODE_TYPE_CRTC_C	((1<<2) | DRM_MODE_TYPE_BUILTIN) /* deprecated */

				#define DRM_MODE_TYPE_PREFERRED	(1<<3)

				#define DRM_MODE_TYPE_DEFAULT	(1<<4)

				#define DRM_MODE_TYPE_DEFAULT	(1<<4) /* deprecated */

				#define DRM_MODE_TYPE_USERDEF	(1<<5)

				#define DRM_MODE_TYPE_DRIVER	(1<<6)

				#define DRM_MODE_TYPE_ALL	(DRM_MODE_TYPE_PREFERRED |	\

								 DRM_MODE_TYPE_USERDEF |	\

								 DRM_MODE_TYPE_DRIVER)

				/* Video mode flags */

				/* bit compatible with the xrandr RR_ definitions (bits 0-13)

				 *

				@@ -66,8 +70,8 @@ extern "C" {

				#define DRM_MODE_FLAG_PCSYNC			(1<<7)

				#define DRM_MODE_FLAG_NCSYNC			(1<<8)

				#define DRM_MODE_FLAG_HSKEW			(1<<9) /* hskew provided */

				#define DRM_MODE_FLAG_BCAST			(1<<10)

				#define DRM_MODE_FLAG_PIXMUX			(1<<11)

				#define DRM_MODE_FLAG_BCAST			(1<<10) /* deprecated */

				#define DRM_MODE_FLAG_PIXMUX			(1<<11) /* deprecated */

				#define DRM_MODE_FLAG_DBLCLK			(1<<12)

				#define DRM_MODE_FLAG_CLKDIV2			(1<<13)

				 /*

				@@ -99,6 +103,20 @@ extern "C" {

				#define  DRM_MODE_FLAG_PIC_AR_16_9 \

							(DRM_MODE_PICTURE_ASPECT_16_9<<19)

				#define  DRM_MODE_FLAG_ALL	(DRM_MODE_FLAG_PHSYNC |		\

								 DRM_MODE_FLAG_NHSYNC |		\

								 DRM_MODE_FLAG_PVSYNC |		\

								 DRM_MODE_FLAG_NVSYNC |		\

								 DRM_MODE_FLAG_INTERLACE |	\

								 DRM_MODE_FLAG_DBLSCAN |	\

								 DRM_MODE_FLAG_CSYNC |		\

								 DRM_MODE_FLAG_PCSYNC |		\

								 DRM_MODE_FLAG_NCSYNC |		\

								 DRM_MODE_FLAG_HSKEW |		\

								 DRM_MODE_FLAG_DBLCLK |		\

								 DRM_MODE_FLAG_CLKDIV2 |	\

								 DRM_MODE_FLAG_3D_MASK)

				/* DPMS flags */

				/* bit compatible with the xorg definitions. */

				#define DRM_MODE_DPMS_ON	0

				@@ -173,6 +191,10 @@ extern "C" {

						DRM_MODE_REFLECT_X | \

						DRM_MODE_REFLECT_Y)

				/* Content Protection Flags */

				#define DRM_MODE_CONTENT_PROTECTION_UNDESIRED	0

				#define DRM_MODE_CONTENT_PROTECTION_DESIRED     1

				#define DRM_MODE_CONTENT_PROTECTION_ENABLED     2

				struct drm_mode_modeinfo {

					__u32 clock;

				@@ -341,7 +363,7 @@ struct drm_mode_get_connector {

					__u32 pad;

				};

				#define DRM_MODE_PROP_PENDING	(1<<0)

				#define DRM_MODE_PROP_PENDING	(1<<0) /* deprecated, do not use */

				#define DRM_MODE_PROP_RANGE	(1<<1)

				#define DRM_MODE_PROP_IMMUTABLE	(1<<2)

				#define DRM_MODE_PROP_ENUM	(1<<3) /* enumerated type with text strings */

				@@ -576,8 +598,11 @@ struct drm_mode_crtc_lut {

				};

				struct drm_color_ctm {

					/* Conversion matrix in S31.32 format. */

					__s64 matrix[9];

					/*

					 * Conversion matrix in S31.32 sign-magnitude

					 * (not two's complement!) format.

					 */

					__u64 matrix[9];

				};

				struct drm_color_lut {

									
										152

include/drm-uapi/i915_drm.h
									
												View File
												
				@@ -102,6 +102,46 @@ enum drm_i915_gem_engine_class {

					I915_ENGINE_CLASS_INVALID	= -1

				};

				/**

				 * DOC: perf_events exposed by i915 through /sys/bus/event_sources/drivers/i915

				 *

				 */

				enum drm_i915_pmu_engine_sample {

					I915_SAMPLE_BUSY = 0,

					I915_SAMPLE_WAIT = 1,

					I915_SAMPLE_SEMA = 2

				};

				#define I915_PMU_SAMPLE_BITS (4)

				#define I915_PMU_SAMPLE_MASK (0xf)

				#define I915_PMU_SAMPLE_INSTANCE_BITS (8)

				#define I915_PMU_CLASS_SHIFT \

					(I915_PMU_SAMPLE_BITS + I915_PMU_SAMPLE_INSTANCE_BITS)

				#define __I915_PMU_ENGINE(class, instance, sample) \

					((class) << I915_PMU_CLASS_SHIFT | \

					(instance) << I915_PMU_SAMPLE_BITS | \

					(sample))

				#define I915_PMU_ENGINE_BUSY(class, instance) \

					__I915_PMU_ENGINE(class, instance, I915_SAMPLE_BUSY)

				#define I915_PMU_ENGINE_WAIT(class, instance) \

					__I915_PMU_ENGINE(class, instance, I915_SAMPLE_WAIT)

				#define I915_PMU_ENGINE_SEMA(class, instance) \

					__I915_PMU_ENGINE(class, instance, I915_SAMPLE_SEMA)

				#define __I915_PMU_OTHER(x) (__I915_PMU_ENGINE(0xff, 0xff, 0xf) + 1 + (x))

				#define I915_PMU_ACTUAL_FREQUENCY	__I915_PMU_OTHER(0)

				#define I915_PMU_REQUESTED_FREQUENCY	__I915_PMU_OTHER(1)

				#define I915_PMU_INTERRUPTS		__I915_PMU_OTHER(2)

				#define I915_PMU_RC6_RESIDENCY		__I915_PMU_OTHER(3)

				#define I915_PMU_LAST I915_PMU_RC6_RESIDENCY

				/* Each region is a minimum of 16k, and there are at most 255 of them.

				 */

				#define I915_NR_TEX_REGIONS 255	/* table size 2k - maximum due to use

				@@ -278,6 +318,7 @@ typedef struct _drm_i915_sarea {

				#define DRM_I915_PERF_OPEN		0x36

				#define DRM_I915_PERF_ADD_CONFIG	0x37

				#define DRM_I915_PERF_REMOVE_CONFIG	0x38

				#define DRM_I915_QUERY			0x39

				#define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)

				#define DRM_IOCTL_I915_FLUSH		DRM_IO ( DRM_COMMAND_BASE + DRM_I915_FLUSH)

				@@ -335,6 +376,7 @@ typedef struct _drm_i915_sarea {

				#define DRM_IOCTL_I915_PERF_OPEN	DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_OPEN, struct drm_i915_perf_open_param)

				#define DRM_IOCTL_I915_PERF_ADD_CONFIG	DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_ADD_CONFIG, struct drm_i915_perf_oa_config)

				#define DRM_IOCTL_I915_PERF_REMOVE_CONFIG	DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_REMOVE_CONFIG, __u64)

				#define DRM_IOCTL_I915_QUERY			DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)

				/* Allow drivers to submit batchbuffers directly to hardware, relying

				 * on the security mechanisms provided by hardware.

				@@ -1318,7 +1360,9 @@ struct drm_intel_overlay_attrs {

				 * active on a given plane.

				 */

				#define I915_SET_COLORKEY_NONE		(1<<0) /* disable color key matching */

				#define I915_SET_COLORKEY_NONE		(1<<0) /* Deprecated. Instead set

										* flags==0 to disable colorkeying.

										*/

				#define I915_SET_COLORKEY_DESTINATION	(1<<1)

				#define I915_SET_COLORKEY_SOURCE	(1<<2)

				struct drm_intel_sprite_colorkey {

				@@ -1564,15 +1608,115 @@ struct drm_i915_perf_oa_config {

					__u32 n_flex_regs;

					/*

					 * These fields are pointers to tuples of u32 values (register

					 * address, value). For example the expected length of the buffer

					 * pointed by mux_regs_ptr is (2 * sizeof(u32) * n_mux_regs).

					 * These fields are pointers to tuples of u32 values (register address,

					 * value). For example the expected length of the buffer pointed by

					 * mux_regs_ptr is (2 * sizeof(u32) * n_mux_regs).

					 */

					__u64 mux_regs_ptr;

					__u64 boolean_regs_ptr;

					__u64 flex_regs_ptr;

				};

				struct drm_i915_query_item {

					__u64 query_id;

				#define DRM_I915_QUERY_TOPOLOGY_INFO    1

					/*

					 * When set to zero by userspace, this is filled with the size of the

					 * data to be written at the data_ptr pointer. The kernel sets this

					 * value to a negative value to signal an error on a particular query

					 * item.

					 */

					__s32 length;

					/*

					 * Unused for now. Must be cleared to zero.

					 */

					__u32 flags;

					/*

					 * Data will be written at the location pointed by data_ptr when the

					 * value of length matches the length of the data to be written by the

					 * kernel.

					 */

					__u64 data_ptr;

				};

				struct drm_i915_query {

					__u32 num_items;

					/*

					 * Unused for now. Must be cleared to zero.

					 */

					__u32 flags;

					/*

					 * This points to an array of num_items drm_i915_query_item structures.

					 */

					__u64 items_ptr;

				};

				/*

				 * Data written by the kernel with query DRM_I915_QUERY_TOPOLOGY_INFO :

				 *

				 * data: contains the 3 pieces of information :

				 *

				 * - the slice mask with one bit per slice telling whether a slice is

				 *   available. The availability of slice X can be queried with the following

				 *   formula :

				 *

				 *           (data[X / 8] >> (X % 8)) & 1

				 *

				 * - the subslice mask for each slice with one bit per subslice telling

				 *   whether a subslice is available. The availability of subslice Y in slice

				 *   X can be queried with the following formula :

				 *

				 *           (data[subslice_offset +

				 *                 X * subslice_stride +

				 *                 Y / 8] >> (Y % 8)) & 1

				 *

				 * - the EU mask for each subslice in each slice with one bit per EU telling

				 *   whether an EU is available. The availability of EU Z in subslice Y in

				 *   slice X can be queried with the following formula :

				 *

				 *           (data[eu_offset +

				 *                 (X * max_subslices + Y) * eu_stride +

				 *                 Z / 8] >> (Z % 8)) & 1

				 */

				struct drm_i915_query_topology_info {

					/*

					 * Unused for now. Must be cleared to zero.

					 */

					__u16 flags;

					__u16 max_slices;

					__u16 max_subslices;

					__u16 max_eus_per_subslice;

					/*

					 * Offset in data[] at which the subslice masks are stored.

					 */

					__u16 subslice_offset;

					/*

					 * Stride at which each of the subslice masks for each slice are

					 * stored.

					 */

					__u16 subslice_stride;

					/*

					 * Offset in data[] at which the EU masks are stored.

					 */

					__u16 eu_offset;

					/*

					 * Stride at which each of the EU masks for each subslice are stored.

					 */

					__u16 eu_stride;

					__u8 data[];

				};

				#if defined(__cplusplus)

				}

				#endif

									
										209

include/drm-uapi/tegra_drm.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,209 @@

				/*

				 * Copyright (c) 2012-2013, NVIDIA CORPORATION.  All rights reserved.

				 *

				 * Permission is hereby granted, free of charge, to any person obtaining a

				 * copy of this software and associated documentation files (the "Software"),

				 * to deal in the Software without restriction, including without limitation

				 * the rights to use, copy, modify, merge, publish, distribute, sublicense,

				 * and/or sell copies of the Software, and to permit persons to whom the

				 * Software is furnished to do so, subject to the following conditions:

				 *

				 * The above copyright notice and this permission notice shall be included in

				 * all copies or substantial portions of the Software.

				 *

				 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL

				 * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR

				 * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,

				 * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR

				 * OTHER DEALINGS IN THE SOFTWARE.

				 */

				#ifndef _TEGRA_DRM_H_

				#define _TEGRA_DRM_H_

				#include "drm.h"

				#if defined(__cplusplus)

				extern "C" {

				#endif

				#define DRM_TEGRA_GEM_CREATE_TILED     (1 << 0)

				#define DRM_TEGRA_GEM_CREATE_BOTTOM_UP (1 << 1)

				struct drm_tegra_gem_create {

					__u64 size;

					__u32 flags;

					__u32 handle;

				};

				struct drm_tegra_gem_mmap {

					__u32 handle;

					__u32 pad;

					__u64 offset;

				};

				struct drm_tegra_syncpt_read {

					__u32 id;

					__u32 value;

				};

				struct drm_tegra_syncpt_incr {

					__u32 id;

					__u32 pad;

				};

				struct drm_tegra_syncpt_wait {

					__u32 id;

					__u32 thresh;

					__u32 timeout;

					__u32 value;

				};

				#define DRM_TEGRA_NO_TIMEOUT	(0xffffffff)

				struct drm_tegra_open_channel {

					__u32 client;

					__u32 pad;

					__u64 context;

				};

				struct drm_tegra_close_channel {

					__u64 context;

				};

				struct drm_tegra_get_syncpt {

					__u64 context;

					__u32 index;

					__u32 id;

				};

				struct drm_tegra_get_syncpt_base {

					__u64 context;

					__u32 syncpt;

					__u32 id;

				};

				struct drm_tegra_syncpt {

					__u32 id;

					__u32 incrs;

				};

				struct drm_tegra_cmdbuf {

					__u32 handle;

					__u32 offset;

					__u32 words;

					__u32 pad;

				};

				struct drm_tegra_reloc {

					struct {

						__u32 handle;

						__u32 offset;

					} cmdbuf;

					struct {

						__u32 handle;

						__u32 offset;

					} target;

					__u32 shift;

					__u32 pad;

				};

				struct drm_tegra_waitchk {

					__u32 handle;

					__u32 offset;

					__u32 syncpt;

					__u32 thresh;

				};

				struct drm_tegra_submit {

					__u64 context;

					__u32 num_syncpts;

					__u32 num_cmdbufs;

					__u32 num_relocs;

					__u32 num_waitchks;

					__u32 waitchk_mask;

					__u32 timeout;

					__u64 syncpts;

					__u64 cmdbufs;

					__u64 relocs;

					__u64 waitchks;

					__u32 fence;		/* Return value */

					__u32 reserved[5];	/* future expansion */

				};

				#define DRM_TEGRA_GEM_TILING_MODE_PITCH 0

				#define DRM_TEGRA_GEM_TILING_MODE_TILED 1

				#define DRM_TEGRA_GEM_TILING_MODE_BLOCK 2

				struct drm_tegra_gem_set_tiling {

					/* input */

					__u32 handle;

					__u32 mode;

					__u32 value;

					__u32 pad;

				};

				struct drm_tegra_gem_get_tiling {

					/* input */

					__u32 handle;

					/* output */

					__u32 mode;

					__u32 value;

					__u32 pad;

				};

				#define DRM_TEGRA_GEM_BOTTOM_UP		(1 << 0)

				#define DRM_TEGRA_GEM_FLAGS		(DRM_TEGRA_GEM_BOTTOM_UP)

				struct drm_tegra_gem_set_flags {

					/* input */

					__u32 handle;

					/* output */

					__u32 flags;

				};

				struct drm_tegra_gem_get_flags {

					/* input */

					__u32 handle;

					/* output */

					__u32 flags;

				};

				#define DRM_TEGRA_GEM_CREATE		0x00

				#define DRM_TEGRA_GEM_MMAP		0x01

				#define DRM_TEGRA_SYNCPT_READ		0x02

				#define DRM_TEGRA_SYNCPT_INCR		0x03

				#define DRM_TEGRA_SYNCPT_WAIT		0x04

				#define DRM_TEGRA_OPEN_CHANNEL		0x05

				#define DRM_TEGRA_CLOSE_CHANNEL		0x06

				#define DRM_TEGRA_GET_SYNCPT		0x07

				#define DRM_TEGRA_SUBMIT		0x08

				#define DRM_TEGRA_GET_SYNCPT_BASE	0x09

				#define DRM_TEGRA_GEM_SET_TILING	0x0a

				#define DRM_TEGRA_GEM_GET_TILING	0x0b

				#define DRM_TEGRA_GEM_SET_FLAGS		0x0c

				#define DRM_TEGRA_GEM_GET_FLAGS		0x0d

				#define DRM_IOCTL_TEGRA_GEM_CREATE DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GEM_CREATE, struct drm_tegra_gem_create)

				#define DRM_IOCTL_TEGRA_GEM_MMAP DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GEM_MMAP, struct drm_tegra_gem_mmap)

				#define DRM_IOCTL_TEGRA_SYNCPT_READ DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_SYNCPT_READ, struct drm_tegra_syncpt_read)

				#define DRM_IOCTL_TEGRA_SYNCPT_INCR DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_SYNCPT_INCR, struct drm_tegra_syncpt_incr)

				#define DRM_IOCTL_TEGRA_SYNCPT_WAIT DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_SYNCPT_WAIT, struct drm_tegra_syncpt_wait)

				#define DRM_IOCTL_TEGRA_OPEN_CHANNEL DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_OPEN_CHANNEL, struct drm_tegra_open_channel)

				#define DRM_IOCTL_TEGRA_CLOSE_CHANNEL DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_CLOSE_CHANNEL, struct drm_tegra_open_channel)

				#define DRM_IOCTL_TEGRA_GET_SYNCPT DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GET_SYNCPT, struct drm_tegra_get_syncpt)

				#define DRM_IOCTL_TEGRA_SUBMIT DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_SUBMIT, struct drm_tegra_submit)

				#define DRM_IOCTL_TEGRA_GET_SYNCPT_BASE DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GET_SYNCPT_BASE, struct drm_tegra_get_syncpt_base)

				#define DRM_IOCTL_TEGRA_GEM_SET_TILING DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GEM_SET_TILING, struct drm_tegra_gem_set_tiling)

				#define DRM_IOCTL_TEGRA_GEM_GET_TILING DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GEM_GET_TILING, struct drm_tegra_gem_get_tiling)

				#define DRM_IOCTL_TEGRA_GEM_SET_FLAGS DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GEM_SET_FLAGS, struct drm_tegra_gem_set_flags)

				#define DRM_IOCTL_TEGRA_GEM_GET_FLAGS DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GEM_GET_FLAGS, struct drm_tegra_gem_get_flags)

				#if defined(__cplusplus)

				}

				#endif

				#endif

									
										110

src/gallium/drivers/vc5/vc5_drm.h → include/drm-uapi/v3d_drm.h
									
												View File
												
				@@ -1,5 +1,5 @@

				/*

				 * Copyright © 2014-2017 Broadcom

				 * Copyright © 2014-2018 Broadcom

				 *

				 * Permission is hereby granted, free of charge, to any person obtaining a

				 * copy of this software and associated documentation files (the "Software"),

				@@ -21,8 +21,8 @@

				 * IN THE SOFTWARE.

				 */

				#ifndef _VC5_DRM_H_

				#define _VC5_DRM_H_

				#ifndef _V3D_DRM_H_

				#define _V3D_DRM_H_

				#include "drm.h"

				@@ -30,30 +30,28 @@

				extern "C" {

				#endif

				#define DRM_VC5_SUBMIT_CL                         0x00

				#define DRM_VC5_WAIT_SEQNO                        0x01

				#define DRM_VC5_WAIT_BO                           0x02

				#define DRM_VC5_CREATE_BO                         0x03

				#define DRM_VC5_MMAP_BO                           0x04

				#define DRM_VC5_GET_PARAM                         0x05

				#define DRM_VC5_GET_BO_OFFSET                     0x06

				#define DRM_V3D_SUBMIT_CL                         0x00

				#define DRM_V3D_WAIT_BO                           0x01

				#define DRM_V3D_CREATE_BO                         0x02

				#define DRM_V3D_MMAP_BO                           0x03

				#define DRM_V3D_GET_PARAM                         0x04

				#define DRM_V3D_GET_BO_OFFSET                     0x05

				#define DRM_IOCTL_VC5_SUBMIT_CL           DRM_IOWR(DRM_COMMAND_BASE + DRM_VC5_SUBMIT_CL, struct drm_vc5_submit_cl)

				#define DRM_IOCTL_VC5_WAIT_SEQNO          DRM_IOWR(DRM_COMMAND_BASE + DRM_VC5_WAIT_SEQNO, struct drm_vc5_wait_seqno)

				#define DRM_IOCTL_VC5_WAIT_BO             DRM_IOWR(DRM_COMMAND_BASE + DRM_VC5_WAIT_BO, struct drm_vc5_wait_bo)

				#define DRM_IOCTL_VC5_CREATE_BO           DRM_IOWR(DRM_COMMAND_BASE + DRM_VC5_CREATE_BO, struct drm_vc5_create_bo)

				#define DRM_IOCTL_VC5_MMAP_BO             DRM_IOWR(DRM_COMMAND_BASE + DRM_VC5_MMAP_BO, struct drm_vc5_mmap_bo)

				#define DRM_IOCTL_VC5_GET_PARAM           DRM_IOWR(DRM_COMMAND_BASE + DRM_VC5_GET_PARAM, struct drm_vc5_get_param)

				#define DRM_IOCTL_VC5_GET_BO_OFFSET       DRM_IOWR(DRM_COMMAND_BASE + DRM_VC5_GET_BO_OFFSET, struct drm_vc5_get_bo_offset)

				#define DRM_IOCTL_V3D_SUBMIT_CL           DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_SUBMIT_CL, struct drm_v3d_submit_cl)

				#define DRM_IOCTL_V3D_WAIT_BO             DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_WAIT_BO, struct drm_v3d_wait_bo)

				#define DRM_IOCTL_V3D_CREATE_BO           DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_CREATE_BO, struct drm_v3d_create_bo)

				#define DRM_IOCTL_V3D_MMAP_BO             DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_MMAP_BO, struct drm_v3d_mmap_bo)

				#define DRM_IOCTL_V3D_GET_PARAM           DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_GET_PARAM, struct drm_v3d_get_param)

				#define DRM_IOCTL_V3D_GET_BO_OFFSET       DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_GET_BO_OFFSET, struct drm_v3d_get_bo_offset)

				/**

				 * struct drm_vc5_submit_cl - ioctl argument for submitting commands to the 3D

				 * struct drm_v3d_submit_cl - ioctl argument for submitting commands to the 3D

				 * engine.

				 *

				 * This asks the kernel to have the GPU execute an optional binner

				 * command list, and a render command list.

				 */

				struct drm_vc5_submit_cl {

				struct drm_v3d_submit_cl {

					/* Pointer to the binner command list.

					 *

					 * This is the first set of commands executed, which runs the

				@@ -77,6 +75,13 @@ struct drm_vc5_submit_cl {

					 /** End address of the RCL (first byte after the RCL) */

					__u32 rcl_end;

					/** An optional sync object to wait on before starting the BCL. */

					__u32 in_sync_bcl;

					/** An optional sync object to wait on before starting the RCL. */

					__u32 in_sync_rcl;

					/** An optional sync object to place the completion fence in. */

					__u32 out_sync;

					/* Offset of the tile alloc memory

					 *

					 * This is optional on V3D 3.3 (where the CL can set the value) but

				@@ -84,62 +89,44 @@ struct drm_vc5_submit_cl {

					 */

					__u32 qma;

					 /** Size of the tile alloc memory. */

					/** Size of the tile alloc memory. */

					__u32 qms;

					 /** Offset of the tile state data array. */

					/** Offset of the tile state data array. */

					__u32 qts;

					/* Pointer to a u32 array of the BOs that are referenced by the job.

					 */

					__u64 bo_handles;

					/* Pointer to an array of chunks of extra submit CL information. (the

					 * chunk struct is not yet defined)

					 */

					__u64 chunks;

					/* Number of BO handles passed in (size is that times 4). */

					__u32 bo_handle_count;

					__u32 chunk_count;

					__u64 flags;

					/* Pad, must be zero-filled. */

					__u32 pad;

				};

				/**

				 * struct drm_vc5_wait_seqno - ioctl argument for waiting for

				 * DRM_VC5_SUBMIT_CL completion using its returned seqno.

				 *

				 * timeout_ns is the timeout in nanoseconds, where "0" means "don't

				 * block, just return the status."

				 */

				struct drm_vc5_wait_seqno {

					__u64 seqno;

					__u64 timeout_ns;

				};

				/**

				 * struct drm_vc5_wait_bo - ioctl argument for waiting for

				 * completion of the last DRM_VC5_SUBMIT_CL on a BO.

				 * struct drm_v3d_wait_bo - ioctl argument for waiting for

				 * completion of the last DRM_V3D_SUBMIT_CL on a BO.

				 *

				 * This is useful for cases where multiple processes might be

				 * rendering to a BO and you want to wait for all rendering to be

				 * completed.

				 */

				struct drm_vc5_wait_bo {

				struct drm_v3d_wait_bo {

					__u32 handle;

					__u32 pad;

					__u64 timeout_ns;

				};

				/**

				 * struct drm_vc5_create_bo - ioctl argument for creating VC5 BOs.

				 * struct drm_v3d_create_bo - ioctl argument for creating V3D BOs.

				 *

				 * There are currently no values for the flags argument, but it may be

				 * used in a future extension.

				 */

				struct drm_vc5_create_bo {

				struct drm_v3d_create_bo {

					__u32 size;

					__u32 flags;

					/** Returned GEM handle for the BO. */

				@@ -148,12 +135,15 @@ struct drm_vc5_create_bo {

					 * Returned offset for the BO in the V3D address space.  This offset

					 * is private to the DRM fd and is valid for the lifetime of the GEM

					 * handle.

					 *

					 * This offset value will always be nonzero, since various HW

					 * units treat 0 specially.

					 */

					__u32 offset;

				};

				/**

				 * struct drm_vc5_mmap_bo - ioctl argument for mapping VC5 BOs.

				 * struct drm_v3d_mmap_bo - ioctl argument for mapping V3D BOs.

				 *

				 * This doesn't actually perform an mmap.  Instead, it returns the

				 * offset you need to use in an mmap on the DRM device node.  This

				@@ -163,7 +153,7 @@ struct drm_vc5_create_bo {

				 * There are currently no values for the flags argument, but it may be

				 * used in a future extension.

				 */

				struct drm_vc5_mmap_bo {

				struct drm_v3d_mmap_bo {

					/** Handle for the object being mapped. */

					__u32 handle;

					__u32 flags;

				@@ -171,17 +161,17 @@ struct drm_vc5_mmap_bo {

					__u64 offset;

				};

				enum drm_vc5_param {

				        DRM_VC5_PARAM_V3D_UIFCFG,

				        DRM_VC5_PARAM_V3D_HUB_IDENT1,

				        DRM_VC5_PARAM_V3D_HUB_IDENT2,

				        DRM_VC5_PARAM_V3D_HUB_IDENT3,

				        DRM_VC5_PARAM_V3D_CORE0_IDENT0,

				        DRM_VC5_PARAM_V3D_CORE0_IDENT1,

				        DRM_VC5_PARAM_V3D_CORE0_IDENT2,

				enum drm_v3d_param {

					DRM_V3D_PARAM_V3D_UIFCFG,

					DRM_V3D_PARAM_V3D_HUB_IDENT1,

					DRM_V3D_PARAM_V3D_HUB_IDENT2,

					DRM_V3D_PARAM_V3D_HUB_IDENT3,

					DRM_V3D_PARAM_V3D_CORE0_IDENT0,

					DRM_V3D_PARAM_V3D_CORE0_IDENT1,

					DRM_V3D_PARAM_V3D_CORE0_IDENT2,

				};

				struct drm_vc5_get_param {

				struct drm_v3d_get_param {

					__u32 param;

					__u32 pad;

					__u64 value;

				@@ -189,10 +179,10 @@ struct drm_vc5_get_param {

				/**

				 * Returns the offset for the BO in the V3D address space for this DRM fd.

				 * This is the same value returned by drm_vc5_create_bo, if that was called

				 * This is the same value returned by drm_v3d_create_bo, if that was called

				 * from this DRM fd.

				 */

				struct drm_vc5_get_bo_offset {

				struct drm_v3d_get_bo_offset {

					__u32 handle;

					__u32 offset;

				};

				@@ -201,4 +191,4 @@ struct drm_vc5_get_bo_offset {

				}

				#endif

				#endif /* _VC5_DRM_H_ */

				#endif /* _V3D_DRM_H_ */

									
										83

include/drm-uapi/vc4_drm.h
									
												View File
												
				@@ -42,6 +42,9 @@ extern "C" {

				#define DRM_VC4_GET_TILING                        0x09

				#define DRM_VC4_LABEL_BO                          0x0a

				#define DRM_VC4_GEM_MADVISE                       0x0b

				#define DRM_VC4_PERFMON_CREATE                    0x0c

				#define DRM_VC4_PERFMON_DESTROY                   0x0d

				#define DRM_VC4_PERFMON_GET_VALUES                0x0e

				#define DRM_IOCTL_VC4_SUBMIT_CL           DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_SUBMIT_CL, struct drm_vc4_submit_cl)

				#define DRM_IOCTL_VC4_WAIT_SEQNO          DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_WAIT_SEQNO, struct drm_vc4_wait_seqno)

				@@ -55,6 +58,9 @@ extern "C" {

				#define DRM_IOCTL_VC4_GET_TILING          DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_GET_TILING, struct drm_vc4_get_tiling)

				#define DRM_IOCTL_VC4_LABEL_BO            DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_LABEL_BO, struct drm_vc4_label_bo)

				#define DRM_IOCTL_VC4_GEM_MADVISE         DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_GEM_MADVISE, struct drm_vc4_gem_madvise)

				#define DRM_IOCTL_VC4_PERFMON_CREATE      DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_PERFMON_CREATE, struct drm_vc4_perfmon_create)

				#define DRM_IOCTL_VC4_PERFMON_DESTROY     DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_PERFMON_DESTROY, struct drm_vc4_perfmon_destroy)

				#define DRM_IOCTL_VC4_PERFMON_GET_VALUES  DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_PERFMON_GET_VALUES, struct drm_vc4_perfmon_get_values)

				struct drm_vc4_submit_rcl_surface {

					__u32 hindex; /* Handle index, or ~0 if not present. */

				@@ -173,6 +179,22 @@ struct drm_vc4_submit_cl {

					 * wait ioctl).

					 */

					__u64 seqno;

					/* ID of the perfmon to attach to this job. 0 means no perfmon. */

					__u32 perfmonid;

					/* Syncobj handle to wait on. If set, processing of this render job

					 * will not start until the syncobj is signaled. 0 means ignore.

					 */

					__u32 in_sync;

					/* Syncobj handle to export fence to. If set, the fence in the syncobj

					 * will be replaced with a fence that signals upon completion of this

					 * render job. 0 means ignore.

					 */

					__u32 out_sync;

					__u32 pad2;

				};

				/**

				@@ -308,6 +330,7 @@ struct drm_vc4_get_hang_state {

				#define DRM_VC4_PARAM_SUPPORTS_THREADED_FS	5

				#define DRM_VC4_PARAM_SUPPORTS_FIXED_RCL_ORDER	6

				#define DRM_VC4_PARAM_SUPPORTS_MADVISE		7

				#define DRM_VC4_PARAM_SUPPORTS_PERFMON		8

				struct drm_vc4_get_param {

					__u32 param;

				@@ -352,6 +375,66 @@ struct drm_vc4_gem_madvise {

					__u32 pad;

				};

				enum {

					VC4_PERFCNT_FEP_VALID_PRIMS_NO_RENDER,

					VC4_PERFCNT_FEP_VALID_PRIMS_RENDER,

					VC4_PERFCNT_FEP_CLIPPED_QUADS,

					VC4_PERFCNT_FEP_VALID_QUADS,

					VC4_PERFCNT_TLB_QUADS_NOT_PASSING_STENCIL,

					VC4_PERFCNT_TLB_QUADS_NOT_PASSING_Z_AND_STENCIL,

					VC4_PERFCNT_TLB_QUADS_PASSING_Z_AND_STENCIL,

					VC4_PERFCNT_TLB_QUADS_ZERO_COVERAGE,

					VC4_PERFCNT_TLB_QUADS_NON_ZERO_COVERAGE,

					VC4_PERFCNT_TLB_QUADS_WRITTEN_TO_COLOR_BUF,

					VC4_PERFCNT_PLB_PRIMS_OUTSIDE_VIEWPORT,

					VC4_PERFCNT_PLB_PRIMS_NEED_CLIPPING,

					VC4_PERFCNT_PSE_PRIMS_REVERSED,

					VC4_PERFCNT_QPU_TOTAL_IDLE_CYCLES,

					VC4_PERFCNT_QPU_TOTAL_CLK_CYCLES_VERTEX_COORD_SHADING,

					VC4_PERFCNT_QPU_TOTAL_CLK_CYCLES_FRAGMENT_SHADING,

					VC4_PERFCNT_QPU_TOTAL_CLK_CYCLES_EXEC_VALID_INST,

					VC4_PERFCNT_QPU_TOTAL_CLK_CYCLES_WAITING_TMUS,

					VC4_PERFCNT_QPU_TOTAL_CLK_CYCLES_WAITING_SCOREBOARD,

					VC4_PERFCNT_QPU_TOTAL_CLK_CYCLES_WAITING_VARYINGS,

					VC4_PERFCNT_QPU_TOTAL_INST_CACHE_HIT,

					VC4_PERFCNT_QPU_TOTAL_INST_CACHE_MISS,

					VC4_PERFCNT_QPU_TOTAL_UNIFORM_CACHE_HIT,

					VC4_PERFCNT_QPU_TOTAL_UNIFORM_CACHE_MISS,

					VC4_PERFCNT_TMU_TOTAL_TEXT_QUADS_PROCESSED,

					VC4_PERFCNT_TMU_TOTAL_TEXT_CACHE_MISS,

					VC4_PERFCNT_VPM_TOTAL_CLK_CYCLES_VDW_STALLED,

					VC4_PERFCNT_VPM_TOTAL_CLK_CYCLES_VCD_STALLED,

					VC4_PERFCNT_L2C_TOTAL_L2_CACHE_HIT,

					VC4_PERFCNT_L2C_TOTAL_L2_CACHE_MISS,

					VC4_PERFCNT_NUM_EVENTS,

				};

				#define DRM_VC4_MAX_PERF_COUNTERS	16

				struct drm_vc4_perfmon_create {

					__u32 id;

					__u32 ncounters;

					__u8 events[DRM_VC4_MAX_PERF_COUNTERS];

				};

				struct drm_vc4_perfmon_destroy {

					__u32 id;

				};

				/*

				 * Returns the values of the performance counters tracked by this

				 * perfmon (as an array of ncounters u64 values).

				 *

				 * No implicit synchronization is performed, so the user has to

				 * guarantee that any jobs using this perfmon have already been

				 * completed  (probably by blocking on the seqno returned by the

				 * last exec that used the perfmon).

				 */

				struct drm_vc4_perfmon_get_values {

					__u32 id;

					__u64 values_ptr;

				};

				#if defined(__cplusplus)

				}

				#endif

									
										8

include/meson.build
									
												View File
												
				@@ -22,6 +22,7 @@ inc_drm_uapi = include_directories('drm-uapi')

				inc_vulkan = include_directories('vulkan')

				inc_d3d9 = include_directories('D3D9')

				inc_gl_internal = include_directories('GL/internal')

				inc_haikugl = include_directories('HaikuGL')

				if with_gles1

				  install_headers(

				@@ -80,6 +81,13 @@ if with_gallium_st_nine

				  )

				endif

				if with_platform_haiku

				  install_headers(

				    'HaikuGL/GLRenderer.h', 'HaikuGL/GLView.h', 'HaikuGL/OpenGLKit.h',

				    subdir : 'opengl',

				  )

				endif

				# Only install the headers if we are building a stand alone implementation and

				# not an ICD enabled implementation

				if with_gallium_opencl and not with_opencl_icd

									
										19

include/pci_ids/i965_pci_ids.h
									
												View File
												
				@@ -156,6 +156,7 @@ CHIPSET(0x5912, kbl_gt2, "Intel(R) HD Graphics 630 (Kaby Lake GT2)")

				CHIPSET(0x5916, kbl_gt2, "Intel(R) HD Graphics 620 (Kaby Lake GT2)")

				CHIPSET(0x591A, kbl_gt2, "Intel(R) HD Graphics P630 (Kaby Lake GT2)")

				CHIPSET(0x591B, kbl_gt2, "Intel(R) HD Graphics 630 (Kaby Lake GT2)")

				CHIPSET(0x591C, kbl_gt2, "Intel(R) Kaby Lake GT2")

				CHIPSET(0x591D, kbl_gt2, "Intel(R) HD Graphics P630 (Kaby Lake GT2)")

				CHIPSET(0x591E, kbl_gt2, "Intel(R) HD Graphics 615 (Kaby Lake GT2)")

				CHIPSET(0x5921, kbl_gt2, "Intel(R) Kabylake GT2F")

				@@ -165,16 +166,16 @@ CHIPSET(0x5927, kbl_gt3, "Intel(R) Iris Plus Graphics 650 (Kaby Lake GT3e)")

				CHIPSET(0x593B, kbl_gt4, "Intel(R) Kabylake GT4")

				CHIPSET(0x3184, glk,     "Intel(R) UHD Graphics 605 (Geminilake)")

				CHIPSET(0x3185, glk_2x6, "Intel(R) UHD Graphics 600 (Geminilake 2x6)")

				CHIPSET(0x3E90, cfl_gt1, "Intel(R) HD Graphics (Coffeelake 2x6 GT1)")

				CHIPSET(0x3E93, cfl_gt1, "Intel(R) HD Graphics (Coffeelake 2x6 GT1)")

				CHIPSET(0x3E90, cfl_gt1, "Intel(R) UHD Graphics 610 (Coffeelake 2x6 GT1)")

				CHIPSET(0x3E93, cfl_gt1, "Intel(R) UHD Graphics 610 (Coffeelake 2x6 GT1)")

				CHIPSET(0x3E99, cfl_gt1, "Intel(R) HD Graphics (Coffeelake 2x6 GT1)")

				CHIPSET(0x3EA1, cfl_gt1, "Intel(R) HD Graphics (Coffeelake 2x6 GT1)")

				CHIPSET(0x3EA4, cfl_gt1, "Intel(R) HD Graphics (Coffeelake 2x6 GT1)")

				CHIPSET(0x3E91, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)")

				CHIPSET(0x3E92, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)")

				CHIPSET(0x3E91, cfl_gt2, "Intel(R) UHD Graphics 630 (Coffeelake 3x8 GT2)")

				CHIPSET(0x3E92, cfl_gt2, "Intel(R) UHD Graphics 630 (Coffeelake 3x8 GT2)")

				CHIPSET(0x3E96, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)")

				CHIPSET(0x3E9A, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)")

				CHIPSET(0x3E9B, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)")

				CHIPSET(0x3E9B, cfl_gt2, "Intel(R) UHD Graphics 630 (Coffeelake 3x8 GT2)")

				CHIPSET(0x3E94, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)")

				CHIPSET(0x3EA0, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)")

				CHIPSET(0x3EA3, cfl_gt2, "Intel(R) HD Graphics (Coffeelake 3x8 GT2)")

				@@ -196,3 +197,11 @@ CHIPSET(0x5A50, cnl_5x8, "Intel(R) HD Graphics (Cannonlake 5x8 GT2)")

				CHIPSET(0x5A51, cnl_5x8, "Intel(R) HD Graphics (Cannonlake 5x8 GT2)")

				CHIPSET(0x5A52, cnl_5x8, "Intel(R) HD Graphics (Cannonlake 5x8 GT2)")

				CHIPSET(0x5A54, cnl_5x8, "Intel(R) HD Graphics (Cannonlake 5x8 GT2)")

				CHIPSET(0x8A50, icl_8x8, "Intel(R) HD Graphics (Ice Lake 8x8 GT2)")

				CHIPSET(0x8A51, icl_8x8, "Intel(R) HD Graphics (Ice Lake 8x8 GT2)")

				CHIPSET(0x8A52, icl_8x8, "Intel(R) HD Graphics (Ice Lake 8x8 GT2)")

				CHIPSET(0x8A5A, icl_6x8, "Intel(R) HD Graphics (Ice Lake 6x8 GT1.5)")

				CHIPSET(0x8A5B, icl_4x8, "Intel(R) HD Graphics (Ice Lake 4x8 GT1)")

				CHIPSET(0x8A5C, icl_6x8, "Intel(R) HD Graphics (Ice Lake 6x8 GT1.5)")

				CHIPSET(0x8A5D, icl_4x8, "Intel(R) HD Graphics (Ice Lake 4x8 GT1)")

				CHIPSET(0x8A71, icl_1x8, "Intel(R) HD Graphics (Ice Lake 1x8 GT0.5)")

									
										16

include/pci_ids/radeonsi_pci_ids.h
									
												View File
												
				@@ -216,6 +216,9 @@ CHIPSET(0x6995, POLARIS12)

				CHIPSET(0x6997, POLARIS12)

				CHIPSET(0x699F, POLARIS12)

				CHIPSET(0x694C, VEGAM)

				CHIPSET(0x694E, VEGAM)

				CHIPSET(0x6860, VEGA10)

				CHIPSET(0x6861, VEGA10)

				CHIPSET(0x6862, VEGA10)

				@@ -226,4 +229,17 @@ CHIPSET(0x6868, VEGA10)

				CHIPSET(0x687F, VEGA10)

				CHIPSET(0x686C, VEGA10)

				CHIPSET(0x69A0, VEGA12)

				CHIPSET(0x69A1, VEGA12)

				CHIPSET(0x69A2, VEGA12)

				CHIPSET(0x69A3, VEGA12)

				CHIPSET(0x69AF, VEGA12)

				CHIPSET(0x66A0, VEGA20)

				CHIPSET(0x66A1, VEGA20)

				CHIPSET(0x66A2, VEGA20)

				CHIPSET(0x66A3, VEGA20)

				CHIPSET(0x66A7, VEGA20)

				CHIPSET(0x66AF, VEGA20)

				CHIPSET(0x15DD, RAVEN)

									
										65

include/vulkan/vk_icd.h
									
												View File
												
				@@ -24,13 +24,34 @@

				#define VKICD_H

				#include "vulkan.h"

				#include <stdbool.h>

				/*

				 * Loader-ICD version negotiation API

				 */

				#define CURRENT_LOADER_ICD_INTERFACE_VERSION 3

				// Loader-ICD version negotiation API.  Versions add the following features:

				//   Version 0 - Initial.  Doesn't support vk_icdGetInstanceProcAddr

				//               or vk_icdNegotiateLoaderICDInterfaceVersion.

				//   Version 1 - Add support for vk_icdGetInstanceProcAddr.

				//   Version 2 - Add Loader/ICD Interface version negotiation

				//               via vk_icdNegotiateLoaderICDInterfaceVersion.

				//   Version 3 - Add ICD creation/destruction of KHR_surface objects.

				//   Version 4 - Add unknown physical device extension qyering via

				//               vk_icdGetPhysicalDeviceProcAddr.

				//   Version 5 - Tells ICDs that the loader is now paying attention to the

				//               application version of Vulkan passed into the ApplicationInfo

				//               structure during vkCreateInstance.  This will tell the ICD

				//               that if the loader is older, it should automatically fail a

				//               call for any API version > 1.0.  Otherwise, the loader will

				//               manually determine if it can support the expected version.

				#define CURRENT_LOADER_ICD_INTERFACE_VERSION 5

				#define MIN_SUPPORTED_LOADER_ICD_INTERFACE_VERSION 0

				typedef VkResult (VKAPI_PTR *PFN_vkNegotiateLoaderICDInterfaceVersion)(uint32_t *pVersion);

				#define MIN_PHYS_DEV_EXTENSION_ICD_INTERFACE_VERSION 4

				typedef VkResult(VKAPI_PTR *PFN_vkNegotiateLoaderICDInterfaceVersion)(uint32_t *pVersion);

				// This is defined in vk_layer.h which will be found by the loader, but if an ICD is building against this

				// file directly, it won't be found.

				#ifndef PFN_GetPhysicalDeviceProcAddr

				typedef PFN_vkVoidFunction(VKAPI_PTR *PFN_GetPhysicalDeviceProcAddr)(VkInstance instance, const char *pName);

				#endif

				/*

				 * The ICD must reserve space for a pointer for the loader's dispatch

				 * table, at the start of <each object>.

				@@ -64,6 +85,9 @@ typedef enum {

				    VK_ICD_WSI_PLATFORM_WIN32,

				    VK_ICD_WSI_PLATFORM_XCB,

				    VK_ICD_WSI_PLATFORM_XLIB,

				    VK_ICD_WSI_PLATFORM_ANDROID,

				    VK_ICD_WSI_PLATFORM_MACOS,

				    VK_ICD_WSI_PLATFORM_IOS,

				    VK_ICD_WSI_PLATFORM_DISPLAY

				} VkIcdWsiPlatform;

				@@ -77,7 +101,7 @@ typedef struct {

				    MirConnection *connection;

				    MirSurface *mirSurface;

				} VkIcdSurfaceMir;

				#endif // VK_USE_PLATFORM_MIR_KHR

				#endif  // VK_USE_PLATFORM_MIR_KHR

				#ifdef VK_USE_PLATFORM_WAYLAND_KHR

				typedef struct {

				@@ -85,7 +109,7 @@ typedef struct {

				    struct wl_display *display;

				    struct wl_surface *surface;

				} VkIcdSurfaceWayland;

				#endif // VK_USE_PLATFORM_WAYLAND_KHR

				#endif  // VK_USE_PLATFORM_WAYLAND_KHR

				#ifdef VK_USE_PLATFORM_WIN32_KHR

				typedef struct {

				@@ -93,7 +117,7 @@ typedef struct {

				    HINSTANCE hinstance;

				    HWND hwnd;

				} VkIcdSurfaceWin32;

				#endif // VK_USE_PLATFORM_WIN32_KHR

				#endif  // VK_USE_PLATFORM_WIN32_KHR

				#ifdef VK_USE_PLATFORM_XCB_KHR

				typedef struct {

				@@ -101,7 +125,7 @@ typedef struct {

				    xcb_connection_t *connection;

				    xcb_window_t window;

				} VkIcdSurfaceXcb;

				#endif // VK_USE_PLATFORM_XCB_KHR

				#endif  // VK_USE_PLATFORM_XCB_KHR

				#ifdef VK_USE_PLATFORM_XLIB_KHR

				typedef struct {

				@@ -109,13 +133,28 @@ typedef struct {

				    Display *dpy;

				    Window window;

				} VkIcdSurfaceXlib;

				#endif // VK_USE_PLATFORM_XLIB_KHR

				#endif  // VK_USE_PLATFORM_XLIB_KHR

				#ifdef VK_USE_PLATFORM_ANDROID_KHR

				typedef struct {

				    ANativeWindow* window;

				    VkIcdSurfaceBase base;

				    struct ANativeWindow *window;

				} VkIcdSurfaceAndroid;

				#endif //VK_USE_PLATFORM_ANDROID_KHR

				#endif  // VK_USE_PLATFORM_ANDROID_KHR

				#ifdef VK_USE_PLATFORM_MACOS_MVK

				typedef struct {

				    VkIcdSurfaceBase base;

				    const void *pView;

				} VkIcdSurfaceMacOS;

				#endif  // VK_USE_PLATFORM_MACOS_MVK

				#ifdef VK_USE_PLATFORM_IOS_MVK

				typedef struct {

				    VkIcdSurfaceBase base;

				    const void *pView;

				} VkIcdSurfaceIOS;

				#endif  // VK_USE_PLATFORM_IOS_MVK

				typedef struct {

				    VkIcdSurfaceBase base;

				@@ -128,4 +167,4 @@ typedef struct {

				    VkExtent2D imageExtent;

				} VkIcdSurfaceDisplay;

				#endif // VKICD_H

				#endif  // VKICD_H

									
										28

include/vulkan/vk_platform.h
									
												View File
												
				@@ -89,32 +89,4 @@ extern "C"

				} // extern "C"

				#endif // __cplusplus

				// Platform-specific headers required by platform window system extensions.

				// These are enabled prior to #including "vulkan.h". The same enable then

				// controls inclusion of the extension interfaces in vulkan.h.

				#ifdef VK_USE_PLATFORM_ANDROID_KHR

				#include <android/native_window.h>

				#endif

				#ifdef VK_USE_PLATFORM_MIR_KHR

				#include <mir_toolkit/client_types.h>

				#endif

				#ifdef VK_USE_PLATFORM_WAYLAND_KHR

				#include <wayland-client.h>

				#endif

				#ifdef VK_USE_PLATFORM_WIN32_KHR

				#include <windows.h>

				#endif

				#ifdef VK_USE_PLATFORM_XLIB_KHR

				#include <X11/Xlib.h>

				#endif

				#ifdef VK_USE_PLATFORM_XCB_KHR

				#include <xcb/xcb.h>

				#endif

				#endif

6969

include/vulkan/vulkan.h

View File

File diff suppressed because it is too large Load Diff

									
										126

include/vulkan/vulkan_android.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,126 @@

				#ifndef VULKAN_ANDROID_H_

				#define VULKAN_ANDROID_H_ 1

				#ifdef __cplusplus

				extern "C" {

				#endif

				/*

				** Copyright (c) 2015-2018 The Khronos Group Inc.

				**

				** Licensed under the Apache License, Version 2.0 (the "License");

				** you may not use this file except in compliance with the License.

				** You may obtain a copy of the License at

				**

				**     http://www.apache.org/licenses/LICENSE-2.0

				**

				** Unless required by applicable law or agreed to in writing, software

				** distributed under the License is distributed on an "AS IS" BASIS,

				** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

				** See the License for the specific language governing permissions and

				** limitations under the License.

				*/

				/*

				** This header is generated from the Khronos Vulkan XML API Registry.

				**

				*/

				#define VK_KHR_android_surface 1

				struct ANativeWindow;

				#define VK_KHR_ANDROID_SURFACE_SPEC_VERSION 6

				#define VK_KHR_ANDROID_SURFACE_EXTENSION_NAME "VK_KHR_android_surface"

				typedef VkFlags VkAndroidSurfaceCreateFlagsKHR;

				typedef struct VkAndroidSurfaceCreateInfoKHR {

				    VkStructureType                   sType;

				    const void*                       pNext;

				    VkAndroidSurfaceCreateFlagsKHR    flags;

				    struct ANativeWindow*             window;

				} VkAndroidSurfaceCreateInfoKHR;

				typedef VkResult (VKAPI_PTR *PFN_vkCreateAndroidSurfaceKHR)(VkInstance instance, const VkAndroidSurfaceCreateInfoKHR* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);

				#ifndef VK_NO_PROTOTYPES

				VKAPI_ATTR VkResult VKAPI_CALL vkCreateAndroidSurfaceKHR(

				    VkInstance                                  instance,

				    const VkAndroidSurfaceCreateInfoKHR*        pCreateInfo,

				    const VkAllocationCallbacks*                pAllocator,

				    VkSurfaceKHR*                               pSurface);

				#endif

				#define VK_ANDROID_external_memory_android_hardware_buffer 1

				struct AHardwareBuffer;

				#define VK_ANDROID_EXTERNAL_MEMORY_ANDROID_HARDWARE_BUFFER_SPEC_VERSION 3

				#define VK_ANDROID_EXTERNAL_MEMORY_ANDROID_HARDWARE_BUFFER_EXTENSION_NAME "VK_ANDROID_external_memory_android_hardware_buffer"

				typedef struct VkAndroidHardwareBufferUsageANDROID {

				    VkStructureType    sType;

				    void*              pNext;

				    uint64_t           androidHardwareBufferUsage;

				} VkAndroidHardwareBufferUsageANDROID;

				typedef struct VkAndroidHardwareBufferPropertiesANDROID {

				    VkStructureType    sType;

				    void*              pNext;

				    VkDeviceSize       allocationSize;

				    uint32_t           memoryTypeBits;

				} VkAndroidHardwareBufferPropertiesANDROID;

				typedef struct VkAndroidHardwareBufferFormatPropertiesANDROID {

				    VkStructureType                  sType;

				    void*                            pNext;

				    VkFormat                         format;

				    uint64_t                         externalFormat;

				    VkFormatFeatureFlags             formatFeatures;

				    VkComponentMapping               samplerYcbcrConversionComponents;

				    VkSamplerYcbcrModelConversion    suggestedYcbcrModel;

				    VkSamplerYcbcrRange              suggestedYcbcrRange;

				    VkChromaLocation                 suggestedXChromaOffset;

				    VkChromaLocation                 suggestedYChromaOffset;

				} VkAndroidHardwareBufferFormatPropertiesANDROID;

				typedef struct VkImportAndroidHardwareBufferInfoANDROID {

				    VkStructureType            sType;

				    const void*                pNext;

				    struct AHardwareBuffer*    buffer;

				} VkImportAndroidHardwareBufferInfoANDROID;

				typedef struct VkMemoryGetAndroidHardwareBufferInfoANDROID {

				    VkStructureType    sType;

				    const void*        pNext;

				    VkDeviceMemory     memory;

				} VkMemoryGetAndroidHardwareBufferInfoANDROID;

				typedef struct VkExternalFormatANDROID {

				    VkStructureType    sType;

				    void*              pNext;

				    uint64_t           externalFormat;

				} VkExternalFormatANDROID;

				typedef VkResult (VKAPI_PTR *PFN_vkGetAndroidHardwareBufferPropertiesANDROID)(VkDevice device, const struct AHardwareBuffer* buffer, VkAndroidHardwareBufferPropertiesANDROID* pProperties);

				typedef VkResult (VKAPI_PTR *PFN_vkGetMemoryAndroidHardwareBufferANDROID)(VkDevice device, const VkMemoryGetAndroidHardwareBufferInfoANDROID* pInfo, struct AHardwareBuffer** pBuffer);

				#ifndef VK_NO_PROTOTYPES

				VKAPI_ATTR VkResult VKAPI_CALL vkGetAndroidHardwareBufferPropertiesANDROID(

				    VkDevice                                    device,

				    const struct AHardwareBuffer*               buffer,

				    VkAndroidHardwareBufferPropertiesANDROID*   pProperties);

				VKAPI_ATTR VkResult VKAPI_CALL vkGetMemoryAndroidHardwareBufferANDROID(

				    VkDevice                                    device,

				    const VkMemoryGetAndroidHardwareBufferInfoANDROID* pInfo,

				    struct AHardwareBuffer**                    pBuffer);

				#endif

				#ifdef __cplusplus

				}

				#endif

				#endif

7767

include/vulkan/vulkan_core.h Normal file

View File

File diff suppressed because it is too large Load Diff

									
										58

include/vulkan/vulkan_ios.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,58 @@

				#ifndef VULKAN_IOS_H_

				#define VULKAN_IOS_H_ 1

				#ifdef __cplusplus

				extern "C" {

				#endif

				/*

				** Copyright (c) 2015-2018 The Khronos Group Inc.

				**

				** Licensed under the Apache License, Version 2.0 (the "License");

				** you may not use this file except in compliance with the License.

				** You may obtain a copy of the License at

				**

				**     http://www.apache.org/licenses/LICENSE-2.0

				**

				** Unless required by applicable law or agreed to in writing, software

				** distributed under the License is distributed on an "AS IS" BASIS,

				** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

				** See the License for the specific language governing permissions and

				** limitations under the License.

				*/

				/*

				** This header is generated from the Khronos Vulkan XML API Registry.

				**

				*/

				#define VK_MVK_ios_surface 1

				#define VK_MVK_IOS_SURFACE_SPEC_VERSION   2

				#define VK_MVK_IOS_SURFACE_EXTENSION_NAME "VK_MVK_ios_surface"

				typedef VkFlags VkIOSSurfaceCreateFlagsMVK;

				typedef struct VkIOSSurfaceCreateInfoMVK {

				    VkStructureType               sType;

				    const void*                   pNext;

				    VkIOSSurfaceCreateFlagsMVK    flags;

				    const void*                   pView;

				} VkIOSSurfaceCreateInfoMVK;

				typedef VkResult (VKAPI_PTR *PFN_vkCreateIOSSurfaceMVK)(VkInstance instance, const VkIOSSurfaceCreateInfoMVK* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);

				#ifndef VK_NO_PROTOTYPES

				VKAPI_ATTR VkResult VKAPI_CALL vkCreateIOSSurfaceMVK(

				    VkInstance                                  instance,

				    const VkIOSSurfaceCreateInfoMVK*            pCreateInfo,

				    const VkAllocationCallbacks*                pAllocator,

				    VkSurfaceKHR*                               pSurface);

				#endif

				#ifdef __cplusplus

				}

				#endif

				#endif

									
										58

include/vulkan/vulkan_macos.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,58 @@

				#ifndef VULKAN_MACOS_H_

				#define VULKAN_MACOS_H_ 1

				#ifdef __cplusplus

				extern "C" {

				#endif

				/*

				** Copyright (c) 2015-2018 The Khronos Group Inc.

				**

				** Licensed under the Apache License, Version 2.0 (the "License");

				** you may not use this file except in compliance with the License.

				** You may obtain a copy of the License at

				**

				**     http://www.apache.org/licenses/LICENSE-2.0

				**

				** Unless required by applicable law or agreed to in writing, software

				** distributed under the License is distributed on an "AS IS" BASIS,

				** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

				** See the License for the specific language governing permissions and

				** limitations under the License.

				*/

				/*

				** This header is generated from the Khronos Vulkan XML API Registry.

				**

				*/

				#define VK_MVK_macos_surface 1

				#define VK_MVK_MACOS_SURFACE_SPEC_VERSION 2

				#define VK_MVK_MACOS_SURFACE_EXTENSION_NAME "VK_MVK_macos_surface"

				typedef VkFlags VkMacOSSurfaceCreateFlagsMVK;

				typedef struct VkMacOSSurfaceCreateInfoMVK {

				    VkStructureType                 sType;

				    const void*                     pNext;

				    VkMacOSSurfaceCreateFlagsMVK    flags;

				    const void*                     pView;

				} VkMacOSSurfaceCreateInfoMVK;

				typedef VkResult (VKAPI_PTR *PFN_vkCreateMacOSSurfaceMVK)(VkInstance instance, const VkMacOSSurfaceCreateInfoMVK* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);

				#ifndef VK_NO_PROTOTYPES

				VKAPI_ATTR VkResult VKAPI_CALL vkCreateMacOSSurfaceMVK(

				    VkInstance                                  instance,

				    const VkMacOSSurfaceCreateInfoMVK*          pCreateInfo,

				    const VkAllocationCallbacks*                pAllocator,

				    VkSurfaceKHR*                               pSurface);

				#endif

				#ifdef __cplusplus

				}

				#endif

				#endif

									
										65

include/vulkan/vulkan_mir.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,65 @@

				#ifndef VULKAN_MIR_H_

				#define VULKAN_MIR_H_ 1

				#ifdef __cplusplus

				extern "C" {

				#endif

				/*

				** Copyright (c) 2015-2018 The Khronos Group Inc.

				**

				** Licensed under the Apache License, Version 2.0 (the "License");

				** you may not use this file except in compliance with the License.

				** You may obtain a copy of the License at

				**

				**     http://www.apache.org/licenses/LICENSE-2.0

				**

				** Unless required by applicable law or agreed to in writing, software

				** distributed under the License is distributed on an "AS IS" BASIS,

				** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

				** See the License for the specific language governing permissions and

				** limitations under the License.

				*/

				/*

				** This header is generated from the Khronos Vulkan XML API Registry.

				**

				*/

				#define VK_KHR_mir_surface 1

				#define VK_KHR_MIR_SURFACE_SPEC_VERSION   4

				#define VK_KHR_MIR_SURFACE_EXTENSION_NAME "VK_KHR_mir_surface"

				typedef VkFlags VkMirSurfaceCreateFlagsKHR;

				typedef struct VkMirSurfaceCreateInfoKHR {

				    VkStructureType               sType;

				    const void*                   pNext;

				    VkMirSurfaceCreateFlagsKHR    flags;

				    MirConnection*                connection;

				    MirSurface*                   mirSurface;

				} VkMirSurfaceCreateInfoKHR;

				typedef VkResult (VKAPI_PTR *PFN_vkCreateMirSurfaceKHR)(VkInstance instance, const VkMirSurfaceCreateInfoKHR* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);

				typedef VkBool32 (VKAPI_PTR *PFN_vkGetPhysicalDeviceMirPresentationSupportKHR)(VkPhysicalDevice physicalDevice, uint32_t queueFamilyIndex, MirConnection* connection);

				#ifndef VK_NO_PROTOTYPES

				VKAPI_ATTR VkResult VKAPI_CALL vkCreateMirSurfaceKHR(

				    VkInstance                                  instance,

				    const VkMirSurfaceCreateInfoKHR*            pCreateInfo,

				    const VkAllocationCallbacks*                pAllocator,

				    VkSurfaceKHR*                               pSurface);

				VKAPI_ATTR VkBool32 VKAPI_CALL vkGetPhysicalDeviceMirPresentationSupportKHR(

				    VkPhysicalDevice                            physicalDevice,

				    uint32_t                                    queueFamilyIndex,

				    MirConnection*                              connection);

				#endif

				#ifdef __cplusplus

				}

				#endif

				#endif

									
										58

include/vulkan/vulkan_vi.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,58 @@

				#ifndef VULKAN_VI_H_

				#define VULKAN_VI_H_ 1

				#ifdef __cplusplus

				extern "C" {

				#endif

				/*

				** Copyright (c) 2015-2018 The Khronos Group Inc.

				**

				** Licensed under the Apache License, Version 2.0 (the "License");

				** you may not use this file except in compliance with the License.

				** You may obtain a copy of the License at

				**

				**     http://www.apache.org/licenses/LICENSE-2.0

				**

				** Unless required by applicable law or agreed to in writing, software

				** distributed under the License is distributed on an "AS IS" BASIS,

				** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

				** See the License for the specific language governing permissions and

				** limitations under the License.

				*/

				/*

				** This header is generated from the Khronos Vulkan XML API Registry.

				**

				*/

				#define VK_NN_vi_surface 1

				#define VK_NN_VI_SURFACE_SPEC_VERSION     1

				#define VK_NN_VI_SURFACE_EXTENSION_NAME   "VK_NN_vi_surface"

				typedef VkFlags VkViSurfaceCreateFlagsNN;

				typedef struct VkViSurfaceCreateInfoNN {

				    VkStructureType             sType;

				    const void*                 pNext;

				    VkViSurfaceCreateFlagsNN    flags;

				    void*                       window;

				} VkViSurfaceCreateInfoNN;

				typedef VkResult (VKAPI_PTR *PFN_vkCreateViSurfaceNN)(VkInstance instance, const VkViSurfaceCreateInfoNN* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);

				#ifndef VK_NO_PROTOTYPES

				VKAPI_ATTR VkResult VKAPI_CALL vkCreateViSurfaceNN(

				    VkInstance                                  instance,

				    const VkViSurfaceCreateInfoNN*              pCreateInfo,

				    const VkAllocationCallbacks*                pAllocator,

				    VkSurfaceKHR*                               pSurface);

				#endif

				#ifdef __cplusplus

				}

				#endif

				#endif

									
										65

include/vulkan/vulkan_wayland.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,65 @@

				#ifndef VULKAN_WAYLAND_H_

				#define VULKAN_WAYLAND_H_ 1

				#ifdef __cplusplus

				extern "C" {

				#endif

				/*

				** Copyright (c) 2015-2018 The Khronos Group Inc.

				**

				** Licensed under the Apache License, Version 2.0 (the "License");

				** you may not use this file except in compliance with the License.

				** You may obtain a copy of the License at

				**

				**     http://www.apache.org/licenses/LICENSE-2.0

				**

				** Unless required by applicable law or agreed to in writing, software

				** distributed under the License is distributed on an "AS IS" BASIS,

				** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

				** See the License for the specific language governing permissions and

				** limitations under the License.

				*/

				/*

				** This header is generated from the Khronos Vulkan XML API Registry.

				**

				*/

				#define VK_KHR_wayland_surface 1

				#define VK_KHR_WAYLAND_SURFACE_SPEC_VERSION 6

				#define VK_KHR_WAYLAND_SURFACE_EXTENSION_NAME "VK_KHR_wayland_surface"

				typedef VkFlags VkWaylandSurfaceCreateFlagsKHR;

				typedef struct VkWaylandSurfaceCreateInfoKHR {

				    VkStructureType                   sType;

				    const void*                       pNext;

				    VkWaylandSurfaceCreateFlagsKHR    flags;

				    struct wl_display*                display;

				    struct wl_surface*                surface;

				} VkWaylandSurfaceCreateInfoKHR;

				typedef VkResult (VKAPI_PTR *PFN_vkCreateWaylandSurfaceKHR)(VkInstance instance, const VkWaylandSurfaceCreateInfoKHR* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);

				typedef VkBool32 (VKAPI_PTR *PFN_vkGetPhysicalDeviceWaylandPresentationSupportKHR)(VkPhysicalDevice physicalDevice, uint32_t queueFamilyIndex, struct wl_display* display);

				#ifndef VK_NO_PROTOTYPES

				VKAPI_ATTR VkResult VKAPI_CALL vkCreateWaylandSurfaceKHR(

				    VkInstance                                  instance,

				    const VkWaylandSurfaceCreateInfoKHR*        pCreateInfo,

				    const VkAllocationCallbacks*                pAllocator,

				    VkSurfaceKHR*                               pSurface);

				VKAPI_ATTR VkBool32 VKAPI_CALL vkGetPhysicalDeviceWaylandPresentationSupportKHR(

				    VkPhysicalDevice                            physicalDevice,

				    uint32_t                                    queueFamilyIndex,

				    struct wl_display*                          display);

				#endif

				#ifdef __cplusplus

				}

				#endif

				#endif

									
										276

include/vulkan/vulkan_win32.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,276 @@

				#ifndef VULKAN_WIN32_H_

				#define VULKAN_WIN32_H_ 1

				#ifdef __cplusplus

				extern "C" {

				#endif

				/*

				** Copyright (c) 2015-2018 The Khronos Group Inc.

				**

				** Licensed under the Apache License, Version 2.0 (the "License");

				** you may not use this file except in compliance with the License.

				** You may obtain a copy of the License at

				**

				**     http://www.apache.org/licenses/LICENSE-2.0

				**

				** Unless required by applicable law or agreed to in writing, software

				** distributed under the License is distributed on an "AS IS" BASIS,

				** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

				** See the License for the specific language governing permissions and

				** limitations under the License.

				*/

				/*

				** This header is generated from the Khronos Vulkan XML API Registry.

				**

				*/

				#define VK_KHR_win32_surface 1

				#define VK_KHR_WIN32_SURFACE_SPEC_VERSION 6

				#define VK_KHR_WIN32_SURFACE_EXTENSION_NAME "VK_KHR_win32_surface"

				typedef VkFlags VkWin32SurfaceCreateFlagsKHR;

				typedef struct VkWin32SurfaceCreateInfoKHR {

				    VkStructureType                 sType;

				    const void*                     pNext;

				    VkWin32SurfaceCreateFlagsKHR    flags;

				    HINSTANCE                       hinstance;

				    HWND                            hwnd;

				} VkWin32SurfaceCreateInfoKHR;

				typedef VkResult (VKAPI_PTR *PFN_vkCreateWin32SurfaceKHR)(VkInstance instance, const VkWin32SurfaceCreateInfoKHR* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);

				typedef VkBool32 (VKAPI_PTR *PFN_vkGetPhysicalDeviceWin32PresentationSupportKHR)(VkPhysicalDevice physicalDevice, uint32_t queueFamilyIndex);

				#ifndef VK_NO_PROTOTYPES

				VKAPI_ATTR VkResult VKAPI_CALL vkCreateWin32SurfaceKHR(

				    VkInstance                                  instance,

				    const VkWin32SurfaceCreateInfoKHR*          pCreateInfo,

				    const VkAllocationCallbacks*                pAllocator,

				    VkSurfaceKHR*                               pSurface);

				VKAPI_ATTR VkBool32 VKAPI_CALL vkGetPhysicalDeviceWin32PresentationSupportKHR(

				    VkPhysicalDevice                            physicalDevice,

				    uint32_t                                    queueFamilyIndex);

				#endif

				#define VK_KHR_external_memory_win32 1

				#define VK_KHR_EXTERNAL_MEMORY_WIN32_SPEC_VERSION 1

				#define VK_KHR_EXTERNAL_MEMORY_WIN32_EXTENSION_NAME "VK_KHR_external_memory_win32"

				typedef struct VkImportMemoryWin32HandleInfoKHR {

				    VkStructureType                       sType;

				    const void*                           pNext;

				    VkExternalMemoryHandleTypeFlagBits    handleType;

				    HANDLE                                handle;

				    LPCWSTR                               name;

				} VkImportMemoryWin32HandleInfoKHR;

				typedef struct VkExportMemoryWin32HandleInfoKHR {

				    VkStructureType               sType;

				    const void*                   pNext;

				    const SECURITY_ATTRIBUTES*    pAttributes;

				    DWORD                         dwAccess;

				    LPCWSTR                       name;

				} VkExportMemoryWin32HandleInfoKHR;

				typedef struct VkMemoryWin32HandlePropertiesKHR {

				    VkStructureType    sType;

				    void*              pNext;

				    uint32_t           memoryTypeBits;

				} VkMemoryWin32HandlePropertiesKHR;

				typedef struct VkMemoryGetWin32HandleInfoKHR {

				    VkStructureType                       sType;

				    const void*                           pNext;

				    VkDeviceMemory                        memory;

				    VkExternalMemoryHandleTypeFlagBits    handleType;

				} VkMemoryGetWin32HandleInfoKHR;

				typedef VkResult (VKAPI_PTR *PFN_vkGetMemoryWin32HandleKHR)(VkDevice device, const VkMemoryGetWin32HandleInfoKHR* pGetWin32HandleInfo, HANDLE* pHandle);

				typedef VkResult (VKAPI_PTR *PFN_vkGetMemoryWin32HandlePropertiesKHR)(VkDevice device, VkExternalMemoryHandleTypeFlagBits handleType, HANDLE handle, VkMemoryWin32HandlePropertiesKHR* pMemoryWin32HandleProperties);

				#ifndef VK_NO_PROTOTYPES

				VKAPI_ATTR VkResult VKAPI_CALL vkGetMemoryWin32HandleKHR(

				    VkDevice                                    device,

				    const VkMemoryGetWin32HandleInfoKHR*        pGetWin32HandleInfo,

				    HANDLE*                                     pHandle);

				VKAPI_ATTR VkResult VKAPI_CALL vkGetMemoryWin32HandlePropertiesKHR(

				    VkDevice                                    device,

				    VkExternalMemoryHandleTypeFlagBits          handleType,

				    HANDLE                                      handle,

				    VkMemoryWin32HandlePropertiesKHR*           pMemoryWin32HandleProperties);

				#endif

				#define VK_KHR_win32_keyed_mutex 1

				#define VK_KHR_WIN32_KEYED_MUTEX_SPEC_VERSION 1

				#define VK_KHR_WIN32_KEYED_MUTEX_EXTENSION_NAME "VK_KHR_win32_keyed_mutex"

				typedef struct VkWin32KeyedMutexAcquireReleaseInfoKHR {

				    VkStructureType          sType;

				    const void*              pNext;

				    uint32_t                 acquireCount;

				    const VkDeviceMemory*    pAcquireSyncs;

				    const uint64_t*          pAcquireKeys;

				    const uint32_t*          pAcquireTimeouts;

				    uint32_t                 releaseCount;

				    const VkDeviceMemory*    pReleaseSyncs;

				    const uint64_t*          pReleaseKeys;

				} VkWin32KeyedMutexAcquireReleaseInfoKHR;

				#define VK_KHR_external_semaphore_win32 1

				#define VK_KHR_EXTERNAL_SEMAPHORE_WIN32_SPEC_VERSION 1

				#define VK_KHR_EXTERNAL_SEMAPHORE_WIN32_EXTENSION_NAME "VK_KHR_external_semaphore_win32"

				typedef struct VkImportSemaphoreWin32HandleInfoKHR {

				    VkStructureType                          sType;

				    const void*                              pNext;

				    VkSemaphore                              semaphore;

				    VkSemaphoreImportFlags                   flags;

				    VkExternalSemaphoreHandleTypeFlagBits    handleType;

				    HANDLE                                   handle;

				    LPCWSTR                                  name;

				} VkImportSemaphoreWin32HandleInfoKHR;

				typedef struct VkExportSemaphoreWin32HandleInfoKHR {

				    VkStructureType               sType;

				    const void*                   pNext;

				    const SECURITY_ATTRIBUTES*    pAttributes;

				    DWORD                         dwAccess;

				    LPCWSTR                       name;

				} VkExportSemaphoreWin32HandleInfoKHR;

				typedef struct VkD3D12FenceSubmitInfoKHR {

				    VkStructureType    sType;

				    const void*        pNext;

				    uint32_t           waitSemaphoreValuesCount;

				    const uint64_t*    pWaitSemaphoreValues;

				    uint32_t           signalSemaphoreValuesCount;

				    const uint64_t*    pSignalSemaphoreValues;

				} VkD3D12FenceSubmitInfoKHR;

				typedef struct VkSemaphoreGetWin32HandleInfoKHR {

				    VkStructureType                          sType;

				    const void*                              pNext;

				    VkSemaphore                              semaphore;

				    VkExternalSemaphoreHandleTypeFlagBits    handleType;

				} VkSemaphoreGetWin32HandleInfoKHR;

				typedef VkResult (VKAPI_PTR *PFN_vkImportSemaphoreWin32HandleKHR)(VkDevice device, const VkImportSemaphoreWin32HandleInfoKHR* pImportSemaphoreWin32HandleInfo);

				typedef VkResult (VKAPI_PTR *PFN_vkGetSemaphoreWin32HandleKHR)(VkDevice device, const VkSemaphoreGetWin32HandleInfoKHR* pGetWin32HandleInfo, HANDLE* pHandle);

				#ifndef VK_NO_PROTOTYPES

				VKAPI_ATTR VkResult VKAPI_CALL vkImportSemaphoreWin32HandleKHR(

				    VkDevice                                    device,

				    const VkImportSemaphoreWin32HandleInfoKHR*  pImportSemaphoreWin32HandleInfo);

				VKAPI_ATTR VkResult VKAPI_CALL vkGetSemaphoreWin32HandleKHR(

				    VkDevice                                    device,

				    const VkSemaphoreGetWin32HandleInfoKHR*     pGetWin32HandleInfo,

				    HANDLE*                                     pHandle);

				#endif

				#define VK_KHR_external_fence_win32 1

				#define VK_KHR_EXTERNAL_FENCE_WIN32_SPEC_VERSION 1

				#define VK_KHR_EXTERNAL_FENCE_WIN32_EXTENSION_NAME "VK_KHR_external_fence_win32"

				typedef struct VkImportFenceWin32HandleInfoKHR {

				    VkStructureType                      sType;

				    const void*                          pNext;

				    VkFence                              fence;

				    VkFenceImportFlags                   flags;

				    VkExternalFenceHandleTypeFlagBits    handleType;

				    HANDLE                               handle;

				    LPCWSTR                              name;

				} VkImportFenceWin32HandleInfoKHR;

				typedef struct VkExportFenceWin32HandleInfoKHR {

				    VkStructureType               sType;

				    const void*                   pNext;

				    const SECURITY_ATTRIBUTES*    pAttributes;

				    DWORD                         dwAccess;

				    LPCWSTR                       name;

				} VkExportFenceWin32HandleInfoKHR;

				typedef struct VkFenceGetWin32HandleInfoKHR {

				    VkStructureType                      sType;

				    const void*                          pNext;

				    VkFence                              fence;

				    VkExternalFenceHandleTypeFlagBits    handleType;

				} VkFenceGetWin32HandleInfoKHR;

				typedef VkResult (VKAPI_PTR *PFN_vkImportFenceWin32HandleKHR)(VkDevice device, const VkImportFenceWin32HandleInfoKHR* pImportFenceWin32HandleInfo);

				typedef VkResult (VKAPI_PTR *PFN_vkGetFenceWin32HandleKHR)(VkDevice device, const VkFenceGetWin32HandleInfoKHR* pGetWin32HandleInfo, HANDLE* pHandle);

				#ifndef VK_NO_PROTOTYPES

				VKAPI_ATTR VkResult VKAPI_CALL vkImportFenceWin32HandleKHR(

				    VkDevice                                    device,

				    const VkImportFenceWin32HandleInfoKHR*      pImportFenceWin32HandleInfo);

				VKAPI_ATTR VkResult VKAPI_CALL vkGetFenceWin32HandleKHR(

				    VkDevice                                    device,

				    const VkFenceGetWin32HandleInfoKHR*         pGetWin32HandleInfo,

				    HANDLE*                                     pHandle);

				#endif

				#define VK_NV_external_memory_win32 1

				#define VK_NV_EXTERNAL_MEMORY_WIN32_SPEC_VERSION 1

				#define VK_NV_EXTERNAL_MEMORY_WIN32_EXTENSION_NAME "VK_NV_external_memory_win32"

				typedef struct VkImportMemoryWin32HandleInfoNV {

				    VkStructureType                      sType;

				    const void*                          pNext;

				    VkExternalMemoryHandleTypeFlagsNV    handleType;

				    HANDLE                               handle;

				} VkImportMemoryWin32HandleInfoNV;

				typedef struct VkExportMemoryWin32HandleInfoNV {

				    VkStructureType               sType;

				    const void*                   pNext;

				    const SECURITY_ATTRIBUTES*    pAttributes;

				    DWORD                         dwAccess;

				} VkExportMemoryWin32HandleInfoNV;

				typedef VkResult (VKAPI_PTR *PFN_vkGetMemoryWin32HandleNV)(VkDevice device, VkDeviceMemory memory, VkExternalMemoryHandleTypeFlagsNV handleType, HANDLE* pHandle);

				#ifndef VK_NO_PROTOTYPES

				VKAPI_ATTR VkResult VKAPI_CALL vkGetMemoryWin32HandleNV(

				    VkDevice                                    device,

				    VkDeviceMemory                              memory,

				    VkExternalMemoryHandleTypeFlagsNV           handleType,

				    HANDLE*                                     pHandle);

				#endif

				#define VK_NV_win32_keyed_mutex 1

				#define VK_NV_WIN32_KEYED_MUTEX_SPEC_VERSION 1

				#define VK_NV_WIN32_KEYED_MUTEX_EXTENSION_NAME "VK_NV_win32_keyed_mutex"

				typedef struct VkWin32KeyedMutexAcquireReleaseInfoNV {

				    VkStructureType          sType;

				    const void*              pNext;

				    uint32_t                 acquireCount;

				    const VkDeviceMemory*    pAcquireSyncs;

				    const uint64_t*          pAcquireKeys;

				    const uint32_t*          pAcquireTimeoutMilliseconds;

				    uint32_t                 releaseCount;

				    const VkDeviceMemory*    pReleaseSyncs;

				    const uint64_t*          pReleaseKeys;

				} VkWin32KeyedMutexAcquireReleaseInfoNV;

				#ifdef __cplusplus

				}

				#endif

				#endif

									
										66

include/vulkan/vulkan_xcb.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,66 @@

				#ifndef VULKAN_XCB_H_

				#define VULKAN_XCB_H_ 1

				#ifdef __cplusplus

				extern "C" {

				#endif

				/*

				** Copyright (c) 2015-2018 The Khronos Group Inc.

				**

				** Licensed under the Apache License, Version 2.0 (the "License");

				** you may not use this file except in compliance with the License.

				** You may obtain a copy of the License at

				**

				**     http://www.apache.org/licenses/LICENSE-2.0

				**

				** Unless required by applicable law or agreed to in writing, software

				** distributed under the License is distributed on an "AS IS" BASIS,

				** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

				** See the License for the specific language governing permissions and

				** limitations under the License.

				*/

				/*

				** This header is generated from the Khronos Vulkan XML API Registry.

				**

				*/

				#define VK_KHR_xcb_surface 1

				#define VK_KHR_XCB_SURFACE_SPEC_VERSION   6

				#define VK_KHR_XCB_SURFACE_EXTENSION_NAME "VK_KHR_xcb_surface"

				typedef VkFlags VkXcbSurfaceCreateFlagsKHR;

				typedef struct VkXcbSurfaceCreateInfoKHR {

				    VkStructureType               sType;

				    const void*                   pNext;

				    VkXcbSurfaceCreateFlagsKHR    flags;

				    xcb_connection_t*             connection;

				    xcb_window_t                  window;

				} VkXcbSurfaceCreateInfoKHR;

				typedef VkResult (VKAPI_PTR *PFN_vkCreateXcbSurfaceKHR)(VkInstance instance, const VkXcbSurfaceCreateInfoKHR* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);

				typedef VkBool32 (VKAPI_PTR *PFN_vkGetPhysicalDeviceXcbPresentationSupportKHR)(VkPhysicalDevice physicalDevice, uint32_t queueFamilyIndex, xcb_connection_t* connection, xcb_visualid_t visual_id);

				#ifndef VK_NO_PROTOTYPES

				VKAPI_ATTR VkResult VKAPI_CALL vkCreateXcbSurfaceKHR(

				    VkInstance                                  instance,

				    const VkXcbSurfaceCreateInfoKHR*            pCreateInfo,

				    const VkAllocationCallbacks*                pAllocator,

				    VkSurfaceKHR*                               pSurface);

				VKAPI_ATTR VkBool32 VKAPI_CALL vkGetPhysicalDeviceXcbPresentationSupportKHR(

				    VkPhysicalDevice                            physicalDevice,

				    uint32_t                                    queueFamilyIndex,

				    xcb_connection_t*                           connection,

				    xcb_visualid_t                              visual_id);

				#endif

				#ifdef __cplusplus

				}

				#endif

				#endif

									
										66

include/vulkan/vulkan_xlib.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,66 @@

				#ifndef VULKAN_XLIB_H_

				#define VULKAN_XLIB_H_ 1

				#ifdef __cplusplus

				extern "C" {

				#endif

				/*

				** Copyright (c) 2015-2018 The Khronos Group Inc.

				**

				** Licensed under the Apache License, Version 2.0 (the "License");

				** you may not use this file except in compliance with the License.

				** You may obtain a copy of the License at

				**

				**     http://www.apache.org/licenses/LICENSE-2.0

				**

				** Unless required by applicable law or agreed to in writing, software

				** distributed under the License is distributed on an "AS IS" BASIS,

				** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

				** See the License for the specific language governing permissions and

				** limitations under the License.

				*/

				/*

				** This header is generated from the Khronos Vulkan XML API Registry.

				**

				*/

				#define VK_KHR_xlib_surface 1

				#define VK_KHR_XLIB_SURFACE_SPEC_VERSION  6

				#define VK_KHR_XLIB_SURFACE_EXTENSION_NAME "VK_KHR_xlib_surface"

				typedef VkFlags VkXlibSurfaceCreateFlagsKHR;

				typedef struct VkXlibSurfaceCreateInfoKHR {

				    VkStructureType                sType;

				    const void*                    pNext;

				    VkXlibSurfaceCreateFlagsKHR    flags;

				    Display*                       dpy;

				    Window                         window;

				} VkXlibSurfaceCreateInfoKHR;

				typedef VkResult (VKAPI_PTR *PFN_vkCreateXlibSurfaceKHR)(VkInstance instance, const VkXlibSurfaceCreateInfoKHR* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSurfaceKHR* pSurface);

				typedef VkBool32 (VKAPI_PTR *PFN_vkGetPhysicalDeviceXlibPresentationSupportKHR)(VkPhysicalDevice physicalDevice, uint32_t queueFamilyIndex, Display* dpy, VisualID visualID);

				#ifndef VK_NO_PROTOTYPES

				VKAPI_ATTR VkResult VKAPI_CALL vkCreateXlibSurfaceKHR(

				    VkInstance                                  instance,

				    const VkXlibSurfaceCreateInfoKHR*           pCreateInfo,

				    const VkAllocationCallbacks*                pAllocator,

				    VkSurfaceKHR*                               pSurface);

				VKAPI_ATTR VkBool32 VKAPI_CALL vkGetPhysicalDeviceXlibPresentationSupportKHR(

				    VkPhysicalDevice                            physicalDevice,

				    uint32_t                                    queueFamilyIndex,

				    Display*                                    dpy,

				    VisualID                                    visualID);

				#endif

				#ifdef __cplusplus

				}

				#endif

				#endif

									
										54

include/vulkan/vulkan_xlib_randr.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,54 @@

				#ifndef VULKAN_XLIB_RANDR_H_

				#define VULKAN_XLIB_RANDR_H_ 1

				#ifdef __cplusplus

				extern "C" {

				#endif

				/*

				** Copyright (c) 2015-2017 The Khronos Group Inc.

				**

				** Licensed under the Apache License, Version 2.0 (the "License");

				** you may not use this file except in compliance with the License.

				** You may obtain a copy of the License at

				**

				**     http://www.apache.org/licenses/LICENSE-2.0

				**

				** Unless required by applicable law or agreed to in writing, software

				** distributed under the License is distributed on an "AS IS" BASIS,

				** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

				** See the License for the specific language governing permissions and

				** limitations under the License.

				*/

				/*

				** This header is generated from the Khronos Vulkan XML API Registry.

				**

				*/

				#define VK_EXT_acquire_xlib_display 1

				#define VK_EXT_ACQUIRE_XLIB_DISPLAY_SPEC_VERSION 1

				#define VK_EXT_ACQUIRE_XLIB_DISPLAY_EXTENSION_NAME "VK_EXT_acquire_xlib_display"

				typedef VkResult (VKAPI_PTR *PFN_vkAcquireXlibDisplayEXT)(VkPhysicalDevice physicalDevice, Display* dpy, VkDisplayKHR display);

				typedef VkResult (VKAPI_PTR *PFN_vkGetRandROutputDisplayEXT)(VkPhysicalDevice physicalDevice, Display* dpy, RROutput rrOutput, VkDisplayKHR* pDisplay);

				#ifndef VK_NO_PROTOTYPES

				VKAPI_ATTR VkResult VKAPI_CALL vkAcquireXlibDisplayEXT(

				    VkPhysicalDevice                            physicalDevice,

				    Display*                                    dpy,

				    VkDisplayKHR                                display);

				VKAPI_ATTR VkResult VKAPI_CALL vkGetRandROutputDisplayEXT(

				    VkPhysicalDevice                            physicalDevice,

				    Display*                                    dpy,

				    RROutput                                    rrOutput,

				    VkDisplayKHR*                               pDisplay);

				#endif

				#ifdef __cplusplus

				}

				#endif

				#endif

									
										54

include/vulkan/vulkan_xlib_xrandr.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,54 @@

				#ifndef VULKAN_XLIB_XRANDR_H_

				#define VULKAN_XLIB_XRANDR_H_ 1

				#ifdef __cplusplus

				extern "C" {

				#endif

				/*

				** Copyright (c) 2015-2018 The Khronos Group Inc.

				**

				** Licensed under the Apache License, Version 2.0 (the "License");

				** you may not use this file except in compliance with the License.

				** You may obtain a copy of the License at

				**

				**     http://www.apache.org/licenses/LICENSE-2.0

				**

				** Unless required by applicable law or agreed to in writing, software

				** distributed under the License is distributed on an "AS IS" BASIS,

				** WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

				** See the License for the specific language governing permissions and

				** limitations under the License.

				*/

				/*

				** This header is generated from the Khronos Vulkan XML API Registry.

				**

				*/

				#define VK_EXT_acquire_xlib_display 1

				#define VK_EXT_ACQUIRE_XLIB_DISPLAY_SPEC_VERSION 1

				#define VK_EXT_ACQUIRE_XLIB_DISPLAY_EXTENSION_NAME "VK_EXT_acquire_xlib_display"

				typedef VkResult (VKAPI_PTR *PFN_vkAcquireXlibDisplayEXT)(VkPhysicalDevice physicalDevice, Display* dpy, VkDisplayKHR display);

				typedef VkResult (VKAPI_PTR *PFN_vkGetRandROutputDisplayEXT)(VkPhysicalDevice physicalDevice, Display* dpy, RROutput rrOutput, VkDisplayKHR* pDisplay);

				#ifndef VK_NO_PROTOTYPES

				VKAPI_ATTR VkResult VKAPI_CALL vkAcquireXlibDisplayEXT(

				    VkPhysicalDevice                            physicalDevice,

				    Display*                                    dpy,

				    VkDisplayKHR                                display);

				VKAPI_ATTR VkResult VKAPI_CALL vkGetRandROutputDisplayEXT(

				    VkPhysicalDevice                            physicalDevice,

				    Display*                                    dpy,

				    RROutput                                    rrOutput,

				    VkDisplayKHR*                               pDisplay);

				#endif

				#ifdef __cplusplus

				}

				#endif

				#endif

768

meson.build

View File

File diff suppressed because it is too large Load Diff

									
										78

meson_options.txt
									
												View File
												
				@@ -1,4 +1,4 @@

				# Copyright © 2017 Intel Corporation

				# Copyright © 2017-2018 Intel Corporation

				# Permission is hereby granted, free of charge, to any person obtaining a copy

				# of this software and associated documentation files (the "Software"), to deal

				@@ -20,8 +20,11 @@

				option(

				  'platforms',

				  type : 'string',

				  value : 'auto',

				  type : 'array',

				  value : ['auto'],

				  choices : [

				    '', 'auto', 'x11', 'wayland', 'drm', 'surfaceless', 'haiku', 'android',

				  ],

				  description : 'comma separated list of window systems to support. If this is set to auto all platforms applicable to the OS will be enabled.'

				)

				option(

				@@ -33,9 +36,10 @@ option(

				)

				option(

				  'dri-drivers',

				  type : 'string',

				  value : 'auto',

				  description : 'comma separated list of dri drivers to build. If this is set to auto all drivers applicable to the target OS/architecture will be built'

				  type : 'array',

				  value : ['auto'],

				  choices : ['', 'auto', 'i915', 'i965', 'r100', 'r200', 'nouveau', 'swrast'],

				  description : 'List of dri drivers to build. If this is set to auto all drivers applicable to the target OS/architecture will be built'

				)

				option(

				  'dri-drivers-path',

				@@ -51,9 +55,14 @@ option(

				)

				option(

				  'gallium-drivers',

				  type : 'string',

				  value : 'auto',

				  description : 'comma separated list of gallium drivers to build. If this is set to auto all drivers applicable to the target OS/architecture will be built'

				  type : 'array',

				  value : ['auto'],

				  choices : [

				    '', 'auto', 'pl111', 'radeonsi', 'r300', 'r600', 'nouveau', 'freedreno',

				    'swrast', 'v3d', 'vc4', 'etnaviv', 'imx', 'tegra', 'i915', 'svga', 'virgl',

				    'swr',

				  ],

				  description : 'List of gallium drivers to build. If this is set to auto all drivers applicable to the target OS/architecture will be built'

				)

				option(

				  'gallium-extra-hud',

				@@ -91,8 +100,8 @@ option(

				  'gallium-omx',

				  type : 'combo',

				  value : 'auto',

				  choices : ['auto', 'true', 'false'],

				  description : 'enable gallium omx bellagio state tracker.',

				  choices : ['auto', 'disabled', 'bellagio', 'tizonia'],

				  description : 'enable gallium omx state tracker.',

				)

				option(

				  'omx-libs-path',

				@@ -141,9 +150,10 @@ option(

				)

				option(

				  'vulkan-drivers',

				  type : 'string',

				  value : 'auto',

				  description : 'comma separated list of vulkan drivers to build. If this is set to auto all drivers applicable to the target OS/architecture will be built'

				  type : 'array',

				  value : ['auto'],

				  choices : ['', 'auto', 'amd', 'intel'],

				  description : 'List of vulkan drivers to build. If this is set to auto all drivers applicable to the target OS/architecture will be built'

				)

				option(

				  'shader-cache',

				@@ -214,6 +224,12 @@ option(

				  value : true,

				  description : 'Build assembly code if possible'

				)

				option(

				   'glx-read-only-text',

				   type : 'boolean',

				   value : false,

				   description : 'Disable writable .text section on x86 (decreases performance)'

				)

				option(

				  'llvm',

				  type : 'combo',

				@@ -248,12 +264,6 @@ option(

				  value : false,

				  description : 'Build unit tests. Currently this will build *all* unit tests, which may build more than expected.'

				)

				option(

				  'texture-float',

				  type : 'boolean',

				  value : false,

				  description : 'Enable floating point textures and renderbuffers. This option may be patent encumbered, please read docs/patents.txt and consult with your lawyer before turning this on.'

				)

				option(

				  'selinux',

				  type : 'boolean',

				@@ -276,7 +286,29 @@ option(

				)

				option(

				  'swr-arches',

				  type : 'string',

				  value : 'avx,avx2',

				  description : 'Comma delemited swr architectures. choices : avx,avx2,knl,skx'

				  type : 'array',

				  value : ['avx', 'avx2'],

				  choices : ['avx', 'avx2', 'knl', 'skx'],

				  description : 'Architectures to build SWR support for.',

				)

				option(

				  'tools',

				  type : 'array',

				  value : [],

				  choices : ['freedreno', 'glsl', 'intel', 'nir', 'nouveau', 'xvmc', 'all'],

				  description : 'List of tools to build.',

				)

				option(

				  'power8',

				  type : 'combo',

				  value : 'auto',

				  choices : ['auto', 'true', 'false'],

				  description : 'Enable power8 optimizations.',

				)

				option(

				  'xlib-lease',

				  type : 'combo',

				  value : 'auto',

				  choices : ['auto', 'true', 'false'],

				  description : 'Enable VK_EXT_acquire_xlib_display.'

				)

									
										11

scons/gallium.py
									
												View File
												
				@@ -134,7 +134,9 @@ def check_cc(env, cc, expr, cpp_opt = '-E'):

				    source.write('#if !(%s)\n#error\n#endif\n' % expr)

				    source.close()

				    pipe = SCons.Action._subproc(env, [env['CC'], cpp_opt, source.name],

				    # sys.stderr.write('%r %s %s\n' % (env['CC'], cpp_opt, source.name));

				    pipe = SCons.Action._subproc(env, env.Split(env['CC']) + [cpp_opt, source.name],

				                                 stdin = 'devnull',

				                                 stderr = 'devnull',

				                                 stdout = 'devnull')

				@@ -352,6 +354,9 @@ def generate(env):

				        if check_header(env, 'xlocale.h'):

				            cppdefines += ['HAVE_XLOCALE_H']

				        if check_header(env, 'endian.h'):

				            cppdefines += ['HAVE_ENDIAN_H']

				        if check_functions(env, ['strtod_l', 'strtof_l']):

				            cppdefines += ['HAVE_STRTOD_L']

				@@ -387,10 +392,6 @@ def generate(env):

				        cppdefines += ['PIPE_SUBSYSTEM_WINDOWS_USER']

				    if env['embedded']:

				        cppdefines += ['PIPE_SUBSYSTEM_EMBEDDED']

				    if env['texture_float']:

				        print('warning: Floating-point textures enabled.')

				        print('warning: Please consult docs/patents.txt with your lawyer before building Mesa.')

				        cppdefines += ['TEXTURE_FLOAT_ENABLED']

				    env.Append(CPPDEFINES = cppdefines)

				    # C compiler options

									
										9

scons/llvm.py
									
												View File
												
				@@ -123,6 +123,10 @@ def generate(env):

				                'LLVMDemangle', 'LLVMGlobalISel', 'LLVMDebugInfoMSF',

				                'LLVMBinaryFormat',

				            ])

				            if env['platform'] == 'windows' and env['crosscompile']:

				                # LLVM 5.0 requires MinGW w/ pthreads due to use of std::thread and friends.

				                assert env['gcc']

				                env['CXX'] = env['CXX'] + '-posix'

				        elif llvm_version >= distutils.version.LooseVersion('4.0'):

				            env.Prepend(LIBS = [

				                'LLVMX86Disassembler', 'LLVMX86AsmParser',

				@@ -211,8 +215,11 @@ def generate(env):

				            'imagehlp',

				            'psapi',

				            'shell32',

				            'advapi32'

				            'advapi32',

				            'ole32',

				            'uuid',

				        ])

				        if env['msvc']:

				            # Some of the LLVM C headers use the inline keyword without

				            # defining it.

									
										6

src/Makefile.am
									
												View File
												
				@@ -67,7 +67,6 @@ SUBDIRS += vulkan

				endif

				EXTRA_DIST += vulkan/registry/vk.xml

				EXTRA_DIST += vulkan/registry/vk_android_native_buffer.xml

				if HAVE_AMD_DRIVERS

				SUBDIRS += amd

				@@ -96,11 +95,6 @@ if HAVE_GBM

				SUBDIRS += gbm

				endif

				## Optionally required by EGL

				if HAVE_PLATFORM_WAYLAND

				SUBDIRS += egl/wayland/wayland-egl

				endif

				if HAVE_EGL

				SUBDIRS += egl

				endif

									
										1

src/amd/Android.mk
									
												View File
												
				@@ -27,3 +27,4 @@ include $(LOCAL_PATH)/Makefile.sources

				include $(LOCAL_PATH)/Android.addrlib.mk

				include $(LOCAL_PATH)/Android.common.mk

				include $(LOCAL_PATH)/vulkan/Android.mk

									
										1

src/amd/Makefile.addrlib.am
									
												View File
												
				@@ -22,6 +22,7 @@

				ADDRLIB_LIBS = addrlib/libamdgpu_addrlib.la

				addrlib_libamdgpu_addrlib_la_CPPFLAGS = \

					$(DEFINES) \

					-I$(top_srcdir)/src/ \

					-I$(srcdir)/common \

					-I$(srcdir)/addrlib \

									
										2

src/amd/Makefile.sources
									
												View File
												
				@@ -45,8 +45,6 @@ AMD_COMPILER_FILES = \

					common/ac_llvm_util.c \

					common/ac_llvm_util.h \

					common/ac_shader_abi.h \

					common/ac_shader_info.c \

					common/ac_shader_info.h \

					common/ac_shader_util.c \

					common/ac_shader_util.h

									
										32

src/amd/addrlib/addrinterface.cpp
									
												View File
												
				@@ -1054,7 +1054,7 @@ ADDR_E_RETURNCODE ADDR_API AddrComputePrtInfo(

				*/

				ADDR_E_RETURNCODE ADDR_API AddrGetMaxAlignments(

				    ADDR_HANDLE                     hLib, ///< address lib handle

				    ADDR_GET_MAX_ALIGNMENTS_OUTPUT* pOut) ///< [out] output structure

				    ADDR_GET_MAX_ALINGMENTS_OUTPUT* pOut) ///< [out] output structure

				{

				    Addr::Lib* pLib = Lib::GetLib(hLib);

				@@ -1072,6 +1072,36 @@ ADDR_E_RETURNCODE ADDR_API AddrGetMaxAlignments(

				    return returnCode;

				}

				/**

				****************************************************************************************************

				*   AddrGetMaxMetaAlignments

				*

				*   @brief

				*       Convert maximum alignments for metadata

				*

				*   @return

				*       ADDR_OK if successful, otherwise an error code of ADDR_E_RETURNCODE

				****************************************************************************************************

				*/

				ADDR_E_RETURNCODE ADDR_API AddrGetMaxMetaAlignments(

				    ADDR_HANDLE                     hLib, ///< address lib handle

				    ADDR_GET_MAX_ALINGMENTS_OUTPUT* pOut) ///< [out] output structure

				{

				    Addr::Lib* pLib = Lib::GetLib(hLib);

				    ADDR_E_RETURNCODE returnCode = ADDR_OK;

				    if (pLib != NULL)

				    {

				        returnCode = pLib->GetMaxMetaAlignments(pOut);

				    }

				    else

				    {

				        returnCode = ADDR_ERROR;

				    }

				    return returnCode;

				}

				////////////////////////////////////////////////////////////////////////////////////////////////////

									
										62

src/amd/addrlib/addrinterface.h
									
												View File
												
				@@ -528,7 +528,8 @@ typedef union _ADDR_SURFACE_FLAGS

				        UINT_32 preferEquation       : 1; ///< Return equation index without adjusting tile mode

				        UINT_32 matchStencilTileCfg  : 1; ///< Select tile index of stencil as well as depth surface

				                                          ///  to make sure they share same tile config parameters

				        UINT_32 reserved             : 2; ///< Reserved bits

				        UINT_32 disallowLargeThickDegrade   : 1;    ///< Disallow large thick tile degrade

				        UINT_32 reserved             : 1; ///< Reserved bits

				    };

				    UINT_32 value;

				@@ -2273,7 +2274,7 @@ typedef struct _ADDR_COMPUTE_DCCINFO_INPUT

				typedef struct _ADDR_COMPUTE_DCCINFO_OUTPUT

				{

				    UINT_32 size;                 ///< Size of this structure in bytes

				    UINT_64 dccRamBaseAlign;      ///< Base alignment of dcc key

				    UINT_32 dccRamBaseAlign;      ///< Base alignment of dcc key

				    UINT_64 dccRamSize;           ///< Size of dcc key

				    UINT_64 dccFastClearSize;     ///< Size of dcc key portion that can be fast cleared

				    BOOL_32 subLvlCompressible;   ///< Whether sub resource is compressiable

				@@ -2298,17 +2299,17 @@ ADDR_E_RETURNCODE ADDR_API AddrComputeDccInfo(

				/**

				****************************************************************************************************

				*   ADDR_GET_MAX_ALIGNMENTS_OUTPUT

				*   ADDR_GET_MAX_ALINGMENTS_OUTPUT

				*

				*   @brief

				*       Output structure of AddrGetMaxAlignments

				****************************************************************************************************

				*/

				typedef struct _ADDR_GET_MAX_ALIGNMENTS_OUTPUT

				typedef struct _ADDR_GET_MAX_ALINGMENTS_OUTPUT

				{

				    UINT_32 size;                   ///< Size of this structure in bytes

				    UINT_64 baseAlign;              ///< Maximum base alignment in bytes

				} ADDR_GET_MAX_ALIGNMENTS_OUTPUT;

				    UINT_32 baseAlign;              ///< Maximum base alignment in bytes

				} ADDR_GET_MAX_ALINGMENTS_OUTPUT;

				/**

				****************************************************************************************************

				@@ -2320,9 +2321,19 @@ typedef struct _ADDR_GET_MAX_ALIGNMENTS_OUTPUT

				*/

				ADDR_E_RETURNCODE ADDR_API AddrGetMaxAlignments(

				    ADDR_HANDLE                     hLib,

				    ADDR_GET_MAX_ALIGNMENTS_OUTPUT* pOut);

				    ADDR_GET_MAX_ALINGMENTS_OUTPUT* pOut);

				/**

				****************************************************************************************************

				*   AddrGetMaxMetaAlignments

				*

				*   @brief

				*       Gets maximnum alignments for metadata

				****************************************************************************************************

				*/

				ADDR_E_RETURNCODE ADDR_API AddrGetMaxMetaAlignments(

				    ADDR_HANDLE                     hLib,

				    ADDR_GET_MAX_ALINGMENTS_OUTPUT* pOut);

				/**

				****************************************************************************************************

				@@ -2366,22 +2377,25 @@ typedef union _ADDR2_SURFACE_FLAGS

				{

				    struct

				    {

				        UINT_32 color         :  1; ///< This resource is a color buffer, can be used with RTV

				        UINT_32 depth         :  1; ///< Thie resource is a depth buffer, can be used with DSV

				        UINT_32 stencil       :  1; ///< Thie resource is a stencil buffer, can be used with DSV

				        UINT_32 fmask         :  1; ///< This is an fmask surface

				        UINT_32 overlay       :  1; ///< This is an overlay surface

				        UINT_32 display       :  1; ///< This resource is displable, can be used with DRV

				        UINT_32 prt           :  1; ///< This is a partially resident texture

				        UINT_32 qbStereo      :  1; ///< This is a quad buffer stereo surface

				        UINT_32 interleaved   :  1; ///< Special flag for interleaved YUV surface padding

				        UINT_32 texture       :  1; ///< This resource can be used with SRV

				        UINT_32 unordered     :  1; ///< This resource can be used with UAV

				        UINT_32 rotated       :  1; ///< This resource is rotated and displable

				        UINT_32 needEquation  :  1; ///< This resource needs equation to be generated if possible

				        UINT_32 opt4space     :  1; ///< This resource should be optimized for space

				        UINT_32 minimizeAlign :  1; ///< This resource should use minimum alignment

				        UINT_32 reserved      : 17; ///< Reserved bits

				        UINT_32 color             :  1; ///< This resource is a color buffer, can be used with RTV

				        UINT_32 depth             :  1; ///< Thie resource is a depth buffer, can be used with DSV

				        UINT_32 stencil           :  1; ///< Thie resource is a stencil buffer, can be used with DSV

				        UINT_32 fmask             :  1; ///< This is an fmask surface

				        UINT_32 overlay           :  1; ///< This is an overlay surface

				        UINT_32 display           :  1; ///< This resource is displable, can be used with DRV

				        UINT_32 prt               :  1; ///< This is a partially resident texture

				        UINT_32 qbStereo          :  1; ///< This is a quad buffer stereo surface

				        UINT_32 interleaved       :  1; ///< Special flag for interleaved YUV surface padding

				        UINT_32 texture           :  1; ///< This resource can be used with SRV

				        UINT_32 unordered         :  1; ///< This resource can be used with UAV

				        UINT_32 rotated           :  1; ///< This resource is rotated and displable

				        UINT_32 needEquation      :  1; ///< This resource needs equation to be generated if possible

				        UINT_32 opt4space         :  1; ///< This resource should be optimized for space

				        UINT_32 minimizeAlign     :  1; ///< This resource should use minimum alignment

				        UINT_32 noMetadata        :  1; ///< This resource has no metadata

				        UINT_32 metaRbUnaligned   :  1; ///< This resource has rb unaligned metadata

				        UINT_32 metaPipeUnaligned :  1; ///< This resource has pipe unaligned metadata

				        UINT_32 reserved          : 14; ///< Reserved bits

				    };

				    UINT_32 value;

									
										6

src/amd/addrlib/addrtypes.h
									
												View File
												
				@@ -76,7 +76,7 @@ typedef int            INT;

				#ifndef ADDR_STDCALL

				    #if defined(__GNUC__)

				        #if defined(__AMD64__)

				        #if defined(__amd64__) || defined(__x86_64__)

				            #define ADDR_STDCALL

				        #else

				            #define ADDR_STDCALL __attribute__((stdcall))

				@@ -87,7 +87,9 @@ typedef int            INT;

				#endif

				#ifndef ADDR_FASTCALL

				    #if defined(__GNUC__)

				    #if defined(BRAHMA_ARM)

				        #define ADDR_FASTCALL

				    #elif defined(__GNUC__)

				        #if defined(__i386__)

				            #define ADDR_FASTCALL __attribute__((regparm(0)))

				        #else

									
										7

src/amd/addrlib/amdgpu_asic_addr.h
									
												View File
												
				@@ -79,12 +79,15 @@

				#define AMDGPU_POLARIS10_RANGE  0x50, 0x5A

				#define AMDGPU_POLARIS11_RANGE  0x5A, 0x64

				#define AMDGPU_POLARIS12_RANGE  0x64, 0x6E

				#define AMDGPU_VEGAM_RANGE      0x6E, 0xFF

				#define AMDGPU_CARRIZO_RANGE    0x01, 0x21

				#define AMDGPU_BRISTOL_RANGE    0x10, 0x21

				#define AMDGPU_STONEY_RANGE     0x61, 0xFF

				#define AMDGPU_VEGA10_RANGE     0x01, 0x14

				#define AMDGPU_VEGA12_RANGE     0x14, 0x28

				#define AMDGPU_VEGA20_RANGE     0x28, 0xFF

				#define AMDGPU_RAVEN_RANGE      0x01, 0x81

				@@ -116,6 +119,7 @@

				#define ASICREV_IS_POLARIS10_P(r)      ASICREV_IS(r, POLARIS10)

				#define ASICREV_IS_POLARIS11_M(r)      ASICREV_IS(r, POLARIS11)

				#define ASICREV_IS_POLARIS12_V(r)      ASICREV_IS(r, POLARIS12)

				#define ASICREV_IS_VEGAM_P(r)          ASICREV_IS(r, VEGAM)

				#define ASICREV_IS_CARRIZO(r)          ASICREV_IS(r, CARRIZO)

				#define ASICREV_IS_CARRIZO_BRISTOL(r)  ASICREV_IS(r, BRISTOL)

				@@ -123,6 +127,9 @@

				#define ASICREV_IS_VEGA10_M(r)         ASICREV_IS(r, VEGA10)

				#define ASICREV_IS_VEGA10_P(r)         ASICREV_IS(r, VEGA10)

				#define ASICREV_IS_VEGA12_P(r)         ASICREV_IS(r, VEGA12)

				#define ASICREV_IS_VEGA12_p(r)         ASICREV_IS(r, VEGA12)

				#define ASICREV_IS_VEGA20_P(r)         ASICREV_IS(r, VEGA20)

				#define ASICREV_IS_RAVEN(r)            ASICREV_IS(r, RAVEN)

									
										80

src/amd/addrlib/core/addrlib.cpp
									
												View File
												
				@@ -285,10 +285,12 @@ ADDR_E_RETURNCODE Lib::Create(

				    {

				        pCreateOut->numEquations =

				            pLib->HwlGetEquationTableInfo(&pCreateOut->pEquationTable);

				    }

				    if ((pLib == NULL) &&

				        (returnCode == ADDR_OK))

				        pLib->SetMaxAlignments();

				    }

				    else if ((pLib == NULL) &&

				             (returnCode == ADDR_OK))

				    {

				        // Unknown failures, we return the general error code

				        returnCode = ADDR_ERROR;

				@@ -336,6 +338,23 @@ VOID Lib::SetMinPitchAlignPixels(

				    m_minPitchAlignPixels = (minPitchAlignPixels == 0) ? 1 : minPitchAlignPixels;

				}

				/**

				****************************************************************************************************

				*   Lib::SetMaxAlignments

				*

				*   @brief

				*       Set max alignments

				*

				*   @return

				*      N/A

				****************************************************************************************************

				*/

				VOID Lib::SetMaxAlignments()

				{

				    m_maxBaseAlign     = HwlComputeMaxBaseAlignments();

				    m_maxMetaBaseAlign = HwlComputeMaxMetaBaseAlignments();

				}

				/**

				****************************************************************************************************

				*   Lib::GetLib

				@@ -358,21 +377,21 @@ Lib* Lib::GetLib(

				*   Lib::GetMaxAlignments

				*

				*   @brief

				*       Gets maximum alignments

				*       Gets maximum alignments for data surface (include FMask)

				*

				*   @return

				*       ADDR_E_RETURNCODE

				****************************************************************************************************

				*/

				ADDR_E_RETURNCODE Lib::GetMaxAlignments(

				    ADDR_GET_MAX_ALIGNMENTS_OUTPUT* pOut    ///< [out] output structure

				    ADDR_GET_MAX_ALINGMENTS_OUTPUT* pOut    ///< [out] output structure

				    ) const

				{

				    ADDR_E_RETURNCODE returnCode = ADDR_OK;

				    if (GetFillSizeFieldsFlags() == TRUE)

				    {

				        if (pOut->size != sizeof(ADDR_GET_MAX_ALIGNMENTS_OUTPUT))

				        if (pOut->size != sizeof(ADDR_GET_MAX_ALINGMENTS_OUTPUT))

				        {

				            returnCode = ADDR_PARAMSIZEMISMATCH;

				        }

				@@ -380,7 +399,54 @@ ADDR_E_RETURNCODE Lib::GetMaxAlignments(

				    if (returnCode == ADDR_OK)

				    {

				        returnCode = HwlGetMaxAlignments(pOut);

				        if (m_maxBaseAlign != 0)

				        {

				            pOut->baseAlign = m_maxBaseAlign;

				        }

				        else

				        {

				            returnCode = ADDR_NOTIMPLEMENTED;

				        }

				    }

				    return returnCode;

				}

				/**

				****************************************************************************************************

				*   Lib::GetMaxMetaAlignments

				*

				*   @brief

				*       Gets maximum alignments for metadata (CMask, DCC and HTile)

				*

				*   @return

				*       ADDR_E_RETURNCODE

				****************************************************************************************************

				*/

				ADDR_E_RETURNCODE Lib::GetMaxMetaAlignments(

				    ADDR_GET_MAX_ALINGMENTS_OUTPUT* pOut    ///< [out] output structure

				    ) const

				{

				    ADDR_E_RETURNCODE returnCode = ADDR_OK;

				    if (GetFillSizeFieldsFlags() == TRUE)

				    {

				        if (pOut->size != sizeof(ADDR_GET_MAX_ALINGMENTS_OUTPUT))

				        {

				            returnCode = ADDR_PARAMSIZEMISMATCH;

				        }

				    }

				    if (returnCode == ADDR_OK)

				    {

				        if (m_maxMetaBaseAlign != 0)

				        {

				            pOut->baseAlign = m_maxMetaBaseAlign;

				        }

				        else

				        {

				            returnCode = ADDR_NOTIMPLEMENTED;

				        }

				    }

				    return returnCode;

									
										36

src/amd/addrlib/core/addrlib.h
									
												View File
												
				@@ -282,14 +282,38 @@ public:

				    BOOL_32 GetExportNorm(const ELEM_GETEXPORTNORM_INPUT* pIn) const;

				    ADDR_E_RETURNCODE GetMaxAlignments(ADDR_GET_MAX_ALIGNMENTS_OUTPUT* pOut) const;

				    ADDR_E_RETURNCODE GetMaxAlignments(ADDR_GET_MAX_ALINGMENTS_OUTPUT* pOut) const;

				    ADDR_E_RETURNCODE GetMaxMetaAlignments(ADDR_GET_MAX_ALINGMENTS_OUTPUT* pOut) const;

				protected:

				    Lib();  // Constructor is protected

				    Lib(const Client* pClient);

				    /// Pure virtual function to get max alignments

				    virtual ADDR_E_RETURNCODE HwlGetMaxAlignments(ADDR_GET_MAX_ALIGNMENTS_OUTPUT* pOut) const = 0;

				    /// Pure virtual function to get max base alignments

				    virtual UINT_32 HwlComputeMaxBaseAlignments() const = 0;

				    /// Gets maximum alignements for metadata

				    virtual UINT_32 HwlComputeMaxMetaBaseAlignments() const

				    {

				        ADDR_NOT_IMPLEMENTED();

				        return 0;

				    }

				    VOID ValidBaseAlignments(UINT_32 alignment) const

				    {

				#if DEBUG

				        ADDR_ASSERT(alignment <= m_maxBaseAlign);

				#endif

				    }

				    VOID ValidMetaBaseAlignments(UINT_32 metaAlignment) const

				    {

				#if DEBUG

				        ADDR_ASSERT(metaAlignment <= m_maxMetaBaseAlign);

				#endif

				    }

				    //

				    // Initialization

				@@ -341,6 +365,8 @@ private:

				    VOID SetMinPitchAlignPixels(UINT_32 minPitchAlignPixels);

				    VOID SetMaxAlignments();

				protected:

				    LibClass    m_class;        ///< Store class type (HWL type)

				@@ -370,6 +396,10 @@ protected:

				    UINT_32     m_minPitchAlignPixels;  ///< Minimum pitch alignment in pixels

				    UINT_32     m_maxSamples;           ///< Max numSamples

				    UINT_32     m_maxBaseAlign;         ///< Max base alignment for data surface

				    UINT_32     m_maxMetaBaseAlign;     ///< Max base alignment for metadata

				private:

				    ElemLib*    m_pElemLib;             ///< Element Lib pointer

				};

									
										14

src/amd/addrlib/core/addrlib1.cpp
									
												View File
												
				@@ -428,6 +428,8 @@ ADDR_E_RETURNCODE Lib::ComputeSurfaceInfo(

				        }

				    }

				    ValidBaseAlignments(pOut->baseAlign);

				    return returnCode;

				}

				@@ -895,6 +897,8 @@ ADDR_E_RETURNCODE Lib::ComputeFmaskInfo(

				        }

				    }

				    ValidBaseAlignments(pOut->baseAlign);

				    return returnCode;

				}

				@@ -1333,6 +1337,8 @@ ADDR_E_RETURNCODE Lib::ComputeHtileInfo(

				        }

				    }

				    ValidMetaBaseAlignments(pOut->baseAlign);

				    return returnCode;

				}

				@@ -1399,6 +1405,8 @@ ADDR_E_RETURNCODE Lib::ComputeCmaskInfo(

				        }

				    }

				    ValidMetaBaseAlignments(pOut->baseAlign);

				    return returnCode;

				}

				@@ -1443,9 +1451,11 @@ ADDR_E_RETURNCODE Lib::ComputeDccInfo(

				            pIn = &input;

				        }

				        if (ADDR_OK == ret)

				        if (ret == ADDR_OK)

				        {

				            ret = HwlComputeDccInfo(pIn, pOut);

				            ValidMetaBaseAlignments(pOut->dccRamBaseAlign);

				        }

				    }

				@@ -3652,7 +3662,7 @@ VOID Lib::OptimizeTileMode(

				                        tileMode = (thickness == 1) ?

				                                   ADDR_TM_1D_TILED_THIN1 : ADDR_TM_1D_TILED_THICK;

				                    }

				                    else if (thickness > 1)

				                    else if ((thickness > 1) && (pInOut->flags.disallowLargeThickDegrade == 0))

				                    {

				                        // As in the following HwlComputeSurfaceInfo, thick modes may be degraded to

				                        // thinner modes, we should re-evaluate whether the corresponding

									
										10

src/amd/addrlib/core/addrlib2.cpp
									
												View File
												
				@@ -295,6 +295,8 @@ ADDR_E_RETURNCODE Lib::ComputeSurfaceInfo(

				    ADDR_ASSERT(pOut->surfSize != 0);

				    ValidBaseAlignments(pOut->baseAlign);

				    return returnCode;

				}

				@@ -447,6 +449,8 @@ ADDR_E_RETURNCODE Lib::ComputeHtileInfo(

				    else

				    {

				        returnCode = HwlComputeHtileInfo(pIn, pOut);

				        ValidMetaBaseAlignments(pOut->baseAlign);

				    }

				    return returnCode;

				@@ -545,6 +549,8 @@ ADDR_E_RETURNCODE Lib::ComputeCmaskInfo(

				    else

				    {

				        returnCode = HwlComputeCmaskInfo(pIn, pOut);

				        ValidMetaBaseAlignments(pOut->baseAlign);

				    }

				    return returnCode;

				@@ -688,6 +694,8 @@ ADDR_E_RETURNCODE Lib::ComputeFmaskInfo(

				        }

				    }

				    ValidBaseAlignments(pOut->baseAlign);

				    return returnCode;

				}

				@@ -764,6 +772,8 @@ ADDR_E_RETURNCODE Lib::ComputeDccInfo(

				    else

				    {

				        returnCode = HwlComputeDccInfo(pIn, pOut);

				        ValidMetaBaseAlignments(pOut->dccRamBaseAlign);

				    }

				    return returnCode;

Compare commits

4562 Commits mesa-18.0. ... 18.2-branc

2 .mailmap Unescape Escape View File

164 .travis.yml Unescape Escape View File

3 Android.common.mk Unescape Escape View File

10 Makefile.am Unescape Escape View File

79 README.rst Normal file Unescape Escape View File

1 REVIEWERS Unescape Escape View File

6 SConstruct Unescape Escape View File

2 VERSION Unescape Escape View File

10 appveyor.yml Unescape Escape View File

6 bin/.cherry-ignore Unescape Escape View File

2 bin/bugzilla_mesa.sh Unescape Escape View File

4 bin/get-fixes-pick-list.sh Unescape Escape View File

2 bin/get-pick-list.sh Unescape Escape View File

22 bin/install_megadrivers.py Unescape Escape View File

292 configure.ac Unescape Escape View File

4 docs/codingstyle.html Unescape Escape View File

1 docs/egl.html Unescape Escape View File

68 docs/envvars.html Unescape Escape View File

BIN docs/favicon.ico Normal file View File

BIN docs/favicon.png Normal file View File

171 docs/features.txt Unescape Escape View File

118 docs/index.html Unescape Escape View File

79 docs/meson.html Unescape Escape View File

31 docs/patents.txt Unescape Escape View File

2 docs/precompiled.html Unescape Escape View File

66 docs/release-calendar.html Unescape Escape View File

4 docs/releasing.html Unescape Escape View File

20 docs/relnotes.html Unescape Escape View File

275 docs/relnotes/17.3.4.html Normal file Unescape Escape View File

66 docs/relnotes/17.3.5.html Normal file Unescape Escape View File

85 docs/relnotes/17.3.6.html Normal file Unescape Escape View File

312 docs/relnotes/17.3.7.html Normal file Unescape Escape View File

147 docs/relnotes/17.3.8.html Normal file Unescape Escape View File

162 docs/relnotes/17.3.9.html Normal file Unescape Escape View File

76 docs/relnotes/17.4.0.html Unescape Escape View File

321 docs/relnotes/18.0.0.html Normal file Unescape Escape View File

225 docs/relnotes/18.0.1.html Normal file Unescape Escape View File

144 docs/relnotes/18.0.2.html Normal file Unescape Escape View File

107 docs/relnotes/18.0.3.html Normal file Unescape Escape View File

157 docs/relnotes/18.0.4.html Normal file Unescape Escape View File

162 docs/relnotes/18.0.5.html Normal file Unescape Escape View File

268 docs/relnotes/18.1.0.html Normal file Unescape Escape View File

168 docs/relnotes/18.1.1.html Normal file Unescape Escape View File

170 docs/relnotes/18.1.2.html Normal file Unescape Escape View File

167 docs/relnotes/18.1.3.html Normal file Unescape Escape View File

150 docs/relnotes/18.1.4.html Normal file Unescape Escape View File

183 docs/relnotes/18.1.5.html Normal file Unescape Escape View File

75 docs/relnotes/18.2.0.html Normal file Unescape Escape View File

81 docs/specs/MESA_framebuffer_flip_y.txt Normal file Unescape Escape View File

3 docs/specs/enums.txt Unescape Escape View File

12 docs/submittingpatches.html Unescape Escape View File

2 docs/utilities.html Unescape Escape View File

21 docs/viewperf.html Unescape Escape View File

1 include/EGL/eglext.h Unescape Escape View File

7 include/EGL/eglmesaext.h Unescape Escape View File

16 include/EGL/eglplatform.h Unescape Escape View File

4 include/GL/gl.h Unescape Escape View File

50 include/GL/internal/dri_interface.h Unescape Escape View File

5 include/GLES2/gl2ext.h Unescape Escape View File

8 include/drm-uapi/README Unescape Escape View File

118 include/drm-uapi/drm_fourcc.h Unescape Escape View File

43 include/drm-uapi/drm_mode.h Unescape Escape View File

152 include/drm-uapi/i915_drm.h Unescape Escape View File

209 include/drm-uapi/tegra_drm.h Normal file Unescape Escape View File

110 src/gallium/drivers/vc5/vc5_drm.h → include/drm-uapi/v3d_drm.h Unescape Escape View File

83 include/drm-uapi/vc4_drm.h Unescape Escape View File

8 include/meson.build Unescape Escape View File

19 include/pci_ids/i965_pci_ids.h Unescape Escape View File

16 include/pci_ids/radeonsi_pci_ids.h Unescape Escape View File

65 include/vulkan/vk_icd.h Unescape Escape View File

28 include/vulkan/vk_platform.h Unescape Escape View File

6969 include/vulkan/vulkan.h View File

126 include/vulkan/vulkan_android.h Normal file Unescape Escape View File

7767 include/vulkan/vulkan_core.h Normal file View File

58 include/vulkan/vulkan_ios.h Normal file Unescape Escape View File

58 include/vulkan/vulkan_macos.h Normal file Unescape Escape View File

65 include/vulkan/vulkan_mir.h Normal file Unescape Escape View File

58 include/vulkan/vulkan_vi.h Normal file Unescape Escape View File

4562 Commits

mesa-18.0. ... 18.2-branc

2

.mailmap

View File

164

.travis.yml

View File

3

Android.common.mk

View File

10

Makefile.am

View File

79

README.rst Normal file

View File

1

REVIEWERS

View File

6

SConstruct

View File

2

VERSION

View File

10

appveyor.yml

View File

6

bin/.cherry-ignore

View File

2

bin/bugzilla_mesa.sh

View File

4

bin/get-fixes-pick-list.sh

View File

2

bin/get-pick-list.sh

View File

22

bin/install_megadrivers.py

View File

292

configure.ac

View File

4

docs/codingstyle.html

View File

1

docs/egl.html

View File

68

docs/envvars.html

View File

BIN
docs/favicon.ico Normal file

View File

BIN
docs/favicon.png Normal file

View File

171

docs/features.txt

View File

118

docs/index.html

View File

79

docs/meson.html

View File

31

docs/patents.txt

View File

2

docs/precompiled.html

View File

66

docs/release-calendar.html

View File

4

docs/releasing.html

View File

20

docs/relnotes.html

View File

275

docs/relnotes/17.3.4.html Normal file

View File

66

docs/relnotes/17.3.5.html Normal file

View File

85

docs/relnotes/17.3.6.html Normal file

View File

312

docs/relnotes/17.3.7.html Normal file

View File

147

docs/relnotes/17.3.8.html Normal file

View File

162

docs/relnotes/17.3.9.html Normal file

View File

76

docs/relnotes/17.4.0.html

View File

321

docs/relnotes/18.0.0.html Normal file

View File

225

docs/relnotes/18.0.1.html Normal file

View File

144

docs/relnotes/18.0.2.html Normal file

View File

107

docs/relnotes/18.0.3.html Normal file

View File

157

docs/relnotes/18.0.4.html Normal file

View File

162

docs/relnotes/18.0.5.html Normal file

View File

268

docs/relnotes/18.1.0.html Normal file

View File

168

docs/relnotes/18.1.1.html Normal file

View File

170

docs/relnotes/18.1.2.html Normal file

View File

167

docs/relnotes/18.1.3.html Normal file

View File

150

docs/relnotes/18.1.4.html Normal file

View File

183

docs/relnotes/18.1.5.html Normal file

View File

75

docs/relnotes/18.2.0.html Normal file

View File

81

docs/specs/MESA_framebuffer_flip_y.txt Normal file

View File

3

docs/specs/enums.txt

View File

12

docs/submittingpatches.html

View File

2

docs/utilities.html

View File

21

docs/viewperf.html

View File

1

include/EGL/eglext.h

View File

7

include/EGL/eglmesaext.h

View File

16

include/EGL/eglplatform.h

View File

4

include/GL/gl.h

View File

50

include/GL/internal/dri_interface.h

View File

5

include/GLES2/gl2ext.h

View File

8

include/drm-uapi/README

View File

118

include/drm-uapi/drm_fourcc.h

View File

43

include/drm-uapi/drm_mode.h

View File

152

include/drm-uapi/i915_drm.h

View File

209

include/drm-uapi/tegra_drm.h Normal file

View File

110

src/gallium/drivers/vc5/vc5_drm.h → include/drm-uapi/v3d_drm.h

View File

83

include/drm-uapi/vc4_drm.h

View File

8

include/meson.build

View File

19

include/pci_ids/i965_pci_ids.h

View File

16

include/pci_ids/radeonsi_pci_ids.h

View File

65

include/vulkan/vk_icd.h

View File

28

include/vulkan/vk_platform.h

View File

6969

include/vulkan/vulkan.h

View File

126

include/vulkan/vulkan_android.h Normal file

View File

7767

include/vulkan/vulkan_core.h Normal file

View File

58

include/vulkan/vulkan_ios.h Normal file

View File

58

include/vulkan/vulkan_macos.h Normal file

View File

65

include/vulkan/vulkan_mir.h Normal file

View File

58

include/vulkan/vulkan_vi.h Normal file

View File

65

include/vulkan/vulkan_wayland.h Normal file

View File