Comparing 8bc7d0f088..b255fda627 - mesa

fran/mesa

Author	SHA1	Message	Date
Emil Velikov	3ef8d4288a	docs: rename release notes to 13.0.0 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-19 19:10:16 +01:00
Marek Olšák	a2ea653a49	radeonsi: remove cb0_is_integer handling st/mesa does this for us. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-19 19:26:30 +02:00
Marek Olšák	54f8efeb02	st/mesa: disable alpha-test, alpha-to-coverage, alpha-to-one for integer FBs v2: rebased Reviewed-by: Brian Paul <brianp@vmware.com>	2016-10-19 19:26:30 +02:00
Marek Olšák	c64da9d499	mesa: remove gl_shader_compiler_options::EmitNoNoise it's always true Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-19 19:26:30 +02:00
Marek Olšák	2897cb3dba	glsl_to_tgsi: remove code for fixing up TGSI labels I don't know what this was supposed to do, but all TGSI labels were always 0. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-19 19:26:30 +02:00
Marek Olšák	ec35ff4e2b	glsl_to_tgsi: remove subroutine support Never used. The GLSL compiler doesn't even look at EmitNoFunctions. v2: add back "return" support in "main" Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-19 19:26:30 +02:00
Marek Olšák	eacda2c080	mesa_to_tgsi: remove remnants of flow control and subroutine support Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-19 19:26:30 +02:00
Marek Olšák	82f4c0126d	mesa_to_tgsi: drop support for instructions that can't occur here Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-19 19:26:30 +02:00
Marek Olšák	4e42898d9d	glsl_to_tgsi: allocate glsl_to_tgsi_instruction::tex_offsets on demand sizeof(glsl_to_tgsi_instruction): 384 -> 264 Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-19 19:26:30 +02:00
Marek Olšák	4d3d620f26	glsl_to_tgsi: merge buffer and sampler fields in glsl_to_tgsi_instruction sizeof(glsl_to_tgsi_instruction): 416 -> 384 Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-19 19:26:30 +02:00
Marek Olšák	dbf64ea28b	glsl_to_tgsi: reduce the size of glsl_to_tgsi_instruction using bitfields sizeof(glsl_to_tgsi_instruction): 464 -> 416 Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-19 19:26:30 +02:00
Marek Olšák	9015cbb3a3	glsl_to_tgsi: reduce the size of st_dst_reg and st_src_reg I noticed that glsl_to_tgsi_instruction is too huge. sizeof(glsl_to_tgsi_instruction): 752 -> 464 (-38%) Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-19 19:26:30 +02:00
Marek Olšák	222c599b61	glsl_to_tgsi: remove unused st_translate::tex_offsets Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-19 19:26:30 +02:00
Marek Olšák	0d95eeb79c	glsl_to_tgsi: remove unused parameters from calc_deref_offsets Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-19 19:26:30 +02:00
Marek Olšák	6980480052	glsl_to_tgsi: use array_id for temp arrays instead of hacking high bits Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-19 19:26:30 +02:00
Adam Jackson	4276b5c16a	reviewers: Throw myself on the GLX grenade Signed-off-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-10-19 12:37:22 -04:00
Eric Engestrom	8acb79dfac	egl: bring back the default glapi.so name Earlier commit replaced the default platform specific libglapi.so name with an #error. This may have been overzealous since the name is the correct for the BSD platforms, at least. Reinstate the hunk - bringing back OpenBSD, et al. to a successful build state. Fixes: `7a9c92d071` ("egl/dri2: non-shared glapi cleanups") [Emil Velikov: format the patch from Eric, add commit message and tag.] Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-10-19 15:09:26 +01:00
Iago Toral Quiroga	66d8bd3b7e	i965: fix subnr overflow in suboffset() Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-10-19 11:48:21 +02:00
Dave Airlie	86c4575a81	radv: decompress fmask before reading using texture unit Before we can read the fmask using the compute shader, we need to decompress the fmask in place. This fixes a bunch of remaining failure and hopefully multisampling in Talos.	2016-10-19 17:39:47 +10:00
Dave Airlie	67c91ef2a2	radv: fix samples_identical return value. This was returning an inversion, so not doing as it should have. We need to compare the fmask value with 0, and return the result from that.	2016-10-19 17:39:01 +10:00
Dave Airlie	93ba86c307	radv: fix wsi porting regression in swapchain destroy. The code in anv is right, there's a pending patch to fix this up different, but I'll sync the code for now.	2016-10-19 13:54:49 +10:00
Dave Airlie	63406b669e	radv: fix fmask ptr issue We were using the wrong descriptor in the fmask picking code.	2016-10-19 13:16:25 +10:00
Dave Airlie	db7ae14b60	radv: simplify fast clear shaders There is no need for anything but a noop shader here.	2016-10-19 13:16:14 +10:00
Dave Airlie	1ec5e6e702	vulkan/wsi: fix out of tree build.	2016-10-19 10:54:42 +10:00
Dave Airlie	b0e11a153c	radv: start using defines for the user sgpr offsets This adds some comments and adds defines for the user sgprs, so that we can move them around easier later and not have to change/revalidate every one of these. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-19 10:17:48 +10:00
Dave Airlie	6c3bd1cdb3	radv: port to common wsi codebase This drops all the radv WSI code in favour of using the new shared code that was ported from anv This regresses Talos for now, Jason has pointed out the bug is in Talos and we should wait for them to fix it. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:43 +10:00
Dave Airlie	3f7ef24889	anv: move to using shared wsi code This moves the shared code to a common subdirectory and makes anv linked to that code instead of the copy it was using. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:43 +10:00
Dave Airlie	ec0bc14a70	anv/wsi: remove all anv references from WSI common code the WSI code should be now be clean for sharing. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:43 +10:00
Dave Airlie	971523410f	anv: move common wsi code to x11/wayland common files. Next task is to rename all the anv_ out of this, and move to a common location Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:43 +10:00
Dave Airlie	e0d15fbe1d	anv/wsi/wayland: add callback to get device format properties. This avoids having to know the toplevel API name. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:43 +10:00
Dave Airlie	4392de6771	anv/wsi/wl: stop using device in more places Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:43 +10:00
Dave Airlie	507722b882	anv/wsi: split out surface creation to avoid instance API Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:43 +10:00
Dave Airlie	954cd09e66	anv/wsi: move further away from passing anv displays around Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:43 +10:00
Dave Airlie	1720bbd353	anv/wsi: split image alloc/free out to separate fns. This moves these outside the wsi platform code, so we can reuse that code Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:43 +10:00
Dave Airlie	828b8dbce4	anv/wsi: switch to using VkDevice in swapchain Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:42 +10:00
Dave Airlie	6542001345	anv/wsi/x11: more refactoring to use generic handles Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:42 +10:00
Dave Airlie	340e72f056	anv/wsi/x11: start refactoring out the image allocation/free functionality Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:42 +10:00
Dave Airlie	c264c272a5	anv/wsi: drop device from get format Just use the wsi_device instead. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:42 +10:00
Dave Airlie	467d161e6a	anv/wsi: remove device from get_support interface replace with wsi_device and allocator. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:42 +10:00
Dave Airlie	b8e7460563	anv/wsi/x11: abstract WSI interface from internals. This allows the API and the internals to be split, and the internals shared. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:42 +10:00
Dave Airlie	36e6be2e0d	anv/wsi/x11: push anv_device out of the init/finish routines Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:42 +10:00
Dave Airlie	7c10258567	anv/wsi: abstract wsi interfaces away from device a bit more. This is a step towards separating out the wsi code for sharing Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:42 +10:00
Dave Airlie	be61fff6da	anv/wsi/x11: push device out of x11 connection fns. just pass the allocator/wsi_interface instead. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:42 +10:00
Dave Airlie	e9cf7c4460	anv/wsi: drop device from get caps Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:42 +10:00
Dave Airlie	0e4abc3e10	anv/wsi: drop get present modes device arg Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:42 +10:00
Dave Airlie	32d70c0d66	radv/anv/wsi: drop unneeded parameter Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-19 10:15:42 +10:00
Roland Scheidegger	aeceec54a8	draw: improve vertex fetch (v2) The per-element fetch has quite some calculations which are constant, these can be moved outside both the per-element as well as the main shader loop (llvm can figure out it's constant mostly on its own, however this can have a significant compile time cost). Similarly, it looks easier swapping the fetch loops (outer loop per attrib, inner loop filling up the per vertex elements - this way the aos->soa conversion also can be done per attrib and not just at the end though again this doesn't really make much of a difference in the generated code). (This would also make it possible to vectorize the calculations leading to the fetches.) There's also some minimal change simplifying the overflow math slightly. All in all, the generated code seems to look slightly simpler (depending on the actual vs), but more importantly I've seen a significant reduction in compile times for some vs (albeit with old (3.3) llvm version, and the time reduction is only really for the optimizations run on the IR). v2: adapt to other draw change. No changes with piglit. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-10-19 01:44:59 +02:00
Roland Scheidegger	0942fe548e	draw: improved handling of undefined inputs Previous attempts to zero initialize all inputs were not really optimal (though no performance impact was measurable). In fact this is not really necessary, since we know the max number of inputs used. Instead, just generate fetch for up to max inputs used by the shader, directly replacing inputs for which there was no vertex element by zero. This also cleans up key generation, which previously would have stored some garbage for these elements. And also drop the assertion which indicates such bogus usage by a debug_printf (the whole point of initializing the undefined inputs was to make this case safe to handle). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-10-19 01:44:59 +02:00
Roland Scheidegger	d1b4a3451e	gallivm: print out time for jitting functions with GALLIVM_DEBUG=perf Compilation to actual machine code can easily take as much time as the optimization passes on the IR if not more, so print this out too. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-10-19 01:44:59 +02:00
Roland Scheidegger	6f2f0daeb4	gallivm: Use native packs and unpacks for the lerps For the texturing packs, things looked pretty terrible. For every lerp, we were repacking the values, and while those look sort of cheap with 128bit, with 256bit we end up with 2 of them instead of just 1 but worse, plus 2 extracts too (the unpack, however, works fine with a single instruction, albeit only with llvm 3.8 - the vpmovzxbw). Ideally we'd use more clever pack for llvmpipe backend conversion too since we actually use the "wrong" shuffle (which is more work) when doing the fs twiddle just so we end up with the wrong order for being able to do native pack when converting from 2x8f -> 1x16b. But this requires some refactoring, since the untwiddle is separate from conversion. This is only used for avx2 256bit pack/unpack for now. Improves openarena scores by 8% or so, though overall it's still pretty disappointing how much faster 256bit vectors are even with avx2 (or rather, aren't...). And, of course, eliminating the needless packs/unpacks in the first place would eliminate most of that advantage (not quite all) from this patch. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-10-19 01:44:59 +02:00
Dave Airlie	7e1e06bc75	anv: drop pointless struct decl. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-19 09:05:26 +10:00
Dave Airlie	e4df1830e4	radv: drop pointless struct decl. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-19 09:05:26 +10:00
Dave Airlie	4450f40519	radv: move to using shared vk_alloc inlines. This moves to the shared vk_alloc inlines for vulkan memory allocations. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-19 09:05:26 +10:00
Dave Airlie	1ae6ece980	anv: move to using vk_alloc helpers. This moves all the alloc/free in anv to the generic helpers. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-19 09:05:26 +10:00
Dave Airlie	0cfd428aef	vulkan: add vk_alloc.h shared allocation inlines. vulkan allocation allows for overriding the allocator used, add some macros for anv/radv to share for this. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-19 09:05:26 +10:00
Dave Airlie	2c6d8bff03	anv: drop local MIN/MAX macros. Use the ones from mesa, most places already did. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-19 09:05:26 +10:00
Dave Airlie	c6f1077e0d	radv: drop local MIN/MAX macros. Use the ones in macros.h instead. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-19 09:05:25 +10:00
Dave Airlie	78bce52f9a	util: move min/max/clamp macros to util macros.h Although the vulkan drivers include mesa macros.h, for radv I'd like to move away from that. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-19 09:05:25 +10:00
Dave Airlie	f5daaba0fd	radv: make use of shared vector helper. This removes the vector code from radv in favour of sharing code with anv. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-19 09:05:25 +10:00
Dave Airlie	8df014c01a	anv: port to using new u_vector shared helper. This just removes the anv vector code and uses the new helper. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-19 09:05:25 +10:00
Dave Airlie	008f54f63a	util: add vector util code. This is ported from anv, both anv and radv can share this. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-19 09:05:25 +10:00
Brian Paul	8b731b8b03	svga: minor code improvements in svga_validate_pipe_sampler_view() Use the 'texture' local var in more places. Rename 'pFormat' to 'viewFormat'. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-10-18 16:16:26 -06:00
Lionel Landwerlin	0ca134aa9f	intel: genxml: add SAMPLER_BORDER_COLOR_STATE structures Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-18 22:43:41 +01:00
Boyuan Zhang	5567145d59	st/va: force to flush the last p frame in idr period During dual instance encoding submission, if the second encode task and first encode task have no reference dependency, e.g. p following with idr-frame, there is a chance the second task will use for its reconstructed picture buffer the same buffer used by first task for its reference/reconstructed picture. In this case, buffer corruption may occur depending on encoding speed. Fix is to force flush these two tasks separately to avoid race condition Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=98005 Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-10-18 15:16:34 -04:00
Chad Versace	52a6483e8a	egl/surfaceless: Fix segfault in eglSwapBuffers Since commit `63c5d5c6c4`, the surfaceless platform has allowed creation of pbuffer surfaces. But the vtable entry for eglSwapBuffers has remained NULL. Discovered by running a little pbuffer test. Cc: Gurchetan Singh <gurchetansingh@chromium.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-18 11:12:22 -07:00
Marek Olšák	21af69e753	radeonsi: rename prefixes from radeon to si Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-18 18:41:08 +02:00
Marek Olšák	6e475fefa1	radeonsi: merge radeon_llvm_context and si_shader_context Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-18 18:41:06 +02:00
Marek Olšák	5ab25bb4ba	radeonsi: import all TGSI->LLVM code from gallium/radeon Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-18 18:41:04 +02:00
Marek Olšák	4967cacdfa	gallium/radeon: simplify initialization of 64-bit gallivm builders Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-18 18:41:03 +02:00
Marek Olšák	502dad4dca	gallium/radeon: remove unused radeon_llvm_reg_index_soa Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-18 18:41:01 +02:00
Marek Olšák	4e5d076fcf	radeonsi: move LLVM ALU codegen into radeonsi Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-18 18:40:59 +02:00
Jonathan Gray	41754f743f	genxml: add generated headers to EXTRA_DIST Building the Mesa 12.0.3 distfile failed on a system without python as generated files were not included in the distfile. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-18 17:06:42 +01:00
Jonathan Gray	23392abf50	mesa: automake: include mesa_glinterop.h in distfile Add mesa_glinterop.h to the list of headers that will get included in the distfile as it is required to build Mesa itself. Corrects a regression introduced in `a89faa2022`. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-18 17:06:42 +01:00
Jonathan Gray	2fc1374be6	egl: remove docs directory from EXTRA_DIST The egl docs directory no longer exists as of `88b5c36fe1`. Remove it from EXTRA_DIST to unbreak 'make dist' Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Tested-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-18 17:06:42 +01:00
Jonathan Gray	27572db46d	genxml: avoid using a GNU make pattern rule % pattern rules are a GNU extension. Convert the use of one to a inference rule to allow this to build on OpenBSD. This is a related change to the one made in `e3d43dc5ea` Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-18 17:06:42 +01:00
Emil Velikov	9898c60745	configure.ac: use a single require_libdrm helper Rather than having 4-5 places which do the explicit check/message just polish the gallium helper and use it everywhere. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-18 17:06:38 +01:00
Emil Velikov	3e079c3f86	configure.ac: remove no longer needed *_pci_id logic Previously it was used to differentiate between the different codepaths in the loader. Although strictly speaking the (core) of the loader is only used when a hardware device is available. The latter of which in itself requires libdrm (one of the codepaths available). That said, all the configure toggles which relate to enabling/using hw device should attribute and require libdrm, so there's no need to keep this code around. With this gallium_require_drm_loader becomes an empty stub, so nuke that one as well. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-18 17:06:35 +01:00
Emil Velikov	47b5925d9b	loader: cleanup copyright section With previous patches nearly all the original code (as seen in the various loaders) is gone. Update the copyright/license section to reflect that. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-18 17:06:32 +01:00
Emil Velikov	af7abc512c	loader: remove loader_get_driver_for_fd() driver_type Reminiscent from the pre-loader days, were we had multiple instances of the loader logic in separate places and one could build a "GALLIUM_ONLY" version. Since that is no longer the case and the loaders (glx/egl/gbm) do not (and should not) require to know any classic/gallium specific we can drop the argument and the related code. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-18 17:06:29 +01:00
Emil Velikov	f9f7e44c94	loader: remove final sysfs codepath in loader_get_device_name_for_fd() Effectively everyone with actual hardware and/or requesting the "device_name" requires a working libdrm. Thus they could/should already be using the (now only) codepath. Apart from the code simplification, we can slim down our configure.ac even further. But that will be done in separate patch(es). Cc: Gary Wong <gtw@gnu.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-18 17:06:26 +01:00
Emil Velikov	4f1c33fd9d	travis: remove no longer needed libudev-dev dependency Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-18 17:06:24 +01:00
Emil Velikov	cb23fba3f3	scons: remove all libudev references Analogous to previous automake/autoconf commit. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-18 17:06:21 +01:00
Emil Velikov	4a183f4d06	scons: loader: use libdrm when available Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-18 17:06:18 +01:00
Emil Velikov	0607c5b1b0	gbm: remove superfluous/incorrect udev comment The gbm_device_get_backend_name() provides an (somewhat) internal name of the implementation/backend used. Is has nothing to do with the udev, one cannot and should not attempt to derive the name from it. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-18 17:06:15 +01:00
Emil Velikov	6b21fdaa8f	automake: remove all the libudev references As of last commit nothing in mesa depends on libudev. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-18 17:06:12 +01:00
Emil Velikov	1e2e625e30	loader: remove libudev_get_device_name_for_fd and related code With this all the libudev related code is now gone. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-18 17:06:10 +01:00
Emil Velikov	fcdc02f512	loader: reimplement loader_get_user_preferred_fd via libdrm Currently not everyone has libudev and with follow-up patches we'll completely remove the divergent codepaths. Use the libdrm drm device API to construct the required ID_PATH_TAG-like string, to preserve the current functionality for libudev users and allow others to benefit from it as well. v2: Drop ranty comments, pick the correct device v3: \n -> \0 in PCI_ID_PATH_TAG_LENGTH comment (Axel). v4: Use snprintf (Nicolai) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-18 17:06:10 +01:00
Emil Velikov	8222100631	loader: annotate __driConfigOptionsLoader as static Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-18 17:06:07 +01:00
Emil Velikov	d561e064a8	loader: separate USE_DRICONF code into separate function Improves readability and allows us to do further cleanups a lot easier. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-18 17:06:04 +01:00
Emil Velikov	be239326aa	loader: slim down loader_get_pci_id_for_fd implementation(s) Currently mesa has three code paths in the loader - libudev, manual sysfs and drm ioctl one. Considering the issues we had with libudev - strip those down in favour of the libdrm drm device API. The latter can be implemented in any way depending on the platform and can be reused by others. v2: Use correct message on drmGetDevice failure. (Nicolai) Cc: Jonathan Gray <jsg@jsg.id.au> Cc: Jean-Sébastien Pédron <dumbbell@FreeBSD.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-18 17:06:04 +01:00
Emil Velikov	fd00aba5f4	configure.ac: mark libdrm as have_pci_id provider With follow on work, we'll untangle and simplify all the different codepaths in loader. Then again, we forget to set have_pci_id when libdrm is present (one of the codepaths available). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-18 17:06:01 +01:00
Ilia Mirkin	8c78fdb328	gm107/ir: fix bit offset of tex lod setting for indirect texturing Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2016-10-18 09:56:14 -04:00
Ilia Mirkin	ecea2f69ef	gm107/ir: fix texturing with indirect samplers The indirect handle has to come right after the coordinates, so if there was a sample/bias/depth compare/offset, everything would end up being shifted by one argument position. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2016-10-18 09:56:14 -04:00
Marek Olšák	34099894c3	gallium/tgsi: add missing #include Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-18 11:20:57 +02:00
Julien Isorce	dbc8e18116	st/va: set default rt formats when calling vaCreateConfig As specified in va.h, default value should be set on attributes not present in the input list. Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Mark Thompson <sw@jkqxz.net>	2016-10-18 08:44:14 +01:00
Kenneth Graunke	9f677d6541	i965: Fix gl_InvocationID in dual object GS where invocations == 1. dEQP-GLES31.functional.geometry_shading.instanced.geometry_1_invocations draws using a geometry shader that specifies layout(points, invocations = 1) in; and then uses gl_InvocationID. According to the Haswell PRM, the "GS Instance ID 0" (and 1) thread payload fields are undefined in dual object mode: "If 'dispatch mode' is DUAL_OBJECT this field is not valid." But there's no point in using them - if there's only one invocation, the ID will be 0. So just load a constant. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-17 20:22:02 -07:00
Jason Ekstrand	52904ba85c	anv: Get rid of anv_cmd_buffer_emit_state_base_address All code that would have once called this can now call the gen-specific version. The switching version is no longer needed. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-17 17:41:35 -07:00
Jason Ekstrand	7998e37774	anv/cmd_buffer: Move descriptor flushing into genX_cmd_buffer.c It really should have gone here all along. We were trying a bit too hard to make it gen-agnostic just because it didn't have any #if's. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-17 17:41:35 -07:00
Jason Ekstrand	eddaa237c0	anv/cmd_buffer: Expose ensure_push_constant_* Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-17 17:41:35 -07:00
Jason Ekstrand	1f3e6468d2	anv/cmd_buffer: Unify flush_compute_state across gens With one small genxml change, the two versions were basically identical. The only differences were one #define for HSW+ and a field that is missing on Haswell but exists everywhere else. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-17 17:41:35 -07:00
Jason Ekstrand	2314c9ed2e	anv/cmd_buffer: Move Begin/End/Execute to genX_cmd_buffer.c vkBeginCommandBuffer and vkCmdExecuteCommands both call into the gen-specific emit_state_base_address function and vkEndCommandBuffer belongs with begin. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-17 17:41:35 -07:00
Jason Ekstrand	ac0ca066de	anv/cmd_buffer: Move state base address re-emit into ExecuteCommands This has two primary advantages. First, it means that the batch_chain code knows less about the actual command buffer contents which is good because improves separation. Second, it means that it only gets re-emitted once after all of the secondaries instead of once after each secondary which is just wasteful. It also has the advantage of cleaning the code up a bit. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-17 17:41:35 -07:00
Edward O'Callaghan	1c05f92590	doc/features.txt: factor out radeonsi as GL45 complete V2. add i965/hsw+ to list V3. rebased on master. V4. 'DONE' -> 'DONE ()'. V5. remove i965/hsw+ from list :/ Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-18 09:55:45 +11:00
Ian Romanick	89e1436e2d	i965: Silence unused parameter warnings brw_link.cpp:76:44: warning: unused parameter ‘shader_type’ [-Wunused-parameter] gl_shader_stage shader_type, ^ brw_nir.c: In function ‘brw_nir_lower_vs_inputs’: brw_nir.c:194:55: warning: unused parameter ‘devinfo’ [-Wunused-parameter] const struct gen_device_info *devinfo, ^ brw_vec4_visitor.cpp:914:37: warning: unused parameter ‘sampler’ [-Wunused-parameter] uint32_t sampler, ^ brw_vec4_visitor.cpp:1146:34: warning: unused parameter ‘stream_id’ [-Wunused-parameter] vec4_visitor::gs_emit_vertex(int stream_id) ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-17 11:32:03 -07:00
Ian Romanick	7c0c3740f0	glsl: Remove unused function import_prototypes Once upon a time, this was used to extract prototypes from the shader containing GLSL built-in functions. This was removed by `f5692f45` in November 2010 for Mesa 7.10. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-17 11:32:03 -07:00
Ian Romanick	5c025ea6fc	glsl: Remove prototypes for nonexistent functions Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-17 11:32:03 -07:00
Ian Romanick	fde48c1262	glsl: Replace assert with unreachable Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-17 11:32:03 -07:00
Lionel Landwerlin	696f5c1853	anv: replace , with ; in anv_batch_emit() Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-17 18:16:38 +01:00
Lionel Landwerlin	6b17e3a6da	intel: aubinator: use different colors to signal batch start/end This makes the stream of commands a bit easier to read. v2 (Ken): Use bold text on green headers for easier readability; swap the green and blue headers so the majority stay blue. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2016-10-17 18:16:38 +01:00
Nicolai Hähnle	c3ce0d22b4	st/glsl_to_tgsi: fix [ui]vec[34] conversion to double The corresponding opcodes for integers need to be treated the same as F2D. Fixes GL45-CTS.gpu_shader_fp64.conversions. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-17 19:09:45 +02:00
Nicolai Hähnle	1dd99a15a4	st/glsl_to_tgsi: fix atomic counter addressing When more than one atomic counter buffer is in use, UniformStorage[n].opaque is set up to contain indices that are contiguous across all used buffers. This appears to be used by i965 via NIR, but for TGSI we do not treat atomic counter buffers as opaque, so using the data in the opaque array is incorrect. Fixes GL45-CTS.compute_shader.resource-atomic-counter. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-17 19:09:42 +02:00
Nicolai Hähnle	9d6f82320c	st/glsl_to_tgsi: fix a corner case of std140 layout in uniform buffers See the comment in the code for an explanation. This fixes GL45-CTS.buffer_storage.map_persistent_draw. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-17 19:09:39 +02:00
Nicolai Hähnle	57a1514203	st/mesa: fix fragment shader output mapping Properly handle the case where there is a gap in the assigned output locations, e.g. a fragment shader writes to color buffer 2 but not to color buffers 0 & 1. Fixes GL45-CTS.gtf33.GL3Tests.explicit_attrib_location.explicit_attrib_location_pipeline. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-17 19:09:37 +02:00
Nicolai Hähnle	e0213f36bb	glsl: print non-zero bindings of variables Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-17 19:09:33 +02:00
Nicolai Hähnle	9160b4d981	radeonsi: unify the constant load paths Remove the split between direct and indirect. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-17 19:08:45 +02:00
Nicolai Hähnle	51f9b38ce8	radeonsi: fix indirect loads of 64 bit constants This fixes GL45-CTS.compute_shader.fp64-case3. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-17 19:08:36 +02:00
Eric Engestrom	e9864f93c6	gbm: add a couple missing includes Needed for memset() and drmIoctl(). Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-17 08:47:38 -07:00
Iago Toral Quiroga	8785a8ff89	glsl: fail compilation of compute shaders when unsupported Generally, we only check for the presence of compute shaders during parsing when we find any language (like layout qualifiers) that are specific to compute shaders, however, it is possible to define an empty compute shader does not use any language specific to compute shaders at all and we should fail the compilation anyway. dEQP checks this. This patch adds a check for compute shader availability after we have parsed the source code. At this point we know the effective GLSL version and also extensions enabled in the shader. Fixes a subcase of the following dEQP tests: dEQP-GLES31.functional.debug.negative_coverage.callbacks.shader.compile_compute_shader dEQP-GLES31.functional.debug.negative_coverage.get_error.shader.compile_compute_shader dEQP-GLES31.functional.debug.negative_coverage.log.shader.compile_compute_shader The tests still fail because there is one more subcase that fails that needs another fix. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-10-17 15:14:12 +02:00
Tapani Pälli	3d48353e29	egl/android: fix error in droid_add_configs_for_visuals() This was some kind of leftover in commit `acd35c8` and format_count array variable (declared in outer scope) should be used instead. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Fixes: `acd35c8758` ("egl/android: tweak droid_add_configs_for_visuals()") Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-17 11:51:15 +01:00
Marek Olšák	74d145f4a8	radeonsi: shorten "shader->selector" to "sel" in si_shader_create Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-17 12:13:00 +02:00
Marek Olšák	2e74e8ead9	radeonsi: clear DB_RENDER_OVERRIDE Vulkan doesn't set these fields even though it doesn't use HiS. HiS is disabled by programming DB_SRESULTS_COMPARE_STATEn to 0. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-17 12:13:00 +02:00
Kenneth Graunke	f30f48476f	glsl: Disable textureOffset(sampler2DArrayShadow, ...) in GLSL ES. This has apparently never existed in GLSL ES. Fixes dEQP-GLES3.functional.shaders.texture_functions.invalid .textureoffset_sampler2darrayshadow_vec4_ivec2_vertex and .textureoffset_sampler2darrayshadow_vec4_ivec2_fragment Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98244 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-16 15:05:00 -07:00
Axel Davy	9baf4505fb	st/nine: Fix multisample limit check Fixes regression introduced by `b560305687` The regression prevents some apps to start. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-17 00:02:52 +02:00
Eric Anholt	c61eb3c91c	vc4: Fix fast clear color packing for 565. Piglit didn't manage to cover this because fbo-clear-formats uses scissors, so we don't get fast clearing.	2016-10-16 11:22:50 -07:00
Eric Anholt	46cd3bab93	state_tracker: Fix check for scissor enabled when < 0. DEQP's clear tests like to give us x + w < 0 or y + h < 0. Since we were comparing to an unsigned, it would get promoted to unsigned and come out as bignum >= width or height and we would clear the whole fb instead of none of the fb. Fixes 10 tests under deqp-gles2/functional/color_clear. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-16 11:22:50 -07:00
Chad Versace	07422bf32b	egl/surfaceless: Fix comparison between pointer and integer Fixes GCC warning: drivers/dri2/platform_surfaceless.c:196:18: warning: comparison between pointer and integer Fixes: `4b8a55809e` ("egl/surfaceless: tweak surfaceless_add_configs_for_visuals()") Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-10-16 09:03:31 -07:00
Emil Velikov	d19b014b77	egl/surfaceless: use correct index when accesing the visual i is used for the driver_configs, while j is for the visuals. Fixes: `4b8a55809e` ("egl/surfaceless: tweak surfaceless_add_configs_for_visuals()") Reported-by: Chad Versace <chadversary@chromium.org> Tested-by: Chad Versace <chadversary@chromium.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-16 09:03:27 -07:00
Gustaw Smolarczyk	36cb5508e8	radv/winsys: Fail early on overgrown cs. When !use_ib_bos, we can't easily chain ibs one to another. If the required cs size grows over 1Mi - 8 dwords just fail the cs so that we won't assert-fail in radv_amdgpu_winsys_cs_submit later on. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-10-16 12:38:53 +02:00
Kenneth Graunke	493237d4ee	glsl: Drop the ES requirement that VS outputs must be flat qualified. Several conformance tests violate this requirement: ES31-CTS.core.tessellation_shader.max_patch_vertices ES31-CTS.core.tessellation_shader.tessellation_control_to_tessellation_evaluation.data_pass_through I submitted a merge request to fix the conformance tests, but Khronos opted to drop this GLSL ES specific requirement in favor of making flat qualification of VS outputs optional, matching modern desktop GL. Note that there were 7 Piglit tests which enforce this rule: tests/spec/glsl-es-3.00/compiler/interpolation/qualifiers/nonflat but these were deleted in Piglit commit acc0a2fabbd714bc704c16f1675e7c0. Bugzilla: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=15465#c7 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-10-15 13:47:47 -07:00
Jason Ekstrand	6ef5a44a43	intel/genxml: Make some PIPE_CONTROL fields booleans Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-15 12:20:50 -07:00
Jason Ekstrand	f34de3e8b0	intel/genxml: Make "Predication enable" a boolean Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-15 12:20:46 -07:00
Jason Ekstrand	468e1042cb	intel/genxml; Make "Use Global GTT a boolean We also remove the redundant zero defaults since everything without an explicit default gets zeroed automatically. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-15 12:20:43 -07:00
Jason Ekstrand	ce86227175	intel/genxml; Make "Tiled Surface" a boolean Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-15 12:20:39 -07:00
Jason Ekstrand	e6f9637d8a	intel/genxml: Make "SO Buffer Enable" fields boolean Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-15 12:20:36 -07:00
Jason Ekstrand	fa0285eaac	intel/genxml: Make "Stencil Buffer Enable" a boolean Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-15 12:20:30 -07:00
Jason Ekstrand	34826078f6	intel/genxml: Make a couple of STREAMOUT fields booleans Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-15 12:20:26 -07:00
Jason Ekstrand	6a064ad01d	intel/genxml: Make "Include Vertex Handles" and "Include Primitive ID" booleans Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-15 12:20:23 -07:00
Jason Ekstrand	f21d3b4d01	intel/genxml: Make "Vector Mask Enable" a boolean We also get rid of the "(VME)" a few places Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-15 12:20:19 -07:00
Jason Ekstrand	aee501c87e	intel/genxml: Make "Single Program Flow" a boolean We also get rid of the "(SPF)" a few places. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-15 12:20:14 -07:00
Tobias Klausmann	b7d9677de8	nv50/ir: constant fold OP_SPLIT Split the source immediate value into new values and move them into the original defs set by the split. Since we can only have up to 64-bit immediates, this is largely beneficial for F64 (and, in the future, U64) operations. Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> [imirkin: always use U32, set newi for foldCount tracking] Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-14 23:23:57 -04:00
Kenneth Graunke	75128d6ffd	i965: Enable OpenGL 4.5. Everything is in place. There are still conformance issues to sort out, but we may as well turn it on in master. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-14 17:35:13 -07:00
Jason Ekstrand	9d65595c06	anv/pipeline: Remove a meta hack from emit_ds_state Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-14 15:40:39 -07:00
Jason Ekstrand	69b2e931d4	anv/image: Create views directly in VkCreate*View Without meta, we no longer need the _init helpers and the ability to back an image view with surface states allocated out of the command buffer. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-14 15:40:39 -07:00
Jason Ekstrand	0a2c375af9	anv/image: Get rid of the usage hacks for meta Now that meta is gone and we're using blorp, we don't need all of the usage hacks. Instead, the usage provided by the app is exactly the usage that we want because the app is the only thing creating image views. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-14 15:40:39 -07:00
Jason Ekstrand	8e1a8dd47e	anv: Move CreatePipelines into genX_cmd_buffer.c Now that we don't have meta, we have no need for a gen-agnostic pipeline create path. We can, instead, just generate one CreatePipelines function per gen and be done with it. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-14 15:40:39 -07:00
Jason Ekstrand	7df46b7533	anv/pipeline: Remove support for direct-from-nir shaders Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-14 15:40:39 -07:00
Jason Ekstrand	6d557ae403	anv: Make entrypoint resolution take a gen_device_info In order for things such as the ANV_CALL and the ifuncs to work, we used to have a singleton gen_device_info structure that got assigned the first time you create a device. Given that the driver will never be used simultaneously on two different generations of hardware, this was fairly safe to do. However, it has caused a few hickups and isn't, in general, a good plan. Now that the two primary reasons for this singleton are gone, we can get rid of it and make things quite a bit safer. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-14 15:40:39 -07:00
Jason Ekstrand	4c9dec80ed	anv: Get rid of the ANV_CALL macro This macro was needed by meta in order to make gen-specific calls from gen-agnostic code. Now that we don't have meta, the remaining two uses are fairly trivial to get rid of. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-14 15:40:39 -07:00
Jason Ekstrand	ac77528f7d	anv: Get rid of graphics_pipeline_create_info_extra Now that we no longer have meta, all pipelines get created via the normal Vulkan pipeline creation mechanics. There is no more need for this bit of extra magic data that we've been passing around. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:40:39 -07:00
Jason Ekstrand	dedc406ec8	anv: Get rid of meta Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:40:39 -07:00
Jason Ekstrand	d823f92970	anv: Use blorp for subpass clears Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:39:41 -07:00
Jason Ekstrand	51faab487f	anv: Use blorp for ClearAttachments Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:39:41 -07:00
Jason Ekstrand	c9eaf12de2	anv/hiz: Perform HiZ resolves for all partial renders If we don't, we can end up with corruption in the portion of the depth buffer that lies outside the render area when we do a HiZ resolve at the end. The only reason we weren't seeing this before was that all of the meta-based clears such as VkCmdClearDepthStencilImage were internally using HiZ so the HiZ buffer never truly got out-of-sync. If the CTS ever tested a depth upload (which doesn't care about HiZ) and then a partial render we would have seen problems. Soon, we will be using blorp to do depth clears and it won't bother with HiZ so we would get CTS regressions without this. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-10-14 15:39:41 -07:00
Jason Ekstrand	58f2315c38	anv: Use blorp for ClearDepthStencilImage Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:39:41 -07:00
Jason Ekstrand	29e289fa65	anv/image: Add an isl_view to anv_image_view Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:39:41 -07:00
Jason Ekstrand	0340548c8e	anv/image: Rework our handling of 3-D image array ranges Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:39:41 -07:00
Jason Ekstrand	146ee31159	anv/blorp: Don't hand-roll flush_pipeline_select_3d When I initially brought up Vulkan blorp, I completely missed that this was already factored out. There's no good reason for us to hand-roll it. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:39:41 -07:00
Jason Ekstrand	d80c0307ea	intel/blorp: Add a flag to make blorp not re-emit dept/stencil buffers In Vulkan, we want to be able to use blorp to perform clears inside of a render pass. If blorp stomps the depth/stencil buffers packets then we'll have to re-emit them. This gets tricky when secondary command buffers get involved. Instead, we'll simply guarantee that the depth and stencil buffers we pass to blorp (if any) match those already set in the hardware. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:39:41 -07:00
Jason Ekstrand	0cabf93b80	intel/blorp: Add an entrypoint for clearing depth and stencil Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:39:41 -07:00
Jason Ekstrand	82a2c49c5f	intel/blorp: Emit a NULL render target for depth/stencil-only operations This never mattered before because the only time we used blorp depth/stencil only was to do HiZ operations on gen6-7. It may have worked in that case (and maybe it didn't) but slow depth clears actually do depth rendering so they need a valid render target. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:39:41 -07:00
Jason Ekstrand	b324c38ae3	intel/blorp: Allow for running without a PS on gen8+ Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:39:41 -07:00
Jason Ekstrand	81be7be119	intel/blorp: Add an "enabled" bit to surface_info This gives a slightly smarter way to check whether or not a particular surface exists than looking at the address. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:39:41 -07:00
Jason Ekstrand	bc4bb5a7e3	intel/blorp: Emit more complete DEPTH_STENCIL state This should now set the pipeline up properly for doing depth and/or stencil clears by plumbing through depth/stencil test values. We are now also emitting color calculator state for blorp operations without an actual shader because that is where the stencil reference value goes pre-SKL. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:39:41 -07:00
Jason Ekstrand	7017742ad7	intel/blorp: Unify the DEPTH_STENCIL emit code across gens Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:39:41 -07:00
Jason Ekstrand	cf2e3c3163	intel/blorp: Simplify depth/stencil config The newly reworked depth/stencil config code can properly handle having depth, stencil, both, or neither. We no longer need to predicate it on having depth or stencil. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:39:41 -07:00
Jason Ekstrand	0414aaa133	intel/blorp: Set QPitch for depth and HiZ on gen8+	2016-10-14 15:39:41 -07:00
Jason Ekstrand	563fa63bf2	intel/blorp: Add support for binding an actual stencil buffer While we're here, we also make depth without HiZ work. v2: - Use the correct surface type for 1-D on SKL+ - Set QPitch on BDW+ Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:39:41 -07:00
Jason Ekstrand	f180faab79	intel/blorp: Move CLEAR_PARAMS setup into emit_depth_stencil_config Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:39:41 -07:00
Jason Ekstrand	c1fcf1a957	intel/genxml: Add a uint MOCS field to 3DSTATE_STENCIL_BUFFER Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2016-10-14 15:39:41 -07:00
Jason Ekstrand	5dacd3caee	intel/blorp: Make the Z component of the primitive adjustable We want to be able to start doing slow depth clears with blorp. This allows us to adjust the depth we're clearing to. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:39:41 -07:00
Emil Velikov	7cb197c3a8	i915: workaround multiple intelFenceExtension definitions Due to conflicting symbol names (between i915 and i965) in the megadriver, we use a set of defines in i915/intel_screen.h. With a recent commit we've introduced a symbol intelFenceExtension which has different implementation for each driver, yet we forgot to add the define. Fixes: `d11515ff1b` ("i915/sync: Implement DRI2_Fence extension") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98264 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-14 19:22:16 +01:00
Chad Versace	cb836b673c	docs/specs: Update allocated EGL enum values Document the EGL enum ranges for Mesa and those values allocated by the following extensions: EGL_MESA_drm_image EGL_MESA_platform_gbm EGL_MESA_platform_surfaceless EGL_WL_bind_wayland_display Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-14 11:19:41 -07:00
Chad Versace	0cfa34c102	doc/specs: Reference the Khronos registry XML Years ago Khronos replaced the registry's spec files with newfangled XML files. Update the reference in doc/specs/enum.txt accordingly. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-14 11:19:40 -07:00
Chad Versace	88b5c36fe1	egl: Move old EGL_MESA_screen_surface spec It was the lone file in src/egl/docs. Move it to where the other specs live, in $MESA_TOP/docs/specs. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-14 11:19:40 -07:00
Chad Versace	a597c8ad5b	egl: Implement EGL_MESA_platform_surfaceless Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-14 11:19:40 -07:00
Chad Versace	c177ef9d47	egl: Don't advertise unsupported platform extensions Mesa's set of supported platform extensions depends on the autoconf option --with-egl-platforms=foo,bar,baz. If --with-egl-platforms lacks foo, then eglGetPlatformDisplay(EGL_PLATFORM_FOO, ...) unconditonally fails. So, if --with-egl-platforms lacks foo, then remove EGL_VENDOR_platform_foo from the EGL client extension string. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-14 11:19:27 -07:00
Chad Versace	27f4e38173	docs: Add EGL_MESA_platform_surfaceless.txt (v2) v2: - Assign enum values. - Define interactions with EGL_EXT_platform_base and EGL 1.4. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-14 11:19:13 -07:00
Ian Romanick	4246986dec	i965: Sort some extension names Trivial. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-10-14 11:16:59 -07:00
Jose Fonseca	b12606b693	scons: Fix the Python dependency scanner. modulefinder wasn't searching for dependencies in the script dir. It's not capable of detecting the sys.path manipulations scripts do internally neither. This change fixes the first issue, and hacks around the second. Honestly, I've come to the conclusion that automatic Python dependency it will always be too brittle. I think we should start manually typing the dependencies like we do in automake. At very least it will enable any person to eyeball and spot/fix missing dependencies, without dig into SCons internals.	2016-10-14 16:52:13 +01:00
Jose Fonseca	c6d17701c8	pipe_loader_sw: Don't invoke Unix close() on Windows. Trivial.	2016-10-14 16:29:04 +01:00
Emil Velikov	ebffa7b6af	Revert "egl/dri2: rework dri2_make_current code flow" This reverts commit `675719817e`.	2016-10-14 16:07:33 +01:00
Mauro Rossi	6eacd69b6f	i915: store reference to the context within struct intel_fence (v2) Porting of the corresponding patch for i965. Here follows the original commit message by Tomasz Figa: "As the spec allows for {server,client}_wait_sync to be called without currently bound context, while our implementation requires context pointer. v2: Add a mutex and acquire it for the duration of brw_fence_client_wait() and brw_fence_is_completed() as suggested by Chad." NOTE: in i915 all references to 'brw' are replaced by 'intel' Marshmallow-x86 boots ok with the following results of Android CTS. Android CTS 6.0_r7 build:2906653 Session Pass Fail Not Executed 0(EGL) 1410 24 0 1(GLES2) 13832 82 0 I get the same results as per i965GM. [Emil Velikov: Include Mauro's test results] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-14 15:43:57 +01:00
Mauro Rossi	d11515ff1b	i915/sync: Implement DRI2_Fence extension Here is the porting of corresponding patch for i965, i.e. commit `c636284` i965/sync: Implement DRI2_Fence extension Here follows part of original commit message by Chad Versace: "This enables EGL_KHR_fence_sync and EGL_KHR_wait_sync." Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-14 15:43:53 +01:00
Mauro Rossi	19fa29a592	i915/sync: Replace prefix 'intel_sync' -> 'intel_gl_sync' This is the porting of corresponding patch for i965, i.e. commit `2516d83` i965/sync: Replace prefix 'intel_sync' -> 'intel_gl_sync' The only difference compared to i965 one is that intel_check_sync() was renamed to intel_gl_check_sync() here, as it is more appropriate. Here follows original commit message by Chad Versace: "I'm about to implement DRI2_Fenc in intel_syncobj.c. To prevent madness, we need to prefix functions for GL_ARB_sync with 'gl' and functions for DRI2_Fence with 'dri'. Otherwise, the file will become a jumble of similiarly named functions. For example: old-name: intel_client_wait_sync() new-name: intel_gl_client_wait_sync() soon-to-come: intel_dri_client_wait_sync() I wrote this renaming commit separately from the commit that implements DRI2_Fence because I wanted the latter diff to be reviewable." [Emil Velikov: rename the outstanding intel_sync instances] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-14 15:43:22 +01:00
Emil Velikov	284795616a	egl/drm: set eglError and provide an error message on failure v2: Remove gratuitous newline/semicolon (Eric) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:53:39 +01:00
Emil Velikov	d81ba763e3	egl/x11: attribute for dri2_add_config failure ... in dri2_x11_add_configs_for_visuals(). Currently the latter does not consider that, thus in such cases it adds "empty" configs in the list. Properly account for things and as we do that we can reuse count, instead of calling _eglGetArraySize to determine if we've added any configs. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:53:39 +01:00
Emil Velikov	0b2b719121	egl/wayland: introduce dri2_wl_add_configs_for_visuals() helper Analogous to previous commits - with an extra bonus. Current code, apart from not attributing the lack of 'per visual' and overall configs also overwrites the newly added config. Namely if the dpy supports two or more of the supported formats (XRGB8888, ARGB8888 and RGB565) earlier configs will be overwritten and the the final one will be stored, since the we use the same index for all three in our dri2_add_config call. v2: Use correct comparison in loop conditional (Eric) Use valid C initializer (Gurchetan) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:53:39 +01:00
Emil Velikov	4b8a55809e	egl/surfaceless: tweak surfaceless_add_configs_for_visuals() Analogous to previous commit. v2: Use correct comparison in loop conditional (Eric) Use valid C initializer (Gurchetan) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:53:39 +01:00
Emil Velikov	acd35c8758	egl/android: tweak droid_add_configs_for_visuals() Iterate over the driver_configs first in order to cut down the number of getConfigAttrib() calls by a factor of 5. While we're here, also drop the sentinel of the visuals array. We already know its size so we can use that and save a few bytes. v2: Use correct comparison in loop conditional (Eric) Use valid C initializer (Gurchetan) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:53:39 +01:00
Emil Velikov	36fe5900a4	egl/drm: introduce drm_add_configs_for_visuals() helper Factor out and rework the existing code so that it prints a debug message if we have zero configs for any visual. As a nice side effect we now provide a correct (sequential ID) when creating a config (via dri2_add_config). v2: Use correct comparison in loop conditional (Eric) Use valid C initializer (Gurchetan) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:53:39 +01:00
Emil Velikov	23ed073aa4	egl/surfaceless: print out a message on zero configs for given format Currently we print a debug message if the total configs is non-zero only to do the same (at an error level) as we return from the function. Rework the message to print if we're missing a config for the given format. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:53:39 +01:00
Emil Velikov	98f5d0106a	egl/dri2: set WL_bind_wayland_display in a consistent way Introduce a helper and use it throughout the platform code. This allows us to reduce the amount of ifdef(s) and (potentially) use kms_swrast_dri.so for !drm platforms (namely wayland and x11). Note: in the future as other platforms (android, surfaceless) support the extension they can reuse the helper. v2: Rebase, check for device_name. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:53:39 +01:00
Emil Velikov	637d001a97	egl/android: remove duplicate KHR_image_base set The core egl/dri2 already sets the extension bit _only_ when possible - which in Android's case is always. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-14 12:53:38 +01:00
Emil Velikov	9caacb39b9	loader/dri3: constify the loader_dri3_vtable Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:53:35 +01:00
Emil Velikov	fdd373acca	egl/dri2: micro optimise dri2_bind_extensions() Do not loop over all matches if we've already found one. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:46:09 +01:00
Emil Velikov	665cad1658	egl/dri2: annotate dri2_extension_match instances as const data v2: Rebase. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:46:05 +01:00
Emil Velikov	3948ad82ce	egl/dri2: use dri2_bind_extensions to manage the optional extensions v2: dri2_bind_extensions() now takes optional as an argument. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:46:03 +01:00
Emil Velikov	d5342c6ff2	gbm: rename gbm_dri_device::{,loader_}extensions To align with the name used in the EGL and GLX loaders. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:45:54 +01:00
Emil Velikov	38526bd468	egl/dri2: add support for optional extensions in dri2_bind_extensions() Will allow us to reuse the function for optional extensions and fold a bit of code. v2: Make dri2_bind_extensions::optional flag an argument to dri2_bind_extensions (Kristian). Cc: Rob Clark <robdclark@gmail.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-14 12:45:24 +01:00
Emil Velikov	ebc68e3849	egl/dri2: coding style cleanup Consistently indent with space rather than a mix of tab and spaces. v2: Keep the structs properly aligned (Eric). Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-14 12:43:57 +01:00
Emil Velikov	b10c05d4ff	egl/x11: don't crash if dri2_dpy->conn is NULL The dri3 version of commits `60e9c35b3a` and `6de9a03bed`. While using xcb_connect() guarantees that we always get a non NULL return value, XGetXCBConnection() does/can not. CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:42:37 +01:00
Emil Velikov	f871946594	egl/dri2: rework dri2_egl_display::extensions storage Remove the error prone fixed size array. While we're here also rename to loader_extensions like in the GLX code. v2: Rebase. Keep image_loader_extension within the wayland_drm dri2_loader_extensions list. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:42:22 +01:00
Emil Velikov	f7b8108289	egl/dri2: remove unused dri2_egl_display::{dri2,swrast}_loader_extension Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:42:18 +01:00
Emil Velikov	e7fcf1b09b	egl/x11: don't populate dri2_dpy->swrast_loader_extension Analogous to earlier commits. Note: the actual version of the extension is 1, since it does not implement .putImage2. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:42:02 +01:00
Emil Velikov	2dbe14af1e	egl/wayland: don't populate dri2_dpy->swrast_loader_extension Similar to the dri2 one - the extension stored in struct dri2_egl_display is unused. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:42:00 +01:00
Emil Velikov	3963a5fc94	egl/x11: don't populate dri2_dpy->dri2_loader_extension Analogous to the earlier android and wayland patches. As we're here we can drop exposing the old version of the extension. Any dri loader/driver interface use lower bound checking thus exposing dri2 loader v3 to a v2 capable driver is perfectly normal. v2: Preserve compat with dri2_minor < 1. The driver does not know if there is a protocol to manage getBuffersWithFormat(). It's up-to the loader to expose the vfunc if there is one. (Kristian) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:41:56 +01:00
Emil Velikov	d2d579da7e	egl/wayland: don't populate dri2_dpy->dri2_loader_extension Analogous to the earlier android patch. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:41:51 +01:00
Emil Velikov	31ef5d4452	egl/surfaceless: trivial coding style fixes Remove a few gratious blank lines and use the correct level of indentation. Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:41:48 +01:00
Emil Velikov	d0155bcbe8	egl/surfaceless: don't check the mask(s) prior to calling dri2_add_config The latter already does it for us. As we're here annotate the masks as const and use unsigned for the index(es). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2016-10-14 12:41:43 +01:00
Emil Velikov	ff700f8c22	egl/surfaceless: remove unused dri2_loader_extension implementation Earlier commit introduced support for image_loader and left the dri2_loader code dangling/unused. Let's remove it. Fixes: `63c5d5c6c4` ("Added pbuffer hooks for surfaceless platform") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2016-10-14 12:17:18 +01:00
Emil Velikov	6a8fe32430	egl/android: don't populate dri2_dpy->dri2_loader_extension The extension stored in struct dri2_egl_display isn't used, thus we can create a static const instance of the extension and point extensions[] to it. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-14 12:17:18 +01:00
Emil Velikov	675719817e	egl/dri2: rework dri2_make_current code flow Fold duplicate conditional blocks and add a few extra comments ;-) v2: Bring back the explicit "unbind" logic (Eric), remove NULL derefs. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-14 12:17:18 +01:00
Emil Velikov	07690a289a	egl/dri2: drop NULL checks prior to dri2_destroy_surface The function already have the respective check within. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-14 12:17:18 +01:00
Emil Velikov	8cf83f9c08	egl/dri2: call static functions directly, not via _EGLDriver::API The indirection is meant to be used by the core EGL implementation in main. Not in the drivers themselves. Move the dri2_destroy_surface definition to avoid forward declaration of the static function. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-10-14 12:16:08 +01:00
Emil Velikov	532ec2edd8	egl/dri2: use dri2_egl_display inline wrapper where possible This way the only places that reference DriverData are the ones that manipulate it. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-10-14 12:16:07 +01:00
Emil Velikov	d6dcf3b4ca	egl/dri2: bail out on NULL dpy in dri2_display_release() Currently all callers are careful enough not to do that, yet that will not be the case in the future. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-10-14 12:16:06 +01:00
Emil Velikov	8fb9ea413d	egl/dri2: move surface refcounting out of the platform code All the platforms are duplicating what should be a driver/dri2 thing - refcounting. Just fold it accordingly. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-10-14 12:16:05 +01:00
Emil Velikov	02f1158746	egl/dri2: coding style fix Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-10-14 12:16:04 +01:00
Emil Velikov	7a9c92d071	egl/dri2: non-shared glapi cleanups For a while now we require shared glapi for EGL, thus we can drop a few bits from the olden days. Namely - dlopen(NULL...) is not possible, error out at build stage if so and drop the guard around dlclose(). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-10-14 12:16:03 +01:00
Emil Velikov	b349c11098	egl/dri2: glFlush is not optional, treat it as such The documentation is clear - one must glFlush the old context on eglMakeCurrent. Thus keeping it optional is not something we should be doing. Furthermore if we cannot get the entry point we're likely having a broken setup/stack. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-10-14 12:16:00 +01:00
Emil Velikov	13bf390657	aubinator: replace pragma once with ifndef guard Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Sirisha Gandikota<sirisha.gandikota@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-10-14 11:53:45 +01:00
Emil Velikov	ae6fb9c922	anv: error out if anv_genX.h is included by !anv_private.h Update the comment to reflect the correct filename and add a guard to catch incorrect inclusion of the header. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-10-14 11:53:43 +01:00
Emil Velikov	08efa6a19f	anv: use correct header guards Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-10-14 11:53:41 +01:00
Emil Velikov	76ae842366	intel/genxml: use correct header guards Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-10-14 11:53:39 +01:00
Emil Velikov	72e70c00f3	intel/common: use correct header guards Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-10-14 11:53:37 +01:00
Emil Velikov	0d86c92dcb	intel/blorp: use correct header guards Avoid the discouraged use of pragma once and a missing guard for blorp_genX_exec.h. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-10-14 11:53:34 +01:00
Emil Velikov	3a98bffa59	isl: use ifndef header guards Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-10-14 11:53:32 +01:00
Emil Velikov	4c1c9d62a9	isl: make locally used functions static Signed-off-by: Emil Velikov <emil.velikov@collabra.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-10-14 11:53:30 +01:00
Emil Velikov	4fe6e7f2bd	isl: trivial include-what-you-want cleanups Noticed while skimming through the files. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-10-14 11:53:28 +01:00
Emil Velikov	eac752e54b	isl/gen7: remove unneeded ISL_DEV_GEN check The function gen7_format_needs_valign2 has two callers - the gen7 only gen7_choose_valign_el() and isl_gen6_filter_tiling(). The latter of which already guarding the invocation appropriately. To be extra cautious add a couple of asserts alongside the removal of the runtime check. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-10-14 11:53:25 +01:00
Emil Velikov	5b1efb65ce	isl: prefix non-static API with isl_ The rest of ISL already follows this approach. Be consistent and resolve the final references. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-10-14 11:53:22 +01:00
Emil Velikov	84f9ef1de4	isl/gen6: correctly check msaa layout samples count Samples == 1 is a valid value, so returning false is plain wrong. Seeming copy/paste typo introduced since day 1. Fixes: `afdadec77f` ("isl: Implement isl_surf_init() for gen4-gen9") Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-10-14 11:53:15 +01:00
Emil Velikov	c572360c30	automake: add radv to the `make distcheck' hooks Will allow us to catch issues (as fixed with previous patches) rather than release a broken tarball. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-10-14 11:09:00 +01:00
Emil Velikov	3fd0cafc1c	radv: move AMDGPU_LIBS later in the link chain At the moment (albeit unlikely) one could get link-time issues, since libdrm_amdgpu.so is before it's users in the link chain. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-10-14 11:09:00 +01:00
Emil Velikov	a8a5f0a025	radv: correct variable name VISIBILITY_{, C}FLAGS The letter C was missing, thus in turn all the internal symbols were exported. As a result we hide ~150 symbols and cut ~36K from libvulkan_radeon.so. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-10-14 11:09:00 +01:00
Emil Velikov	753a9c989f	amd/addrlib: hide private symbols via VISIBILITY_CXXFLAGS Private/internal symbols should not be exported. Using the CXXFLAGS cuts ~300 exported symbols and ~23K from libvulkan_radeon.so. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-14 11:09:00 +01:00
Emil Velikov	72fa5ca06d	intel: automake: replace direct basename $@ invokation with $(@F) Use the shorthand make variable(s) as elsewhere in the build. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2016-10-14 11:09:00 +01:00
Emil Velikov	48267b730c	gallium: annotate sw_driver_descriptor instance as const data Already treated and handled as such. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-14 11:09:00 +01:00
Emil Velikov	792148f16a	gallium: annotate drm_driver_descriptor instance as const data Already treated and handled as such. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-14 11:09:00 +01:00
Emil Velikov	c079a206ad	gallium: rename drm_driver_descriptor::{, driver_}name Historically we use "device name" for the name of the kernel module and "driver name" for the dri/other driver. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-14 11:09:00 +01:00
Emil Velikov	9837cf13b1	gallium: remove unused drm_driver_descriptor::driver_name Likely unused since day 1, although I've only checked back until the st/dri unification with commit `29ca7d2c94` ("st/dri: merge dri/drm and dri/sw backends") Based on the comment, referencing drmOpenByName it's not something we want to bring back. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-14 11:09:00 +01:00
Emil Velikov	0f031dcf11	gallium: fix drm_driver_descriptor::name comment Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-14 11:09:00 +01:00
Emil Velikov	c85b34ffd0	mesa_glinterop: allow building without X and related headers This commit effectively reverts `c10dcb2ce8` and fixes the typedef redefinition which inspired it. In order to prevent requiring X packages at build time earlier commit forward declared the required X/GLX typedefs. Since that approach introduced typedef redefinition (a C11 feature) it was reverted. To avoid the redefinition while _not_ mandating X and related headers forward declare the structs and use those through the header. As anyone uses the mesa interop header they ensure that the X (or others in terms of EGL) headers are included, which ensures that everything is resolved within the compilation unit. Cc: Vinson Lee <vlee@freedesktop.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Cc: Tapani Pälli <tapani.palli@intel.com> Cc: Chih-Wei Huang <cwhuang@android-x86.org> Fixes: `c10dcb2ce8` ("Revert "mesa_glinterop: remove inclusion of GLX header"") Fixes: `8472045b16` ("mesa_glinterop: remove inclusion of GLX header") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96770 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-10-14 11:08:59 +01:00
Mark Thompson	0b241b7717	st/va: Fix H.264 PicOrderCnt value TopFieldPicOrderCnt is exactly the PicOrderCnt value for a frame - see H.264 section 8.2.1. Reviewed-by: Christian König <christian.koenig@amd.com>	2016-10-14 11:57:52 +02:00
Mark Thompson	1edaa33135	st/va: Baseline profile is not supported Constrained baseline profile is supported, so use that instead. This matches what the encoder already does (constraint_set1_flag is always set in the output bitstream). Reviewed-by: Christian König <christian.koenig@amd.com>	2016-10-14 11:57:48 +02:00
Mark Thompson	e0604eed9f	st/va: Return surface formats depending on config chroma format This makes the supported format actually match the configuration, and allows the user to observe that NV12 is supported for video processing where previously they couldn't (though it did always work if they blindly tried to use it anyway). Reviewed-by: Christian König <christian.koenig@amd.com>	2016-10-14 11:57:44 +02:00
Mark Thompson	e7c7ef3625	st/va: Save surface chroma format in config Both YUV420 and RGB32 configurations are supported, so we need to be able to distinguish which is being used. Reviewed-by: Christian König <christian.koenig@amd.com>	2016-10-14 11:57:40 +02:00
Mark Thompson	8a931c83ba	st/va: Return more useful config attributes The encoder attributes are needed for a user of the encoder to be able to configure it sensibly without internal knowledge. Reviewed-by: Christian König <christian.koenig@amd.com>	2016-10-14 11:57:25 +02:00
Mario Kleiner	0c94ed0987	glx: Perform check for valid fbconfig against proper X-Screen. Commit `cf804b4455` ('glx: fix crash with bad fbconfig') introduced a check in glXCreateNewContext() if the given config is a valid fbconfig. Unfortunately the check always checks the given config against the fbconfigs of the DefaultScreen(dpy), instead of the actual X-Screen specified in the config config->screen. This leads to failure whenever a GL context is created on a non-DefaultScreen(dpy), e.g., on X-Screen 1 of a multi-x-screen setup, where the default screen is typically 0. Fix this by using config->screen instead of DefaultScreen(dpy). Tested to fix context creation failure on a dual-x-screen setup. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-14 10:11:25 +01:00
Tim Rowley	a42c22fdbf	swr: [rasterizer core] don't construct pArContext on non-ar builds Stops debug directory being created on non-ar builds. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-13 23:39:14 -05:00
Tim Rowley	29d07480b8	swr: [rasterizer core] remove WorkerWaitForThreadEvent bucket Cause of bucket stop capture hang, as threads get stuck in level 1. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-13 23:39:14 -05:00
Tim Rowley	ada27b503e	swr: [rasterizer core] move binner functionality to separate file Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-13 23:39:14 -05:00
Tim Rowley	f0a66c1da2	swr: [rasterizer scripts] add DEBUG_OUTPUT_DIR knob Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-13 23:39:14 -05:00
Tim Rowley	ffd0224303	swr: [rasterizer core] fix comment typo Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-13 23:39:14 -05:00
Tim Rowley	4889922210	swr: [rasterizer core/sim] 8x2 backend + 16-wide tile clear/load/store Work in progress (disabled). USE_8x2_TILE_BACKEND define in knobs.h enables AVX512 code paths (emulated on non-AVX512 HW). Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-13 23:39:14 -05:00
Tim Rowley	bf1f46216c	swr: [rasterizer archrast] fix event file issue with saving data Also, tagging stats with draw id to correlate these events with draw/dispatch events. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-13 23:39:13 -05:00
Eric Engestrom	827e038062	swr: [rasterizer common] fix assert index Fixes: `b3bd8bb611` ("swr: [rasterizer core] add support for "RAW" surface format") CovID: 1373647 Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-13 21:37:20 -05:00
Ilia Mirkin	5f885225cf	docs: mark GL 4.4/4.5 extension groups as DONE for nvc0 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-13 21:45:21 -04:00
Ilia Mirkin	afb6dc53bf	nv50: enable ARB_enhanced_layouts Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-13 21:45:21 -04:00
Ilia Mirkin	a6d6eff2e6	nvc0/ir: be more careful about preserving modifiers in SHLADD creation src2 was being given the wrong modifier, and we were not properly managing the modifier on the SHL source either. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-10-13 21:44:03 -04:00
Brian Paul	3a2869aaca	mesa: fix indentation in vertex_attrib_binding() Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2016-10-13 17:38:49 -06:00
Brian Paul	743a526372	mesa: add sanity check assertion in update_array_format At most, one of the normalized, integer, doubles bools can be true. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2016-10-13 17:38:49 -06:00
Brian Paul	d6b0002195	mesa: remove needless cast in update_array() Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2016-10-13 17:38:49 -06:00
Brian Paul	74745dcfa4	mesa: simplify update_array() with a vao local var Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2016-10-13 17:38:49 -06:00
Brian Paul	0de9265b1f	vbo: simplify some code in check_draw_elements_data() Use the 'vao' local var in more places. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2016-10-13 17:38:49 -06:00
Brian Paul	15fb88e912	mesa: rename gl_vertex_attrib_array gl_array_attributes The structure contains the attributes of a vertex array. The old name was kind of confusing. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2016-10-13 17:38:49 -06:00
Brian Paul	c89802aeea	mesa: rename gl_vertex_attrib_array::VertexBinding Rename to gl_vertex_attrib_array::BufferBindingIndex because this field is an index into the array of buffer binding points. This makes some code a little easier to follow since there's also a "VertexBinding" field in gl_vertex_array_object. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2016-10-13 17:38:49 -06:00
Brian Paul	c328268b92	mesa: rename some vars in arrayobj.c Use 'vao' instead of 'obj' to be consistent with other code. Plus, add a comment. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2016-10-13 17:38:49 -06:00
Brian Paul	b81546d43c	tgsi: fix comment typo in tgsi_ureg.c Trivial.	2016-10-13 17:38:49 -06:00
Brian Paul	ff00ab745c	mesa: replace gl_framebuffer::_IntegerColor wih _IntegerBuffers Use a bitmask to indicate which color buffers are integer-valued, rather than a bool. Also, the old field was mis-computed. If an integer buffer was followed by a non-integer buffer, the _IntegerColor field was wrongly set to false. This fixes the new piglit gl-3.1-mixed-int-float-fbo test. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-13 17:38:49 -06:00
Brian Paul	a710c21ac2	mesa: remove 'params' parameter from ctx->Driver.TexParameter() None of the drivers which implement this hook do anything with the texture parameter value. Drivers just look at the pname and set a dirty flag if needed. We were doing some ugly casting and type conversion to setup the argument so that all goes away. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-13 17:38:49 -06:00
Eric Anholt	99d790538d	vc4: Avoid loading from the texture during non-utile-aligned glTexImage(). Previously, the plan was "if the width/height we have to load/store isn't the size the user is planning on writing, then we need to load the old contents out beforehand to prevent writing back undefined". However, when we're doing glTexImage() we often end up aligning the width/height into the padding of the texture, and we don't actually need to read out that padding. Improves x11perf -aatrapezoid100 performance from ~460/sec to ~700/sec.	2016-10-13 14:27:30 -07:00
Axel Davy	0717cd975d	st/nine: Fix possible segfault in surface ctor Regression introduced by `ba0274c7d6` Check the resource exists before assigning it a flag (and use This->base.resource instead of pResource, since the former may have a newly allocate resource, while the latter would be NULL). This should reintroduce the behaviour of previous code. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-13 21:16:35 +02:00
Axel Davy	98b8ad61c6	st/nine: Remove useless code in nine_shader Since `1604efa6fd`, lconsti and lconstb don't need to be initialized. Remove some leftovers from the previous code (which has now invalid use of ARRAY_SIZE on a pointer instead of an array). Reported by Coverity. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-13 21:16:35 +02:00
Axel Davy	197cdd1bbd	gallium/os: Use unsigned integers for size computation Use uint64_t instead of int64_t in the calculation, as the result is uint64_t. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-13 21:16:35 +02:00
Samuel Pitoiset	4527222169	nvc0: enable ARB_enhanced_layouts All ARB_enhanced_layouts piglit tests pass without any changes in our compiler. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-13 21:13:34 +02:00
Dave Airlie	47a7d86fe9	radv: fix the wayland wsi busy bit Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-14 05:10:02 +10:00
Dave Airlie	a3834ebaf9	anv: fix the wayland wsi busy flag setting Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-14 05:10:02 +10:00
Tom Stellard	5c66d46d6a	radv: Use new image load/store intrinsic signatures v2 These were changed in LLVM r284024. v2: - Only use float types for vdata of llvm.amdgcn.image.store. LLVM doesn't support integer types for this intrinsic. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-14 04:48:11 +10:00
Tom Stellard	30e63fb0e4	radv: Fix incorrect comment Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-14 04:48:11 +10:00
Dave Airlie	060e6f468a	radv: fix identity swizzle handling The identity swizzle should operate exactly like an .r = R, .g = G, .b = B, .a = A swizzle. This fixes a bunch of the 16-bit BGRA blit tests dEQP-VK.api.copy_and_blit.blit_image.all_formats.b4g4r4a4* Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-14 04:45:57 +10:00
Dave Airlie	8980ac0411	anv/wsi: fix apps that acquire multiple images up front This fix was found in the radv codebase when running dota2, no idea if anyone has reported it on anv, but the same problem occurs. Once an image is acquired we need to mark it busy. Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-14 04:45:11 +10:00
Dave Airlie	8bdac874e6	radv/wsi: fix app that acquire multiple images up front dota2 does multiple acquires followed by multiple queues, this bug manifested itself as a hang in the xshmfence code randomly when dota2 was doing it's menus. It also occured when running dota2 under phoronix-test-suite. The fix is once the image is acquired to mark it busy then so nobody else can acquire. We have to trust vulkan apps that they will eventually submit it. Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-14 04:45:11 +10:00
Dave Airlie	dfe74fd1a9	anv: initialise and increment send_sbc At least set this to not be uninitialised memory. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-14 04:45:00 +10:00
Marek Olšák	7dddf0b7ab	radeonsi: adjust and clean up Z_ORDER and EXEC_ON_x settings The table was copied from the Vulkan driver. The comment lines are as long as the table for cosmetic reasons. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-13 19:00:51 +02:00
Marek Olšák	e12c1cab5d	radeonsi: disable ReZ This is a serious performance fix. Discovered by luck. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94354 Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-13 19:00:51 +02:00
Marek Olšák	d4d9ec55c5	radeonsi: implement TC-compatible HTILE so that decompress blits aren't needed and depth texturing needs less memory bandwidth. Z16 and Z24 are promoted to Z32_FLOAT by the driver, because TC-compatible HTILE only supports Z32_FLOAT. This doubles memory footprint for Z16. The format promotion is not visible to state trackers. This is part of TC-compatible renderbuffer compression, which has 3 parts: DCC, HTILE, FMASK. Only TC-compatible FMASK compression is missing now. I don't see a measurable increase in performance though. (I tested Talos Principle and DiRT: Showdown, the latter is improved by 0.5%, which is almost noise, and it originally used layered Z16, so at least we know that Z16 promoted to Z32F isn't slower now) Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-13 19:00:51 +02:00
Marek Olšák	a077185ea9	gallium: add PIPE_RESOURCE_FLAG_TEXTURING_MORE_LIKELY For performance tuning in drivers. It filters out window system framebuffers and OpenGL renderbuffers. radeonsi will use this to guess whether a depth buffer will be read by a shader. There is no guarantee about what will actually happen. This is a departure from PIPE_BIND flags which are defined to be strict but they are useless in practice. Acked-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-13 19:00:51 +02:00
Nicolai Hähnle	761388a0eb	radeonsi: fix regression in image atomics Caused by a bad rebase when pushing commit `76a940893`.	2016-10-13 16:04:16 +02:00
Nicolai Hähnle	d413fbb159	st/mesa: fix vertex elements setup for doubles Whether one or two slots are taken up by one API array depends on the vertex shader, not on how the array is configured. When an array is set up with fewer components than the shader expects, the high components are undefined. Fixes GL45-CTS.vertex_attrib_binding.basic-inputL-case1. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-10-13 15:41:36 +02:00
Nicolai Hähnle	15fc74905b	st/glsl_to_tgsi: remove unnecessary ir_instruction argument from get_opcode Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-10-13 15:41:33 +02:00
Nicolai Hähnle	1d7685e52c	st/glsl_to_tgsi: fix textureGatherOffset with indirectly loaded offsets Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-10-13 15:41:29 +02:00
Nicolai Hähnle	b234e37765	st/glsl_to_tgsi: simplify translate_tex_offset This fixes a bug with offsets from uniforms which seems to have only been noticed as a crash in piglit's arb_gpu_shader5/compiler/builtin-functions/fs-gatherOffset-uniform-offset.frag on radeonsi. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-10-13 15:41:11 +02:00
Nicolai Hähnle	76a940893d	radeonsi: fix the coordinate overloading of llvm.amdgcn.image.atomic.cmpswap.* Fixes GL45-CTS.shader_image_load_store.basic-allTargets-atomic* Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-13 10:17:42 +02:00
Nicolas Koch	35e2bfa6d9	radv: Return correct result in EnumeratePhysicalDevices If pPhysicalDevices is too small for all physical devices, the driver must return VK_INCOMPLETE. Since only a single physical device is supported, this is only the case when pPhysicalDeviceCount == 0 && pPhysicalDevices != NULL. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-13 09:11:13 +10:00
Ilia Mirkin	e6a693c447	st/mesa: only flip stipple pattern for winsys fbo's Gallium is completely oblivious to whether the fbo is flipped or not. Only flip the stipple pattern when the fbo is flipped as well. Otherwise the driver has no idea when to unflip the pattern. Fixes bin/gl-2.1-polygon-stipple-fs -fbo Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-12 17:04:16 -04:00
Emil Velikov	a4622305e6	swr: automake: add ar_eventhandlerfile_h.template to the tarball Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-12 18:55:22 +01:00
Emil Velikov	3c419a941a	radv: add all headers to the sources list Otherwise they'll be missing from the tarball and the build will fail. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-12 18:55:20 +01:00
Ilia Mirkin	a48a343c29	nvc0/ir: fix textureGather with a single offset Recent fix for non-const offsets broke the case of a single offset (vs 4 offsets). The later code relies on the offs array to contain null values to tell whether they should be added onto the srcs list. Fixes: `5239bd592` ("nvc0/ir: fix overwriting of value backing non-constant gather offset") Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2016-10-12 13:18:14 -04:00
Ilia Mirkin	300b5ad023	nv50/ir: copy over value's register id when resolving merge of a phi The offset needs to be properly copied over to the phi value, otherwise it will get assigned to the base of the merge instead of the proper location. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2016-10-12 13:18:14 -04:00
Nicolai Hähnle	789119d212	st/mesa: enable ARB_enhanced_layouts and turn the cap on v2: mark llvmpipe & softpipe properly as well (Jason Wood) Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-10-12 18:50:10 +02:00
Nicolai Hähnle	b5b4aa42ba	st/glsl_to_tgsi: adjust swizzles and writemasks for explicit components Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-10-12 18:50:10 +02:00
Nicolai Hähnle	777dcf81b9	st/glsl_to_tgsi: explicitly track all input and output declaration In order to be able to emit overlapping input and output array declarations, we flip the logic of emitting those declarations on its head: rather than iterating over slots and emitting the corresponding declarations, we iterate over the declarations from GLSL and emit those. v2: fix some regressions related to structs v3: fix a regression in geometry and tessellation shader array handling Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> (v2) Reviewed-by: Dave Airlie <airlied@redhat.com> (v2)	2016-10-12 18:50:10 +02:00
Nicolai Hähnle	2299a9940c	st/glsl_to_tgsi: mark "gaps" in input/output arrays as used In some cases, a shader may have an input/output array but not use some entries in the middle. This happens with eON games, for example. We emit declarations that cover the entire array range even if there are some unused gaps. This patch now reflects that in the InputsRead etc. fields to ensure the various input/outputMapping arrays are actually correct, which will be important when we re-jiggle the way declarations are emitted. v2: fix a typo (Edward O'Callaghan) Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-10-12 18:50:10 +02:00
Nicolai Hähnle	63193b9cde	st/glsl_to_tgsi: disable on-the-fly peephole for 64-bit operations This optimization is incorrect with 64-bit operations, because the channel-splitting logic in emit_asm ends up being applied twice to the source operands. A lucky coincidence of how the writemask test works resulted in this optimization basically never being applied anyway. As far as I can tell, the only case where it would (incorrectly) have been applied is something like dvec2 d; float x = (float)d.y; which nobody seems to have ever done. But the moral equivalent does occur in one of the component layout piglit test. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-10-12 18:50:10 +02:00
Nicolai Hähnle	f5f3cadca3	st/glsl_to_tgsi: simpler fixup of empty writemasks Empty writemasks mean "copy everything", so we can always just use the number of vector elements (which uses the GLSL meaning here, i.e. each double is a single element/writemask bit). Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-10-12 18:50:10 +02:00
Nicolai Hähnle	957d541089	st/glsl_to_tgsi: explicit handling of writemask for depth/stencil export Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-10-12 18:50:10 +02:00
Nicolai Hähnle	14aaaa1b4b	glsl: dump explicit location when printing IR Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-10-12 18:50:10 +02:00
Nicolai Hähnle	2b460c750a	tgsi/ureg: add ureg_DECL_output_layout For specifying an exact location/component. v2: change the order of parameters (Dave) Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> (v1) Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)	2016-10-12 18:50:10 +02:00
Nicolai Hähnle	047a7c7a0b	tgsi/ureg: add layout/component input declarations v2: change the order of parameters (Dave) Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> (v1) Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)	2016-10-12 18:50:10 +02:00
Nicolai Hähnle	f9a01f3872	tgsi/scan: fix num_inputs/num_outputs for shaders with overlapping arrays v2: remove a tautological left-over assert (Marek) Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> (v1) Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)	2016-10-12 18:50:10 +02:00
Nicolai Hähnle	700a571f89	gallium: add PIPE_CAP_TGSI_ARRAY_COMPONENTS This is a screen cap because drivers are expected to support it either for all shader types or for none of them. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-10-12 18:50:10 +02:00
Tom Stellard	b33cb709fd	radeonsi: Use the new image load/store intrinsic signatures This patch requires LLVM r284024 or newer. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-12 16:42:43 +00:00
Tom Stellard	ff0df66e10	radeonsi: Add function for converting LLVM type to intrinsic string The existing function only worked for integer types. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-12 16:42:07 +00:00
Tom Stellard	a96a7eae04	radeonsi: Refactor image store/load intrinsic name creation Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-12 16:42:07 +00:00
Marek Olšák	d7e74b52bb	winsys/amdgpu: fix infinite loop w/ RADEON_NOOP=1 caused by unsubmitted fences Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-12 18:29:40 +02:00
Marek Olšák	e4bbab9022	radeonsi: fix R600_DEBUG=precompile for shader-db radeonsi no longer supports pixel shaders without interpolation optimizations, which led to assertion failures in si_shader_ps when running shader-db. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-12 18:29:40 +02:00
Marek Olšák	40e1f7e09b	radeonsi: use TC write-back instead of full cache invalidation Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-12 18:29:40 +02:00
Marek Olšák	8cdce30cc2	radeonsi: implement TC L2 write-back (flush) without cache invalidation Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-12 18:29:40 +02:00
Marek Olšák	65a4d55a9f	radeonsi: don't invalidate VMEM L1 for memory barriers for index buffers Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-12 18:29:40 +02:00
Samuel Pitoiset	87b06cab14	nv50/ir: optimize ADD(SHL(a, b), c) to SHLADD(a, b, c) total instructions in shared programs :2286901 -> 2284473 (-0.11%) total gprs used in shared programs :335256 -> 335273 (0.01%) total local used in shared programs :31968 -> 31968 (0.00%) local gpr inst bytes helped 0 41 852 852 hurt 0 44 23 23 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-12 17:46:03 +02:00
Nicolai Hähnle	85ba409967	mapi: fix out-of-tree build dependencies We shouldn't be using wildcard here in the first place, but changing that is some effort. As it stands, make -p confirms that glapi_gen_mapi_deps only contains mapi_abi.py when building outside the Mesa tree. As a result, only some of the tables were updated when XML files change, but not the tables for shared glapi. This change ensures that we pick up the XML files and scripts from the source tree as dependencies also for shared glapi. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-12 17:36:35 +02:00
Roland Scheidegger	7e86b2ddae	draw: initialize shader inputs This should make the code more robust if a shader tries to use inputs which aren't defined by the vertex element layout (which usually shouldn't happen). No piglit change. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-10-12 15:05:44 +02:00
Edward O'Callaghan	cfbf956dfd	radv: trivial case stmt style fixups Relocate a 'default:' to the end of a case stmt and fix an indent issue. Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2016-10-12 20:12:43 +11:00
Nicolas Koch	fd27d5fd92	anv: Return correct result in EnumeratePhysicalDevices If pPhysicalDevices is too small for all physical devices, the driver must return VK_INCOMPLETE. Since only a single physical device is supported, this is only the case when pPhysicalDeviceCount == 0 && pPhysicalDevices != NULL. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-11 22:58:27 -07:00
Kenneth Graunke	2871d4d687	anv: Allow vp_info to be NULL in 3DSTATE_CLIP code. pViewportState may be NULL if rasterization is disabled. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-11 22:50:19 -07:00
Kenneth Graunke	ba38a9d380	anv: Fix anv_pipeline_validate_create_info assertions. Many of these can be "NULL if the pipeline has rasterization disabled." Also, we should assert that pMultisampleState exists. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-11 22:50:09 -07:00
Ilia Mirkin	389d6dedbe	trace: add invalidate_resource callback Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-11 20:47:54 -04:00
Gustaw Smolarczyk	c3f3c6b0e8	radv/winsys: Fix radv_amdgpu_cs_grow min_size argument. (v2) It's supposed to be how much at least we want to grow the cs, not the minimum size of the cs after growth. v2: Unbreak use_ib_bos. Don't mask the ib_size when !use_ib_bos, since it's not needed. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-12 09:06:30 +10:00
Grigori Goronzy	a22b5f28fb	radv: fix strict aliasing violation Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-12 09:00:22 +10:00
Grigori Goronzy	0b539abcf4	radv: fix uninitialized variables This gets rid of "may be used uninitialized" compiler warnings. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-12 09:00:22 +10:00
Grigori Goronzy	7ca44f8a33	radv: add missing unreachable Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-12 09:00:22 +10:00
Dave Airlie	8cc9f89d26	radv: remove the validation layer and some related bits. As pointed out by Emil this isn't used in anv anymore, and it was totally unused in radv anyways. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-12 08:57:09 +10:00
Dave Airlie	014ec78fb2	radv: drop entrypoint split out. radv really doesn't need different dispatch per gen yet, there really isn't that many differences yet. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-12 08:56:41 +10:00
Dave Airlie	12301c5418	radv: drop the RADV_CALL macro. This is leftover from anv, and we really never needed it. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-12 08:56:41 +10:00
Dave Airlie	fc28f89157	radv: check driver name before calling amdgpu. This checks the kernel driver name is amdgpu before calling libdrm_amdgpu. This avoids the following error: amdgpu_device_initialize: DRM version is 1.6.0 but this driver is only compatible with 3.x.x when run on a machine with i915 graphics as well as amdgpu. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-12 08:56:41 +10:00
Dave Airlie	6215b47648	radv: fix memory leak from physical device if wsi fails Inspired by patch from Edward O'Callaghan <funfunctor@folklore1984.net> which didn't do it right. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-12 08:53:44 +10:00
Edward O'Callaghan	e0641c61ca	radv/winsys: Fix mem leak at failed do_winsys_init() call site Probably unlikely however ensure we don't leak a heap allocation on the fail path. V.2: also fix missing 'amdgpu_device_deinitialize()' calls (Emil Velikov). Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-12 08:46:10 +10:00
Edward O'Callaghan	4a0db58f14	radv/winsys: Trivial style and readability fixups Drop/add a few newlines where appropriate and drop a couple of unnessary braces. Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-12 08:24:50 +10:00
Marek Olšák	b425b57d1e	radeonsi: emit TA_CS_BC_BASE_ADDR on SI only if the kernel allows it Reviewed-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-11 20:04:57 +02:00
Tim Rowley	9db9c61d26	swr: [rasterizer archrast] update proto file Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-11 11:48:23 -05:00
Tim Rowley	3805e40f32	swr: [rasterizer archrast] add support for stats files Only stat and counter events are saved to the event files. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-11 11:48:23 -05:00
Tim Rowley	f4684cdb5f	swr: [rasterizer jitter] remove architecture override Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-11 11:48:23 -05:00
Tim Rowley	185a531206	swr: [rasterizer jitter] adjust jitmanager assert Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-11 11:48:17 -05:00
Tim Rowley	eaec263427	swr: [rasterizer] eliminate unused label warnings on gcc Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-11 11:22:04 -05:00
Tim Rowley	12e6f4c879	swr: [rasterizer core] implement depth bounds test Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-11 11:22:04 -05:00
Tim Rowley	1b86c050ad	swr: [rasterizer core] update/add formats Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-11 11:22:04 -05:00
Tim Rowley	a907b7a5f7	swr: [rasterizer core] SwrStoreTiles api change SwrStoreTiles now takes a mask of surfaces to store. Reduces overhead when storing multiple render targets. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-11 11:22:04 -05:00
Tim Rowley	5d5179a6c2	swr: [rasterizer scripts] add ENABLE_ASSERT_DIALOGS knob for windows Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-11 11:22:04 -05:00
Tim Rowley	07326d4006	swr: [rasterizer archrast] add mako template Add template for generating code to save events to a file. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-11 11:22:04 -05:00
Tim Rowley	e845eeb0be	swr: [rasterizer core] disable cull for rect_list Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-11 11:22:04 -05:00
Tim Rowley	b3bd8bb611	swr: [rasterizer core] add support for "RAW" surface format Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-11 11:22:04 -05:00
Tim Rowley	2966d9c691	swr: [rasterizer core] align Macrotile FIFO memory to SIMD size Align and use streaming store instructions for BE fifo queues. Provides slightly faster enqueue and doesn't pollute the caches. Add appropriate memory fences to ensure streaming writes are globally visible. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-11 11:22:04 -05:00
Tim Rowley	6b3691c876	swr: [rasterizer common] remove threadviz code Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-11 11:22:04 -05:00
Tim Rowley	2550b04179	swr: [rasterizer memory] split load/store for compile speed Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-11 11:22:04 -05:00
Eric Engestrom	0a606a400f	egl: add eglSwapBuffersWithDamageKHR EGL_KHR_swap_buffers_with_damage is actually already supported, as it is technically nothing but a rename of EGL_EXT_swap_buffers_with_damage. To that effect, both extension are advertised depending on the same condition, and the new entrypoint simply redirects to the previous one. Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-11 14:04:26 +01:00
Mauro Rossi	b9e639589d	intel/genxml: fix building rules for aubinator required headers New generated headers were introduced by commit `63a366a` "intel: aubinator: generate a standalone binary" Android does not need aubinator yet, so in order to avoid building error, aubinator required new genxml headers are defined in a separate list. If required, building rules for Android will be added later. [Emil Velikov: don't use a _HEADERS variable name (causes warnings)] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-11 13:53:19 +01:00
Emil Velikov	0b54c022a8	radv: automake: move libamdgpu_addrlib.la to VULKAN_LIB_DEPS The static library is analogous to the intel ISL, which is required for both hardware and (to be added) testing library. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-11 13:51:09 +01:00
Emil Velikov	4882476eca	radv: automake: remove unused variables Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-11 13:51:08 +01:00
Emil Velikov	e2cb253346	radv: automake: include the python scripts/formats table in the tarball Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-11 13:51:06 +01:00
Tapani Pälli	fc8b358bd6	mesa: fix error handling in _mesa_TransformFeedbackVaryings Patch changes function to use _mesa_lookup_shader_program_err both in TransformFeedbackVaryings and GetTransformFeedbackVarying that handles errors correctly for invalid values of shader program. Fixes following dEQP test: dEQP-GLES31.functional.debug.negative_coverage.get_error.shader.transform_feedback_varyings Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98135 Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-10-11 07:44:33 +03:00
Xu,Randy	d11a63d6e6	i965: solve cubemap negative x/y/z faces buffer offset issue in dEQP. Add the miptree level/slice x/y_offset when count the surface offset in brw_emit_surface_state. The surface offset has two parts, one is from mt->offset, which should be 32 aligned in width/height for tiled buffer; another is from mt->level[current_level].slice[current_slice]. x/y_offset. This fix will solve 12 deqp failure dEQP-EGL.functional.image.create.gles2_cubemap_negative_*_texture Signed-off-by: Xu,Randy <randy.xu@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-11 07:44:18 +03:00
Nicholas Bishop	64435fd888	i915g: fix incorrect gl_FragCoord value On Intel Pineview M hardware, the i915 gallium driver doesn't output the correct gl_FragCoord. It seems to always have an X coord of 0.0 and a Y coord of the window's height in pixels, e.g. 600.0f or such. I believe this is a regression caused in part by this commit: `afa035031f` The old behavior used the output at index zero, while the new behavior uses actual zeroes. In the case of gl_FragCoord the output at index zero happened to be the correct one, so the behavior appeared correct although the code already had a bug. Fixed by checking for I915_SEMANTIC_POS when setting up texCoords. If the generic_mapping is I915_SEMANTIC_POS, look for the TGSI_SEMANTIC_POSITION instead of a TGSI_SEMANTIC_GENERIC output. https://bugs.freedesktop.org/show_bug.cgi?id=97477 Reviewed-by: Stéphane Marchesin <marcheu@chromium.org> Tested-by: Stéphane Marchesin <marcheu@chromium.org>	2016-10-10 18:32:36 -07:00
Vinson Lee	c10dcb2ce8	Revert "mesa_glinterop: remove inclusion of GLX header" This reverts commit `8472045b16`. Conflicts: include/GL/mesa_glinterop.h This patch fixes this build error with GCC 4.4. Compiling src/glx/dri_common_interop.c ... In file included from src/glx/dri_common_interop.c:33: include/GL/mesa_glinterop.h:62: error: redefinition of typedef ‘GLXContext’ include/GL/glx.h:165: note: previous declaration of ‘GLXContext’ was here Fixes: `8472045b16` ("mesa_glinterop: remove inclusion of GLX header") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96770 Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2016-10-10 15:09:44 -07:00
Axel Davy	eef0744d43	st/nine: More checks for GetRenderTargetData Fixes a wine test crash Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	a52e700169	st/nine: Add debug output for lost devices Add debug output to ease debugging. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	5d85253dc3	st/nine: Prevent crash in GetRenderTargetData Return error instead of crashing on source surfaces with format D3DFMT_NULL. Fix for issue #236. Tested on Windows 7. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	09edc0555f	st/nine: Set CLAMP_TO_EDGE on cubetextures Wine tests show that cubetextures always use PIPE_TEX_WRAP_CLAMP_TO_EDGE regardless of set sampler states. Fixes failing d3d9 wine test test_cube_wrap. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	fa2574497b	st/nine: handle possible failure of D3DWindowBuffer_create Check for errors and pass them to the callers. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	f04fa0a62c	st/nine: Assert on buffer creation failure Add an assert to make sure buffer creation doesn't fail. Add error handling in calling functions. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	f8c01e7a96	st/nine: Use NineDevice9_CreateDepthStencilSurface in swapchain9 Replace custom code with NineDevice9_CreateDepthStencilSurface. All functionality is given now.	2016-10-10 23:43:51 +02:00
Axel Davy	63367e6c95	st/nine: Fix check and remove useless code in swapchain9 The removed code was there for two reasons: 1) Allow DF16, DF24, INTZ to be used as depth buffer for swapchain, if the driver doesn't support PIPE_BIND_SAMPLER_VIEW for the underlying format 2) Set PIPE_BIND_SAMPLER_VIEW if possible, such that if StretchRect is called on the depth texture, it is happy. 1) The reason these formats needed a workaround is because the check flags for them in CheckDeviceFormat were incorrect, which led applications to think the formats were valid for swapchains, even if they weren't supported. 2) StretchRect limitations for depth buffers force the resource_copy_region path, which should be fine without PIPE_BIND_SAMPLER_VIEW. Thus fix the check for 1), and remove the code. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	60624be203	st/nine: Implement MSAA quality levels Advertise quality levels: Each supported multisample count matches to one quality level. The application doesn't know how much samples each quality level has. For that reason it's not possible to set the multisample mask. Return errors on quality level missmatch. Fixes several old games not having multisample support until now. Fix for issue #73. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	8a50b1244f	st/nine: Prepare update_framebuffer for MS quality levels Compare resource's nr_samples instead of D3D multisample level. Required for multisample quality levels to work correct. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	b560305687	st/nine: Add additional error handling in CheckDeviceMultiSampleType Return one supported quality level in error cases. Return error on invalid multisample count. Fixes failing wine tests. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	7afab4ad39	st/nine: Fix compiler warning Use strict aliasing in SetPrivateData and struct pheader. Casting char[1] to IUnknown** isn't allowed in strict aliasing. Compute pointer to body by adding size of header to header pointer. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	b9f31111ac	st/nine: Remove resource9 {Set/Get/Free}PrivateData functions Remove {Set/Get/Free}PrivateData in resource9. Functionality has been implement in IUnknown interface. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	03888e8a46	st/nine: Remove volume9 {Set/Get/Free}PrivateData functions Remove {Set/Get/Free}PrivateData in volume9. Functionality has been implement in IUnknown interface. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	485cba7eb4	st/nine: Switch {Set/Get/Free}PrivateData functions Switch {Set/Get/Free}PrivateData function to introduced IUnknown functions. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	4117f5e1ab	st/nine: Implement {Set/Get/Free}PrivateData in iunknown Implement {Set/Get/Free}PrivateData in iunknown to get rid of duplicated code in resource9 and volume9. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	c1c8e852c1	st/nine: Return device in NineSurface9_GetContainer According to MSDN the device is returned for surfaces that do not have a regular container. Such surfaces are: OffscreenPlainSurface, DepthStencilSurface and RenderTarget Tested and verified on Windows. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	ba0274c7d6	st/nine: Allocate surface resources in surface ctor Allocate resources in surface ctor. Allows to use statetracker internal memory accounting. Fix for issue #231. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Axel Davy	1f65f67b21	st/nine: Fix D3DFMT_NULL size D3DFMT_NULL is mapped to PIPE_FORMAT_NONE. Instead of relying on PIPE_FORMAT_NONE to return a size, pick one. The one picked is the same than Wine. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	9dc792b95b	st/nine: Add debugging output Add DBG calls to NineTexture9_GetLevelDesc and NineTexture9_GetSurfaceLevel to ease debugging. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	8ceb2264c5	st/nine: Fix assert in NineUnknown_QueryInterface Tests showed that is allowed to call this method on object that have a zero refcount. Required for issue #230. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:51 +02:00
Patrick Rudolph	f2eacef33d	st/nine: Print interface id in NineVolume9_GetContainer To ease debugging print interface id. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Patrick Rudolph	489dbc51ae	st/nine: Print interface id in NineSurface9_GetContainer To ease debugging print interface id. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Patrick Rudolph	e63a38832b	st/nine: Print interface id in NineUnknown_QueryInterface To ease debugging print interface id. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Patrick Rudolph	6a1cce20b6	st/nine: Move assert in NineSurface9_ctor Move assert to function entry. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	851e4b8d8a	st/nine: Properly declare sampler states for ff Fixes a softpipe assertion failure with wine tests Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-10-10 23:43:50 +02:00
Axel Davy	5ce23c1689	st/nine: Handle user clipping planes properly for ff Found reading msdn and checking Wine. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	d2fd296648	st/nine: Fix the calculation of the number of vs inputs Fixes hangs on radeonsi, and assert on llvmpipe. Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-10-10 23:43:50 +02:00
Axel Davy	71e7292a85	st/nine: Fix specular w coordinate Found looking at Wine formulas. Fixes a few visual issues. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	732cea09cd	st/nine: Disable parts of lighting calculation if no normal provided Behaviour found in Wine sources, and checked with some test apps. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	fc9bb19dce	st/nine: Fix condition for specular lightning Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	c56c7c1fc8	st/nine: Do always accumulate diffuse According to spec. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-10-10 23:43:50 +02:00
Axel Davy	c5bce80f50	st/nine: Initialize ps ff registers Found with wine tests for the rTmp register. Not sure for the other ones. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	4ed3d5ee57	st/nine: Do not pollute rTmp in ff ps Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-10-10 23:43:50 +02:00
Axel Davy	d9b8b3196e	st/nine: Allocate temporaries on demand for ps ff Same change than for vs ff. This makes it easier to not introduce mistakes reusing temporaries whose result shouldn't be erased. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-10-10 23:43:50 +02:00
Axel Davy	f7dd27aed3	st/nine: Fix texbem Error found with wine tests. nine_shader was expecting another order than the one device9 was using. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-10-10 23:43:50 +02:00
Axel Davy	7afcbb49ba	st/nine: Fix ff computation for inverse Thanks to wine tests. Apparently 4x4 inverse is to be used, and if the inverse can't be calculated, the input matrix is to be used. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	36399f9a7f	st/nine: Used normed Vtx for reflectionvector Fix deduced from the spec. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	eda1e6ece7	st/nine: Implement SPHEREMAP Behaviour checked with a test app. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	a3ddc80ec8	st/nine: Enable passthrough only if positiont is used Wine tests for the passthrough feature are for positiont. Nothing seems to indicate passthrough happens when positiont it not used. However having passthrough with positiont makes sense (to be used with ProcessVertices outputs). Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	0b5bed774b	st/nine: Fix wrong mask in ff vs Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-10-10 23:43:50 +02:00
Axel Davy	028dab95f6	st/nine: Fix tweening factor computation The computation was reversed. Deduced by tests on windows. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	1fe055338d	st/nine: Disable ff vertex blending if required inputs are missing This behaviour has been partially tested on windows. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	aa69bb6848	st/nine: Use materials if source is not given. Deduced by test on windows. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	ab068a78d3	st/nine: Fix ff SPECULARENABLE We were (wrongly) adding specular to diffuse in vertex shaders when SPECULARENABLE was set. However the spec says specular has to be added after texture processing (which is in ps). Besides SPECULARENABLE is flagged as a pixel state. There was unused support for SPECULARENABLE in the ps ff code. Remove the vs code, and use the ps code. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	1d7890a441	st/nine: Undefined specular should be full of zeros Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-10-10 23:43:50 +02:00
Axel Davy	d9330f9348	st/nine: Implement normal transformation with vertex blending The formula is different from the one of the spec, but otherwise nothing particular. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	305e8106ab	st/nine: Increase MaxVertexBlendMatrixIndex Modern cards do advertise 8. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	567be40de9	st/nine: Compact ff vs constants a bit There are several holes. This patch reduces the holes a bit, which reduces the size of the constant buffer uploaded. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	07d1f32e0f	st/nine: Fix vertex blending aVtx computation There was an multiplication by the world matrix 0 which had nothing to do there. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	d9d8cb9f19	st/nine: Reorganize ff vtx processing The new order simplified the code a bit for next patches. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	cde74cba71	st/nine: Small simplification for position_t and fog position_t disables fog computation. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	5d2a8e8a36	st/nine: Cleaning code for vs temporaries This has been a real mess up to now: the temporaries were allocated once, and shared after that between the different parts of the code. To help maintaining the code, the temporaries are now allocated and released on need. As surprising as it could be, this patch, which was supposed to introduce no behaviour change, actually solved a visual bug observed on a sample program. This was due to ureg_normalize3 polluting a temporary variable. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	1f18b6f351	st/nine: No need for the local flag for temporaries in ff Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-10-10 23:43:50 +02:00
Axel Davy	eb9ad8f969	st/nine: Handle D3DRS_NORMALIZENORMALS When this state is set, the normals computed in the vs ff shader should be normalized. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-10-10 23:43:50 +02:00
Axel Davy	b9639c661f	st/nine: Initial ProcessVertices support For now only VS 3 support is implemented. This enables The Sims 2 to work. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:50 +02:00
Axel Davy	3bf02d383f	st/nine: Partial software vertex processing support Software Vertex Processing allows: . Less limitations for shaders (more loops, etc) . Less limitations for ff (more enabled lights, 255 matrices for VertexBlend) In particular shaders can get more constants. This patch implements support for this (not using software rendering, but hardware rendering, as llvmpipe and dx10+ hw have the same limits...) This is considered a second class path. Even apps asking for "Mixed Vertex processing" (ie the ability to switch to swvp on demand) do not use the feature much. Some just initialize more constants than the normal limit at the start of the application, but never use more than the normal limit. When the apps do not need the software vertex processing features, they do not seem to turn it on. This means it is ok if that path is slow. Thus no care has been made to make the path optimized. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:49 +02:00
Axel Davy	f8c8f44244	st/nine: Rework vs int and bool constants buffer This will help to support swvp constants. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:49 +02:00
Axel Davy	a83dce0128	st/nine: Change dirty tracking for vs int and bool constants This change makes easier to introduce tracking for swvp constants. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:49 +02:00
Axel Davy	f78089b962	st/nine: Drop unused constant upload path This path has been disabled for some time because of some bugs with it. It hasn't been updated to the new features, and is not faster. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-10-10 23:43:49 +02:00
Axel Davy	1604efa6fd	st/nine: Add support for swvp constants in shaders swvp has relaxed limits (more nested loops, etc). In particular it enables more constants. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:49 +02:00
Axel Davy	56ea3df7d4	st/nine: Initial mixed vertex processing support In mixed vertex processing, the user can enable or disable software vertex processing. It is on hardware by default. This feature is not a state, and thus the setting doesn't need to be recorded by stateblocks. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:49 +02:00
Axel Davy	747f1ef8b6	st/nine: Implement SetNPatchMode Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:49 +02:00
Axel Davy	ded7a73eb3	st/nine: Implement D3DUSAGE_SOFTWAREPROCESSING Buffers with this flag must be usable with both software and hardware vertex processing. Use Staging for fast cpu access. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-10-10 23:43:49 +02:00
Patrick Rudolph	19703f2a36	st/nine: Allocate more space for ATI1 ATIx are "unknown" formats that do not follow block format conventions. Tests showed that pitch*height bytes are allocated. apitrace used to depend on this behaviour. It used to copy more bytes than it has to for the ATI1 block format, but it didn't crash on Windows. Increase buffersize for ATI1 to fix this crash. The same issue was present in WINE but a patch has been sent by me. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:49 +02:00
Patrick Rudolph	ec6c636722	st/nine: Add missing break Add missing break instruction. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:49 +02:00
Axel Davy	03f60a3357	st/nine: Implement relative addressing for ps inputs To implement the feature we copy the ps inputs to a temp array. This is not optimal for performance, but it is the simplest solution. This is a feature that is very very rarely used. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:49 +02:00
Axel Davy	a5d308e51a	st/nine: Wait for pending tasks to execute in swapchain Fixes crash after Reset() when using thread_submit=true Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:49 +02:00
Axel Davy	f090705075	st/nine: Use fixed size arrays for swapchain buffers Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:49 +02:00
Patrick Rudolph	a719800cb8	st/nine: Fix buffer count check for Ex devices Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:49 +02:00
Axel Davy	9ff0dc3129	st/nine: Disable seamless cubemap for d3d d3d9 doesn't have seamless cubemap. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:49 +02:00
Axel Davy	f0ec54ee32	st/nine: Fix some check flags Uses the new defines introduced in previous commit. See comment in the commit for more explanation. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:49 +02:00
Axel Davy	39e98d351f	st/nine: Unify some check flags The new defines will be reused in a later patch. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-10-10 23:43:48 +02:00
Axel Davy	2290eac84e	gallium/util: Really allow aliasing of dst for u_box_union_* Gallium nine relies on aliasing to work with this function. Without this patch, dirty region tracking was incorrect, which could lead to incorrect textures or vertex buffers. Fixes several game bugs with nine. Fixes https://github.com/iXit/Mesa-3D/issues/234 Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-10-10 23:43:48 +02:00
Axel Davy	5e7f0ebe29	softpipe: Cap to 2 GB on 32 bits On 32 bits system, application memory is quite limited. softpipe uses application memory. To help prevent memory exhaustion, limit reported memory availability to 2GB. Some gallium nine apps do check reported memory by allocating resources until memory is full. Gallium nine refuses allocations when 80% of the reported memory limit is used. This change helps some apps to start. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-10-10 23:43:48 +02:00
Axel Davy	814ca96d0d	llvmpipe: Cap to 2 GB on 32 bits On 32 bits system, application memory is quite limited. llvmpipe uses application memory. To help prevent memory exhaustion, limit reported memory availability to 2GB. Some gallium nine apps do check reported memory by allocating resources until memory is full. Gallium nine refuses allocations when 80% of the reported memory limit is used. This change helps some apps to start. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-10-10 23:43:48 +02:00
Axel Davy	218459771a	gallium/os: Fix overflow on 32 bits On systems with more than 4GB of ram, os_get_total_physical_memory was triggering an integer overflow for the linux and haiku path, when on 32 bits. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94561 Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-10 23:43:48 +02:00
Axel Davy	9904581dc6	st/nine: Memset pipe_resource templates Fixes regression introduced by `ecd6fce261` and is more future proof than just clearing the next field. Other nine usages did already zero out the templates. Signed-off-by: Axel Davy <axel.davy@ens.fr> Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-10 23:43:48 +02:00
Samuel Pitoiset	d43151318a	nvc0: fix valid range for shader buffers When offset != 0, the valid range was wrong because the second argument of util_range_add() is end, not size. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-10 21:32:16 +02:00
Ilia Mirkin	5239bd5920	nvc0/ir: fix overwriting of value backing non-constant gather offset Normally the value is an immediate, which is moved to some temporary, so there's no problem. In the case of a non-constant offset (as allowed by ARB_gpu_shader5), we have to take care to copy it first before using it to build up the bits. This fixes a compilation error observed in F1 2015. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2016-10-10 14:28:32 -04:00
Vinson Lee	0a898ec28b	glsl: Add missing cache_destroy stub function. CC glsl/tests/cache_test.o glsl/tests/cache_test.c: In function ‘test_cache_create’: glsl/tests/cache_test.c:160:4: error: implicit declaration of function ‘cache_destroy’ [-Werror=implicit-function-declaration] cache_destroy(cache); ^ Fixes: `87ab26b2ab` ("glsl: Add initial functions to implement an on-disk cache") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-10-10 11:17:31 -07:00
Anuj Phogat	f8f6f60a36	docs: Mark GL_OES_viewport_array done on i965 Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2016-10-10 10:48:38 -07:00
Chad Versace	8044885182	egl: Unify the EGLint/EGLAttrib paths in eglCreateSync* (v3) Pre-patch, there were two code paths for parsing EGLSync attribute lists: one path for old-style EGLint lists, used by eglCreateSyncKHR, and another for new-style EGLAttrib lists, used by eglCreateSync (1.5) and eglCreateSync64 (EGL_KHR_cl_event2). There were two attrib_list parsing functions, _eglParseSyncAttribList(_EGLSync sync, const EGLint attrib_list) _eglParseSyncAttribList64(_EGLSync sync, const EGLattrib attrib_list) This patch unifies the two attrib_list parsing functions into one, _eglParseSyncAttribList(_EGLSync sync, const EGLattrib attrib_list) Many internal EGLSync function signatures had two attrib_list parameters to accomodate both code paths: one parameter was an EGLint list and other an EGLAttrib list. At most one of the parameters was allowed to be non-null. This patch removes the `EGLint attrib_list` parameter, leaving only the `EGLAttrib attrib_list` parameter, for all internal EGLSync functions. v2: - Consistently use condition (sizeof(int_list[0]) == sizeof(attrib_list[0])). [for emil] v3: - Don't double-unlock the display in eglCreateSyncKHR. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v2)	2016-10-10 09:54:11 -07:00
Eric Anholt	0f99c0686e	intel: Fix bash-specific redirection. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2016-10-10 09:50:05 -07:00
Eric Anholt	ec9ed1c4d8	gallium: Fix install-gallium-links.mk on non-bash /bin/sh Debian uses dash by default, which doesn't do '+='. Fixes servo's osmesa-based headless testing system, which was looking for libOSMesa in the lib/ directory. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Cc: mesa-stable@lists.freedesktop.org	2016-10-10 08:56:12 -07:00
Ilia Mirkin	ec05331a7b	nv50/ir: only stick one preret per function A function with multiple returns would have had multiple preret settings at the top of the function. While this is unlikely to have caused issues since we don't use functions in earnest, it could have in some cases overflowed the call stack, in case a function had a lot of early returns. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-10-10 10:45:06 -04:00
Nicolai Hähnle	1f95121626	radeonsi: make more use of si_have_tgsi_compute Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-10 10:38:33 +02:00
Nicolai Hähnle	38cfd5160a	gallium/radeon: assign a name to LLVM output variables in debug builds This can be helpful with R600_DEBUG=preoptir. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-10 10:38:30 +02:00
Nicolai Hähnle	39a29c2431	gallium/radeon: avoid redundant work with overlapping in/out arrays Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-10 10:37:50 +02:00
Nicolai Hähnle	77c81164bc	radeonsi: support ARB_compute_variable_group_size Not sure if it's possible to avoid programming the block size twice (once for the userdata and once for the dispatch). Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-10 10:36:42 +02:00
Lionel Landwerlin	014bd4acb8	anv: turn on samplerAnisotropy in VkPhysicalDeviceFeatures According to the Vulkan spec 5.63.4 : samplerAnisotropy indicates whether anisotropic filtering is supported. If this feature is not enabled, the maxAnisotropy member of the VkSamplerCreateInfo structure must be 1.0. Since we already set maxAnisotropy to 16 and program the hardware according to the VkSamplerCreateInfo.maxAnisotropy, it seems we can turn this on. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-10 09:25:38 +01:00
Edward O'Callaghan	ba43768a1e	radv: Use proper header guards over 'pragma once' directives Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-10-10 16:10:56 +11:00
Tapani Pälli	2d7e0f35c5	mesa: throw error if bufSize negative in GetSynciv on OpenGL ES Fixes following dEQP tests: dEQP-GLES31.functional.debug.negative_coverage.callbacks.state.get_synciv dEQP-GLES31.functional.debug.negative_coverage.get_error.state.get_synciv dEQP-GLES31.functional.debug.negative_coverage.log.state.get_synciv v2: drop _mesa_is_gles check (Kenneth) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98133 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-10 07:29:31 +03:00
Tapani Pälli	d997d5c0c9	glsl: prohibit lowp, mediump precision on atomic_uint Fixes following dEQP tests: dEQP-GLES31.functional.debug.negative_coverage.callbacks.atomic_counter.atomic_precision dEQP-GLES31.functional.debug.negative_coverage.get_error.atomic_counter.atomic_precision dEQP-GLES31.functional.debug.negative_coverage.log.atomic_counter.atomic_precision Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98131 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-10 07:29:31 +03:00
Tapani Pälli	c64093e7d5	glsl: optimize copy_propagation_elements pass Changes make copy_propagation_elements pass faster, reducing link time spent in test case of bug 94477. Does not fix the actual issue but brings down the total time. No regressions seen in CI. v2 (idr): Formatting / whitespace fixes. Embed the acp_ref in the acp_entry. v3 (idr): Delete unused copy constructor. Use while(pop_head) instead of foreach() { remove }. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-10 07:29:31 +03:00
Dave Airlie	db5d278541	radv: don't build without SHA1. Just copy the section from anv above this. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98167 Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-10 10:08:47 +10:00
Edward O'Callaghan	185be15d9d	docs/features.txt: Add GL_KHR_robustness supported on ES 3.2 Both radeonsi and nvc0 should also support ES so fixup doc. Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-09 01:06:38 +11:00
Lionel Landwerlin	4682abdaa8	intel: aubinator: enable loading dumps from standard input In conjuction with an intel_aubdump change, you can now look at your application's output like this : $ intel_aubdump -c '/path/to/aubinator --gen=hsw' my_gl_app v2: Add print_help() comment about standard input handling (Eero) Remove shrinked gtt space debug workaround (Eero) v3: Use realloc rather than memcpy/free (Ben) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com>	2016-10-08 02:18:47 +01:00
Lionel Landwerlin	619c8de522	intel: aubinator: enable loading xml files from a given directory This might be useful for people who debug with out of tree descriptions. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com>	2016-10-08 02:17:35 +01:00
Lionel Landwerlin	63a366a881	intel: aubinator: generate a standalone binary Embed the xml files into the binary, so aubinator can be used from any location. v2: Split generation packing into another patch (Jason) Check for xxd (Jason) v3: Fix out of tree builds (Jason) Generate custom variable name rather than names generated by xxd (Lionel) v4: Move generated _xml.h files to genxml/ (Sirisha) v5: Remove newline from makefile (Jason) v6: Add comment on gen*_xml.h creation (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-08 02:17:03 +01:00
Nanley Chery	4d7d9825f3	anv/TODO: Update the HiZ task Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-10-07 12:54:18 -07:00
Nanley Chery	d8aacc24cc	anv: Enable fast depth clears Provides an FPS increase of ~30% on the Sascha triangle and multisampling demos. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-10-07 12:54:18 -07:00
Chad Versace	78d074b87a	anv/cmd_buffer: Enable rendering to HiZ Nanley Chery: (rebase) - Resolve conflicts with new anv_batch_emit macro (amend) - Handle a QPitch TODO - Emit 3DSTATE_HIER_DEPTH_BUFFER on pre-BDW systems - Only use HiZ for single-subpass renderpasses - Emit the HiZ instruction before the stencil instruction to follow the optimized clear sequence specified in the PRMs - Don't modify clear params - Enable resolves when a HiZ buffer is used to ensure depth buffer validity Provides an FPS increase of ~15% on the Sascha triangle and multisampling demos. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-07 12:54:18 -07:00
Nanley Chery	134d181be1	anv/cmd_buffer: Add code for performing HZ operations Create a function that performs one of three HiZ operations - depth/stencil clears, HiZ resolve, and depth resolves. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-10-07 12:54:18 -07:00
Jason Ekstrand	9919a2d34d	anv/image: Memset hiz surfaces to 0 when binding memory Nanley Chery (amend): - Change memset value from 0xff to 0 (a defined value for HiZ). Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-07 12:54:18 -07:00
Jason Ekstrand	b4bbabf21b	anv: Move BindImageMemory to anv_image.c Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-07 12:54:18 -07:00
Chad Versace	917814dccd	anv: Allocate hiz surface Nanley Chery: (rebase) - Use isl_surf_get_hiz_surf() (amend) - Only add a HiZ surface onto a depth/stencil attachment - Add comment above HiZ surface addition - Hide HiZ behind INTEL_VK_HIZ prior to BDW - Disable HiZ for untested cases - Remove DISABLE_AUX_BIT instead of preventing it from being added Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-10-07 12:54:18 -07:00
Chad Versace	3aec432ed3	anv: Add func anv_image_has_hiz() Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-07 12:54:17 -07:00
Chad Versace	fe40d026a1	anv: Add anv_image::hiz_surface Unused. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-07 12:54:17 -07:00
Nanley Chery	814fa12379	isl: Correct a comment in the isl_format enum HiZ is not a color surface, but an auxiliary depth surface. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-07 12:54:17 -07:00
Rob Clark	495ba8884a	gallium: add missing zero-init for resource templates Mostly test code, plus one spot I noticed in r600. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-07 15:50:46 -04:00
Rob Clark	3ebfc44b42	freedreno: don't try to shadow layered textures We will only hit this with multi-planar YUV external images, so we would probably never hit this code path in the first place. But if we did, it wouldn't do the right thing so just bail. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-10-07 15:50:46 -04:00
Rob Clark	f88f025e8c	freedreno/a3xx+a4xx: fix clip-plane lowering state If enabled clip-planes have changed, we need to mark program state dirty. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-10-07 15:50:46 -04:00
Ian Romanick	f546b41f6a	glsl: Let cache_test build when the shader cache is not enabled Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Tested-by: Aaron Watry <awatry@gmail.com>	2016-10-07 11:19:37 -07:00
Lionel Landwerlin	eb23de6116	anv: pipeline cache: fix return value of vkGetPipelineCacheData According to the spec - 9.6. Pipeline Cache : If pDataSize is less than the maximum size that can be retrieved by the pipeline cache, at most pDataSize bytes will be written to pData, and vkGetPipelineCacheData will return VK_INCOMPLETE. Fixes the following test from Vulkan CTS : dEQP-VK.pipeline.cache.pipeline_from_incomplete_get_data.vertex_stage_fragment_stage dEQP-VK.pipeline.cache.pipeline_from_incomplete_get_data.vertex_stage_geometry_stage_fragment_stage dEQP-VK.pipeline.cache.misc_tests.invalid_size_test Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-07 18:46:12 +01:00
Timothy Arceri	965ebc8b28	util: remove unused variable Also initialise page at declaration. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-07 21:24:50 +11:00
Martin Peres	a599b1c203	loader/dri3: import prime buffers in the currently-bound screen This tries to mirrors the codepath taken by DRI2 in IntelSetTexBuffer2() and fixes many applications when using DRI3: - Totem with libva on hw-accelerated decoding - obs-studio, using Window Capture (Xcomposite) as a Source - gstreamer with VAAPI v2: - introduce get_dri_screen() in the dri3 loader's vtable (krh) Tested-by: Timo Aaltonen <tjaalton@ubuntu.com> Tested-by: Ionut Biru <biru.ionut@gmail.com> Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71759 Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Martin Peres <martin.peres@linux.intel.com>	2016-10-07 11:11:55 +03:00
Martin Peres	0247e5ee3e	loader/dri3: add get_dri_screen() to the vtable This allows querying the current active screen from the loader's common code. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Martin Peres <martin.peres@linux.intel.com>	2016-10-07 11:11:44 +03:00
Jason Ekstrand	82b4f1c47b	anv/entrypoints: Save off the entire devinfo rather than a pointer Since the gen_device_info structs are no longer just constant memory, a pointer to one is not a pointer to something in the .data section so we shouldn't be storing it in a static variable. Instead, we should just store the entire device_info structure. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-06 21:13:52 -07:00
Dave Airlie	85a47f647e	radv: drop all uint for unsigned. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-07 12:09:13 +10:00
Eric Anholt	20d91e5ce9	vc4: Don't worry about partial Z/S clear if the other is already cleared. We have to be careful to not smash the value they're clearing to, but other than that we're fine. Avoids quad clears in Processing, which likes to do glClear(Z\|S); glClear(Z). Improves performance of Processing's QuadRendering demo at 5000 quads by 5.46507% +/- 1.35576% (n=15 before, 32 after)	2016-10-06 18:29:16 -07:00
Eric Anholt	cb328123fe	vc4: Try to fix the HW-2116 workaround. We were incrementing the count at the end of vc4_start_draw(), except that that function returns immediately if we've already started drawing on this batch. It also failed to count the statechanges from the GFXH-515 workaround. This incidentally allows repeated glClear() to be coalesced, because the fast clears aren't counted in draw_calls_queued any more. Fixes most of the extra flushes in Processing, which emits glClear(Z\|S); glClear(Z); glClear(C) during its frame setup. Improves performance of Processing's QuadRendering demo at 5000 quads by 3.33538% +/- 2.05846% (n=21 before, 15 after)	2016-10-06 18:29:12 -07:00
Eric Anholt	bca9a58d04	vc4: Drop dead argument from vc4_start_draw().	2016-10-06 18:09:24 -07:00
Eric Anholt	9421a6065c	vc4: Fix fallback to quad clears of depth in GLX. The fix in the vc4-jobs series ended up triggering the fallback path on GLX apps that use depth but not stencil.	2016-10-06 18:09:24 -07:00
Eric Anholt	8810270d06	vc4: Add the format name in miptree_debug. I was curious if my Z/S buffer was actually ZS or ZX, and the vc4 format of "0" didn't tell me much.	2016-10-06 18:09:24 -07:00
Eric Anholt	ee577e7fa7	vc4: Fix perf debug formatting on partial Z/S clear.	2016-10-06 18:09:24 -07:00
Eric Anholt	7c7bcbbc7d	vc4: Drop destination register when it's unused. This slightly reduces instructions on shader-db, but I think it's just perturbing register allocation -- the allocator should have always trivially colored these nodes, before. This commit is just to make QIR code failing more intelligible when register allocation fails.	2016-10-06 18:09:24 -07:00
Eric Anholt	d4ae5ca823	vc4: Fix live intervals analysis for screening defs in if statements. If a conditional assignment is only conditioned on the exec mask, that's still screening off the value in the executed channels (and, since we're not storing to the unexcuted channels, we don't care what's in there). Fixes a bunch of extra register pressure on Processing's Ribbons demo, which is failing to allocate.	2016-10-06 18:09:24 -07:00
Eric Anholt	06cc3dfda4	vc4: Fix simulator when more than one vc4_screen is opened. We would assertion fail in setting up the simulator the second time around. This at least postpones the assertion failure until we've closed all of the first set of screens and started opening a new set.	2016-10-06 18:09:24 -07:00
Eric Anholt	b30205b112	vc4: Fix assertion fails from trying to cast non-ALU instrs to ALU. Fixes 100 piglit tests since the assertions were added to nir.h. What's amazing is that these tests used to pass, even when casting garbage.	2016-10-06 18:09:24 -07:00
Jason Ekstrand	c81ec84c1e	anv/cmd_buffer: Move the clear_subpasses calls to set_subpass Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-10-06 16:52:31 -07:00
Jason Ekstrand	b548fdbed5	anv/cmd_buffer: Don't call set_subpass in a secondary Initially, we had intended set_subpass to be an interesting function that did whatever (presumably a lot) setup we needed for a subpass. In reality, it just sets a pointer and a dirty bit and then emits depth and stencil state. When we call BeginCommandBuffer on a secondary, there's no point in setting depth and stencil state since it will already be set by the primary. Instead, the only thing we need to do at the start of a secondary is set the subpass pointer and the dirty bit. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-10-06 16:52:31 -07:00
Jason Ekstrand	fe4e276b02	anv/cmd_buffer: Rework descriptor dirtying in set_subpass We have a DIRTY_RENDER_TARGETS flag and that makes a lot more sense than just dirtying fragment descriptors. We're checking for it in some of the gen7 code but unfortunately, nothing was setting it and it didn't do what it was supposed to do in cmd_buffer_flush_state. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-06 16:52:31 -07:00
Jason Ekstrand	a1db0e87ff	anv/wsi: Advertise UNORM formats as well as sRGB Because WSI images are created with VkImageCreateInfo::flags explicitly set to 0, they don't ever have the VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT set. This means that you can't create an image view of it with a different format so applications can't render directly in sRGB (without automatic encoding) unless we actually advertise UNORM formats. There are a lot of applications that want to do their own sRGB conversion, so we should allow for that. We do, however, make UNORM come after sRGB in the list so that the default for dumb apps that just grab the first thing is to render in linear and let the sRGB conversion happen automatically. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-06 16:52:31 -07:00
Dave Airlie	5267124648	radv: fix configure.ac check This should be positive test. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-07 09:28:03 +10:00
Gustaw Smolarczyk	24815bd7b3	radv: Skip already signalled fences. If the user created a fence with VK_FENCE_CREATE_SIGNALED_BIT set, we shouldn't fail to wait for a fence if it was not submitted since that is not necessary. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-07 09:24:09 +10:00
Dave Airlie	f4e499ec79	radv: add initial non-conformant radv vulkan driver This squashes all the radv development up until now into one for merging. History can be found: https://github.com/airlied/mesa/tree/semi-interesting This requires llvm 3.9 and is in no way considered a conformant vulkan implementation. It can run a number of vulkan applications, and supports all GPUs using the amdgpu kernel driver. Thanks to Intel for providing anv and spirv->nir, and Emil Velikov for reviewing build integration. Parts of this are: Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> Authors: Bas Nieuwenhuizen and Dave Airlie Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-07 09:16:09 +10:00
Samuel Pitoiset	28ecd3eac2	nv50/ir: fix wrong check when optimizing MAD to SHLADD Checking if MAD is supported is definitely wrong, and it's more likely a typo I introduced few days ago which breaks NV50 because SHLADD is not supported there. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-07 01:13:06 +02:00
Lionel Landwerlin	0b10152b80	intel: aubinator: use getopt to parse arguments Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Sirisha Gandikota <sirisha.gandikota@intel.com>	2016-10-07 00:05:56 +01:00
Samuel Pitoiset	a198883bf7	nvc0: dump program binary only when NV50_PROG_DEBUG is set When the chipset is forced with NV50_PROG_CHIPSET, we actually only want to output the binary if NV50_PROG_DEBUG is also enabled. Otherwise, this pollutes the shader-db output. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-10-07 01:01:17 +02:00
Jason Ekstrand	325b3fd668	nir: Fix the control flow tests for nir_loop_first_block changes Commit `2ed17d46de` changed nir_loop_first_cf_node and friends to return a nir_block instead of a nir_cf_node. This broke one of the NIR control flow tests. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98128	2016-10-06 15:48:30 -07:00
Samuel Pitoiset	e3f586c98d	docs: mark ARB_compute_variable_group_size as done for nvc0 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-07 00:18:57 +02:00
Samuel Pitoiset	56a0bed2c1	nvc0: expose ARB_compute_variable_group_size Only expose 512 threads/block on Fermi to not be limited by 32 GPRs/thread. v4: - use 512 threads on Fermi, 1024 on Kepler+ Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-10-07 00:18:57 +02:00
Samuel Pitoiset	11e75fffeb	nv50/ir: set number of threads/block for variable local size When a variable local size is defined as specified by ARB_compute_variable_group_size, the fixed local size is set to 0 and a SIGFPE occurs when we compute the maximum number of regs. This allows to use 64 GPRs/thread. v4: - use 512 threads on Fermi, 1024 on Kepler+ Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-10-07 00:18:57 +02:00
Samuel Pitoiset	590734fa0d	st/mesa: expose ARB_compute_variable_group_size This extension is only exposed if the underlying driver supports ARB_compute_shader and if PIPE_COMPUTE_MAX_VARIABLE_THREADS_PER_BLOCK is set. v3: - initialize max_variable_threads_per_block to 0 v2: - expose the ext based on that new cap Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-07 00:18:57 +02:00
Samuel Pitoiset	dfd7734cb7	st/mesa: add support for dispatching a variable local size Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-07 00:18:57 +02:00
Samuel Pitoiset	e78bd48b9c	st/mesa: add mapping for SYSTEM_VALUE_LOCAL_GROUP_SIZE gl_LocalGroupSizeARB can be translated into TGSI_SEMANTIC_BLOCK_SIZE which represents the block size in threads. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-07 00:18:57 +02:00
Samuel Pitoiset	07bb4513c6	gallium: add PIPE_COMPUTE_CAP_MAX_VARIABLE_THREADS_PER_BLOCK v3: - use a new case statement in r600_pipe_common.c - fix compilation of softpipe... Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-07 00:18:57 +02:00
Samuel Pitoiset	48de9aaa72	glsl: add gl_LocalGroupSizeARB as a system value v2: - only add it if the ext is enabled (Ilia) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-07 00:18:57 +02:00
Samuel Pitoiset	dee627a16e	glsl/linker: handle errors when a variable local size is used Compute shaders can now include a fixed local size as defined by ARB_compute_shader or a variable size as defined by ARB_compute_variable_group_size. v2: - update formatting spec quotations (Ian) - various cosmetic changes (Ian) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-07 00:18:57 +02:00
Samuel Pitoiset	008e785f74	glsl: reject compute shaders with fixed and variable local size The ARB_compute_variable_group_size specification explains that when a compute shader includes both a fixed and a variable local size, a compile-time error occurs. v2: - update formatting spec quotations (Ian) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-07 00:18:57 +02:00
Samuel Pitoiset	dd2bda7002	glsl: process local_size_variable input qualifier This is the new layout qualifier introduced by ARB_compute_variable_group_size which allows to use a variable work group size. v4: - add missing '%s' in the monster format string Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-07 00:18:57 +02:00
Samuel Pitoiset	d5c8481d57	glsl: add enable flags for ARB_compute_variable_group_size This also initializes the default values for the standalone compiler. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-07 00:18:57 +02:00
Samuel Pitoiset	45ab63c0cb	mesa/main: add support for ARB_compute_variable_groups_size v5: - replace fixed_local_size by !LocalSizeVariable (Nicolai) v4: - slightly indent spec quotes (Nicolai) - drop useless _mesa_has_compute_shaders() check (Nicolai) - move the fixed local size outside of the loop (Nicolai) - add missing check for invalid use of work group count v2: - update formatting spec quotations (Ian) - move the total_invocations check outside of the loop (Ian) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-07 00:18:57 +02:00
Samuel Pitoiset	a063f3084a	glapi: add entry points for GL_ARB_compute_variable_group_size v2: - correctly sort that new extension (Ian) - fix up the comment (Ian) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-07 00:18:57 +02:00
Karol Herbst	f96945c5b5	nv50/ir: optimize sub(a, 0) to a helped some ue4 demos and divinity OS shaders total instructions in shared programs : 2818674 -> 2818606 (-0.00%) total gprs used in shared programs : 379273 -> 379273 (0.00%) total local used in shared programs : 9505 -> 9505 (0.00%) total bytes used in shared programs : 25837792 -> 25837192 (-0.00%) local gpr inst bytes helped 0 0 33 33 hurt 0 0 0 0 Signed-off-by: Karol Herbst <karolherbst@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>	2016-10-06 19:39:51 +02:00
Brian Paul	6963f94e98	st/mesa: move all sampler view code into new st_sampler_view.[ch] files Previously, the sampler view code was scattered across several different files. Note, the previous REALLOC(), FREE() for st_texture_object::sampler_views are replaced by realloc(), free() to avoid conflicting macros in Mesa vs. Gallium. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-06 11:29:32 -06:00
Brian Paul	e5cc84dd43	st/mesa: optimize pipe_sampler_view validation Before, st_get_texture_sampler_view_from_stobj() did a lot of work to check if the texture parameters matched the sampler view (format, swizzle, min/max lod, first/last layer, etc). We did this every time we validated the texture state. Now, we use a ctx->Driver.TexParameter() callback and a couple other checks to proactively release texture views when we know that view-related parameters have changed. Then, the validation step is simplified: - Search the texture's list of sampler views (just match the context). - If found, we're done. - Else, create a new sampler view. There will never be old, out-of-date sampler views attached to texture objects that we have to test. Most apps create textures and set the texture parameters once. This make sampler view validation much cheaper for that case. Note that the old texture/sampler comparison code has been converted into a set of assertions to verify that the sampler view is in fact consistent with the texture parameters. This should help to spot any potential regressions. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-06 11:29:32 -06:00
Brian Paul	0f3aee888e	mesa: call ctx->Driver.TexParameter() in texture_buffer_range() To inform drivers of texture buffer offset/size changes, as we do for other texture object parameters. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-06 11:29:32 -06:00
Brian Paul	b3127a96a9	st/mesa: consolidate view format setup code Before, we had code to compute the sampler view's format spread across two different functions: in update_single_texture() and st_get_texture_sampler_view_from_stobj(). Now it's all in one new function. Also, use _mesa_texture_base_format() to simplify the code. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-06 11:29:32 -06:00
Brian Paul	628e651f64	st/mesa: add some const qualifiers in st_atom_texture.c And minor code reformatting. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-06 11:29:32 -06:00
Brian Paul	b3c8935165	st/mesa: simplify some code in get_texture_format_swizzle() There's no need to cast to st_texture_image. Just use gl_texture_image. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-06 11:29:31 -06:00
Brian Paul	9add37b100	mesa: make _mesa_texture_buffer_range() static Not called from any other file. Also, add a comment. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-06 11:29:31 -06:00
Brian Paul	92188c207e	mesa: add const qualifier, comment on can_avoid_reallocation() Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-06 11:29:31 -06:00
Brian Paul	57279c5454	mesa: add comment/assertion on get_tex_level_parameter_buffer() Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-06 11:29:31 -06:00
Jason Ekstrand	ae032e5ea6	nir: Remove some no longer needed asserts Now that the NIR casting functions have type assertions, we have a bunch of assertions that aren't needed anymore. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-10-06 09:16:39 -07:00
Jason Ekstrand	2ed17d46de	nir: Make nir_foo_first/last_cf_node return a block instead One of NIR's invariants is that control flow lists always start and end with blocks. There's no good reason why we should return a cf_node from these functions since we know that it's always a block. Making it a block lets us remove a bunch of code. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-10-06 09:16:37 -07:00
Jason Ekstrand	7a3bcadf4e	nir: Add asserts to the casting functions This makes calling nir_foo_as_bar a bit safer because we're no longer 100% trusting in the caller to ensure that it's safe. The caller still needs to do the right thing but this ensures that we catch invalid casts with an assert rather than by reading garbage data. The one downside is that we do use the casts a bit in nir_validate and it's not a validate_assert. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-10-06 09:16:24 -07:00
Steven Toth	e00fdd643b	gallium/hud: Remove superfluous debug No longer required. Signed-off-by: Steven Toth <stoth@kernellabs.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-06 16:37:06 +01:00
Emil Velikov	03350c9708	amd: add amd_kernel_code_t.h to the sources list Otherwise it won't be picked in the tarball and the build will fail. Fixes: `91ec6e5664` ("radeonsi/compute: Use the HSA abi for non-TGSI compute shaders v3") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-06 16:17:51 +01:00
Emil Velikov	b634be0e69	svga: add svga_mksstats.h to the sources list Otherwise it won't be picked in the tarball and the build will fail. Fixes: `0035f7f136` ("svga: add guest statistic gathering interface") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-06 16:17:09 +01:00
Emil Velikov	78a7415f0b	glx: rename choose_visual(), drop const argument The function deals with fb (style) configs, thus using "visual" in the name is misleading. Which in itself had led to the use of fbconfig_style_tags argument. Rename the function to reflect what it does and drop the unneeded argument. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-10-06 15:03:47 +01:00
Emil Velikov	2e9e05dfca	glx: return GL_FALSE from glx_screen_init where applicable. Return GL_FALSE if we fail to find any fb/visual configs, otherwise we end up with all sorts of chaos further down the GLX stack. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-10-06 15:03:47 +01:00
Emil Velikov	e542ed463d	glx: correctly mask the drawableType for GLX_ARB_fbconfig_float The comment/spec says - only for pbuffer drawables, while the code clears the window/pixmap bit. Practise what you preach and apply the trivial tweak. In practise this should not cause functional change. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-10-06 15:03:46 +01:00
Chuck Atkins	a89faa2022	autoconf: Make header install distinct for various APIs (v2) This fixes a problem where GL headers would only get installed if glx was enabled. So if osmesa was enabled but not glx, then the GL headers required by osmesa would be missing from the install. v2: Dropped unneeded mesa_glinterop.h redundant osmesa.h install Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-06 15:03:46 +01:00
Emil Velikov	0216a16819	mesa: annotate AttribFuncsARB[] as const It's read-only data, so annotate it accordingly. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-10-06 15:03:46 +01:00
Emil Velikov	0728e2bb17	mapi/glapi: remove unused _glapi_check_table() Similar to earlier commit - symbol was never part of the public API so we're safe to remove it. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-10-06 15:03:46 +01:00
Emil Velikov	96b9ec1ea3	glapi/hgl: remove the final user of _glapi_check_table() The symbol is a no-op since, the EXTRA_DEBUG macro is not set in the build. Unused by !Haiku people/platforms since 2010 (commit `a73c6540d9`) while the Haiku C++ wrapper has no obvious users. Cc: Alexander von Gluck IV <kallisti5@unixzen.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-10-06 15:03:46 +01:00
Emil Velikov	79835565c3	mapi/glapi: remove unused _glapi_check_table_not_null Function was never part of the API/ABI and the final user was removed with commit `a73c6540d9`, back in 2010. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-10-06 15:03:46 +01:00
Emil Velikov	9b7fd4080a	st/xvmc/tests: force enable assertions Similar to the other 'tests', enable assertions in xvmc_bench. This silences the GCC warnings about unused-variable(s), makes the program actually useful, as the XvMC API called. Atm the function calls are omitted, since they're called within the assert. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-06 15:03:46 +01:00
Emil Velikov	0b6837a643	anv: automake: ship intel_icd.json.in in the tarball Otherwise we'll fail to (re)generate intel_icd.json. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-06 15:03:46 +01:00
Emil Velikov	a42115d6e2	intel: automake: reference the correct header The header was renamed with earlier commit, so update the Makefile.sources respectively. {vulkan/genX_multisample.h => common/gen_sample_positions.h} Fixes: c779ad3e661("intel: Move Vulkan sample positions to common code") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-06 15:03:46 +01:00
Lionel Landwerlin	b84234fd28	intel: aubinator: add missing return characters Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-06 10:39:53 +01:00
Kenneth Graunke	f7659e02c3	nir: Delete open coded type printing. glsl_print_type() prints arrays of arrays incorrectly. For example, a type with name float[3][7] would be printed as float[7][3]. (This is an array of length 3 containing arrays of 7 floats.) cdecl says that the type name is correct. glsl_print_type() doesn't really do anything above and beyond printing type->name, and glsl_print_struct() wasn't used at all. So, drop them. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-10-06 02:13:36 -07:00
Philipp Zabel	0408d50f43	anv: fix GetPhysicalDeviceProperties to return timestampPeriod in ns According to chapters 16.5. (Timestamp Queries) and 30.2 (Limits) of the Vulkan Specification 1.0.29, the .limits.timestampPeriod field returned by vkGetPhysicalDeviceProperties is measured in nanoseconds, not in seconds. Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-06 02:02:35 -07:00
Timothy Arceri	88428fbe41	i965: remove remaining tabs in brw_draw.c Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-06 16:04:16 +11:00
Timothy Arceri	7627fbd9b0	i965: get inputs read from nir info This is a step towards dropping the GLSL IR version of do_set_program_inouts() in i965 and moving towards native nir support. This is important because we want to eventually convert to nir and use its optimisations passes before we can call this GLSL IR pass. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-06 16:04:09 +11:00
Timothy Arceri	7ef8286487	i965: get outputs written from nir info This is a step towards dropping the GLSL IR version of do_set_program_inouts() in i965 and moving towards native nir support. This is important because we want to eventually convert to nir and use its optimisations passes before we can call this GLSL IR pass. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-06 16:04:03 +11:00
Timothy Arceri	b526a9b708	i965: get outputs read from nir info This is a step towards dropping the GLSL IR version of do_set_program_inouts() in i965 and moving towards native nir support. This is important because we want to eventually convert to nir and use its optimisations passes before we can call this GLSL IR pass. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-06 16:03:57 +11:00
Timothy Arceri	a38c809f6e	i965: remove remaining tabs in brw_wm.c Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-06 16:03:52 +11:00
Timothy Arceri	201f940d2e	mesa: remove the UsesDFdy flag Seems the last user of this was removed in `08bc74e69`. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-06 16:03:46 +11:00
Timothy Arceri	556335eb99	i965: get uses discard from nir info This is a step towards dropping the GLSL IR version of do_set_program_inouts() in i965 and moving towards native nir support. This is important because we want to eventually convert to nir and use its optimisations passes before we can call this GLSL IR pass. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-06 16:03:40 +11:00
Timothy Arceri	ee829cba8e	i965: get uses texture gather from nir info This is a step towards dropping the GLSL IR version of do_set_program_inouts() in i965 and moving towards native nir support. This is important because we want to eventually convert to nir and use its optimisations passes before we can call this GLSL IR pass. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-06 16:03:00 +11:00
Kenneth Graunke	a85a8ecd32	i965: Eliminate brw->cs.prog_data pointer. Just say no to: - brw->cs.base.prog_data = &brw->cs.prog_data->base.base; We'll just use the brw_stage_prog_data pointer in brw_stage_state and downcast it to brw_cs_prog_data as needed. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arcero@collabora.com>	2016-10-05 19:21:35 -07:00
Kenneth Graunke	16d5536e55	i965: Eliminate brw->wm.prog_data pointer. Just say no to: - brw->wm.base.prog_data = &brw->wm.prog_data->base.base; We'll just use the brw_stage_prog_data pointer in brw_stage_state and downcast it to brw_wm_prog_data as needed. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arcero@collabora.com>	2016-10-05 19:21:35 -07:00
Kenneth Graunke	ff366f3db4	i965: Eliminate brw->gs.prog_data pointer. Just say no to: - brw->gs.base.prog_data = &brw->gs.prog_data->base.base; We'll just use the brw_stage_prog_data pointer in brw_stage_state and downcast it to brw_gs_prog_data as needed. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arcero@collabora.com>	2016-10-05 19:21:33 -07:00
Kenneth Graunke	e512941537	i965: Eliminate brw->tes.prog_data pointer. Just say no to: - brw->tes.base.prog_data = &brw->tes.prog_data->base.base; We'll just use the brw_stage_prog_data pointer in brw_stage_state and downcast it to brw_tes_prog_data as needed. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arcero@collabora.com>	2016-10-05 19:21:09 -07:00
Kenneth Graunke	82c97ac710	i965: Eliminate brw->tcs.prog_data pointer. Just say no to: - brw->tcs.base.prog_data = &brw->tcs.prog_data->base.base; We'll just use the brw_stage_prog_data pointer in brw_stage_state and downcast it to brw_tcs_prog_data as needed. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arcero@collabora.com>	2016-10-05 19:21:09 -07:00
Kenneth Graunke	40258a13d5	i965: Eliminate brw->vs.prog_data pointer. Just say no to: - brw->vs.base.prog_data = &brw->vs.prog_data->base.base; We'll just use the brw_stage_prog_data pointer in brw_stage_state and downcast it to brw_vs_prog_data as needed. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arcero@collabora.com>	2016-10-05 19:21:06 -07:00
Kenneth Graunke	e51e055fcd	i965: Introduce downcast helpers for prog_data structures. Similar to brw_context(...), intel_texture_object(...), and so on. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arcero@collabora.com>	2016-10-05 19:20:42 -07:00
Chad Versace	74b02a7449	i965/sync: Rename awkward variable What is the difference between a 'driver_fence' and a 'fence'? Do the characters 'driver_' add anything helpful? Nope. They do, though, add an extra 7 chars and pull your eyeballs away to ask "huh? what's that?" one microsecond too many. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-05 17:09:25 -07:00
Chad Versace	a99ff82714	i965/sync: Rename intel_syncobj.c -> brw_sync.c Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-05 17:09:25 -07:00
Chad Versace	9ea48fc877	i965/sync: Replace 'intel' prefix with 'brw' This is yet another patch for the great renaming begun long ago. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-05 17:09:24 -07:00
Chad Versace	ce1d67c2e5	i965/sync: Fix uninitalized usage and leak of mutex We locked an unitialized mutex in the callstack glClientWaitSync intel_gl_client_wait_sync brw_fence_client_wait_sync because we forgot to initialize it in intel_gl_fence_sync. (The EGLSync codepath didn't have this bug. It initialized the mutex in intel_dri_create_sync). We also forgot to tear down (mtx_destroy) the mutex when destroying the sync object. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-05 17:09:24 -07:00
Jason Ekstrand	28ab2570c8	nir: Use the correct infos structure for copying atomic sources Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Tested-by: Mark Janes <mark.a.janes@intel.com> Cc: "12.0" <mesa-dev@lists.freedestkop.org>	2016-10-05 13:04:54 -07:00
Samuel Pitoiset	a41cfbbf2b	nvc0: dump program binary when chipset has been forced Currently, program binaries are only dumped at upload time, but when the chipset has been forced via NV50_PROG_CHIPSET we might want to show the generated code, especially with shaderdb. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-10-05 21:15:44 +02:00
Marek Olšák	cc4a19c4ad	radeonsi: fix texture border colors for compute shaders There are VM faults without this. Cc: 12.0 <mesa-stable@lists.freedesktop.org> Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-05 21:03:54 +02:00
Marek Olšák	844f8268e1	gallium/radeon/winsyses: set reasonable max_alloc_size which is returned for GL_MAX_TEXTURE_BUFFER_SIZE. It doesn't have any other use at the moment. Bigger allocations are not rejected. This fixes GL45-CTS.texture_buffer.texture_buffer_max_size on Bonaire. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-05 21:03:54 +02:00
Marek Olšák	1b37e5541c	radeonsi: fix interpolateAt opcodes for .zw components Not returning garbage in .zw seems pretty important. This fixes: GL45-CTS.shader_multisample_interpolation.render.interpolate_at__check. Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-05 21:03:23 +02:00
Marek Olšák	300a8221e9	radeonsi: add assertions to validate interpolation flags Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-05 21:03:23 +02:00
Marek Olšák	d4a8bf89ce	radeonsi: interpolate colors after interpolation weight shuffling Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-05 21:03:23 +02:00
Marek Olšák	faee2d6dda	tgsi/scan: don't set interp flags for inputs only used by INTERP (v2) (v1 pushed, then reverted) This fixes 9 randomly failing tests on radeonsi: GL45-CTS.shader_multisample_interpolation.render.interpolate_at_centroid.* v2: use input_interpolate[input] (correct) instead of input_interpolate[index] (incorrect) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-05 21:03:23 +02:00
Marek Olšák	10e5f126dd	ddebug: dump most driver information with GALLIUM_DDEBUG=always Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-05 21:03:23 +02:00
Karol Herbst	d8bcd3ef37	nv50/ra: let simplify return an error and handle that fixes a crash in the case simplify reports an error Signed-off-by: Karol Herbst <karolherbst@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-10-05 19:11:42 +02:00
Nanley Chery	f315c4f189	intel/blorp: Use documented RECTLIST vertex positions Use the vertex positions described in the PRMs. This has no effect on rendering but quiets the simulator warnings seen when the vertices appear out of order. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-10-05 09:41:21 -07:00
Jason Ekstrand	e3a1d33077	anv/meta: Roll clear_image into CmdClearDepthStencilImage It is now the only caller so there's no sense in keeping things split out. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-10-05 09:33:44 -07:00
Jason Ekstrand	f027609a64	anv: Use blorp for VkCmdFillBuffer Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-10-05 09:33:44 -07:00
Kyle Brenneman	ca9f26ac6f	egl: Implement EGL_KHR_debug (v2) Wire up the debug entrypoints to EGL dispatch, and add the extension string to the client extension list. v2: - Lots of style fixes - Fix missing EGLAPIENTRYs - Factor out valid attribute check - Lock display in eglLabelObjectKHR as needed, and use RETURN_EGL_* - Move "EGL_KHR_debug" into asciibetical order in client extension string Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.veliko@collabora.com>	2016-10-05 11:41:26 -04:00
Kyle Brenneman	6a5545d3ba	egl: Track EGL_KHR_debug state when going through EGL API calls (v3) This decorates every EGL entrypoint with _EGL_FUNC_START, which records the function name and primary dispatch object label in the current thread state. It also adds debug report functions and calls them when appropriate. This would be useful enough for debugging on its own, if the user set a breakpoint when the report function was called. We will also need this state tracked in order to expose EGL_KHR_debug. v2: - Clear the object label in more cases in _eglSetFuncName - Pass draw surface (if any) to _EGL_FUNC_START in eglSwapInterval v3: - Set dummy thread's CurrentAPI to EGL_OPENGL_ES_API not zero - Less ?: in _eglSetFuncName Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.veliko@collabora.com>	2016-10-05 11:40:51 -04:00
Lionel Landwerlin	f8b861a867	intel: aubinator: pack supported generations into an array Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-05 16:23:28 +01:00
Ben Widawsky	2dc06e2324	i965/l3: Add explicit way size calculation for bxt There should be no functional change here because Broxton and CHV are both gt1. Without this code however, it might seem like broxton support is missing. While here, put the gt1 check in front to hopefully short-circuit the condition for the mobile cases. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-10-05 07:57:58 -07:00
Nicolai Hähnle	11cc59afca	virgl: Fix build regression of commit `8a943564`	2016-10-05 16:27:29 +02:00
Nicolai Hähnle	0cba7b771a	st/mesa: enable GL_KHR_robustness The difference to the virtually identical ARB_robustness (which is already enabled unconditionally) is miniscule and handled elsewhere, but this cap seems like the right thing to require for this extension. v2: drop the device reset cap requirement (Ilia) Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1) Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-05 15:51:59 +02:00
Nicolai Hähnle	b5cd7dfe3e	gallium/radeon: implement set_device_reset_callback Check for device reset on flush. It would be nicer if the kernel just reported this as an error on the submit ioctl (and similarly for fences), but this will do for now. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-05 15:51:56 +02:00
Nicolai Hähnle	a1fa8b731f	st/mesa: set a device reset callback when available Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-05 15:51:53 +02:00
Nicolai Hähnle	d856130025	st/mesa: extract conversion from pipe_reset_status to GLenum Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-05 15:51:49 +02:00
Nicolai Hähnle	07bea09c64	ddebug: add pass-through of set_device_reset_callback Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-05 15:51:47 +02:00
Nicolai Hähnle	1a3c75e30e	gallium: add pipe_context::set_device_reset_callback Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-05 15:51:34 +02:00
Nicolai Hähnle	8a943564fd	virgl: use the new parent/child pools for transfers Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-05 15:42:22 +02:00
Nicolai Hähnle	2a83036fe2	vc4: use the new parent/child pools for transfers Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-05 15:42:20 +02:00
Nicolai Hähnle	0334ba150f	freedreno: use the new parent/child pools for transfers Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-05 15:42:17 +02:00
Nicolai Hähnle	616e36674a	r300: use the new parent/child pools for transfers (v2) v2: slab_alloc_st -> slab_alloc Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-05 15:42:13 +02:00
Nicolai Hähnle	e56e1f8119	gallium/radeon: use the new parent/child pools for transfers Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97894 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-05 15:42:07 +02:00
Nicolai Hähnle	d8cff811df	util/slab: re-design to allow migration between pools (v3) This is basically a re-write of the slab allocator into a design where multiple child pools are linked to a parent pool. The intention is that every (GL, pipe) context has its own child pool, while the corresponding parent pool is held by the winsys or screen, or possibly the GL share group. The fast path is still used when objects are freed by the same child pool that allocated them. However, it is now also possible to free an object in a different pool, as long as they belong to the same parent. Objects also survive the destruction of the (child) pool from which they were allocated. The slow path will return freed objects to the child pool from which they were originally allocated. If that child pool was destroyed, the corresponding page is considered an orphan and will be freed once all objects in it have been freed. This allocation pattern is required for pipe_transfers that correspond to (GL) buffer object mappings when the mapping is created in one context which is later destroyed while other contexts of the same share group live on -- see the bug report referenced below. Note that individual drivers do need to migrate to the new interface in order to benefit and fix the bug. v2: use singly-linked lists everywhere v3: use p_atomic_set for page->u.num_remaining Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97894	2016-10-05 15:40:40 +02:00
Nicolai Hähnle	8915f0c0de	util: use GCC atomic intrinsics with explicit memory model This is motivated by the fact that p_atomic_read and p_atomic_set may somewhat surprisingly not do the right thing in the old version: while stores and loads are de facto atomic at least on x86, the compiler may apply re-ordering and speculation quite liberally. Basically, the old version uses the "relaxed" memory ordering. The new ordering always uses acquire/release ordering. This is the strongest possible memory ordering that doesn't require additional fence instructions on x86. (And the only stronger ordering is "sequentially consistent", which is usually more than you need anyway.) I would feel more comfortable if p_atomic_set/read in the old implementation were at least using volatile loads and stores, but I don't see a way to get there without typeof (which we cannot use here since the code is compiled with -std=c99). Eventually, we should really just move to something that is based on the atomics in C11 / C++11. Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-05 15:39:39 +02:00
Lionel Landwerlin	d51c1f9d51	i965: use L3 data cache for SSBOs Anv programs the hardware to use L3 data cache if we use either SSBOs or images in the shaders, we can program i965 the same way. gl_shader_program has a bit of a confusing named field with 'NumAtomicBuffers'. It doesn't tell how many buffers are accessed by the shader in an atomic way but instead the number of atomic counters manipulated by the shader. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-10-05 12:24:04 +01:00
Kenneth Graunke	a40640f530	mesa: Raise INVALID_ENUM in FramebufferTextureD for unknown textargets. ES3-CTS.functional.negative_api.buffer.framebuffer_texture2d expects glFramebufferTexture[123]D to raise GL_INVALID_ENUM when supplied a completely bogus textarget parameter (i.e. 0xffffffff). This is at odds with the spec. GLES 3.1 says: "An INVALID_OPERATION error is generated if texture is not zero and textarget is not one of TEXTURE_2D, TEXTURE_2D_MULTISAMPLE, or one of the cube map face targets from table 8.21." (and GLES 3.0 and GL 4.5 both have similar text). However, GL has a general guideline that says: "If a command that requires an enumerated value is passed a symbolic constant that is not one of those specified as allowable for that command, an INVALID_ENUM error is generated." Apparently other vendors reconcile these two rules as follows: GL should raise INVALID_OPERATION for actual texture target enumeration values which are not allowed for this particular glFramebufferTextureD call. Any value that is not a texture target should result in GL_INVALID_ENUM. For example, glFramebufferTexture2D with GL_TEXTURE_1D would result in INVALID_OPERATION because it is a real texture target, but not allowed for the 2D version of the function. But calling it with GL_FRONT would result in INVALID_ENUM, as that isn't even a texture target. Fixes: - {ES3-CTS,dEQP-GLES3}.functional.negative_api.buffer.framebuffer_texture2d - {ES31-CTS,ES32-CTS,dEQP-GLES31}.functional.debug.negative_coverage.get_error.buffer.framebuffer_texture2d References: https://gitlab.khronos.org/opengl/cts/merge_requests/387 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-10-04 21:10:24 -07:00
Kenneth Graunke	aecdb21be8	mesa: Reorganize check_textarget(). Having one top-level switch statement covering all known texture targets will make the next change easier to implement. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-10-04 21:10:05 -07:00
Kenneth Graunke	53b8f6374f	aubinator: use the correct format specifier for printing ptrdiff_t. Fixes more warnings in 32-bit builds. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-10-04 17:28:01 -07:00
Kenneth Graunke	af41e1a500	aubinator: Use less -RS instead of -r for the implicit pager. From the less man page: "Warning: when the -r option is used, less cannot keep track of the actual appearance of the screen (since this depends on how the screen responds to each type of control character). Thus, various display problems may result, such as long lines being split in the wrong place." Lines which are too long to fit in the terminal would be word wrapped, but unfortunately less would get confused about which line it was on, and text would be drawn on top of other text. The most noticable case was shader assembly, which is frequently too wide for an 80 character terminal, and thus would be drawn on top of the following state packets, making them completely unreadable. Using -R instead of -r fixes this problem by only allowing color escape sequences. (Notably, Git's implicit pager invocation uses -R.) Unfortunately, it means our "clear to the end of the line" hack for extending the blue bar headers won't work anymore. Word wrapping usually isn't terribly readable, anyway, so we also add the -S option (chop long lines) to restrict it to the terminal width. (You can hit the left and right arrow keys to scroll sideways.) Then, for a new blue bar hack, we can use a printf specifier to pad the command packet names to be 80 characters long (arbitrarily), which extends them "far enough" to look good, and doesn't require us to use ioctls to determine the terminal width. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Sirisha Gandikota <sirisha.gandikota@intel.com>	2016-10-04 17:25:46 -07:00
Kenneth Graunke	8a484a63f8	i965: Drop _NEW_TRANSFORM from 3DSTATE_VS atom on Gen7. The atom that uploads push constants listens to _NEW_TRANSFORM for legacy clip plane handling. On Sandybridge, the gen6_vs_state atom emits 3DSTATE_CONSTANT_VS as well as 3DSTATE_VS, so it needs to listen to the same set of conditions. However, it looks like Gen7 doesn't need this. The push constant atom emits 3DSTATE_CONSTANT_VS directly, and the gen7_vs_state atom that emits 3DSTATE_VS doesn't have a dependency on ctx->Transform. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-04 17:21:40 -07:00
Kenneth Graunke	d3cc3d28bd	i965: Fix brw_clear_cache to clean up TCS/TES shaders. We need to free prog_data for TCS/TES too. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arcero@collabora.com>	2016-10-04 17:09:08 -07:00
Kenneth Graunke	bab1c05634	i965: Add missing BRW_CS_PROG_DATA to CS work group surface atom. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-04 17:09:07 -07:00
Kenneth Graunke	ce6c80ebbb	i965: Add missing BRW_NEW_CS_PROG_DATA to compute constant atom. CACHE_NEW_CS_PROG hasn't existed in quite a long time...the old comment was there, but not the actual bit. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-04 17:09:07 -07:00
Kenneth Graunke	f2b9b0c730	i965: Add missing BRW_NEW_FS_PROG_DATA to render target reads. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-04 17:09:07 -07:00
Kenneth Graunke	0047d600af	i965: Move BRW_NEW_FRAGMENT_PROGRAM from 3DSTATE_PS to PS_EXTRA. 3DSTATE_PS doesn't need this. 3DSTATE_PS_EXTRA however does, for brw_color_buffer_write_enabled(). Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-04 17:09:07 -07:00
Kenneth Graunke	28e1538be7	i965: Add missing BRW_NEW_VS_PROG_DATA to 3DSTATE_CLIP. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-04 17:09:07 -07:00
Kenneth Graunke	78df96256b	i965: Fix missing _NEW_TRANSFORM in Gen8+ 3DSTATE_DS atom. Needed for user clip plane enables. Broken since this code was introduced. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-04 17:09:07 -07:00
Ian Romanick	40dd45d0c6	i965: Enable ARB_shader_atomic_counter_ops Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-10-04 16:53:32 -07:00
Ian Romanick	3d2011cb33	i965: Refactor emission of atomic counter operations This will make it easier to add more operations. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-10-04 16:53:32 -07:00
Ian Romanick	7cd0b3084c	nir/intrinsics: Add more atomic_counter ops v2: Delete some stray debug code notice by Iago. v3: Massive rebase on new ir_function_signature::intrinsic_id mechanism. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> [v1] Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-04 16:53:32 -07:00
Ian Romanick	2c9a17ac79	nir/intrinsics: Include atomic_counter_ in the names used in macro invocations Otherwise grepping for where atomic_counter_inc and friends are defined is a very frustrating experience. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-04 16:53:32 -07:00
Ian Romanick	c42fe30c86	glsl: Kill __intrinsic_atomic_sub Just generate an __intrinsic_atomic_add with a negated parameter. Some background on the non-obvious reasons for the the big change to builtin_builder::call()... this is cribbed from some discussion with Ilia on mesa-dev. Why change builtin_builder::call() to allow taking dereferences and create them here rather than just feeding in the ir_variables directly? The problem is the neg_data ir_variable node would have to be in two lists at the same time: the instruction stream and parameters. The ir_variable node is automatically added to the instruction stream by the call to make_temp. Restructuring the code so that the ir_variables could be in parameters then move them to the instruction stream would have been pretty terrible. ir_call in the instruction stream has an exec_list that contains ir_dereference_variable nodes. The builtin_builder::call method previously took an exec_list of ir_variables and created a list of ir_dereference_variable. All of the original users of that method wanted to make a function call using exactly the set of parameters passed to the built-in function (i.e., call __intrinsic_atomic_add using the parameters to atomicAdd). For these users, the list of ir_variables already existed: the list of parameters in the built-in function signature. This new caller doesn't do that. It wants to call a function with a parameter from the function and a value calculated in the function. So, I changed builtin_builder::call to take a list that could either be a list of ir_variable or a list of ir_dereference_variable. In the former case it behaves just as it previously did. In the latter case, it uses (and removes from the input list) the ir_dereference_variable nodes instead of creating new ones. text data bss dec hex filename 6036395 283160 28608 6348163 60dd83 lib64/i965_dri.so before 6036923 283160 28608 6348691 60df93 lib64/i965_dri.so after Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-04 16:53:32 -07:00
Ian Romanick	bb290b5679	glsl: Remove ir_function_signature::_is_intrinsic field text data bss dec hex filename 6036491 283160 28608 `6348259` 60dde3 lib64/i965_dri.so before 6036395 283160 28608 6348163 60dd83 lib64/i965_dri.so after Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-04 16:53:31 -07:00
Ian Romanick	acfcc7bbfa	glsl: Add ir_function_signature::is_intrinsic() method This necessetated renaming the is_intrinsic field to _is_intrinsic. The next commit will remove the field. text data bss dec hex filename 6036507 283160 28608 6348275 60ddf3 lib64/i965_dri.so before 6036491 283160 28608 `6348259` 60dde3 lib64/i965_dri.so after Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-04 16:53:31 -07:00
Ian Romanick	b7df52b106	glsl: Use the ir_intrinsic_* enums instead of the __intrinsic_* name strings text data bss dec hex filename 6038043 283160 28608 6349811 60e3f3 lib64/i965_dri.so before 6036507 283160 28608 6348275 60ddf3 lib64/i965_dri.so after v2: s/ir_intrinsic_atomic_sub/ir_intrinsic_atomic_counter_sub/. Noticed by Ilia. v3: Silence unhandled enum in switch warnings in st_glsl_to_tgsi. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-04 16:53:31 -07:00
Ian Romanick	5854de99b2	glsl: Track a unique intrinsic ID with each intrinsic function text data bss dec hex filename 6037483 283160 28608 6349251 60e1c3 lib64/i965_dri.so before 6038043 283160 28608 6349811 60e3f3 lib64/i965_dri.so after Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-04 16:53:31 -07:00
Ian Romanick	c01f2bfc6c	glsl: Don't emit ir_binop_carry during ir_binop_imul_high lowering st_glsl_to_tgsi only calls lower_instructions once (instead of in a loop), so the ir_binop_carry generated would not get lowered. Fixes assertion failure state_tracker/st_glsl_to_tgsi.cpp:2265: void glsl_to_tgsi_visitor::visit_expression(ir_expression, st_src_reg): Assertion `!"Invalid ir opcode in glsl_to_tgsi_visitor::visit()"' failed. on softpipe in 16 piglit tests: mesa_shader_integer_functions/execution/built-in-functions/fs-imulExtended-nonuniform.shader_test mesa_shader_integer_functions/execution/built-in-functions/fs-imulExtended-only-msb-nonuniform.shader_test mesa_shader_integer_functions/execution/built-in-functions/fs-imulExtended-only-msb.shader_test mesa_shader_integer_functions/execution/built-in-functions/fs-imulExtended.shader_test mesa_shader_integer_functions/execution/built-in-functions/fs-umulExtended-nonuniform.shader_test mesa_shader_integer_functions/execution/built-in-functions/fs-umulExtended-only-msb-nonuniform.shader_test mesa_shader_integer_functions/execution/built-in-functions/fs-umulExtended-only-msb.shader_test mesa_shader_integer_functions/execution/built-in-functions/fs-umulExtended.shader_test mesa_shader_integer_functions/execution/built-in-functions/vs-imulExtended-nonuniform.shader_test mesa_shader_integer_functions/execution/built-in-functions/vs-imulExtended-only-msb-nonuniform.shader_test mesa_shader_integer_functions/execution/built-in-functions/vs-imulExtended-only-msb.shader_test mesa_shader_integer_functions/execution/built-in-functions/vs-imulExtended.shader_test mesa_shader_integer_functions/execution/built-in-functions/vs-umulExtended-nonuniform.shader_test mesa_shader_integer_functions/execution/built-in-functions/vs-umulExtended-only-msb-nonuniform.shader_test mesa_shader_integer_functions/execution/built-in-functions/vs-umulExtended-only-msb.shader_test mesa_shader_integer_functions/execution/built-in-functions/vs-umulExtended.shader_test Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-04 16:53:31 -07:00
Timothy Arceri	0e8f1eaf41	i965: fix unused variable warning in brw_emit_gpgpu_walker() Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-10-05 10:14:05 +11:00
Timothy Arceri	6fdfcd4d1c	i965: add MAYBE_UNUSED to assert param Fixes unused variable warning in release build. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-10-05 10:13:58 +11:00
Timothy Arceri	4340294af8	i965: wrap unused function in #ifndef NDEBUG This function is only ever used by an assert() this fixes an unused function warning in release builds. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-10-05 10:13:58 +11:00
Timothy Arceri	c9f1767903	i965: fix unused variable warning in gen7_block_read_scratch() Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-10-05 10:13:58 +11:00
Timothy Arceri	df4ff31d3c	i965: add MAYBE_UNUSED to assert param This fixes an unused variable warning on release builds. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-10-05 10:13:52 +11:00
Jose Fonseca	437d7e1baf	gallivm: Use AVX2 gather instrinsics. v2: Use AVX2 gather for non aligned loads too. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-10-04 23:36:20 +01:00
Roland Scheidegger	bc80741d7a	gallivm: Use 8 wide AoS sampling on AVX2. v2: Make sure that with num_lods > 1 and min_filter != mag_filter we still enter the splitting path. So this case would still use 4-wide aos path (as a side note, the 4-wide aos sampling path could actually be improved quite a bit if we have avx2, by just doing the filtering with 256bit vectors). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-10-04 23:36:20 +01:00
José Fonseca	e088390c7d	gallivm: Basic AVX2 support. v2: pblendb -> pblendvb Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-10-04 23:36:20 +01:00
Chad Versace	add01add1b	egl: Drop duplicate check on EGLSync type _eglInitSync checked that the display supported the sync type (such as EGL_SYNC_FENCE), and did it wrong. When the check failed it emitted EGL_BAD_ATTRIBUTE, but sometimes EGL_BAD_PARAMETER is needed. _eglCreateSync already does the error checking, and it does it right. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-04 14:11:29 -07:00
Chad Versace	02e4f1cb43	egl: Cleanup control flow in _eglParseSyncAttribList When the function encountered an error, it effectively returned immediately. However, it did so indirectly by breaking out of a loop. Replace the loop breakout with a explicit 'return'. Do the same for _eglParseSyncAttribList64 too. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-04 14:11:29 -07:00
Chad Versace	3e0d575a6d	egl: Add _eglConvertIntsToAttribs() This function converts an attribute list from EGLint[] to EGLAttrib[]. Will be used in following patches to cleanup EGLSync attribute parsing. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-04 14:11:29 -07:00
Chad Versace	f2c2f43d4e	egl: Fix an error path in eglCreateSync* When the user called eglCreateSync64KHR on a display without EGL_KHR_cl_event2 (the only extension that exposes it), we returned EGL_NO_SYNC but did not update the error code. We also did the same for eglCreateSync on a display without EGL 1.5. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-04 14:11:28 -07:00
Chad Versace	69adb9a778	egl: Fix truncation error in _eglParseSyncAttribList64 The function stores EGLAttrib values in EGLint variables. On 64-bit systems, this truncated the values. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-04 14:11:28 -07:00
Chad Versace	17084b6f93	egl: Fix missing unlock in eglGetSyncAttribKHR On the error path, eglGetSyncAttribKHR neglected to unlock the EGLDisplay before returning. Fixes deadlock in dEQP-EGL.functional.fence_sync.invalid.get_invalid_value. Cc: mesa-stable@lists.freedesktop.org Cc: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-04 14:11:22 -07:00
Anuj Phogat	d2112fc8d9	anv/gen7_pipeline: Fix typo in semicolon Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-04 13:20:35 -07:00
Anuj Phogat	1ffcf95fc4	anv/gen7_pipeline: Set sample mask field in 3DSTATE_PS Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-04 13:20:35 -07:00
Anuj Phogat	deeb1e95d0	anv/gen7_pipeline: Move ksp{1,2} state setting next to ksp0 Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-04 13:20:35 -07:00
Anuj Phogat	517b1bf499	anv/gen7: Make use of local variable prog_data Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-04 13:20:34 -07:00
Anuj Phogat	2abb7486f5	anv/gen8_pipeline: Add an assert to ensure use_alt_mode is not set in prog_data Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-04 13:20:34 -07:00
Anuj Phogat	fa04b57c15	anv/gen8_pipeline: Fix typo in semicolon Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-04 13:20:34 -07:00
Anuj Phogat	7daafad9ac	intel/genxml: Keep the value name 'Alternate' uniform across gen75.xml Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-04 13:20:34 -07:00
Anuj Phogat	c0f02bbc57	intel/genxml: Fix typo in gen75.xml Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-04 13:20:34 -07:00
Anuj Phogat	cd69d3f929	i965/gen8+: Enable GL_OES_viewport_array This patch causes 2 regressions in khronos' gles cts tests on various intel platforms. Failing tests: ES3-CTS.functional.state_query.integers.viewport_getinteger ES3-CTS.functional.state_query.integers.viewport_getfloat Here is an explanation of what's causing the failures: CTS tests are not clamping the x, y location of the viewport's bottom-left corner as recommended by ARB_viewport_array and OES_viewport_array: "The location of the viewport's bottom-left corner, given by (x,y), are clamped to be within the implementation-dependent viewport bounds range. The viewport bounds range [min, max] tuple may be determined by calling GetFloatv with the symbolic constant VIEWPORT_BOUNDS_RANGE_OES" Khronos CTS merge request to fix the test case: https://gitlab.khronos.org/opengl/cts/merge_requests/399 V2: Initialize the relevant variables for GL_OES_viewport_array on gen8+ Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-04 13:20:34 -07:00
Anuj Phogat	239ff64173	mesa: Add a check for OES_viewport_array Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-04 13:20:34 -07:00
Anuj Phogat	0a7691ee62	mesa: Enable enums for OES_viewport_array Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-10-04 13:20:34 -07:00
Anuj Phogat	2c7e1165fa	anv/gen7_pipeline: Use MSDISPMODE_PERSAMPLE for non-multisampled fbo Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-04 13:20:34 -07:00
Anuj Phogat	f75a93f610	anv/blorp: Handle zero width/height blits in blorp_copy() V2: Move the check from copy_buffer_to_image() to blorp_copy(). (Nanley) Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-10-04 13:20:34 -07:00
Anuj Phogat	2c78b2ec90	intel/isl: Add an assert to check zero width/height surface Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-04 13:20:34 -07:00
Leo Liu	0e85ff3355	st/omx/dec/h265: add scaling list data Specified by subclause 7.3.4 v2: get the loop optimized Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-10-04 11:09:59 -04:00
Leo Liu	ffb863fd2c	st/omx/dec/h265: fix the skip for before and after list For reference picture sets, there are cases that rps will not always be used. Once detect the unused flag from encoded bitstream, we should not add this rps to any list, otherwise pass the incorrect reference and skip the correct rps. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-04 11:09:59 -04:00
Leo Liu	c50b68e6a8	st/omx/dec/h265: set the default reference picture set for reference It will fix the corruption for frame, that only has one stort term ref picture set, we set NULL rps for this case previously, causing taking incorrect reference. Instead we should take that only short term set as reference Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-04 11:09:59 -04:00
Leo Liu	091aae0265	st/omx/dec/h265: decoder size should follow from sps The video size from format container is not always compatible with the size from codec bitstream, the HW decoder should take the size information from bitstream, otherwise the corruption appears with clip that has different size info between bitstream and format container So we are passing width(height)_in_samples from sequence parameter set to video decoder. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-10-04 11:09:59 -04:00
Leo Liu	2371119db9	st/omx/dec/h265: increase dpb max size to 32 For clip with frame delta poc over 16 Signed-off-by: Leo Liu <leo.liu@amd.com>	2016-10-04 11:09:59 -04:00
Eric Engestrom	66f85c3824	nir/spirv: Remove a duplicate spirv2nir from .gitignore This reverts commit `fc03ecfeaf`. Chad had already pushed the same change between me posting the patch and Jason pushing it: `44bcf1ffcc` (".gitignore: Ignore src/compiler/spirv2nir") Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-04 07:43:15 -07:00
Nicolai Hähnle	8b1f9fd3b3	radeonsi: optionally run the LLVM IR verifier pass This is enabled automatically if shader printing is enabled, or separately by R600_DEBUG=checkir. Catch mal-formed IR before it crashes in a later pass. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-04 16:39:33 +02:00
Nicolai Hähnle	1e9476e8c5	gallium/radeon: fix argument type of llvm.{cttz,ctlz}.i32 intrinsics Caught by R600_DEBUG=checkir (next commit). Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-04 16:39:28 +02:00
Nicolai Hähnle	1b6fb88ab2	gallium/radeon: unify the creation of basic blocks This changes the order of basic blocks to be equal to the order of code in the original TGSI, which is nice for making sense of shader dumps. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-04 16:39:25 +02:00
Nicolai Hähnle	d377f4c1ca	gallium/radeon: merge branch and loop flow control stacks Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-04 16:39:21 +02:00
Nicolai Hähnle	b0d50e157d	gallium/radeon: simplify if/else/endif blocks In particular, we no longer emit an else block when there is no ELSE instruction. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-04 16:39:18 +02:00
Nicolai Hähnle	89e9de2ea6	gallium/radeon: label basic blocks by the corresponding TGSI pc Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-04 16:39:15 +02:00
Nicolai Hähnle	6f87d7a146	gallium/radeon: cleanup and fix branch emits Some of the existing code is needlessly complicated. The basic principle should be: control-flow opcodes emit branches to properly terminate the current block, _unless_ the current block already has a terminator (which happens if and only if there was a BRK or CONT). This also fixes a bug where multiple terminators were created in a block. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97887 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-04 16:39:10 +02:00
Nicolai Hähnle	dfc1afda83	winsys/radeon: add buffer_get_reloc_offset Really fix the bug that was supposed to be fixed by commits `3e7cced4b` and `a48bf02d`: even when virtual addresses are used, the legacy relocation-based method with offsets relative to the kernel's buffer object are used for video submissions. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97969 Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-04 16:37:44 +02:00
Marek Olšák	71a5cf6f3b	radeonsi: don't declare LDS in PS when ds_bpermute is used I guess this is not needed because dead code elimination removes the declaration. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:12:16 +02:00
Marek Olšák	b2a694f079	radeonsi: use DDX/DDY directly in si_llvm_emit_ddxy_interp We can finally do this, because the opcodes are scalar now. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:12:14 +02:00
Marek Olšák	b57aef8033	radeonsi: simplify si_llvm_emit_ddxy si_llvm_emit_ddxy is called once per element, so we don't have to generate code for 4 elements at once. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:12:12 +02:00
Marek Olšák	046c199c3a	radeonsi: don't call build_gep0 in si_llvm_emit_ddxy on VI Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:12:11 +02:00
Marek Olšák	bcc55e1f32	radeonsi: use a helper function for BuildGEP(0, x) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:12:10 +02:00
Marek Olšák	e20f7142a3	radeonsi: remove obsolete shader definitions Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:12:09 +02:00
Marek Olšák	8c6ea5a6ff	radeonsi: remove unnecessary #includes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:12:07 +02:00
Marek Olšák	3388f27d84	radeonsi: clean up lucky #include dependencies Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:12:06 +02:00
Marek Olšák	53d2c8f00f	radeonsi: don't re-create shader PM4 states after scratch buffer update Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:12:05 +02:00
Marek Olšák	6c01684393	gallium/radeon: move r600_common_context::texture_buffers to r600g Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:12:03 +02:00
Marek Olšák	7ce19d9014	radeonsi: don't set sampler buffer offsets in create_sampler_view do it at bind time, so that pipe_sampler_view is immutable with regard to buffer reallocations and we don't have to remember all existing buffer views. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:12:01 +02:00
Marek Olšák	7e6428e0a8	radeonsi: optimize si_invalidate_buffer based on bind_history Just enclose each section with: if (rbuffer->bind_history & PIPE_BIND_...) Bioshock Infinite: +1% performance Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:12:00 +02:00
Marek Olšák	e43bd861e8	radeonsi: track buffer bind history similar to gl_buffer_object::UsageHistory Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:11:58 +02:00
Marek Olšák	b523a9ddc5	radeonsi: drop support for NULL sampler views not used anymore. It was used when the polygon stipple texture was constant. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:11:57 +02:00
Marek Olšák	82e51e8188	radeonsi: separate IA_MULTI_VGT_PARAM and VGT_PRIMITIVE_TYPE emission We want to emit IA_MULTI_VGT_PARAM less often because it's a context reg. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:11:56 +02:00
Marek Olšák	3ee9be42ac	radeonsi: move VGT_LS_HS_CONFIG to derived tess_state Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:11:53 +02:00
Marek Olšák	f92113c5a1	radeonsi: don't check PIPE_BARRIER_MAPPED_BUFFER Caches are always flushed at IB boundary. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:11:51 +02:00
Marek Olšák	ca1d1e0e19	radeonsi: parse SURFACE_SYNC correctly on CIK-VI Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:11:49 +02:00
Marek Olšák	37065b0583	gallium/radeon: inline r600_context_add_resource_size Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:11:47 +02:00
James Legg	e33f31d61f	radeonsi: Fix primitive restart when index changes If primitive restart is enabled for two consecutive draws which use different primitive restart indices, then the first draw's primitive restart index was incorrectly used for the second draw. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98025 Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-10-04 15:57:37 +02:00
Timothy Arceri	338d3c0b0f	spirv: replace assert() with unreachable() This fixes an uninitialized warning for is_vertex_input. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-04 22:33:51 +11:00
Timothy Arceri	298c2e03d7	intel: use the correct format specifier for printing uint64_t Fixes a bunch of warnings in 32-bit builds. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-10-04 22:32:57 +11:00
Matt Whitlock	42ed8a6c9c	gallium/winsys: replace calls to dup(2) with fcntl(F_DUPFD_CLOEXEC) Without this fix, duplicated file descriptors leak into child processes. See commit `aaac913e90` for one instance where the same fix was employed. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Matt Whitlock <freedesktop@mattwhitlock.name> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-04 11:09:03 +02:00
Matt Whitlock	ac6064f918	st/xa: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC) Without this fix, duplicated file descriptors leak into child processes. See commit `aaac913e90` for one instance where the same fix was employed. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Matt Whitlock <freedesktop@mattwhitlock.name> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-04 11:09:01 +02:00
Matt Whitlock	0c060f691c	st/dri: replace calls to dup(2) with fcntl(F_DUPFD_CLOEXEC) Without this fix, duplicated file descriptors leak into child processes. See commit `aaac913e90` for one instance where the same fix was employed. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Matt Whitlock <freedesktop@mattwhitlock.name> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-04 11:08:58 +02:00
Matt Whitlock	5d0069eca2	gallium/auxiliary: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC) Without this fix, duplicated file descriptors leak into child processes. See commit `aaac913e90` for one instance where the same fix was employed. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Matt Whitlock <freedesktop@mattwhitlock.name> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-04 11:08:55 +02:00
Matt Whitlock	c8fd7d060d	egl/android: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC) Without this fix, duplicated file descriptors leak into child processes. See commit `aaac913e90` for one instance where the same fix was employed. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Matt Whitlock <freedesktop@mattwhitlock.name> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-04 11:08:50 +02:00
Tapani Pälli	387e0af0b4	intel: fix compilation warning on gen_get_device_info (warning: 'const' type qualifier on return type has no effect) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2016-10-04 07:38:45 +03:00
Kenneth Graunke	9d6ca7c3d0	i965: Only emit 1 viewport when possible. In core profile, we support up to 16 viewports. However, in the majority of cases, only 1 of them is actually used - we only need the others if the last shader stage prior to the rasterizer writes gl_ViewportIndex. Processing all 16 viewports adds additional CPU overhead, which hurts CPU-intensive workloads such as Glamor. This meant that switching to core profile actually penalized Glamor to an extent, which is unfortunate. This patch tracks the number of relevant viewports, switching between 1 and ctx->Const.MaxViewports if gl_ViewportIndex is written. A new BRW_NEW_VIEWPORT_COUNT flag tracks this. This could mean re-emitting viewport state when switching, but hopefully this is offset by doing 1/16th of the work in the common case. The new flag is also lighter weight than BRW_NEW_VUE_MAP_GEOM_OUT, which we were using in one case. According to Eric Anholt, x11perf -copypixwin10 performance improves by 11.5094% +/- 3.10841% (n=10) on his Skylake. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-03 18:41:10 -07:00
Dave Airlie	7eb7684818	spirv: translate cull distance semantic. This just translates to the correct cull distance slot. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-04 10:16:23 +10:00
Dave Airlie	bd0157d542	compiler: add printable values for cull distance varyings. We need these for spir-v/nir shaders. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-10-04 10:15:23 +10:00
Jason Ekstrand	6ffbfc760d	nir/spirv/cfg: Use a nop intrinsic for tagging the ends of blocks Previously, we were saving off the last nir_block in a vtn_block before moving on so that we could find the nir_block again when it came time to handle phi sources. Unfortunately, NIR's control flow modification code is inconsistent when it comes to how it splits blocks so the block pointer we saved off may point to a block somewhere else in the shader by the time we get around to handling phi sources. In order to get around this, we insert a nop instruction and use that as the logical end of our block. Since the control flow manipulation code respects instructions, the nop will keeps its place like any other instruction and we can easily find the end of our block when we need it. This fixes a bug triggered by a couple of vkQuake shaders. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97233 Cc: "12.0" <mesa-stable@lists.freedesktop.org> Tested-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-03 16:17:12 -07:00
Jason Ekstrand	7697b4b98b	nir: Add a nop intrinsic This intrinsic has no destination, no sources, no variables, and can be eliminated. In other words, it does nothing and will always get deleted by dead code elimination. However, it does provide a quick-and-easy way to temporarily tag a particular location in a NIR shader. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-10-03 16:17:12 -07:00
Jason Ekstrand	0176c6a692	intel/isl: Allow non-2D HiZ surfaces Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-10-03 14:53:01 -07:00
Jason Ekstrand	4e397c6c75	intel/isl: Add a detailed comment about multisampling with HiZ Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-10-03 14:53:01 -07:00
Jason Ekstrand	c3bd711411	intel/isl: Remove tiling checks from choose_msaa_layout We already do those checks in filter_tiling. There's no good reason to repeat them in choose_msaa_layout. If anything they should have been asserts and not "return false" checks. Also, this check was causing us to outright reject multisampled HiZ surfaces which wasn't intended. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-10-03 14:53:01 -07:00
Jason Ekstrand	69d3bb9915	intel/isl: Handle HiZ and CCS tiling more directly The HiZ and CCS tiling formats are always used for HiZ and CCS surfaces respectively. There's no reason why we should go through filter_tiling and it's much easier to always get HiZ and CCS right if we just handle them directly. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-10-03 14:53:01 -07:00
Jason Ekstrand	b1311a48e0	intel/isl: Allow multisampling with ISL_FORMAT_HiZ HiZ buffers can be multisampled and, on Broadwell and earlier, simply using interleaved multisampling with a compression block size of 8x4 samples yields the correct HiZ surface size calculations. Unfortunately, choose_msaa_layout was rejecting multisampled HiZ buffers because of format checks. Now that we have a simple helper for determining if a format supports multisampling, that's an easy enough issue to fix. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-10-03 14:53:01 -07:00
Jason Ekstrand	baade41a5c	intel/isl: Allow creation of 1-D compressed textures Compressed 1-D textures are not well-defined thing in either GL or Vulkan. However, auxiliary surfaces are treated as compressed textures in ISL and we can do HiZ and CCS with 1-D so we need to be able to create them. In order to prevent actually using them (the docs say no), we assert in the state setup code. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-10-03 14:53:01 -07:00
Jason Ekstrand	f82166578f	intel/isl: Fix up asserts in calc_phys_level0_extent_sa The assertion that a format is uncompressed in the multisample layouts isn't quite right. What we really want to assert is that the format supports multisampling which is a bit more complicated query. We also want to assert that it has a block size of 1x1 since we do nothing with the block size in the phys_level0_sa assignment. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-10-03 14:53:01 -07:00
Jason Ekstrand	5637f3f120	intel/isl: Add a format_supports_multisampling helper Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-10-03 14:53:01 -07:00
Nayan Deshmukh	b7a0f2e1f7	vl/dri3: fix warning about incompatible pointer type Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-10-03 12:51:30 -04:00
Bruce Cherniak	903d00cd32	swr: Removed stalling SwrWaitForIdle from queries. Previous fundamental change in stats gathering added a temporary SwrWaitForIdle to begin_query and end_query. Code has been reworked to remove stall. Reviewed-by: George Kyriazis <george.kyriazis@intel.com>	2016-10-03 09:57:45 -05:00
Tim Rowley	cdac042733	swr: [rasterizer core] refactor thread creation Create worker pool now computes number of worker threads based on things like topologies, etc. and creates the pool but doesn't actually launch the threads. Instead there is a separate start thread pool function. This allows thread resources to be constructed first before threads start. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-03 09:57:38 -05:00
Tim Rowley	114f7a92c6	swr: [rasterizer jitter] canonicalize blend compile state Canonicalize to prevent unnecessary JIT compiles. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-03 09:57:31 -05:00
Tim Rowley	4198520a82	swr: [rasterizer core] archrast fixes - Immediately sleep threads until thread data is initialized - Fix some compile bugs with AR enabled Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-03 09:57:25 -05:00
Tim Rowley	aaeb07989e	swr: [rasterizer jitter] fixes for icc in vs2015 compat mode - Move most jitter functionality into SwrJit namespace - Avoid global "using namespace llvm" in headers Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-03 09:57:19 -05:00
Tim Rowley	b8a6f06c85	swr: [rasterizer core] generalize compute dispatch mechanism Generalize compute dispatch mechanism to support other types of dispatches. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-03 09:57:13 -05:00
Tim Rowley	33a1a09eb0	swr: [rasterizer common] os.h portability header changes - Fix conflict between windows MemoryFence and llvm::sys::MemoryFence - Declare gettid() Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-10-03 09:56:47 -05:00
Ville Syrjälä	2fef0d108a	anv/formats: Fix build on gcc-4 and earlier gcc-4 and earlier don't allow compound literals where a constant is required in -std=c99/gnu99 mode, so we can't use ISL_SWIZZLE() when populating the anv_formats[] array. There are a few ways around it: First one would be -std=c89/gnu89, but the rest of the code depends on c99 so it's not really an option. The second option would be to upgrade to gcc-5+ where the compiler behaviour was relaxed a bit [1]. And the third option is just to avoid using compound literals. I chose the last option since it keeps gcc-4 and earlier working. [1] https://gcc.gnu.org/gcc-5/porting_to.html Cc: Jason Ekstrand <jason@jlekstrand.net> Cc: Topi Pohjolainen <topi.pohjolainen@intel.com> Fixes: `7ddb21708c` ("intel/isl: Add an isl_swizzle structure and use it for isl_view swizzles") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-03 15:45:28 +03:00
Tapani Pälli	4d6d55deef	egl: stop claiming support for pbuffer + msaa This fixes a crash in egl-create-msaa-pbuffer-surface Piglit test and same crash in many dEQP EGL tests. I also found that some Qt example did a workaround because of this crash: https://bugreports.qt.io/browse/QTBUG-47509 v2: Ian pointed out that v1 removed support for all multisample configs, including window ones. This one removes pbuffer bit when adding configs, now only pbuffer+msaa gets rejected and window+msaa continues to work. Fixed also comment (Emil) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-10-03 07:56:44 +03:00
Timothy Arceri	eaf147cb46	i965: rename max_ds_* variable to max_tes_* Using consistent naming allows us to create macros more easily. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-03 15:29:58 +11:00
Timothy Arceri	b67633ce5e	i965: rename max_hs_* variables to max_tcs_* Using consistent naming allows us to create macros more easily. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-03 15:29:51 +11:00
Kenneth Graunke	da274ba5f8	i965: Drop pointless stage == MESA_SHADER_FRAGMENT checks. There's an assert right above this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-02 14:49:20 -07:00
Timothy Arceri	024c207319	glsl: add missing headers to blob.h Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-10-02 13:48:06 +11:00
Jason Ekstrand	ef3c5ac7fb	nir/spirv/cfg: Detect switch_break after loop_break/continue While the current CFG code is valid in the case where a switch break also happens to be a loop continue, it's a bit suboptimal. Since hardware is capable of handling the continue as a direct jump, it's better to use a continue instruction when we can than to bother with all of the nasty switch break lowering. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-10-01 15:40:34 -07:00
Jason Ekstrand	4d02faede5	nir/spirv/cfg: Handle switches whose break block is a loop continue It is possible that the break block of a switch is actually the continue of the loop containing the switch. In this case, we need to identify the break block as a continue and break out of current level of CFG handling. If we don't, the continue portion of the loop will get handled twice, once by following after the break and a second time by the loop handling code handling it explicitly. This fixes 6 of the new Vulkan CTS tests: - dEQP-VK.spirv_assembly.instruction.graphics.opphi.out_of_order* - dEQP-VK.spirv_assembly.instruction.graphics.selection_block_order.out_of_order* Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-10-01 15:40:14 -07:00
Eric Engestrom	fc03ecfeaf	nir/spirv: add spirv2nir binary to .gitignore Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-01 15:27:48 -07:00
Eric Engestrom	c867938044	nir/spirv: improve mmap() error handling Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-01 15:27:46 -07:00
Eric Engestrom	65c8cbe89d	nir/spirv: improve lseek() error handling Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-01 15:27:44 -07:00
Eric Engestrom	23519a9de2	nir/spirv: add some error checking to open() CovID: 1373369 Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-01 15:27:31 -07:00
Timothy Arceri	913e0296f2	mesa: use uint32_t rather than unsigned for xfb struct members These structs will be written to disk as part of the shader cache so use uint32_t just to be safe. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-10-01 11:26:25 +10:00
Timothy Arceri	7064f8674a	i915/i965: remove commented out warning The warning was also the wrong location, it should have been in the else. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-10-01 09:24:33 +10:00
Brian Paul	951bf44a56	mesa: move _mesa_valid_to_render() to api_validate.c Almost all of the other drawing validation code is in api_validate.c so put this function there as well. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-09-30 16:28:00 -06:00
Steven Toth	e99b9395be	gallium/hud: Add support for CPU frequency monitoring Detect all of the CPUs in the system. Expose metrics for min, max and current frequency in Hz. Signed-off-by: Steven Toth <stoth@kernellabs.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-30 15:18:46 -06:00
Marek Olšák	7b87190d2b	Revert "gallium/hud: automatically print % if max_value == 100" This reverts commit `dbfeb0ec12`. With max_value being rounded to 100, it's often wrong. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-30 22:07:12 +02:00
Brian Paul	1d07552ba5	docs: update the list of Mesa major versions and API support Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-30 09:17:33 -06:00
Nicolai Hähnle	7bac5bf032	gallium/radeon: fix crash/regression in performance counters Regression introduced by "gallium/radeon: zero all query buffers". Cc: Michel Dänzer <michel@daenzer.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-30 12:41:45 +02:00
Nicolai Hähnle	cfd870de70	gallium/radeon: update documentation of buffer_get_virtual_address Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-30 12:41:41 +02:00
Nicolai Hähnle	fd9f54223d	gallium/radeon: emit relocations for query fences This is only needed for r600 which doesn't have ARB_query_buffer_object and therefore wouldn't really need the fences, but let's be optimistic about filling in this feature gap eventually. Cc: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-30 12:38:57 +02:00
Nicolai Hähnle	3e7cced4b9	radeon/uvd: adjust the buffer offset when relocation is used We don't plan to use sub-allocated buffers with UVD, but just in case one slips through, this increases the chances of things working out anyway. Reviewed-by: Christian König <christian.koenig@amd.com>	2016-09-30 12:38:52 +02:00
Nicolai Hähnle	a48bf02d05	radeon/vce: adjust the buffer offset when relocation is used We don't plan to use sub-allocated buffers with VCE, but just in case one slips through, this increases the chances of things working out anyway. Reviewed-by: Christian König <christian.koenig@amd.com>	2016-09-30 12:38:48 +02:00
Nicolai Hähnle	13cb41f666	radeon/video: don't use sub-allocated buffers Cc: Christian König <christian.koenig@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97976 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97969 Reviewed-by: Christian König <christian.koenig@amd.com>	2016-09-30 12:38:29 +02:00
Steven Toth	1d466b9b04	gallium/hud: Add power sensor support Implement support for power based sensors, reporting units in milli-watts and watts. Also, minor cleanup - change the related if block to a switch. Tested with two different power sensors, including the nouveau 'power1' sensors on a GTX950 card. Signed-off-by: Steven Toth <stoth@kernellabs.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-29 17:51:15 -06:00
Samuel Pitoiset	3abe68b828	nv50/ir: teach insnCanLoad() about SHLADD Commutativity is not allowed with SHLADD, but src2 can accept loads. To allow the load propagation pass to do its job, add a special case like for SUCLAMP because src1 is always an immediate. This IMAD to SHLADD optimization helps a bunch of shaders from Tomb Raider, Victor Vran, UE4 demos (+15% perf with Elemental) and Shadow Warrior. GF100/GK104: total instructions in shared programs :2838045 -> 2834712 (-0.12%) total gprs used in shared programs :396684 -> 396386 (-0.08%) total local used in shared programs :34416 -> 34416 (0.00%) local gpr inst bytes helped 0 326 1105 1105 hurt 0 55 3 3 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-29 21:20:50 +02:00
Samuel Pitoiset	115c79be10	nv50/ir: optimize SHLADD(a, b, c) to MOV((a << b) + c) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-29 21:20:47 +02:00
Samuel Pitoiset	2e008be9a9	nv50/ir: optimize SHLADD(a, b, 0x0) to SHL(a, b) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-29 21:20:44 +02:00
Samuel Pitoiset	e4eb0fca02	nv50/ir: optimize IMAD to SHLADD in presence of power of 2 Only and only if src1 is a power of 2 we can replace IMAD by SHLADD. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-29 21:20:41 +02:00
Samuel Pitoiset	31545b64b8	nvc0/ir: add emission for SHLADD Unfortunately, we can't use the emit helpers for GF100/GK110 because src1 and src2 are swapped. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-29 21:20:36 +02:00
Samuel Pitoiset	85132c7453	nv50/ir: add preliminary support for SHLADD This instruction is available since SM20 (Fermi) and allow to do (a << b) + c in one shot. In some situations, IMAD should be replaced by SHLADD when b is a power of 2, and ADD+SHL should be replaced by SHLADD as well. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-29 21:20:30 +02:00
Samuel Pitoiset	652874754a	nvc0: update GM107 sched control codes format envyas now uses a much better representation for those control codes and it displays the different flags instead of an unreadable hex number. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-29 20:13:05 +02:00
Nicolai Hähnle	e4b585f009	gallium/radeon: use smaller buffers for query results Most of the time, even the 512 bytes that we now get is more than sufficient (pipeline stats queries are the largest at 184 bytes per shot). Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-29 11:24:56 +02:00
Nicolai Hähnle	de84e99e45	gallium/radeon/winsyses: add radeon_winsys::min_alloc_size Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-29 11:24:52 +02:00
Nicolai Hähnle	7a0e543836	radeonsi: enable ARB_query_buffer_object (v2) v2: enable only when compute is available Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-29 11:15:00 +02:00
Nicolai Hähnle	15e2661137	gallium/radeon: implement get_query_result_resource (v2) v2: fix a comment (Gustaw Smolarczyk) Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-29 11:14:54 +02:00
Nicolai Hähnle	2c9d546402	gallium/radeon: zero all query buffers To ensure that fences are properly initialized. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-29 11:14:51 +02:00
Nicolai Hähnle	daeab0171d	gallium/radeon: cleanup getting PIPE_QUERY_TIMESTAMP result Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-29 11:14:45 +02:00
Nicolai Hähnle	631c47384c	gallium/radeon: add query fences and r600_get_hw_query_params We will support the waiting option in ARB_query_buffer_object using WAIT_REG_MEM on an appropriate fence-like dword. Some queries conveniently write their results with the highest bit set, and we can just use that; for others, we have to write a fence explicitly. ZPASS_DONE for occlusion queries writes its results with the high bit set, but it writes up to 8 pairs of results (one for each DB). We have to wait for all of these results, so let's just add an explicit fence. The new function provides summary information to be used by subsequent patches. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-29 11:14:41 +02:00
Nicolai Hähnle	51b57a9b5a	radeonsi: add save_qbo_state Save compute shader state that will be used for the ARB_query_buffer_object implementation. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-29 11:14:37 +02:00
Nicolai Hähnle	70f9ca2468	radeonsi: add si_get_shader_buffers/get_pipe_constant_buffers (v2) These functions extract the pipe state structure from the current descriptors, for state saving. v2: correctly dereference *buf (Bas) Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-29 11:14:33 +02:00
Nicolai Hähnle	8d45243e40	gallium/radeon: add r600_gfx_{write,wait}_fence For bottom-of-pipe fences inside the gfx command stream. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-29 11:14:29 +02:00
Nicolai Hähnle	8e4de00930	gallium/radeon: add barrier_flags to r600_common_screen There are driver-specific context flags for barriers that are not covered by the Gallium barrier interfaces. The R600 settings of these flags may not be optimal, but we're not going to use them yet anyway. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-29 11:14:11 +02:00
Timothy Arceri	577e06095b	glsl: remove remaining tabs from ast_type.cpp Acked-by: Dave Airlie <airlied@redhat.com>	2016-09-29 11:06:12 +10:00
Timothy Arceri	222f66a812	glsl: remove remaining tabs from ast_to_hir.cpp Acked-by: Dave Airlie <airlied@redhat.com>	2016-09-29 11:06:12 +10:00
Timothy Arceri	fc1d200bc7	glsl: remove remaining tabs from ast_array_index.cpp Acked-by: Dave Airlie <airlied@redhat.com>	2016-09-29 11:06:12 +10:00
Timothy Arceri	b193c4d75b	glsl: remove tabs from ast_expr.cpp Acked-by: Dave Airlie <airlied@redhat.com>	2016-09-29 11:06:12 +10:00
Timothy Arceri	386045a3df	glsl: remove tabs from linker.{cpp,h} Acked-by: Dave Airlie <airlied@redhat.com>	2016-09-29 11:06:12 +10:00
Steven Toth	8c60bcb4c3	gallium/hud: Add support for block I/O, network I/O and lmsensor stats V8: Feedback based on peer review convert if block into a switch Constify some func args V7: Increase precision when measuring lmsensors volts Flatten patch series. V6: Feedback based on peer review Simplify sensor initialization (arg passing). Constify some func args V5: Feedback based on peer review Convert sprintf to snprintf Convert char * to const char * int arg converted to bool Func changes to take a filename vs a larger struct. Omit the space between '*' and the param name. V4: Merged with master as of 2016/9/27 6pm V3: Flatten the entire patchset ready for the ML V2: Additional seperate patches based on feedback a) configure.ac: Add a comment related to libsensors b) HUD: Disable Block/NIC I/O stats by default. Implement configuration option --enable-gallium-extra-hud=yes and enable both statistics when this option is enabled. c) Configure.ac: Minor cleanup to user visible configuration settings d) Configure.ac: HUD stats - build system improvements Move the -lsensors out of a deeper Makefile, bring it into the configure.ac. Also, rename a compiler directive to more closely follow the standard. V1: Initial release to the ML Three new features: 1. Disk/block I/O device read/write stats MB/ps. 2. Network Interface RX/TX transfer statistics as a percentage of the overall NIC speed. 3. lmsensor power, voltage and temperature sensors. The lmsensor changes makes a dependency on libsensors so support for the change is opt out by default. Signed-off-by: Steven Toth <stoth@kernellabs.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-28 16:18:05 -06:00
Ben Widawsky	29783c0887	i965: Remove useless (harmful) assertion The code already skips doing the depth stall on gen >= 8, and as we enable new platforms this assertion will fail needlessly. Instead of changing the caller, make this simple change. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-09-28 09:42:53 -07:00
Eric Anholt	2a721b1b79	vc4: Emit perf debug when we fall back to quad clears.	2016-09-28 08:31:14 -07:00
Eric Anholt	1aa8a0392f	nir: Optimize out discard_ifs with a constant 0 argument. I found this in a shader that was doing an alpha test when alpha is fixed at 1.0. v2: Rebase on master (now the const value is "u32" not "u"). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1)	2016-09-28 08:31:14 -07:00
Michel Dänzer	8d8c440ebf	gallium/radeon: Initialize pipe_resource::next to NULL Fixes lots of piglit tests crashing due to using uninitialized memory. Fixes: `ecd6fce261` ("mesa/st: support lowering multi-planar YUV") Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-28 10:39:22 +09:00
Timothy Arceri	3eb0baeecf	glsl: don't crash when dumping shaders if some come from cache Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-28 10:43:15 +10:00
Timothy Arceri	87ab26b2ab	glsl: Add initial functions to implement an on-disk cache This code provides for an on-disk cache of objects. Objects are stored and retrieved via names that are arbitrary 20-byte sequences, (intended to be SHA-1 hashes of something identifying for the content). The directory used for the cache can be specified by means of environment variables in the following priority order: $MESA_GLSL_CACHE_DIR $XDG_CACHE_HOME/mesa <user-home-directory>/.cache/mesa By default the cache will be limited to a maximum size of 1GB. The environment variable: $MESA_GLSL_CACHE_MAX_SIZE can be set (at the time of GL context creation) to choose some other size. This variable is a number that can optionally be followed by 'K', 'M', or 'G' to select a size in kilobytes, megabytes, or gigabytes. By default, an unadorned value will be interpreted as gigabytes. The cache will be entirely disabled at runtime if the variable MESA_GLSL_CACHE_DISABLE is set at the time of GL context creation. Many thanks to Kristian Høgsberg <krh@bitplanet.net> for the initial implementation of code that led to this patch. In particular, the idea of using an mmapped file, (indexed by a portion of the SHA-1), for the efficent implementation of cache_has_key was entirely his idea. Kristian also provided some very helpful advice in discussions regarding various race conditions to be avoided in this code. Signed-off-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-09-28 09:16:31 +10:00
Chad Versace	44bcf1ffcc	.gitignore: Ignore src/compiler/spirv2nir	2016-09-27 13:22:44 -07:00
Ian Romanick	ea6ed2379d	glsl: Fix cut-and-paste bug in hierarchical visitor ir_expression::accept At this point in the code, s must be visit_continue. If the child returned visit_stop, visit_stop is the only correct thing to return. Found by inspection. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-27 12:06:46 -07:00
Ian Romanick	7f64041cee	glsl: Add bit_xor builder Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-27 12:06:46 -07:00
Ian Romanick	5f7f7d582b	glsl/standalone: Enable GLSL 4.00 through 4.50 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-27 12:06:46 -07:00
Ian Romanick	798d1b8816	glsl/standalone: Use API_OPENGL_CORE if the GLSL version is >= 1.40 Otherwise extensions to 1.40 that are only for core profile won't work. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-27 12:06:46 -07:00
Ian Romanick	afd99734db	glsl: Update function parameter documentation for do_common_optimization max_unroll_iterations was moved into options a long, long time ago. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-27 12:06:46 -07:00
Tim Rowley	bacdd9ef4c	configure.ac: add llvm inteljitevents component if enabled Needed to successfully link llvmpipe or swr when using shared llvm libs built with inteljitevents enabled. v2: Make adding inteljitevents component global rather than just llvmpipe/swr, since libgallium will have a symbol dependency. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-27 12:56:47 -05:00
Tim Rowley	50842e8a93	swr: replace gallium->swr format enum conversion Replace old string comparison with a mapping table. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-09-27 12:55:26 -05:00
Nicolai Hähnle	4421c0fb0d	gallium/radeon/winsyses: reduce the number of pb_cache buckets Small buffers are now handled via the slabs code, so separate buckets in pb_cache have become redundant. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-27 16:45:41 +02:00
Nicolai Hähnle	fb827c055c	winsys/radeon: enable buffer allocation from slabs Only enable for chips with GPUVM, because older driver paths do not take the required offset into account. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-27 16:45:37 +02:00
Nicolai Hähnle	a1e391e39d	winsys/radeon: add fine-grained fences for slab buffers Note the logic for adding fences is somewhat different than for amdgpu, because radeon has no scheduler and we therefore have no guarantee about the order in which submissions from multiple threads are processed. (Ironically, this is only an issue when "multi-threaded submission" is disabled, because "multi-threaded submission" actually means that all submissions happen from a single thread that happens to be separate from the application's threads. If we only supported "multi-threaded submission", the fence handling could be simplified by adding the fences in that thread where everything is serialized.) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-27 16:45:34 +02:00
Nicolai Hähnle	0edebde9a4	winsys/radeon: add slab buffer list Introducing radeon_bo::hash will reduce collisions between "real" buffers and buffers from slabs. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-27 16:45:32 +02:00
Nicolai Hähnle	cbb9c2f170	winsys/radeon: separate adding a buffer from updating its reloc data Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-27 16:45:29 +02:00
Nicolai Hähnle	a9e8672585	winsys/radeon: add slab entry structures to radeon_bo Already adjust the map/unmap logic accordingly. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-27 16:45:25 +02:00
Nicolai Hähnle	ffa1c669dd	winsys/amdgpu: enable buffer allocation from slabs Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-27 16:45:23 +02:00
Nicolai Hähnle	a3832590c6	winsys/amdgpu: add fence and buffer list logic for slab allocated buffers Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-27 16:45:20 +02:00
Nicolai Hähnle	a987e4377a	winsys/amdgpu: add slab entry structures to amdgpu_winsys_bo Already adjust amdgpu_bo_map/unmap accordingly. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-27 16:45:15 +02:00
Nicolai Hähnle	5af9eef719	winsys/amdgpu: do not synchronize unsynchronized buffers When a buffer is added to a CS without the SYNCHRONIZED usage flag, we now no longer add a dependency on the buffer's fence(s). However, we still need to add a fence to the buffer during flush, so that cache reclaim works correctly (and in the hypothetical case that the buffer is later added to a CS _with_ the SYNCHRONIZED flag). It is now possible that the submissions refererring to a buffer are no longer linearly ordered, and so we may have to keep multiple fences around. We keep the fences in a FIFO. It should usually stay quite short (# of contexts * 2, for gfx + dma rings). While we're at it, extract amdgpu_add_fence_dependency for a single buffer, which will make adding the distinction between real buffer and slab cases easier. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-27 16:45:11 +02:00
Nicolai Hähnle	6d89a40676	gallium/radeon: add RADEON_FLAG_HANDLE When passed to winsys->buffer_create, this flag will indicate that we require a buffer that maps 1:1 with a kernel buffer handle. This is currently set for all textures, since textures can potentially be exported to other processes. This is not a huge loss, since the main purpose of this patch series is to deal with applications that allocate many small buffers. A hypothetical application with tons of tiny textures might still benefit from not setting this flag, but that's not a use case I'm worried about just now. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-27 16:45:05 +02:00
Nicolai Hähnle	e703f71ebd	gallium/radeon: add RADEON_USAGE_SYNCHRONIZED This is really the behavior we want most of the time, but having a SYNCHRONIZED flag instead of an UNSYNCHRONIZED one has the advantage that OR'ing different flags together always results in stronger guarantees. The parent BOs of sub-allocated buffers will be added unsynchronized. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-27 16:45:02 +02:00
Nicolai Hähnle	84f156c0cb	gallium/pipebuffer: add pb_slab utility This is a simple framework for slab allocation from buffers that fits into the buffer management scheme of the radeon and amdgpu winsyses where bufmgrs aren't used. The utility knows about different sized allocations and explicitly manages reclaim of allocations that have pending fences. It manages all the free lists but does not actually touch buffer objects directly, relying on callbacks for that. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-27 16:44:42 +02:00
Nicolai Hähnle	b3ebc229dc	gallium/u_math: add util_logbase2_ceil For finding the exponent of the next power of two. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-27 16:44:38 +02:00
Nicholas Bishop	c060f291c2	i915g: add dma-buf support to i915_drm_buffer_get_handle The implementation of i915_drm_buffer_get_handle now handles DRM_API_HANDLE_TYPE_FD in the same way that intel_winsys_import_handle does, by calling drm_intel_bo_gem_create_from_prime. Tested by successfully running Chrome's ozone_demo [1] with the ozone-gbm backend on an Intel Pineview M machine. Without this change it fails while trying to create a DMA-BUF. [1] https://chromium.googlesource.com/chromium/src.git/+/master/ui/ozone/demo/ozone_demo.cc Signed-off-by: Nicholas Bishop <nbishop@neverware.com> [Emil Velikov: Fix coding style] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-27 13:37:21 +01:00
Nicholas Bishop	aa560e8e63	st/dri: check pipe_screen->resource_get_handle() return value Change dri2_query_image to check the return value of resource_get_handle and return GL_FALSE if an error occurs. For reference this is an example callstack that should propagate the error back to the user: i915_drm_buffer_get_handle i915_texture_get_handle u_resource_get_handle_vtbl dri2_query_image gbm_dri_bo_get_fd gbm_bo_get_fd Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Nicholas Bishop <nbishop@neverware.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1) [Emil Velikov: Split from larger patch, polish coding style, cc stable] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-27 13:37:21 +01:00
Nicholas Bishop	2d05ba2ca0	gbm: return appropriate error when queryImage() fails Change gbm_dri_bo_get_fd to check the return value of queryImage and return -1 (an invalid file descriptor) if an error occurs. Update the comment for gbm_bo_get_fd to return -1, since (apart from the above) we've already return -1 on error. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Nicholas Bishop <nbishop@neverware.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1) [Emil Velikov: Split from larger patch, polish coding style, cc stable] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-27 13:37:21 +01:00
Andy Furniss	a599302227	st/va Avoid VBR bitrate calculation overflow v2 VBR bitrate calc needs 64 bits at high rates. v2: use float. Signed-off-by: Andy Furniss <adf.lists@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Cc: mesa-stable@lists.freedesktop.org	2016-09-27 14:21:45 +02:00
Mark Thompson	a543f231d7	st/va: Fix vaSyncSurface with no outstanding operation Fixes crash if the application doesn't do what the state tracker expects. Reviewed-by: Christian König <christian.koenig@amd.com>	2016-09-27 14:21:44 +02:00
Timothy Arceri	df920367bf	glsl: remove remaining tabs in glsl_parser_extras.h Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-09-27 20:32:47 +10:00
Ilia Mirkin	477cc0e085	st/mesa: enable ARB_ES3_2_compatibility when enough available Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Marek Olšák <marek.olsak@amd.com>	2016-09-27 00:20:44 -04:00
Ilia Mirkin	67fbaa5873	st/mesa: enable GL_ANDROID_extension_pack_es31a when available For now that's never since advanced blend hasn't been piped through. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Marek Olšák <marek.olsak@amd.com>	2016-09-27 00:20:41 -04:00
Timothy Arceri	63e8221574	glsl: move some uniform linking code to new link_assign_uniform_storage() This makes link_assign_uniform_locations() easier to follow. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-27 11:29:05 +10:00
Timothy Arceri	ab67b6afdf	glsl: move some uniform linking code to new link_setup_uniform_remap_tables() This makes link_assign_uniform_locations() easier to follow. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-27 11:29:05 +10:00
Timothy Arceri	856e0bd707	i965: create populate key functions for tcs and tes These will be used by the on disk shader cache. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-27 11:11:15 +10:00
Timothy Arceri	ec75570415	i965: make gs key generation helper available to shader cache Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-27 11:11:15 +10:00
Timothy Arceri	481d8ec291	glsl: use reproducible name for lowered const arrays Otherwise we can end up with mismatching names between the cached binary and the cached metadata. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-27 11:11:15 +10:00
Carl Worth	017081a3e5	i965: make vs and fs key generation helpers available to shader cache Signed-off-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>	2016-09-27 11:11:15 +10:00
Carl Worth	f61669f997	glsl: Prepare standalone compiler to be able to use parameter lists As part of the shader-cache work an upcoming change will add new references to _mesa_add_parameter and _mesa_new_parameter_list from the glsl code. To prepare for that, and to allow the standalone glsl_compiler to still link, here we add mesa/program/prog_parameter.c to the libglsl_util sources. Then, in order to get that to work, we also add to stubs to standalone_scaffolding: _mesa_program_state_flags _mesa_program_state_string These functions aren't actually used by the two functions in prog_parameter.c that we are actually calling. They are used in other functions in the same file. So we don't care what the implementation of these stubs is, (they won't be called by glsl_compiler). We just need the stubs present so that it can link. Signed-off-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-09-27 11:11:15 +10:00
Samuel Pitoiset	f24b517858	nv50/ir: fix comments about instructions info Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-26 21:59:37 +02:00
Rob Clark	ecd6fce261	mesa/st: support lowering multi-planar YUV Support multi-planar YUV for external EGLImage's (currently just in the dma-buf import path) by lowering to multiple texture fetch's for each plane and CSC in shader. There was some discussion of alternative approaches for tracking the additional UV or U/V planes: https://lists.freedesktop.org/archives/mesa-dev/2016-September/127832.html They all seemed worse than pipe_resource::next Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-09-26 15:29:17 -04:00
Rob Clark	e0ec1c3134	mesa/st: add nir pass to lower tex_src_plane Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-09-26 15:29:17 -04:00
Rob Clark	c2a60cacd4	mesa/st: add lowering pass for YUV samplers Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-09-26 15:29:17 -04:00
Sirisha Gandikota	8e3e9d74b5	aubinator: Fix the decoding of values that span two Dwords Fixed the way the values that span two Dwords are decoded. Based on the start and end indices of the field, the Dwords are fetched and decoded accordingly. v2: rename dw to qw in gen_field_iterator_next and remove extra white space (Anuj) v3: change all instances of dw to qw (Anuj) Earlier, 64-bit fields (such as most pointers on Gen8+) weren't decoded correctly. gen_field_iterator_next seemed to walk one DWord at a time, sets v.dw, and then passes it to field(). So, even though field() takes a uint64_t, we're passing it a uint32_t (which gets promoted, so the top 32 bits will always be zero). This seems pretty bogus... (Ken) Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-26 11:18:52 -07:00
Samuel Pitoiset	ac859d68f4	nvc0: allow to force compiling programs in debug build This adds a new envvar called NV50_PROG_CHIPSET which allows to compile shaders with a different target, especially useful for shader-db. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-26 19:39:04 +02:00
Samuel Pitoiset	e05042b367	nv50/ir: drop unused NVISA_XXX_CHIPSET constants Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-26 19:39:04 +02:00
Samuel Pitoiset	be0535b8c7	gallium/util: make use of strtol() in debug_get_num_option() This allows to use hexadecimal numbers which are automatically detected by strtol() when the base is 0. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Brian Paul <brianp@vmware.com>	2016-09-26 19:39:04 +02:00
Glenn Kennard	5da24242b3	r600g: Add support for PK2H/UP2H Based off of Ilia's original patch, but with output values replicated so that it matches the TGSI semantics. Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-09-26 17:08:49 +02:00
Timothy Arceri	eb2dc04127	i965: stop passing stage as a function parameter We already pass the shader so we can just get the stage from this. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-09-26 09:59:24 +10:00
Nayan Deshmukh	b3827819aa	aubinator: fix resource leak CovID: 1373370 Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-25 12:32:48 -07:00
Emilio Cobos Álvarez	cb7c2c9d65	osmesa: Unbind the current context when given a null context and buffer. This is needed to be consistent with other drivers. Signed-off-by: Emilio Cobos Álvarez <me@emiliocobos.me> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-23 19:55:50 -06:00
Brian Paul	07d1f8faf9	st/mesa: small optimization in swizzle_swizzle() Usually, there's no user-specified texture swizzle so we can optimize the swizzle_swizzle() function and skip the loop/switch. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-09-23 19:54:42 -06:00
Brian Paul	1cdc232e1a	st/mesa: fix swizzle issue in st_create_sampler_view_from_stobj() Some demos, like Heaven, were creating and destroying a large number of sampler views because of a swizzle issue. Basically, we compute the sampler view's swizzle by examining the texture format, user swizzle, depth mode, etc. Later, during validation we recompute that swizzle (in case something like depth mode changes) and see if it matches the view's swizzle. In the case of PIPE_FORMAT_RGTC2_UNORM, get_texture_format_swizzle returned SWIZZLE_XYZW but the u_sampler_view_default_template() function was setting the sampler view's swizzle to SWIZZLE_XY01. This mismatch caused the validation step to always "fail" so we'd destroy the old sampler view and create a new one. By removing the conditional, the sampler view's swizzle and the computed texture swizzle match and validation "passes". When creating a new sampler view, we always want to use the texture swizzle which we just computed. Fixes VMware issue 1733389. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-09-23 19:54:42 -06:00
Brian Paul	c0d7b6073d	svga: set PIPE_BIND_DEPTH_STENCIL flag for new resources when possible When we create a depth/stencil texture, also check if we can render to it and set the PIPE_BIND_DEPTH_STENCIL flag. We were previously doing this for color textures (PIPE_BIND_RENDER_TARGET). Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-09-23 19:54:42 -06:00
Brian Paul	f942a70340	svga: don't special case caps for SVGA3D_R32_FLOAT This may have been needed years ago during development, but not now. Prevents some regressions after introducing the next patch. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-09-23 19:54:42 -06:00
Brian Paul	14639cdf8f	svga: use new adjust_z_layer() helper in svga_pipe_blit.c To handle z/layer fix-ups for blitting and copying. Note that we weren't doing this properly in svga_blit() before. Also, remove redundant stex, dtex assignments. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-09-23 19:54:42 -06:00
Brian Paul	c42000545d	svga: simplify/improve the format compatibility check for region copies The util_is_format_compatible() function didn't quite do what we wanted for vgpu10. This check is more flexible and allows copies between formats such as R32G32B32A32_FLOAT and R32G32B32A32_INT. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-09-23 19:54:42 -06:00
Brian Paul	2ad4ba0727	svga: add const qualifier on svga_translate_format() Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-09-23 19:54:42 -06:00
Brian Paul	4d04696524	svga: eliminate unneeded gotos in svga_validate_surface_view() Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-09-23 19:54:42 -06:00
Neha Bhende	47f16f5e7f	svga: disable srgb format related code from svga_blit() With latest mesa and latest piglit tests srgb<->linear conversion is not required as per GL4.4 rules See commit `b662c70aea`. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-09-23 19:53:51 -06:00
Timothy Arceri	29c174a3e5	Revert "glsl: move xfb BufferStride into gl_transform_feedback_info" This reverts commit `f5a6aab403`. This broke some tests. It seems gl_transform_feedback_info gets memset to 0 so we were losing the values in BufferStride before we used them.	2016-09-24 10:17:26 +10:00
Kenneth Graunke	943b69cddd	glsl: Delete linker stuff relating to built-in functions. Now that we generate built-in functions inline, there's no need to link against the built-in shader, and no built-in prototypes to consider. This lets us delete a bunch of code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by; Ian Romanick <ian.d.romanick@intel.com>	2016-09-23 16:40:40 -07:00
Kenneth Graunke	f7a5c714b3	glsl: Delete ftransform support from builtin_functions.cpp. This is now handled directly by ast_function.cpp. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by; Ian Romanick <ian.d.romanick@intel.com>	2016-09-23 16:40:40 -07:00
Kenneth Graunke	b04ef3c08a	glsl: Immediately inline built-ins rather than generating calls. In the past, we imported the prototypes of built-in functions, generated calls to those, and waited until link time to resolve the calls and import the actual code for the built-in functions. This severely limited our compile-time optimization opportunities: even trivial functions like dot() were represented as function calls. We also had no way of reasoning about those calls; they could have been 1,000 line functions with side-effects for all we knew. Practically all built-in functions are trivial translations to ir_expression opcodes, so it makes sense to just generate those inline. Since we eventually inline all functions anyway, we may as well just do it for all built-in functions. There's only one snag: built-in functions that refer to built-in global variables need those remapped to the variables in the shader being compiled, rather than the ones in the built-in shader. Currently, ftransform() is the only function matching those criteria, so it seemed easier to just make it a special case. On Skylake: total instructions in shared programs: 12023491 -> 12024010 (0.00%) instructions in affected programs: 77595 -> 78114 (0.67%) helped: 97 HURT: 309 total cycles in shared programs: 137239044 -> 137295498 (0.04%) cycles in affected programs: 16714026 -> 16770480 (0.34%) helped: 4663 HURT: 4923 while these statistics are in the wrong direction, the number of hurt programs is small (309 / 41282 = 0.75%), and I don't think anything can be done about it. A change like this significantly alters the order in which optimizations are performed. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by; Ian Romanick <ian.d.romanick@intel.com>	2016-09-23 16:40:40 -07:00
Kenneth Graunke	1617f59bc6	glsl: Check TCS barrier restrictions at ast_to_hir time, not link time. We want to check prior to optimization - otherwise we might fail to detect cases where barrier() is in control flow which is always taken (and therefore gets optimized away). We don't currently loop unroll if there are function calls inside; otherwise we might have a problem detecting barrier() in loops that get unrolled as well. Tapani's switch handling code adds a loop around switch statements, so even with the mess of if ladders, we'll properly reject it. Enforcing these rules at compile time makes more sense more sense than link time. Doing it at ast-to-hir time (rather than as an IR pass) allows us to emit an error message with proper line numbers. (Otherwise, I would have preferred the IR pass...) Fixes spec/arb_tessellation_shader/compiler/barrier-switch-always.tesc. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by; Ian Romanick <ian.d.romanick@intel.com>	2016-09-23 16:40:40 -07:00
Timothy Arceri	f5a6aab403	glsl: move xfb BufferStride into gl_transform_feedback_info It makes more sense to have this here where we store the other values from xfb qualifiers. The struct it was previously part of is now only used to store values that come from the api. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-09-24 09:18:29 +10:00
Dylan Baker	85e9bbc14d	Revert "mapi: export all GLES 3.2 functions in libGLESv2.so" This reverts commit `e66a2b879b`. Which breaks the scons build in an interesting way, particularly when BlendBarrier and PrimitiveBoundingBox are added to static_data.py's functions list. This seems to be related to the fact that the unsuffixed names are only in GLES3.2, but Desktop GL only has suffixed versions. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>	2016-09-23 12:13:13 -07:00
Adam Jackson	8ce2afe776	i965: Enable EGL_KHR_gl_texture_3D_image Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-09-23 06:53:21 -04:00
Adam Jackson	5981366b9f	i915: Enable EGL_KHR_gl_texture_3D_image Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-09-23 06:53:17 -04:00
Nicolas Koch	f17948a30a	anv: Check for VK_WHOLE_SIZE in anv_CmdFillBuffer From the Vulkan spec: Size is the number of bytes to fill, and must be either a multiple of 4, or VK_WHOLE_SIZE to fill the range from offset to the end of the buffer. If VK_WHOLE_SIZE is used and the remaining size of the buffer is not a multiple of 4, then the nearest smaller multiple is used. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-23 00:20:16 -07:00
Lionel Landwerlin	6b21728c4a	anv: get rid of duplicated values from gen_device_info Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-23 10:12:06 +03:00
Lionel Landwerlin	94d0e7dc08	i965: get rid of duplicated values from gen_device_info Now that we have gen_device_info mutable, we can update its values and drop all copies we had in brw_context. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-23 10:12:06 +03:00
Lionel Landwerlin	bc24590f0c	intel/i965: make gen_device_info mutable Make gen_device_info a mutable structure so we can update the fields that can be refined by querying the kernel (like subslices and EU numbers). This patch does not make any functional change, it just makes gen_get_device_info() fill a structure rather than returning a const pointer. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-23 10:11:59 +03:00
Timothy Arceri	e60928f4c4	gallium: remove unused PIPE_CC_GCC_VERSION Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-09-23 16:18:21 +10:00
Timothy Arceri	4eb0e90c6b	util: remove Sun C Compiler support Support for this compiler was dropped in `51564f04b7` Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-09-23 16:17:16 +10:00
Ilia Mirkin	c0a7e931e3	st/mesa: turn on OES_viewport_array when dependencies are met Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-22 20:42:30 -04:00
Ilia Mirkin	0f01aa8033	mesa: add implementations for new float depth functions This just up-converts them to doubles. Not great, but this is what all the other variants also do. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-09-22 20:42:30 -04:00
Ilia Mirkin	381b15dc20	mesa: move ARB_viewport_array params to a GLES 3.1-accessible section This is needed for GL_OES_viewport_array. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-09-22 20:42:30 -04:00
Ilia Mirkin	5644a90801	mesa: add GL_OES_viewport_array to the extension string The expectation is that drivers will set this based on OES_geometry_shader and ARB_viewport_array support. This is a separate enable on the same reasoning as for OES_texture_cube_map_array. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-09-22 20:42:30 -04:00
Ilia Mirkin	70aef97f9e	glsl: add OES_viewport_array enables and use them to expose gl_ViewportIndex Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-09-22 20:42:30 -04:00
Ilia Mirkin	411a72d4a2	mesa: add new entrypoints for GL_OES_viewport_array Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-09-22 20:42:30 -04:00
Dylan Baker	e66a2b879b	mapi: export all GLES 3.2 functions in libGLESv2.so See commit `5921f372c8` for the rational of this commit. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-22 16:01:40 -07:00
Dylan Baker	ce83e36ec0	mapi: sort static_data.py functions Sorted by vim's builtin "sort i" (keeping the sorting case insensitive) v2: - uses case insensitive sorting (Ken) Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> (v1) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-22 15:29:27 -07:00
Dylan Baker	2fd51cf8ca	mapi: retab static_data.py to be consistent This file currently uses a mixture of 3 and 4 space indent. I have changed it all to 4 space indent, matching the settings in $ROOT/.editorconfig. This was generated with sed: sed -i -e 's@^ "@ "@g' Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-22 15:28:44 -07:00
Lionel Landwerlin	9adfa695ac	spirv: fix AtomicLoad/Store on images OpAtomicLoad/Store should have pointer to images just like the rest of the atomic operators. These couple of lines were poorly copied from the ssbo/shared_vars cases (the only ones currently tests by the CTS). Fixes `2afb950161` ("spirv/nir: Add support for OpAtomicLoad/Store") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-22 14:08:21 +03:00
Eric Anholt	36f0f03182	nir: Allow opt_peephole_sel to be more aggressive in flattening IFs. VC4 was running into a major performance regression from enabling control flow in the glmark2 conditionals test, because of short if statements containing an ffract. This pass seems like it was was trying to ensure that we only flattened IFs that should be entirely a win by guaranteeing that there would be fewer bcsels than there were MOVs otherwise. However, if the number of ALU ops is small, we can avoid the overhead of branching (which itself costs cycles) and still get a win, even if it means moving real instructions out of the THEN/ELSE blocks. For now, just turn on aggressive flattening on vc4. i965 will need some tuning to avoid regressions. It does looks like this may be useful to replace freedreno code. Improves glmark2 -b conditionals:fragment-steps=5:vertex-steps=0 from 47 fps to 95 fps on vc4. vc4 shader-db: total instructions in shared programs: 101282 -> 99543 (-1.72%) instructions in affected programs: 17365 -> 15626 (-10.01%) total uniforms in shared programs: 31295 -> 31172 (-0.39%) uniforms in affected programs: 3580 -> 3457 (-3.44%) total estimated cycles in shared programs: 225182 -> 223746 (-0.64%) estimated cycles in affected programs: 26085 -> 24649 (-5.51%) v2: Update shader-db output. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1)	2016-09-22 11:10:21 +03:00
Kenneth Graunke	6c648cdac8	docs: Mark ES 3.2 "all done" for i965/gen9+.	2016-09-21 11:52:59 -07:00
Kenneth Graunke	a4fbc73ee8	docs: Add ES 3.2 to release notes.	2016-09-21 11:52:59 -07:00
Brian Paul	b35684543e	gallium/util: add comment on util_is_format_compatible() From reading the code, it's not obvious what is src/dest compatible. The list of a->b copy-compatible formats comes from Jose's original check-in message, with some format name updates. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-09-21 12:26:17 -06:00
Brian Paul	99d9f764b2	svga: minor simplification in svga_validate_surface_view() Get rid of unneeded local var. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-09-21 12:23:45 -06:00
Brian Paul	1cc7a76d73	svga: remove disable_shader debug variable Never used, AFAIK. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-09-21 12:23:45 -06:00
Kenneth Graunke	a53da57d5a	i965: Enable ES 3.2 on Skylake. It's already advertised because the version.c extension checks are fulfilled, but we didn't actually claim support, so trying to create a ES 3.2 context would fail. It's all done, and the CTS results look good, so let's turn it on. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-09-21 10:51:58 -07:00
Jason Ekstrand	d2f42a945e	nir/spirv/glsl450: Add support for the InterpolateAt opcodes Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-09-21 05:39:06 -07:00
Jason Ekstrand	a529644889	nir/spirv: Claim support for SampleRateShading We already support all of the decorations that require this capability. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-09-21 05:39:06 -07:00
Jason Ekstrand	7c48622581	nir/spirv: Bring back the spirv2nir helper binary This was something that I wrote in the early days of the spirv_to_nir code but deleted once we had a real driver. However, in the absence of a shader_runner equivalent, it's extremely useful for debugging the spirv_to_nir code so let's bring it back. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-21 05:38:26 -07:00
Chuanbo Weng	e4648ba8dd	i965: implement querying __DRI_IMAGE_ATTRIB_OFFSET. Implement querying this attribute in intelImageExtension and bump version of intelImageExtension. Signed-off-by: Chuanbo Weng <chuanbo.weng@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-21 12:19:19 +01:00
Chuanbo Weng	9e8de866f7	egl: return corresponding offset of EGLImage instead of 0. The offset should not always be 0. For example, if EGLImage is created from a 2D texture with EGL_GL_TEXTURE_LEVEL=1, then the offset should be the actual start of miplevel 1 in bo. v2: Add version check of __DRIimageExtension implementation (Suggested by Axel Davy). v3: Don't add version check of __DRIimageExtension implementation. Set the offset only when queryImage() succeeds. (Suggested by Emil Velikov) Signed-off-by: Chuanbo Weng <chuanbo.weng@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> [Emil Velikov: coding style fixes] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-21 12:19:19 +01:00
Chuanbo Weng	1ceb775d57	dri: add offset attribute and bump version of EGLImage extensions. Offset is useful for buffer sharing with other components, so add it to queryImage attributes. Signed-off-by: Chuanbo Weng <chuanbo.weng@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-21 12:19:19 +01:00
Francisco Jerez	e5311ba1ac	i965/ir: Test thread dispatch packing assumptions. Not [originally] intended for upstream. Should cause a GPU hang if some thread is executed with a non-contiguous dispatch mask breaking assumptions of brw_stage_has_packed_dispatch(). Doesn't cause any CTS, DEQP or Piglit regressions, while replacing brw_stage_has_packed_dispatch() with a dummy implementation that unconditionally returns true on top of this patch causes multiple GPU hangs. v2: Refactor into a separate function instead of emitting the test code directly from emit_nir_code(), drop VEC4 test and clean up slightly for upstream. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-21 13:45:46 +03:00
Francisco Jerez	c05a4f11a0	i965/ir: Pass identity mask to brw_find_live_channel() in the packed dispatch case. This avoids emitting a few extra instructions required to take the dispatch mask into account when it's known to be tightly packed. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-21 13:45:46 +03:00
Francisco Jerez	f57f526fc5	i965/ir: Skip eliminate_find_live_channel() for stages with sparse thread dispatch. The eliminate_find_live_channel optimization eliminates FIND_LIVE_CHANNEL instructions in cases where control flow is known to be uniform, and replaces them with 'MOV 0', which in turn unblocks subsequent elimination of the BROADCAST instruction frequently used on the result of FIND_LIVE_CHANNEL. This is however not correct in per-sample fragment shader dispatch because the PSD can dispatch a fully unlit sample under certain conditions. Disable the optimization in that case. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> v2: Add devinfo argument to brw_stage_has_packed_dispatch() to implement hardware generation check.	2016-09-21 13:45:46 +03:00
Jason Ekstrand	8a468d186e	i965/fs: Take Dispatch/Vector mask into account in FIND_LIVE_CHANNEL On at least Sky Lake, ce0 does not contain the full story as far as enabled channels goes. It is possible to have completely disabled channels where the corresponding bits in ce0 are 1. In order to get the correct execution mask, you have to mask off those channels which were disabled from the beginning by taking the AND of ce0 with either sr0.2 or sr0.3 depending on the shader stage. Failure to do so can result in FIND_LIVE_CHANNEL returning a completely dead channel. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: Francisco Jerez <currojerez@riseup.net> [ Francisco Jerez: Fix a couple of typos, add mask register type assertion, clarify reason why ce0 can have bits set for disabled channels, clarify that this may only be a problem when thread dispatch doesn't pack channels tightly in the SIMD thread. Apply same treatment to Align16 path. ] Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-09-21 13:45:45 +03:00
Jason Ekstrand	a2392cee48	i965/reg: Make brw_sr0_reg take a subnr and return a vec1 reg The state register sr0 is really a collection of dwords not a SIMD8 anything. It's much more convenient for brw_sr0_reg to return the particular dword you're looking for rather than a giant blob you have to massage into what you want. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> [ Francisco Jerez: Trivial simplification of brw_ud1_reg(). ] Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-09-21 13:45:45 +03:00
Lionel Landwerlin	b8162d6b6e	anv: pipeline: use correct number of thread for compute Reproduces this commit : commit `0fb85ac08d` Author: Kenneth Graunke <kenneth@whitecape.org> Date: Mon Jun 6 21:37:34 2016 -0700 i965: Use the correct number of threads for compute shaders. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-21 12:01:06 +03:00
Lionel Landwerlin	f2d43b44d7	anv: allocator: correct scratch space for haswell This reproduces this commit : commit `2213ffdb4b` Author: Kenneth Graunke <kenneth@whitecape.org> Date: Mon Jun 6 21:37:34 2016 -0700 i965: Allocate scratch space for the maximum number of compute threads. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-21 12:01:06 +03:00
Lionel Landwerlin	09394ee6cf	anv: device: calculate compute thread numbers using subslices numbers Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-21 12:01:06 +03:00
Nicolai Hähnle	1f291369e4	gallivm: support negation on 64-bit integers This should be analogous to 32-bit integers. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-21 10:24:50 +02:00
Dave Airlie	4207612f9c	radeonsi: prepare 64-bit integer support. (v2) v2: - no PIPE_CAP_INT64 yet - emit DIV/MOD without the divide-by-zero workaround Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1) Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-21 10:24:38 +02:00
Dave Airlie	5561a37710	gallivm/llvmpipe: prepare support for ARB_gpu_shader_int64. This enables 64-bit integer support in gallivm and llvmpipe. v2: add conversion opcodes. v3: - PIPE_CAP_INT64 is not there yet - restrict DIV/MOD defaults to the CPU, as for 32 bits - TGSI_OPCODE_I2U64 becomes TGSI_OPCODE_U2I64 Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-21 10:24:30 +02:00
Dave Airlie	6b26039da3	tgsi/softpipe: prepare ARB_gpu_shader_int64 support. (v3) This adds all the opcodes to tgsi_exec for softpipe to use. v2: add conversion opcodes. v3: - no PIPE_CAP_INT64 yet - change TGSI_OPCODE_I2U64 to TGSI_OPCODE_U2I64 Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-21 10:24:11 +02:00
Dave Airlie	3985e6c044	gallium/tgsi: add support for 64-bit integer immediates. This adds support to TGSI for 64-bit integer immediates. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-09-21 10:23:55 +02:00
Dave Airlie	6e1a34d545	gallium: add opcode and types for 64-bit integers. (v3) This just adds the basic support for 64-bit opcodes, and the new types. v2: add conversion opcodes. add documentation. v3: - make docs more consistent - change TGSI_OPCODE_I2U64 to TGSI_OPCODE_U2I64 Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2) Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-21 10:23:05 +02:00
Kenneth Graunke	9694b23f66	i965: Rename intelScreen to screen. "intelScreen" is wordy and also doesn't fit our style guidelines. "screen" is shorter, which is nice, because we use it fairly often. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-09-20 20:08:20 -07:00
Kenneth Graunke	8fec9fbb9f	i965: Rename __DRIScreen pointers to "dri_screen". I want to use "screen" as the variable name for a struct intel_screen pointer. This means that we can't use it for __DRIscreen pointers. Sometimes we called it "screen", sometimes "sPriv", sometimes "driScrnPriv", and sometimes "psp" (Pointer to Screen Private?). The last one is particularly confusing because we use "psp" to refer to the Gen4 PIPELINED_STATE_POINTERS packet as well. Let's be consistent. "dri_screen" is clear, and it's not used often enough that I'm worried about the verbosity. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-09-20 20:08:12 -07:00
Dylan Baker	d4bf9baa43	mesa: Implement ARB_shader_viewport_layer_array for i965 This extension is a combination of AMD_vertex_shader_viewport_index and AMD_vertex_shader_layer, making it rather trivial to implement. For gallium I think this needs a new cap because of the addition of support in tessellation evaluation shaders, and since I don't have any hardware to test it on, I've left that for someone else to wire up. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-20 16:23:04 -07:00
Leo Liu	956f3e3bcd	radeon/vce: add firmware support for version 52.8.3 Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-09-20 15:58:56 -04:00
Indrajit Das	f9311265bf	st/omx/dec/h265: Correct the timestamping (derived from commit `3b6bda665a`) v2: fix the tabs(Leo) Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Nishanth Peethambaran <nishanth.peethambaran@amd.com> Signed-off-by: Indrajit Das <indrajit-kumar.das@amd.com> Signed-off-by: Leo Liu <leo.liu@amd.com>	2016-09-20 15:58:56 -04:00
Lionel Landwerlin	792d77165b	aubinator: add a custom handler for immediate register load Transforming this : 0x00c77084: 0x11000001: MI_LOAD_REGISTER_IMM 0x00c77088: 0x0000b020 : Dword 1 Register Offset: 0x0000b020 0x00c7708c: 0x00880038 : Dword 2 Data DWord: 8912952 Into this: 0x007880f0: 0x11000001: MI_LOAD_REGISTER_IMM 0x007880f4: 0x0000b020 : Dword 1 Register Offset: 0x0000b020 0x007880f8: 0x00080040 : Dword 2 Data DWord: 524352 register L3CNTLREG2 (0xb020) : 0x80040 SLM Enable: 0 URB Allocation: 32 URB Low Bandwidth: 0 RO Allocation: 32 RO Low Bandwidth: 0 DC Allocation: 0 DC Low Bandwidth: 0 v2: Drop unused arguments (Sirisha) Print out register name Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2016-09-20 10:47:21 +01:00
Nayan Deshmukh	0301858a31	st/va: flush the context before calling flush_frontbuffer(v2) so that the texture is rendered to back buffer before calling flush_frontbuffer and can be copied to a different buffer in the function v2: change comment style Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2016-09-20 11:18:29 +02:00
Nayan Deshmukh	e4cc2276c1	st/vdpau: flush the context before calling flush_frontbuffer so that the texture is rendered to back buffer before calling flush_frontbuffer and can be copied to a different buffer in the function Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2016-09-20 11:18:07 +02:00
Nayan Deshmukh	853e80f5a0	vl/dri3: handle the case of different GPU(v4.2) In case of prime when rendering is done on GPU other then the server GPU, use a seprate linear buffer for each back buffer which will be displayed using present extension. v2: Use a seprate linear buffer for each back buffer (Michel) v3: Change variable names and fix coding style (Leo and Emil) v4: Use PIPE_BIND_SAMPLER_VIEW for back buffer in case when a seprate linear buffer is used (Michel) v4.1: remove empty line v4.2: destroy the context and handle the case when create_context fails (Emil) Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Acked-by: Michel Dänzer <michel.daenzer@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2016-09-20 11:17:02 +02:00
Ilia Mirkin	40d787ab05	st/vdpau: fix argument type to vlVdpOutputSurfaceDMABuf Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-09-20 11:13:05 +02:00
Tim Rowley	92ec820244	swr: [rasterizer core] Better thread destruction Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-09-19 20:10:19 -05:00
Tim Rowley	fdf2890423	swr: [rasterizer jitter] Fix missing end-of-file newline Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-09-19 20:10:19 -05:00
Tim Rowley	2f86a9577a	swr: [rasterizer core] Add macros for mapping ArchRast to buckets Switch all RDTSC_START/STOP macros to use AR_BEGIN/END macros. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-09-19 20:10:19 -05:00
Kenneth Graunke	04026b43c8	glsl: Skip "unsized arrays aren't allowed" check for TCS/TES/GS vars. Fixes ESEXT-CTS.draw_elements_base_vertex_tests.AEP_shader_stages and ESEXT-CTS.texture_cube_map_array.texture_size_tesselation_con_sh. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-19 12:01:11 -07:00
Samuel Pitoiset	6ed05fa4cb	nvc0: get rid of nvc0_stage_sampler_states_bind_range() Same thing as nvc0_stage_set_sampler_views_range(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-19 20:03:24 +02:00
Samuel Pitoiset	407948df1b	nvc0: get rid of nvc0_stage_set_sampler_views_range() This function was quite similar to nvc0_stage_set_sampler_views() and I don't see any reasons to not remove it. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-19 20:03:20 +02:00
Samuel Pitoiset	557a29b51f	nv50/ir: optimize SUB(a, b) to MOV(a - b) This helps shaders in UE4 demos, especially with Elemental (+1% perf). This optimization reduces spilling usage in one shader which explains the little gain. GF100/GK104: total instructions in shared programs :2838551 -> 2838045 (-0.02%) total gprs used in shared programs :396706 -> 396684 (-0.01%) total local used in shared programs :34432 -> 34416 (-0.05%) local gpr inst bytes helped 1 19 112 112 hurt 0 0 0 0 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-18 16:42:39 +02:00
Samuel Pitoiset	d8b4f5fcca	gk110/ir: fix wrong emission of OP_NOT This should emit src0 instead of src1. Found by inspection. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-09-18 16:42:33 +02:00
Martina Kollarova	15804c4b90	r600g/sb: fix struct/class declaration conflicts A couple of forward-declarations were causing warnings in clang: 'value' defined as a class here but previously declared as a struct [-Wmismatched-tags] Signed-off-by: Martina Kollarova <martina.kollarova@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-09-18 09:23:42 +02:00
Eric Anholt	073129c7af	i965: Drop assertion about buffer offset at draw time. Given robust access, we should just be returning zeroes if the user gives us a base pointer that's too big, which is what was happens on a release build. This was caught by a webgl conformance test for out-of-bounds draws on servo. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-17 17:48:16 +01:00
Lars Hamre	ddd6116e32	tgsi: Enable returns from within loops Fixes the following piglit test (for softpipe): /spec/glsl-1.10/execution/fs-loop-return Signed-off-by: Lars Hamre <chemecse@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-17 10:24:13 -06:00
Charmaine Lee	8a6391477e	svga: relax restriction of compressed formats for texture upload This patch relaxes the restriction of compressed formats for texture upload buffer. For now, 3D texture with compressed format is still not supported in the texture upload buffer path. As Brian noted, ETQW does many texture updates with glCompressedTexSubImage. This patch greatly improves the performance of the ETQW trace. Tested with ETQW, MTT piglit, glretrace, conform, viewperf v2: Per Brian's suggestion, removed the subregion boundary check. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-17 10:24:13 -06:00
Brian Paul	15dee0fc1d	svga: skip query flush if we already have the query result This reduces the number of times we flush in some situations (the arbocclude demo is one trivial example). Tested with Piglit, ETQW, Sauerbraten, arbocclude. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-09-17 10:24:13 -06:00
Brian Paul	c71e82b8e9	svga: remove unneeded svga_context_flush() in svga_end_query() Since commit `99d8fe20ab` we don't have to flush the command buffer when we end a query. Tested with Piglit, Sauerbraten, arbocclude, ETQW (noticably faster now). Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-09-17 10:24:13 -06:00
Charmaine Lee	f1b3374d28	svga: use upload buffer for upload texture. With this patch, when running with vgpu10, instead of mapping directly to the guest backed memory for texture update, we'll use the texture upload buffer and use the transfer from buffer command to update the host side texture memory. This optimization yields about 20% performance improvement with Lightsmark2008 and about 40% with Tropics. Tested with Lightsmark2008, Tropics, Heaven, MTT piglit, glretrace, conform. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-17 10:24:13 -06:00
Charmaine Lee	a9c4a861d5	svga: refactor svga_texture_transfer_map/unmap functions Split the functions into separate functions for dma and direct map to make the code more readable. Tested with MTT piglit, glretrace, viewperf, conform, various OpenGL apps Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-17 10:24:12 -06:00
Charmaine Lee	c8ef82d65a	svga: add SVGA3d_vgpu10_TransferFromBuffer() Also add the corresponding dump function to dump the TransferFromBuffer command. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-17 10:24:12 -06:00
Charmaine Lee	2a4b019239	svga: single sample surface can be created as non-multisamples surface With this patch, single sample surface will be created as non-multisamples surface. Tested with piglit, glretrace. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-17 10:24:12 -06:00
Charmaine Lee	5947d90830	svga: fix memory leak with sampler state This patch fixes a memory leak with sampler state when piglit is run with HW version 11. Sampler state clean up was incorrectly skipped in svga_cleanup_sampler_state() for vgpu9. Tested with piglit. Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-17 10:24:12 -06:00
Brian Paul	12689efbbe	svga: fix prim type check/assignment in translate_indices() Left over test code spotted by Sinclair. Tested with piglit, Google Earth, Lightsmark, Heaven4, glretraces, etc. Reviewed-by: Sinclair Yeh <syeh@vmware.com>	2016-09-17 10:09:00 -06:00
Charmaine Lee	50359ddb5d	svga: use SVGA3D_QUERYTYPE_MAX for svga query type check Use SVGA3D_QUERYTYPE_MAX instead of SVGA_QUERY_MAX for svga query type check. Tested with various OpenGL apps with GALLIUM_HUD set. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-17 10:09:00 -06:00
Charmaine Lee	ee39814d90	svga: split the num-resources-mapped hud to textures & buffers Replace the num-resources-mapped hud with num-textures-mapped and num-buffers-mapped, so we can differentiate the map counts for these two different resources. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-17 10:09:00 -06:00
Charmaine Lee	f168c886c9	svga: change svga hud defines to enums This will make it easier to add new hud types. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-17 10:09:00 -06:00
Brian Paul	4f74b379aa	svga: implement an index buffer translation cache Some OpenGL apps, like Cinebench R15, have many glDrawElements(GL_QUADS) calls. Since we don't directly support quads we have to convert these calls into GL_TRIANGLES which involves generating a new index buffer. This patch saves the new/translated index buffer in the hope that it can be reused for a later draw call. Cinebench R15 increases by about 20% with this change. The NobelClinician Viewer app also hits this code. Tested with full piglit run. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-09-17 10:09:00 -06:00
Brian Paul	581292a78c	svga: try to emit fewer buffer rebind commands If a consecutive sequence of drawing commands references the same vertex/index buffers, there should be no need to rebind the surfaces for the second and subsequent drawing commands. Apps that use multiple display lists benefit from this since the vertex data for several display lists is often stored in one buffer. In the case of the legacy E&S Glaze demo, this reduces the size of our command buffers from 91KB to 44KB. One WSI Fusion trace shows a 33% reduction in command buffer sizes. Tested with full piglit run. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-09-17 10:09:00 -06:00
Brian Paul	ee5f5e2269	svga: reduce unmapping/remapping of the default constant buffer Previously, every time we put shader constants into the default constant buffer we called u_upload_alloc(), which mapped the buffer, and u_upload_unmap(). We had to unmap the buffer before calling svga_buffer_handle() to get the winsys handle for the buffer. But we really only need to do that the first time we reference the const buffer. Now we try to keep the upload manager's buffer mapped until we fill it or flush the command buffer. v2: add additional comment on the buffer unmapping code in svga_context_flush(), per Charmaine. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-09-17 10:09:00 -06:00
Brian Paul	ce3b34b727	svga: optimize memcpy() in svga_buffer_update_hw() When we migrate a buffer from sw/malloc storage to a hardware buffer, don't memcpy the whole buffer, just copy the part we've written to. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-09-17 10:08:59 -06:00
Neha Bhende	b7bee25052	svga: Use comparison between svga texture types to use PredCopyRegion command PredCopyRegion support copy between same type of textures. Instead of comparing src and dst pipe texture type, compare svga texture type which can avoid some software fallback. for example, it avoids a software blit with the Redway3D Aston demo. Tested piglit tests on VGPU9 and VGPU10 on GL/DX11Renderer, Redway3D Aston demo v2: some nit pick suggested by Charmaine. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-17 10:08:59 -06:00
Neha Bhende	b9f333cc81	svga: Add function svga_resource_type() This function returns svga texture type for corresponding pipe texture. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-09-17 10:08:59 -06:00
Samuel Pitoiset	50baaf6bc6	nvc0/ir: fix subops for IMAD Offset was wrong, it's at bit 8, not 4. Also, uses subr instead of sub when src2 has neg. Similar to GK110 now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-09-17 17:42:45 +02:00
Samuel Pitoiset	9b8b69b3c4	nvc0/ir: fix comments about instructions info The comment for the commutative flags was wrong because OP_MUL is before OP_MAD. While we are at it add missing opcodes, and fix the comment about the short forms. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-17 17:42:40 +02:00
Kenneth Graunke	eaacb27812	mesa: Move buffers-unmapped earlier in check_valid_to_render(). This needs to be above the switch on API, as that can return true (valid to render) before this error check even had a chance to run. Fixes ESEXT-CTS.draw_elements_base_vertex_tests.invalid_mapped_bos, which worked before commit `72f1566f90`. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-09-16 19:42:56 -07:00
Kenneth Graunke	6b0ba02cae	mesa: Expose GL_CONTEXT_FLAGS in ES 3.2. Fixes four ES32-CTS.context_flags.* tests. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-16 18:55:38 -07:00
Tom Stellard	91ec6e5664	radeonsi/compute: Use the HSA abi for non-TGSI compute shaders v3 This patch switches non-TGSI compute shaders over to using the HSA ABI described here: https://github.com/RadeonOpenCompute/ROCm-Docs/blob/master/AMDGPU-ABI.md The HSA ABI provides a much cleaner interface for compute shaders and allows us to share more code in the compiler with the HSA stack. The main changes in this patch are: - We now pass the scratch buffer resource into the shader via user sgprs rather than using relocations. - Grid/Block sizes are now passed to the shader via the dispatch packet rather than at the beginning of the kernel arguments. Typically for HSA, the CP firmware will create the dispatch packet and set up the user sgprs automatically. However, in Mesa we let the driver do this work. The main reason for this is that I haven't researched how to get the CP to do all these things, and I'm not sure if it is supported for all GPUs. v2: - Add comments explaining why we are setting certain bits of the scratch resource descriptor. v3: - Use amdgcn-mesa-mesa3d triple instead of amdgcn--mesa3d. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-16 23:07:10 +00:00
Tom Stellard	a2b8346fa6	radeonsi/compute: Add some more debug printfs	2016-09-16 22:51:06 +00:00
Marek Olšák	ae0a4a1299	glsl: remove interpolateAt* instructions for demoted inputs This fixes 8 fs-interpolateat* piglit crashes on radeonsi, because it can't handle non-input operands in interpolateAt*. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-16 22:35:08 +02:00
Marek Olšák	d58a3906cb	mesa: fix glGetFramebufferAttachmentParameteriv w/ on-demand FRONT_BACK alloc This fixes 66 CTS tests on st/mesa. Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-16 22:35:08 +02:00
Serge Martin	1c8d4c694b	clover: fix getting scalar args api size This fix getting the size of a struct arg. vec3 types still work ok. Only buit-in args need to have power of two alignment, getTypeAllocSize reports the correct size in all cases. Acked-by: Francisco Jerez <currojerez@riseup.net>	2016-09-16 22:09:47 +02:00
Ilia Mirkin	f65187bb93	docs: add GL_ARB_gl_spirv to features list Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-16 12:04:12 -04:00
Rob Clark	ba8a50955d	ttn: fix warning after `7bf76563e` Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-09-16 11:55:26 -04:00
Brian Paul	702ff0b9a0	gallium/docs: document alpha_to_coverage and alpha_to_one blend state The gallium interface defines these like DX10. Note that OpenGL ignores these options if MSAA is disabled or the dest buffer doesn't support MSAA. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-09-16 08:44:26 -06:00
Brian Paul	187c278121	st/mesa: update comment in st_atom_msaa.c The old comment was a copy and paste mistake. Indent another comment. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-09-16 08:44:26 -06:00
Brian Paul	a01872f808	st/mesa: only enable MSAA coverage options when we have a MSAA buffer Regardless of whether GL_MULTISAMPLE is enabled (it's enabled by default) we should not set the alpha_to_coverage or alpha_to_one flags if the current drawing buffer does not do MSAA. This fixes the new piglit gl-1.3-alpha_to_coverage_nop test. ETQW is a game that enables GL_SAMPLE_ALPHA_TO_COVERAGE without MSAA. Shrubs along the side of roads were invisible because fragments with alpha < 0.5 were being discarded (zero coverage). v2: remove ctx->DrawBuffer != NULL check. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-16 08:44:12 -06:00
Dave Airlie	e1ea36ae71	spirv: use subpass image type (v1.1) This adds support for the input attachments subpass type to the SPIRV->NIR pass. v1.1: drop handling from vtn_handle_texture Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-09-16 15:16:31 +10:00
Dave Airlie	7bf76563e2	glsl: add subpass image type (v2) SPIR-V/Vulkan have a special image type for input attachments called the subpass type. It has different characteristics than other images types. The main one being it can only be an input image to fragment shaders and loads from it are relative to the frag coord. This adds support for it to the GLSL types. Unfortunately we've run out of space in the sampler dim in types, so we need to use another bit. v2: Fixup subpass input name (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-09-16 15:16:31 +10:00
Kenneth Graunke	081f21f29b	isl: Finish tiling filtering for Gen6. Gen6 only has one additional restriction over Gen7+, so we just add it to the existing gen7 function (which actually covers later gens too). This should stop FINISHME spew when running GL on Sandybridge. v2: Fix bytes per block vs. bits per block confusion (Jason) and rename function to gen6_filter_tiling (Jason and Chad). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-15 21:21:50 -07:00
Ilia Mirkin	9fec15a7e0	i965: enable ARB_ES3_2_compatibility on gen8+ Note that ASTC support is not actually mandated for this extension to be exposed. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-15 19:29:41 -04:00
Jason Ekstrand	111f6b250d	i965/nir: Roll set_default_interpolation into lower_fs_inputs Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-15 13:31:43 -07:00
Jason Ekstrand	246db0063e	i965/fs: Use NIR for handling forced per-sample interpolation Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-15 13:31:43 -07:00
Jason Ekstrand	ed65e6ef49	nir: Add a flag to lower_io to force "sample" interpolation Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-15 13:31:43 -07:00
Jason Ekstrand	114874b22b	i965/fs: Use sample interpolation for interpolateAtCentroid in persample mode From the ARB_gpu_shader5 spec: The built-in functions interpolateAtCentroid() and interpolateAtSample() will sample variables as though they were declared with the "centroid" or "sample" qualifiers, respectively. When running with persample dispatch forced by the API, we interpolate anything that isn't flat as if it's qualified by "sample". In order to keep interpolateAtCentroid() consistent with the "centroid" qualifier, we need to make interpolateAtCentroid() do sample interpolation instead. Nothing in the GLSL spec guarantees that the result of interpolateAtCentroid is uniform across samples in any way, so this is a perfectly fine thing to do. Fixes 8 of the new dEQP-VK.pipeline.multisample_interpolation.* Vulkan CTS tests that specifically validate consistency between the "sample" qualifier and interpolateAtSample() Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-15 13:31:27 -07:00
Brian Paul	0d2eb8c14d	mesa: check for no matrix change in _mesa_LoadMatrixf() Some apps issue redundant glLoadMatrixf() calls with the same matrix. Try to avoid setting dirty state in that situation. This reduces the number of constant buffer updates by about half in ET Quake Wars. Tested with Piglit, ETQW, Sauerbraten, Google Earth, etc. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-15 12:00:12 -06:00
Jon Turney	533b3530c1	direct-to-native-GL for GLX clients on Cygwin ("Windows-DRI") Structurally, this is very similar to the existing Apple-DRI code, except I have chosen to implement this using the __GLXDRIdisplay, etc. vtables (as suggested originally in [1]), rather than a maze of ifdefs. This also means that LIBGL_ALWAYS_SOFTWARE and LIBGL_ALWAYS_INDIRECT work as expected. [1] https://lists.freedesktop.org/archives/mesa-dev/2010-May/000756.html This adds: * the Windows-DRI extension protocol headers and the windowsdriproto.pc file, for use in building the Windows-DRI extension for the X server * a Windows-DRI extension helper client library * a Windows-specific DRI implementation for GLX clients The server is queried for Windows-DRI extension support on the screen before using it (to detect the case where WGL is disabled or can't be activated). The server is queried for fbconfigID to pixelformatindex mapping, which is used to augment glx_config. The server is queried for a native handle for the drawable (which is of a different type for windows, pixmaps and pbuffers), which is used to augment __GLXDRIdrawable. Various GLX extensions are enabled depending on if the equivalent WGL extension is available.	2016-09-15 13:14:43 +01:00
Emil Velikov	2ac09ac5a5	docs: add news item and link release notes for 12.0.3 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-15 11:31:06 +01:00
Emil Velikov	219a2f5f9f	docs: add sha256 checksums for 12.0.3 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `09460b8cf7`)	2016-09-15 11:30:00 +01:00
Emil Velikov	06f83a5548	docs: add release notes for 12.0.3 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `d79b2e7bf3`)	2016-09-15 11:29:59 +01:00
Kenneth Graunke	3bcdc2e3db	mesa: Expose RESET_NOTIFICATION_STRATEGY with KHR_robustness. This is supposed to be exposed with the GL_KHR_robustness extension, which we support on ES 2.0 and later. On desktop GL, it's also exposed by GL_ARB_robustness, which is supported by all drivers ("dummy_true"). so we also allow desktop GL. Fixes: - ES32-CTS.robust.robustness.noResetNotification - ES32-CTS.robust.robustness.loseContextOnReset Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-15 00:58:47 -07:00
Jason Ekstrand	89a96c8f43	anv/cmd_buffer: Set the L3 atomic disable mask bit in CHICKEN3 on HSW Without this bit set, the value in "L3 Atomic Disable" won't get applied by the hardware so we won't properly get L3 atomic caching. Fixes dEQP-VK.spirv_assembly.instruction.compute.opatomic.compex and 198 of the dEQP-VK.image.atomic_operations.* tests on HSW Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-09-14 17:53:16 -07:00
Jason Ekstrand	a814e18c96	intel/blorp: Stop setting 3DSTATE_DRAWING_RECTANGLE The Vulkan driver sets 3DSTATE_DRAWING_RECTANGLE once to MAX_INT x MAX_INT at the GPU initialization time and never sets it again. The GL driver sets it every time the framebuffer changes. Originally, blorp set it to the size of the drawing area but meant we had to set it back in the Vulkan driver. Instead, we can easily just do that in the GL driver's blorp_exec implementation and not set it in blorp core. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-09-14 17:51:16 -07:00
Jason Ekstrand	b56f509ee0	intel/blorp: Emit 3DSTATE_MULTISAMPLE directly Previously, we relied on a driver hook for 3DSTATE_MULTISAMPLE. However, now that Vulkan and GL use the same sample positions, we can set up 3DSTATE_MULTISAMPLE directly in blorp and delete the driver hook. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-09-14 17:51:16 -07:00
Jason Ekstrand	c779ad3e66	intel: Move Vulkan sample positions to common code Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-09-14 17:51:16 -07:00
Marek Olšák	f019255acf	Revert "tgsi/scan: don't set interp flags for inputs only used by INTERP instructions" This reverts commit `524fd55d2d`. Reason: https://bugs.freedesktop.org/show_bug.cgi?id=97808	2016-09-15 00:47:24 +02:00
Francisco Jerez	6d861968ca	i965/vec4: Assert that pull constant load offsets are 16B-aligned. Non-16B-aligned pull constant loads are unlikely to be particularly useful given that you can get roughly the same effect by using swizzles on the result. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:59 -07:00
Francisco Jerez	5ca35c6367	i965/vec4: Assert that ATTR regions are register-aligned. It might be useful to actually handle this once copy propagation becomes smarter about register-misaligned offsets. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:59 -07:00
Francisco Jerez	f33a8f8fcf	i965/vec4: Don't spill non-GRF-aligned register regions. A better fix would be to do something along the lines of the FS back-end spilling code and emit a scratch read before any instruction that overwrites the register to spill partially due to a non-zero sub-register offset. In the meantime mark registers used with a non-zero sub-register offset as no-spill to prevent the spilling code from miscompiling the program. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:59 -07:00
Francisco Jerez	8531f943d9	i965/vec4: Fix copy propagation for non-register-aligned regions. This prevents it from trying to propagate a copy through a register-misaligned region. MOV instructions with a misaligned destination shouldn't be treated as a direct GRF copy, because they only define the destination GRFs partially. Also fix the interference check implemented with is_channel_updated() to consider overlapping regions with different register offset to interfere, since the writemask check implemented in the function is only valid under the assumption that the source and destination regions are aligned component by component. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:59 -07:00
Francisco Jerez	0e657b7b55	i965/vec4: Compare full register offsets in cmod propagation. Cmod propagation would misoptimize the program if the destination offset of the generating instruction wasn't exactly the same as the source region offset of the copy instruction. In preparation for adding support for sub-GRF offsets to the VEC4 IR. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:58 -07:00
Francisco Jerez	8bed1adfc1	i965/vec4: Assign correct destination offset to rewritten instruction in register coalesce. Because the pass already checks that the destination offset of each 'scan_inst' that needs to be rewritten matches 'inst->src[0].offset' exactly, the final offset of the rewritten instruction is just the original destination offset of the copy. This is in preparation for adding support for sub-GRF offsets to the VEC4 IR. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:58 -07:00
Francisco Jerez	3a74e437fd	i965/vec4: Don't coalesce registers with overlapping writes not matching the MOV source. In preparation for adding support for sub-GRF offsets to the VEC4 IR. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:58 -07:00
Francisco Jerez	1bb5074474	i965/vec4: Compare full register offsets in opt_register_coalesce nop move check. In preparation for adding support for sub-GRF offsets to the VEC4 IR. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:58 -07:00
Francisco Jerez	3be0d6d040	i965/vec4: Check that the write offsets match when setting dependency controls. For simplicity just assume that two writes to the same GRF with different sub-GRF offsets will potentially interfere and break the dependency control chain. This is in preparation for adding sub-GRF offset support to the VEC4 IR. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:58 -07:00
Francisco Jerez	b52fefc4d5	i965/vec4: Change opt_vector_float to keep track of the last offset seen in bytes. This simplifies things slightly and makes the pass more correct in presence of sub-GRF offsets. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:58 -07:00
Francisco Jerez	230615e228	i965/vec4: Simplify src/dst_reg to brw_reg conversion by using byte_offset(). This should also have the side effect of fixing convert_to_hw_regs() to handle sub-GRF register offsets. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:58 -07:00
Francisco Jerez	eb746a80e5	i965/ir: Update several stale comments. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:58 -07:00
Francisco Jerez	47784e2346	i965/ir: Don't print ARF subnr values twice. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:58 -07:00
Francisco Jerez	5d65d51e78	i965/vec4: Print src/dst_reg::offset field consistently for all register files. C.f. 'i965/fs: Print fs_reg::offset field consistently for all register files.'. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:57 -07:00
Francisco Jerez	ec259f5307	i965/fs: Print fs_reg::offset field consistently for all register files. The offset printing code in fs_visitor::dump_instruction() was doing things differently for sources and destinations and for each register file -- In some cases it would be added to the base register number fs_reg::nr, in other cases it would follow the base register separated with a plus sign, in other cases (uniforms) it would do both (!). The sub-register offset was also being printed or not rather inconsistently. Fix it. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:57 -07:00
Francisco Jerez	950af5ed40	i965/fs: Misc simplification. Get rid of some leftover redundant arithmetic introduced during the conversion to byte offsets and sizes that can be simplified easily. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:57 -07:00
Francisco Jerez	80e1d670b4	i965/fs: Get rid of fs_inst::set_smear(). component() was generally a better alternative because of several issues set_smear() had: - It wouldn't take the original stride and offset of the register into account, which means that set_smear() on the result of e.g. another set_smear() call or an offset() call would give a bogus region as result. - It was an inherently destructive operation. See the 'nir_intrinsic_shader_clock' hunk below for how this could lead to subtle bugs in cases where set_smear() was called multiple times on the same register like 'r.set_smear(0), r.set_smear(1)' with the expectation that each call would return a separate value instead of a reference to the same subsequently mutated object. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:57 -07:00
Francisco Jerez	8e58e4412f	i965/fs: Use region_contained_in() in compute-to-mrf coalescing pass. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:57 -07:00
Francisco Jerez	f2d2156ba2	i965/fs: Move region_contained_in to the IR header and fix for non-VGRF files. Also changed the argument names since 'src' and 'dst' don't make that much sense outside of the context of copy propagation. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:57 -07:00
Francisco Jerez	645261c4b2	i965/fs: Change region_contained_in() to use byte units. This makes the function less annoying to use and more accurate -- We shouldn't propagate a copy into a register region that wasn't fully contained in the destination of the copy (IOW, a source region that wasn't fully defined by the copy) just because the number of registers written and read by each instruction happened to get rounded up to the same GRF multiple. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:57 -07:00
Francisco Jerez	1c67e27247	i965/fs: Simplify copy propagation LOAD_PAYLOAD ACP setup. By keeping track of 'offset' in byte units. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:57 -07:00
Francisco Jerez	2d7d4a7910	i965/fs: Simplify a bunch of fs_inst::size_written calculations by using component_size(). Using component_size() is easier and generally more correct because it takes into account the register type and stride for you. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:56 -07:00
Francisco Jerez	0bc46cc961	i965/fs: Simplify result_live calculation in dead_code_eliminate(). No need to unroll the first iteration of the loop manually. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:56 -07:00
Francisco Jerez	62aaef6c83	i965/fs: Simplify and fix buggy stride/offset calculations using subscript(). These were bashing the 'offset' and 'stride' values of several registers without taking the previous value into account, which probably didn't matter in practice for optimize_frontfacing_ternary() because the 'tmp' register already had a known region, but it would have given the wrong region as result in the other cases in lower_integer_multiplication(). subscript(..., i) is a more straightforward way to take the i-th field of a given type from each channel of a register which should give the right answer as result regardless of the original 'offset' and 'stride' parameters of the register region. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:56 -07:00
Francisco Jerez	3b7b908787	i965/fs: Simplify get_fpu_lowered_simd_width() by using inequalities instead of rounding. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:56 -07:00
Francisco Jerez	ee930c0435	i965/fs: Simplify byte_offset(). In the most common case this can now be implemented as a simple addition because the offset is already encoded as a single scalar value in bytes. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:56 -07:00
Francisco Jerez	bae3a41171	i965/fs: Fix signedness of the return value of fs_inst::size_read(). Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:56 -07:00
Francisco Jerez	a384503c15	i965/fs: Switch mask_relative_to() used in compute-to-mrf to byte units. This makes the helper function less annoying to use and somewhat more accurate. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:56 -07:00
Francisco Jerez	401fc228fd	i965/fs: Fix bogus sub-MRF offset calculation in compute-to-mrf. The 'scan_inst->dst.offset % REG_SIZE' term in the final 'scan_inst->dst.offset' calculation is obviously bogus. The offset from the start of the copy destination register 'inst->dst' where the destination of the generating instruction 'scan_inst' would be written to (before compute-to-mrf runs) is just the offset of 'scan_inst->dst' relative to the source of the copy instruction (AKA rel_offset in the code below). Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:56 -07:00
Francisco Jerez	cd0134072a	i965/fs: Take into account copy register offset during compute-to-mrf. This was dropping 'inst->dst.offset' on the floor. Nothing in the code above seems to guarantee that it's zero and in that case the offset of the register being coalesced into wouldn't be taken into account while rewriting the generating instruction. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:56 -07:00
Francisco Jerez	fcd9d1badc	i965/vec4: Drop backend_reg::in_range() in favor of regions_overlap(). This makes sure that overlap checks are done correctly throughout the back-end when the '*this' register starts before the register/size pair provided as argument, and is actually less annoying to use than in_range() at this point since regions_overlap() takes its size arguments in bytes. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:55 -07:00
Francisco Jerez	56bcb2230f	i965/vec4: Port regions_overlap() to the vec4 IR. This is copy-pasted almost line by line from the FS back-end. The only reason it cannot be implemented in terms of backend_reg is that the backend_reg::nr field doesn't have the same meaning for uniforms on both back-ends. It could be easily deduplicated by using a template function. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:55 -07:00
Francisco Jerez	c057278c06	i965/fs: Stop using fs_reg::in_range() in favor of regions_overlap(). Its only use left in the FS back-end should be using regions_overlap() instead to avoid getting a false negative result in cases where source and destination overlap but the former starts before the latter in the VGRF file. v2: Put back lost components factor (Iago). Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:55 -07:00
Francisco Jerez	b42c13a5b8	i965/fs: Drop fs_inst::overwrites_reg() in favor of regions_overlap(). fs_inst::overwrites_reg is rather easy to misuse because it cannot tell how large the register region starting at 'reg' is, so in cases where the destination region starts after 'reg' it may give a misleading result. regions_overlap() is somewhat more verbose to use but handles arbitrary overlap correctly so it should generally be used instead. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:55 -07:00
Francisco Jerez	32d67923b2	i965/fs: Fix LOAD_PAYLOAD handling in register coalesce is_nop_mov(). is_nop_mov() was broken for LOAD_PAYLOAD instructions in two ways: On the one hand the original destination register offset wasn't being taken into account which would give incorrect results if it was already non-zero, and on the other hand all source registers were being treated as if they had a size of 32B, which is almost never the case in SIMD16 programs for non-header sources. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:55 -07:00
Francisco Jerez	5cc6425d70	i965/fs: Fix can_propagate_from() source/destination overlap check. The previous overlap condition only made sure that the VGRF numbers or GRF-aligned offsets were different without taking the amount of data written and read by the instruction into consideration. Use the regions_overlap() helper instead. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:55 -07:00
Francisco Jerez	9ae77d7020	i965/fs: Compare full register offsets in cmod propagation pass. This could potentially have misoptimized a program in cases where inst->src[0] had a non-zero sub-GRF offset. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:55 -07:00
Francisco Jerez	3a4ea7cf80	i965/fs: Don't consider LOAD_PAYLOAD with stride > 1 source to behave like a raw copy. Noticed the problem by inspection while typing in the previous commit. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:55 -07:00
Francisco Jerez	1164aa1a1b	i965/fs: Don't consider LOAD_PAYLOAD with sub-GRF offset to behave like a raw copy. This was likely the original intention, and at least register coalesce relies on it. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:55 -07:00
Francisco Jerez	a5bbe4c127	i965/vec4: Take into account misalignment in regs_written() and regs_read(). Unlike the FS counterpart of this commit this was likely not (yet) a bug, but let's fix it already in preparation for implementing support for sub-GRF offsets in the VEC4 back-end. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:54 -07:00
Francisco Jerez	717d8efd58	i965/fs: Take into account misalignment in regs_written() and regs_read(). There was a workaround for this in fs_inst::size_read() for the SHADER_OPCODE_MOV_INDIRECT instruction and FIXED_GRF register file only. We should take this possibility into account for the sources and destinations of all instructions on all optimization passes that need to quantize dataflow in 32B increments by adding the amount of misalignment to the size read or written from the regs_read() and regs_written() helpers respectively. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:54 -07:00
Francisco Jerez	e540045df5	i965/fs: Take into account trailing padding in regs_written() and regs_read(). This fixes regs_written() and regs_read() to return a more accurate value when the padding left between components due to a stride value greater than one causes the region bounds given by size_written or size_read to overflow into the next register. This could become a problem in optimization passes that keep track of dataflow using fixed-size arrays with register granularity, because the overflow register (not actually accessed by the region) may not have been allocated at all which could lead to undefined memory access. An alternative to this would be to subtract the trailing padding already during the calculation of fs_inst::size_read and ::size_written, but that would break code that currently assumes that ::size_read and _written are whole multiples of the component size, and would be hard to maintain looking forward because size_written is assigned from a bunch of different places. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:54 -07:00
Francisco Jerez	937373eb25	i965/fs: Handle fixed HW GRF subnr in reg_offset(). This will be useful later on when we start using reg_offset() on fixed hardware registers. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:54 -07:00
Francisco Jerez	1a4b7fdd88	i965/fs: Handle arbitrary offsets in brw_reg_from_fs_reg for MRF/VGRF registers. This restriction seemed rather artificial... Removing it actually simplifies things slightly. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:54 -07:00
Francisco Jerez	d6b60934aa	i965/fs: Return more accurate read size for LINTERP from fs_inst::size_read. The LINTERP virtual instruction only reads three scalar components from the first 16B of the second source, we can now teach size_read() about it since its return value is represented with byte granularity. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:54 -07:00
Francisco Jerez	31a40202b8	i965/fs: Return more accurate read size from fs_inst::size_read for IMM and UNIFORM files. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:54 -07:00
Francisco Jerez	728dd30c0a	i965/vec4: Replace vec4_instruction::regs_read with ::size_read using byte units. The previous regs_read value can be recovered by rewriting each reference of regs_read() like 'x = i.regs_read(j)' to 'x = DIV_ROUND_UP(i.size_read(j), reg_unit)'. For the same reason as in the previous patches, this doesn't attempt to be particularly clever about simplifying the result in the interest of keeping the rather lengthy patch as obvious as possible. I'll come back later to clean up any ugliness introduced here. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:54 -07:00
Francisco Jerez	e1a918ba7b	i965/fs: Replace fs_inst::regs_read with ::size_read using byte units. The previous regs_read value can be recovered by rewriting each reference of regs_read() like 'x = i.regs_read(j)' to 'x = DIV_ROUND_UP(i.size_read(j), reg_unit)'. For the same reason as in the previous patches, this doesn't attempt to be particularly clever about simplifying the result in the interest of keeping the rather lengthy patch as obvious as possible. I'll come back later to clean up any ugliness introduced here. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:53 -07:00
Francisco Jerez	27cb6b081e	i965/ir: Drop backend_instruction::regs_written field. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:53 -07:00
Francisco Jerez	69fdf13c21	i965/vec4: Replace vec4_instruction::regs_written with ::size_written field in bytes. The previous regs_written field can be recovered by rewriting each rvalue reference of regs_written like 'x = i.regs_written' to 'x = DIV_ROUND_UP(i.size_written, reg_unit)', and each lvalue reference like 'i.regs_written = x' to 'i.size_written = x * reg_unit'. For the same reason as in the previous patches, this doesn't attempt to be particularly clever about simplifying the result in the interest of keeping the rather lengthy patch as obvious as possible. I'll come back later to clean up any ugliness introduced here. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:53 -07:00
Francisco Jerez	69570bbad8	i965/fs: Replace fs_inst::regs_written with ::size_written field in bytes. The previous regs_written field can be recovered by rewriting each rvalue reference of regs_written like 'x = i.regs_written' to 'x = DIV_ROUND_UP(i.size_written, reg_unit)', and each lvalue reference like 'i.regs_written = x' to 'i.size_written = x * reg_unit'. For the same reason as in the previous patches, this doesn't attempt to be particularly clever about simplifying the result in the interest of keeping the rather lengthy patch as obvious as possible. I'll come back later to clean up any ugliness introduced here. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:53 -07:00
Francisco Jerez	d28cfa35fe	i965/vec4: Add wrapper functions for vec4_instruction::regs_read and ::regs_written. This is in preparation for dropping vec4_instruction::regs_read and ::regs_written in favor of more accurate alternatives expressed in byte units. The main reason these wrappers are useful is that a number of optimization passes implement dataflow analysis with register granularity, so these helpers will come in handy once we've switched register offsets and sizes to the byte representation. The wrapper functions will also make sure that GRF misalignment (currently neglected by most of the back-end) is taken into account correctly in the calculation of regs_read and regs_written. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:53 -07:00
Francisco Jerez	c458eeb946	i965/fs: Add wrapper functions for fs_inst::regs_read and ::regs_written. This is in preparation for dropping fs_inst::regs_read and ::regs_written in favor of more accurate alternatives expressed in byte units. The main reason these wrappers are useful is that a number of optimization passes implement dataflow analysis with register granularity, so these helpers will come in handy once we've switched register offsets and sizes to the byte representation. The wrapper functions will also make sure that GRF misalignment (currently neglected by most of the back-end) is taken into account correctly in the calculation of regs_read and regs_written. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:53 -07:00
Francisco Jerez	be095e11e4	i965/fs: Replace fs_reg::subreg_offset with fs_reg::offset expressed in bytes. The fs_reg::subreg_offset and ::offset fields are now redundant, the sub-GRF offset can just be added to the single ::offset field expressed in byte units. The current subreg_offset value can be recovered by applying the following rule: Replace each rvalue reference of subreg_offset like 'x = r.subreg_offset' with 'x = r.offset % reg_unit', and each lvalue reference like 'r.subreg_offset = x' with 'r.offset = ROUND_DOWN_TO(r.offset, reg_unit) + x'. For the same reason as in the previous patches, this doesn't attempt to be particularly clever about simplifying the result in the interest of keeping the rather lengthy patch as obvious as possible. I'll come back later to clean up any ugliness introduced here. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:53 -07:00
Francisco Jerez	9a523dd051	i965/ir: Remove backend_reg::reg_offset. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:53 -07:00
Francisco Jerez	fba020e5af	i965/vec4: Replace dst/src_reg::reg_offset with dst/src_reg::offset expressed in bytes. The dst/src_reg::offset field in byte units introduced in the previous patch is a more straightforward alternative to an offset representation split between ::reg_offset and ::subreg_offset fields. The split representation makes it too easy to forget about one of the offsets while dealing with the other, which has led to multiple FS back-end bugs in the past. To make the matter worse the unit reg_offset was expressed in was rather inconsistent, for uniforms it would be expressed in either 4B or 16B units depending on the back-end, and for most other things it would be expressed in 32B units. This encodes reg_offset as a new offset field expressed consistently in byte units. Each rvalue reference of reg_offset in existing code like 'x = r.reg_offset' is rewritten to 'x = r.offset / reg_unit', and each lvalue reference like 'r.reg_offset = x' is rewritten to 'r.offset = r.offset % reg_unit + x * reg_unit'. Because the change affects a lot of places and is rather non-trivial to verify due to the inconsistent value of reg_unit, I've tried to avoid making any additional changes other than applying the rewrite rule above in order to keep the patch as simple as possible, sometimes at the cost of introducing obvious stupidity (e.g. algebraic expressions that could be simplified given some knowledge of the context) -- I'll clean those up later on in a second pass. v2: Fix division by the wrong reg_unit in the UNIFORM case of convert_to_hw_regs(). (Iago) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:52 -07:00
Francisco Jerez	86944e063a	i965/fs: Replace fs_reg::reg_offset with fs_reg::offset expressed in bytes. The fs_reg::offset field in byte units introduced in this patch is a more straightforward alternative to the current register offset representation split between fs_reg::reg_offset and ::subreg_offset. The split representation makes it too easy to forget about one of the offsets while dealing with the other, which has led to multiple back-end bugs in the past. To make the matter worse the unit reg_offset was expressed in was rather inconsistent, for uniforms it would be expressed in either 4B or 16B units depending on the back-end, and for most other things it would be expressed in 32B units. This encodes reg_offset as a new offset field expressed consistently in byte units. Each rvalue reference of reg_offset in existing code like 'x = r.reg_offset' is rewritten to 'x = r.offset / reg_unit', and each lvalue reference like 'r.reg_offset = x' is rewritten to 'r.offset = r.offset % reg_unit + x * reg_unit'. Because the change affects a lot of places and is rather non-trivial to verify due to the inconsistent value of reg_unit, I've tried to avoid making any additional changes other than applying the rewrite rule above in order to keep the patch as simple as possible, sometimes at the cost of introducing obvious stupidity (e.g. algebraic expressions that could be simplified given some knowledge of the context) -- I'll clean those up later on in a second pass. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:52 -07:00
Eero Tamminen	8ad5fb3a8f	glsl: grammar fix Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-09-14 13:35:47 -07:00
Kenneth Graunke	aa70ac172e	docs: Mention AEP in release notes	2016-09-14 12:43:16 -07:00
Kenneth Graunke	8c9dddadad	i965: Enable ANDROID_extension_pack_es31a on Gen9+. AEP requires ASTC, which is currently only enabled on Skylake and later. (It may be possible to extend this to Cherryview/Braswell in the future, but earlier hardware doesn't have ASTC support.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-14 12:16:25 -07:00
Kenneth Graunke	2d8a3fa7ea	nir: Report progress from nir_lower_phis_to_scalar. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-09-14 12:01:51 -07:00
Kenneth Graunke	32630e211e	nir: Report progress from nir_lower_alu_to_scalar. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-09-14 12:01:49 -07:00
Kenneth Graunke	e6eed3533e	nir: Call nir_metadata_preserve from nir_lower_alu_to_scalar(). This is mandatory. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-09-14 12:01:39 -07:00
Rob Clark	bff90aedf1	nir/lower_tex: fix typo with sample_dim Numeric 2 is actually GLSL_SAMPLER_DIM_3D, which I don't think is what was intended. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-14 13:45:32 -04:00
Rob Clark	1a8424ceba	nir: move tex_instr_remove_src I want to re-use this in a different pass, so move to nir.h Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-14 13:45:32 -04:00
Rob Clark	2c3f966276	nir/lower_tex: remove tex_instr_find_src() Turns out it already exists.. so don't duplicate it. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-14 13:45:32 -04:00
Kyle Brenneman	7206b3a556	egl: Add storage for EGL_KHR_debug's state to EGL objects Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-14 11:45:58 -04:00
Kyle Brenneman	1d535c1e83	egl: Factor out _eglGetSyncAttribCommon Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-14 11:45:58 -04:00
Kyle Brenneman	5b0b844ac9	egl: Factor out _eglWaitSyncCommon Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-14 11:45:58 -04:00
Kyle Brenneman	9a992038e7	egl: Lock the display in _eglCreateSync's callers Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-14 11:45:58 -04:00
Kyle Brenneman	58338c6b65	egl: Factor out _eglCreateImageCommon (v2) v2: - Pass disp to RETURN_EGL_ERROR so we unlock the display Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-14 11:45:58 -04:00
Kyle Brenneman	82a2e2cb50	egl: Factor out _eglWaitClientCommon Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-14 11:45:58 -04:00
Kyle Brenneman	8cc3d9855f	egl: Use _eglCreatePixmapSurfaceCommon consistently This moves the native pixmap fixup to a helper function so we don't repeat ourselves. Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-14 11:45:58 -04:00
Kyle Brenneman	7d7ae5e1c3	egl: Use _eglCreateWindowSurfaceCommon consistently This moves the native window fixup to a helper function so we don't repeat ourselves. Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-14 11:45:58 -04:00
Kyle Brenneman	017946b724	egl: Factor out _eglGetPlatformDisplayCommon Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-14 11:45:58 -04:00
Kyle Brenneman	fe6ffa79be	egl: Fix typo Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-14 11:45:58 -04:00
Adam Jackson	e2c067d256	egl: Tear down images and syncs at eglTerminate Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-09-14 11:45:58 -04:00
Kyle Brenneman	6e50f12b04	egl: Update eglext.h (v2) Updated eglext.h to revision 33111 from the Khronos repository. v2: - Don't (re)move extension includes from eglext.h (Emil Velikov) - Bump to revision 33111 (Adam Jackson) Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2016-09-14 11:45:58 -04:00
Brendan King	95f3e5861c	configure.ac: fix the name of the Wayland Scanner pc file The Wayland Scanner pkg-config file is called wayland-scanner.pc. Fixes: `153539bd9d` ("configure: rework wayland_scanner handling (fix make distcheck)") Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Tested-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Brendan King <Brendan.King@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-14 14:38:30 +01:00
Eric Engestrom	4bb9efb592	gbm: remove left-over array `e7c8c85785` ("gbm: Removed unused function.") forgot to remove the global array used only by that function. Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-14 14:37:34 +01:00
Martina Kollarova	2527e18eeb	gallium: fix return value check A possible error (-1) was being lost because it was first converted to an unsigned int and only then checked. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Martina Kollarova <martina.kollarova@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-09-14 14:36:43 +01:00
Marek Olšák	ab29788250	radeonsi: reload PS inputs with direct indexing at each use (v2) The LLVM compiler can CSE interp intrinsics thanks to LLVMReadNoneAttribute. 26011 shaders in 14651 tests Totals: SGPRS: 1146340 -> 1132676 (-1.19 %) VGPRS: 727371 -> 711730 (-2.15 %) Spilled SGPRs: 2218 -> 2078 (-6.31 %) Spilled VGPRs: 369 -> 369 (0.00 %) Scratch VGPRs: 1344 -> 1344 (0.00 %) dwords per thread Code Size: 35841268 -> 36009732 (0.47 %) bytes LDS: 767 -> 767 (0.00 %) blocks Max Waves: 222559 -> 224779 (1.00 %) Wait states: 0 -> 0 (0.00 %) v2: don't call load_input for fragment shaders in emit_declaration Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-14 12:33:00 +02:00
Marek Olšák	007b512f9d	radeonsi: get rid of constant buffer preloading 26011 shaders in 14651 tests Totals: SGPRS: 1152636 -> 1146340 (-0.55 %) VGPRS: 728198 -> 727371 (-0.11 %) Spilled SGPRs: 3776 -> 2218 (-41.26 %) Spilled VGPRs: 369 -> 369 (0.00 %) Scratch VGPRs: 1344 -> 1344 (0.00 %) dwords per thread Code Size: 35835152 -> 35841268 (0.02 %) bytes LDS: 767 -> 767 (0.00 %) blocks Max Waves: 222372 -> 222559 (0.08 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-09-14 12:32:59 +02:00
Marek Olšák	16be87c904	radeonsi: get rid of img/buf/sampler descriptor preloading (v2) 26011 shaders in 14651 tests Totals: SGPRS: 1251920 -> 1152636 (-7.93 %) VGPRS: 728421 -> 728198 (-0.03 %) Spilled SGPRs: 16644 -> 3776 (-77.31 %) Spilled VGPRs: 369 -> 369 (0.00 %) Scratch VGPRs: 1344 -> 1344 (0.00 %) dwords per thread Code Size: 36001064 -> 35835152 (-0.46 %) bytes LDS: 767 -> 767 (0.00 %) blocks Max Waves: 222221 -> 222372 (0.07 %) Wait states: 0 -> 0 (0.00 %) v2: merge codepaths where possible Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-14 12:32:59 +02:00
Marek Olšák	22797d7d83	radeonsi: rename get_sampler_desc -> load_sampler_desc Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-09-14 12:32:59 +02:00
Marek Olšák	5f0a8fbcc8	radeonsi: cosmetic changes in si_shader.c Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-09-14 12:32:59 +02:00
Marek Olšák	afaf27bff3	radeonsi: load streamout buffer descriptors before use (v2) v2: inline the code and remove the conditional that's a no-op now Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-14 12:32:59 +02:00
Eric Anholt	f597ac3966	vc4: Implement job shuffling Track rendering to each FBO independently and flush rendering only when necessary. This lets us avoid the overhead of storing and loading the frame when an application momentarily switches to rendering to some other texture in order to continue rendering the main scene. Improves glmark -b desktop:effect=shadow:windows=4 by 27% Improves glmark -b desktop:blur-radius=5:effect=blur:passes=1:separable=true:windows=4 by 17% While I haven't tested other apps, this should help X rendering a lot, and I've heard GLBenchmark needed it too.	2016-09-14 06:25:41 +01:00
Eric Anholt	f473348468	vc4: Handle resolve skipping at job submit time. This is done in vc4_flush currently, but I'm going to make the job always track the surfaces it might be rendering to instead of putting in the destinations at flush time.	2016-09-14 06:08:03 +01:00
Eric Anholt	9688166bd9	vc4: Move the render job state into a separate structure. This is a preparation step for having multiple jobs being queued up at the same time.	2016-09-14 06:08:03 +01:00
Eric Anholt	c31a7f529f	vc4: Always unref the current job surfaces at job reset time. Drops some tricky logic in vc4_flush() trying to update the pointers, and fixes a broken lack of unref for MSAA surfaces at context destroy time.	2016-09-14 06:08:03 +01:00
Eric Anholt	774a556b6d	vc4: Move job-submit skip cases to vc4_job_submit(). For calling job_submit() directly, I need the skipping here.	2016-09-14 06:08:03 +01:00
Eric Anholt	0ef1b32ebb	vc4: Move bin CL trailer to job_submit() time. To implement job shuffling, I want to be able to call submit() on specific jobs, turning vc4_flush() into the context's flush-all-jobs hook.	2016-09-14 06:08:03 +01:00
Eric Anholt	a2014c2eb9	vc4: Simplify the DISCARD_RANGE handling It's really just an upgrade to attempting WHOLE_RESOURCE. Pulling the logic out caught two bugs in it: We would try to do so on cubemaps (even though we're only mapping 1 of the 6 slices), and we would break persistent coherent mappings by trying to reallocate when we shouldn't.	2016-09-14 06:08:03 +01:00
Eric Anholt	21a27ad956	vc4: Fix incorrect clearing of Z/stencil when cleared separately. The clear of Z or stencil will end up clearing the other as well, instead of masking. There's no way around this that I know of, so if we are clearing just one then we need to draw a quad. Fixes a regression in the job-shuffling code, where the clear values move to the job and don't just have the last clear's value laying around when you do glClear(DEPTH) and then glClear(STENCIL) separately (ext_framebuffer_multisample-clear 4 depth)). This causes regressions in ext_framebuffer_multisample/multisample-blit depth and ext_framebuffer_multisample/no-color depth, but these were formerly false positives due to the reference image also being black. Now the reference and test images are both being drawn, and it looks like there's an incorrect resolve of depth during blitting to an MSAA FBO.	2016-09-14 06:08:03 +01:00
Ilia Mirkin	89a49af31e	glsl: add core plumbing for GL_ANDROID_extension_pack_es31a Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-13 20:49:55 -04:00
Ilia Mirkin	83116d084f	mesa: introduce glPrimitiveBoundingBoxARB entrypoint This requires a bit of rejiggering, since normally ES entrypoints alias core ones, not vice-versa. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-13 20:49:50 -04:00
Ilia Mirkin	a69dc2c412	mesa: add a GLES3.2 enums section, and expose new MS line width params This also exposes them for ARB_ES3_2_compatibility. While both specs refer to the new MS line width parameters being separate from the existing AA line widths, reality begs to differ. It's the same on all hardware currently supported by mesa. Should hardware come along that wants these to be different, they're easy enough to separate out. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-13 20:49:47 -04:00
Sirisha Gandikota	aa7b410592	aubinator: Remove bogus "end" parameter in gen_disasm_disassemble() Earlier, the loop pretends to loop over instructions from "start" to "end", but the callers always pass 8192 for end, which is some huge bogus value. The real loop termination condition is send-with-EOT or 0. (Ken) Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-13 16:32:42 -07:00
Sirisha Gandikota	1ab92d80a8	aubinator: Make gen_disasm_disassemble handle split sends Skylake adds new SENDS and SENDSC opcodes, which should be handled in the send-with-EOT check. Make an is_send() helper that checks if the opcode is SEND/SENDC/SENDS/SENDSC (Ken) v2: Make is_send() much more crispier, Mix declaration and code to make the code compact (Ken) Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-13 16:32:39 -07:00
Sirisha Gandikota	5d2440532f	aubinator: Simplify print_dword_val() method Remove the float/dword union and use the iter->p[f->start / 32] directly as printf formatter %08x expects uint32_t (Ken) v2: Make the cleanup much more crispier (Ken) Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-13 16:32:24 -07:00
Jason Ekstrand	1eebb60917	anv/image: Set correct base_array_layer and array_len for storage images Since Vulkan doesn't allow single-slice 3D storage images, we need to just set the base_array_layer and array_len to the full size of the 3-D LOD. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-13 14:45:49 -07:00
Jason Ekstrand	106709db7b	Revert "intel/isl: Ignore base_array_layer and array_len for 3D storage..." This reverts commit `3943888c94`. It turns out that commit was pretty-much bogus since it breaks binding a 3-D texture as a 2-D storage image. The correct fix for the Vulkan CTS tests needs to be in the Vulkan driver itself rather than ISL. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-13 14:45:15 -07:00
Jason Ekstrand	330104464f	anv: Use blorp for doing MSAA resolves Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-09-13 12:40:13 -07:00
Jason Ekstrand	6bcb1f753e	anv: Use blorp for ClearColorImage Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-09-13 12:40:13 -07:00
Jason Ekstrand	57e87862eb	anv: Delete meta_blit2d Everything that we were once using the blit2d framework for is now done with blorp. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-09-13 12:40:13 -07:00
Jason Ekstrand	36286ccb96	anv/blorp: Add a gcd_pow2_u64 helper and use it for buffer alignments This is a lot cleaner and easier to read than the old piles of if statements. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-09-13 12:40:13 -07:00
Jason Ekstrand	af5d30de55	anv: Use blorp for CopyBuffer and UpdateBuffer Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-09-13 12:40:13 -07:00
Jason Ekstrand	0f1ca5407a	anv: Use blorp for CopyImage Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-09-13 12:40:12 -07:00
Jason Ekstrand	58593f24cb	anv: Use blorp for CopyBufferToImage Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-09-13 12:40:12 -07:00
Jason Ekstrand	f07f44a5bc	anv: Use blorp for CopyImageToBuffer Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-09-13 12:40:12 -07:00
Jason Ekstrand	9f44745eca	anv: Use blorp to implement VkBlitImage Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-09-13 12:40:12 -07:00
Jason Ekstrand	52fa3e8347	anv: Make image_get_surface_for_aspect_mask const Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-09-13 12:40:12 -07:00
Jason Ekstrand	8f780af968	anv: Add initial blorp support Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-13 12:40:12 -07:00
Jason Ekstrand	1fe8bf82b2	intel/anv: Use #defines for all __gen_ helpers This allows us to #undef them later if we don't want them to persist Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-13 12:40:12 -07:00
Jason Ekstrand	4a6c9e20b8	anv: Generalize emit_urb_setup Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-13 12:40:12 -07:00
Jason Ekstrand	8cb144bd93	anv/pipeline: Roll compute_urb_partition into emit_urb_setup Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-13 12:40:12 -07:00
Jason Ekstrand	823ab83432	intel/blorp: Use #defines for all __gen_ helpers This allows us to #undef them later if we don't want them to persist Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-13 12:40:12 -07:00
Jason Ekstrand	c0b9776cd6	intel/isl: Divide QPitch by 2 for 3-D stencil textures on SKL+ Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-09-13 12:40:12 -07:00
Jason Ekstrand	00e79cec99	isl/state: Don't set QPitch for GEN4_3D surfaces Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-09-13 12:40:12 -07:00
Jason Ekstrand	cb780c9ccf	intel/blorp: Rework alloc_binding_table The original blorp_alloc_binding_table helper was supposed to return the binding table offset and map along with the surface state maps. This isn't quite what we want, however. What we really want is the binding table offsets, surface state offsets, and surface state maps. In the GL driver, the binding table map is an array of surface state offsets. However, in Vulkan, this isn't quite true as the entries in the binding table are surface state offsets combined with another binding table block offset. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-13 12:40:11 -07:00
Marek Olšák	524fd55d2d	tgsi/scan: don't set interp flags for inputs only used by INTERP instructions radeonsi depends on the interp flags a little bit too much. This fixes 9 randomly failing tests: GL45-CTS.shader_multisample_interpolation.render.interpolate_at_centroid.* Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-13 20:38:25 +02:00
Marek Olšák	15a127bc2c	radeonsi: fix FP64 UBO loads with indirect uniform block indexing No known tests. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-13 20:38:25 +02:00
Marek Olšák	35d284d08e	winsys/amdgpu: don't assume GTT if the VRAM flag isn't set Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-13 20:38:25 +02:00
Marek Olšák	6df872df59	radeonsi: clean up CP DMA emit code Unify the clear and copy paths, clean up the definitions. It looks more like a rework. It's a preparation for GDS support, which might or might not come. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-13 20:38:25 +02:00
Marek Olšák	84860dd0bb	radeonsi: print the IB and buffer list in VM fault reports This is a fallout from reworking the debug flags. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-13 20:38:25 +02:00
Marek Olšák	fd69fa65a8	radeonsi: add sampler view BOs to the BO list last If si_sampler_view_add_buffer ends up flushing, then the code in begin_new_cs would previously have added the buffer(s) for whatever was previously bound to that slot. Now it would add only the new buffer. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-13 20:38:25 +02:00
Marek Olšák	275c073c6a	radeonsi: export SampleMask from pixel shaders at full rate Heaven and Valley write gl_SampleMask and not Z. Use 16_ABGR instead of 32_ABGR if Z isn't written. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-13 20:38:25 +02:00
Marek Olšák	b89854b0c7	gallium/radeon: set new r600_resource fields correctly in other places too This was missed in: commit `0d2e43fcb1` Author: Marek Olšák <marek.olsak@amd.com> Date: Thu Aug 18 16:30:00 2016 +0200 gallium/radeon: derive buffer placement and flags only at initialization Tested-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-13 20:38:25 +02:00
Marek Olšák	c723acc03d	ddebug: dump shader buffers and images this was unimplemented Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-13 20:38:25 +02:00
Marek Olšák	fdd457c89f	ddebug: fix a crash in resource_get_handle broken recently Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-13 20:38:25 +02:00
Jan Vesely	b671909d27	radeon: Don't check DCC on pipe buffers Fixes segfaults in EG compute since: commit `21de3be8e6` radeonsi: fix texture format reinterpretation with DCC Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-13 14:23:26 -04:00
Andy Furniss	304f70536a	vl/util: Fix YV12/I420 convert to NV12 U/V reversal Fix VAAPI YV12/I420 convert to NV12 U/V reversal. Input order is YVU when this is called. Signed-off-by: Andy Furniss <adf.lists@gmail.com> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>	2016-09-13 13:58:40 -04:00
Jason Ekstrand	6ac469a6c3	anv/allocator: Use VG_NOACCESS_WRITE in anv_bo_pool_free Previously, we were relying on the fact that VALGRIND_MEMPOOL_FREE came later on in the function to prevent "link->bo = bo" from causing an invalid write. However, in the case where the size requested by the user is very small (less than sizeof(struct anv_bo)), this isn't sufficient. Instead, we should call VALGRIND_MEMPOOL_FREE early and then use VG_NOACCESS_WRITE. We do, however, have to call VALGRIND_MEMPOOL_FREE after reading bo_in because it may be stored in the bo itself. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-13 10:44:03 -07:00
Jason Ekstrand	3943888c94	intel/isl: Ignore base_array_layer and array_len for 3D storage surfaces The time we want to restrict the Z range of a 3-D surface is when rendering to it. For storage surfaces, we always want he full range. However, we still need to set MinimumArrayElement and RenderTargetViewExtent to sensible values so we'll just set them to the reasonable defaults we used before we started respecting the base_array_layer and array_len. This fixes a bunch of Vulkan CTS regressions caused by `48f195d7c6`. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97790 Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-09-13 10:43:21 -07:00
Jose Fonseca	62affedbed	appveyor: Update winflexbison download URL. This particular version got moved into a `old_versions` subdirectory.	2016-09-13 17:54:51 +01:00
Jason Ekstrand	a1e49be713	i965: Use blorp_copy for all copy_image operations on gen6+ Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-09-12 19:44:05 -07:00
Jason Ekstrand	540395bf9b	i965/blorp: Add a copy_miptrees helper Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-09-12 19:44:05 -07:00
Jason Ekstrand	d038adca0e	intel/isl: Add support for RGB formats in X and Y-tiled memory Normally, using a non-linear tiling format helps improve cache locality by ensuring that neighboring pixels are usually close-by in memory. For RGB formats, this still sort-of holds, but it can also lead to rather terrible memory access patterns where a single RGB pixel value crosses a tile boundary and gets split into two pieces in different 4K pages. It also makes for some rather awkward calculations because your tile size is no longer an even multiple of surface element size. For these reasons, we chose to simply never create tiled RGB images in the Vulkan driver. The GL driver, however, is not so kind so we need to support it somehow. I briefly toyed with a couple of different schemes but this is the best one I could come up with. The fundamental problem is that a tile no longer contains an integer number of surface elements. I briefly considered a couple other options but found them wanting: 1) Using floats for the logical tile size. This leads to potential rounding error problems. 2) When presented with a RGB format, just make the tile 3-times as wide. This isn't so nice because now our tiles are no longer power-of-two size. Also, it can force the row_pitch to be larger than needed which, while not strictly a problem for ISL, causes incompatibility problems with the way the GL driver chooses surface pitches. The chosen method requires that you pay attention and not just assume that your tile_info is in the units you think it is. However, it's nice because it provides a nice "these are the units" declaration in isl_tile_info itself. Previously, the tile_info wasn't usable as a stand-alone structure because you had to also know the format. It also forces figuring out how to deal with inconsistencies between tiling and format back to the caller which is good because the two different consumers of isl_tile_info really want to deal with it differently: Computation of the surface size wants the fewest number of horizontal tiles possible while get_intratile_offset is far more concerned with things aligning nicely. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Chad Versace <chadversary@chromium.org>	2016-09-12 19:44:05 -07:00
Jason Ekstrand	883086500b	intel/isl: Allow valign2 for texture-only Y-tiled surfaces on gen7 The restriction that Y-tiled surfaces must have valign == 4 only aplies to render targets but we were applying it universally. This causes problems if ISL_FORMAT_R32G32B32_FLOAT is used because it requires valign == 2; this should be okay because you can't render to that format. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-09-12 19:44:05 -07:00
Jason Ekstrand	54db5afd2c	intel/blorp: Work in terms of logical array layers When Ivy Bridge introduced array multisampling, someone made the decision to do lots of stuff throughout the driver in terms of physical array layers rather than logical array layers. In ISL, we use logical array layers most of the time and it really makes no sense to use physical array layers in the blorp API. Every time someone passes physical array layers into blorp for an array multisampled surface, they're always divisible by the number of samples and we divide right away. Eventually, I'd like to rework most of the GL driver internals to use logical array layers but that's going to be a big project and will probably happen as part of the ISL conversion. For now, we'll do the conversion in brw_blorp and let blorp just use the logical layers. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	fa4627149d	intel/blorp: Increase the presision of coordinate transform calculations The result of this calculation goes into an fma() in the shader and we would like it to be as precise as possible. The division in particular was a source of imprecision whenever dst1 - dst0 was not a power of two. This prevents regressions in some of the new Vulkan CTS tests for blitting using a filtering of NEAREST. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	c70be1ead5	intel/blorp: Add a swizzle parameter to blorp_clear While we're here, we also re-arrange the parameters to better match the parameter order of blorp_blit. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	ea1399aba0	intel/blorp: Make color_write_disable const and optional Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	9286f62f11	intel/blorp: Add support for clearing R9G9B9E5 surfaces Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	ab03e59867	intel/blorp: Add support for RGB destinations in copies Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	5ae8043fed	intel/blorp: Add an entrypoint for doing bit-for-bit copies Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	941b4d063a	intel/blorp: Pull the guts of blorp_blit into a helper Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	4e03edf189	intel/blorp: Stop using the X/YOffset field of RENDER_SURFACE_STATE While it can be useful, the field has substantial limtations. In particular, the bittom 2 or 3 bits is missing so your offset always has to be a multiple of 4 or 8. While surface alignments usually work out to make this ok, when you start trying to fake compressed surfaces as uncompressed (which we will want to do) this falls apart. The easiest solution is to simply align all offsets to a tile boundary and munge the regions we're copying to account for the intratile offset. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	c170606fc6	intel/blorp: Use fake_interleaved_msaa in retile_w_to_y Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	a613449f71	intel/blorp: Use isl_get_interleaved_msaa_px_size_sa Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	8ac99eabb6	intel/isl: Add a helper for getting the size of an interleaved pixel Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	3cc15ba5bb	intel/blorp: Handle 3D surfaces in convert_to_single_slice Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	43d25edf78	intel/isl: Fix an assert in get_intratile_offset_sa Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	6da968b651	intel/blorp: Fix the early return condition in convert_to_single_slice The convert_to_single_slice operation is mostly idempotent. The only non-repeatable thing it does is that, when it sets the intratile offset fields, it just overwrites them instead of doing a += operation. This is supposed to be ok because we have an early return at the top that should make it bail of the surface is already a single slice. Unfortunately, the if condition has been broken ever since it was first added in `96fa98c18`. This commit fixes the condition and adds an assert to ensure we don't stomp any non-zero intratile offsets. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	ec7e0d62c5	intel/blorp: Use the surface format for computing offsets If we use the view format, it may be an uncompressed view of a compressed image which throws things off. Since we're computing offsets of images, we want the actual surface offset anyway. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	7f2fecd114	intel/blorp: Don't assume R8_UINT in convert_to_single_slice We're going to use it for more than just stencil textures Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	2fc9c7e3d9	intel/blorp: Take a destination swizzle in blorp_blit Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	2dba5489ae	intel/blorp: Take an isl_swizzle instead of a SWIZZLE Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Jason Ekstrand	7ddb21708c	intel/isl: Add an isl_swizzle structure and use it for isl_view swizzles This should be more compact than the enum isl_channel_select[4] that we were using before. It's also very convenient because we already had such a structure in the Vulkan driver we just needed to pull it over. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-09-12 19:42:57 -07:00
Kenneth Graunke	376d1dc2f1	docs: Add OES_tessellation_shader to the release notes.	2016-09-12 17:24:35 -07:00
Kenneth Graunke	049cee2c16	docs: Mark OES_tessellation_shader as done.	2016-09-12 17:23:20 -07:00
Ilia Mirkin	742832434a	st/mesa: fix is_scissor_enabled when X/Y are negative Similar to commit `49c24d8a24` ("i965: fix noop_scissor range issue on width/height") - take the X/Y into account to determine whether the scissor covers the whole area or not. Fixes the recently-added gl-1.0-scissor-depth-clear-negative-xy piglit test. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: <mesa-stable@lists.freedesktop.org>	2016-09-12 20:07:21 -04:00
Mauro Rossi	6b9d7e69ee	android: add support for libmesa_amdgpu_addrlib Android porting of the following commits: `f1f1ba3` "radeonsi: move sid.h/r600d_common.h to a common place." `69fca64` "amd/addrlib: move addrlib from amdgpu winsys to common code" This patch fixes android building errors Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-09-13 10:06:04 +10:00
Dave Airlie	0fe9152868	u_endian: add android to glibc clause Tested-by: Mauro Rossi <issor.oruam@gmail.com>	2016-09-13 10:04:13 +10:00
Jason Ekstrand	24be630660	Revert "i965: Drop the maximum 3D texture size to 512 on Sandy Bridge" This reverts commit `6ba88bce64`. The commit was erroneous because GL has a separate limit, GL_MAX_FRAMEBUFFER_LAYERS which guards the number of layers you are allowed to render into. The GL 4.5 spec says: "The framebuffer attachment point attachment is said to be framebuffer attachment complete if [...] all of the following conditions are true: [...] If image is a three-dimensional, one- or two-dimensional array, or cube map array texture and the attachment is layered, the depth or layer count of the texture is less than or equal to the value of the implementation-dependent limit MAX_FRAMEBUFFER_LAYERS." and goes on to say that "framebuffer complete" requires all attachments to be "framebuffer attachment complete". On Sandy Bridge, we set GL_MAX_FRAMEBUFFER_LAYERS to 512 so creating a 3D texture bigger than 512 is fine; you just can't render into all of the slices at once. Fixes ES3-CTS.gtf.GL3Tests.npot_textures.npot_tex_image on Sandy Bridge Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-09-12 16:52:10 -07:00
Jason Ekstrand	2519237c24	intel/blorp: Handle the 512 layers restriction on Sandy Bridge Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-12 16:48:56 -07:00
Jason Ekstrand	48f195d7c6	intel/isl: Treat 3-D textures as 2-D arrays for rendering In particular, this means that isl_view::base_array_layer and isl_view::array_len get applied to 3-D textures but only when rendering. We were already applying isl_view::base_array_layer for rendering into 3-D textures so this isn't a huge deviation. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-12 16:48:56 -07:00
Sirisha Gandikota	63fe9ab894	aubinator: Simplify gen_disasm_create()'s devinfo handling Copy the whole devinfo structure instead of just few fields (Ken) Earlier, copied only couple of fields which added more code. So, simplify code by copying the whole structure. Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-12 16:20:04 -07:00
Sirisha Gandikota	d2869c95fb	aubinator: Fix compiler warning Add 'const' qualifier to gen_field_iterator::p pointer (Ken) Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-12 16:19:56 -07:00
Julien Isorce	bf901a2f8c	st/va: also honors interlaced preference when providing a video format This fixes a crash when using the prefered video format with vaapisink on Nvidia hardwares. Also caught by the following assert: nouveau_vp3_video.c:91: Assertion `templat->interlaced' failed. TEST= gst-launch-1.0 videotestsrc ! video/x-raw, format=NV12 ! vaapisink Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Julien Isorce <j.isorce@samsung.com> Tested-by: Víctor Manuel Jáquez Leal <vjaquez@igalia.com> Tested-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-09-12 22:17:40 +01:00
Samuel Pitoiset	3f3640c86c	tgsi: document semantics for compute shaders Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-12 22:15:10 +02:00
Kenneth Graunke	54138af1cd	mesa: Enable OES/EXT_tessellation_shader for ES 3.1 + ARB_tess drivers. Drivers which support ARB_tessellation_shader and ES 3.1 now will expose OES_tessellation_shader and EXT_tessellation_shader as well. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-12 13:07:38 -07:00
Marek Olšák	546bc07349	radeonsi: don't preload constants at the beginning of shaders LLVM can CSE the loads, thus we can always re-load constants before each use. The decrease in SGPR spilling is huge. The best improvements are the dumbest ones. 26011 shaders in 14651 tests Totals: SGPRS: 1453346 -> 1251920 (-13.86 %) VGPRS: 742576 -> 728421 (-1.91 %) Spilled SGPRs: 52298 -> 16644 (-68.17 %) Spilled VGPRs: 397 -> 369 (-7.05 %) Scratch VGPRs: 1372 -> 1344 (-2.04 %) dwords per thread Code Size: 36136488 -> 36001064 (-0.37 %) bytes LDS: 767 -> 767 (0.00 %) blocks Max Waves: 219315 -> 222221 (1.33 %) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-12 21:06:57 +02:00
Jason Ekstrand	e2fb044115	intel/blorp: Add a TODO file This provides a nice little place to share notes on what still needs to be done and/or would be nice to have in BLORP. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-12 10:14:49 -07:00
Alejandro Piñeiro	6165603209	i965: check for GL_TEXTURE_EXTERNAL_OES at miptree_create_for_teximage Forgotten on commit "i965: Fix calculation of the image height at start level". Thanks to Ilia Mirkin for point it. Fixes the following regressions on Haswell and Broadwell: ES2-CTS.gtf.GL2ExtensionTests.egl_image_external.TestSimpleUnassociated (crash back to pass) ES2-CTS.gtf.GL2ExtensionTests.egl_image_external.TestSimple (crash back to fail) ES2-CTS.gtf.GL2ExtensionTests.egl_image_external.TestVertexShader (crash back to fail) https://bugs.freedesktop.org/show_bug.cgi?id=97761 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-12 18:10:50 +02:00
Chuanbo Weng	9a1eb54237	gbm: fix potential NULL deref of mapImage/unmapImage. The mapImage/unmapImage functions of DRIimage extension can be NULL, so we should add additional check for them. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Chuanbo Weng <chuanbo.weng@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-12 16:52:55 +01:00
Emil Velikov	63faf7de61	Remove GL_GLEXT_PROTOTYPES guards from non-ext headers. A earlier sync with the Khronos headers added _extension_ prototype guards to all the GLES2/3/31/32 core entry points. Effectively breaking all the applications that aim to be portable and do not set the define. The issue has been reported to Khronos (internal bugzilla #14206) and is being worked on. Until updated/fixed headers are released locally fix the issue. The following report is when building weston. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97773 Cc: Armin Krezović <krezovic.armin@gmail.com> Cc: Emmanuel Gil Peyrot <emmanuel.peyrot@collabora.com> Cc: Pekka Paalanen <ppaalanen@gmail.com> Fixes: `6a5504de2f` ("Update Khronos-supplied headers to r33100") Cc: Dave Airlie <airlied@redhat.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-12 16:52:43 +01:00
Emil Velikov	ceaa2e1738	aubinator: rework print_help() Rather than using platform specific methods to retrieve the program name pass it explicitly. The function is called directly from main(). Similarly - basename comes in two versions POSIX (can modify string, always pass a copy) and GNU (never modifies the string). Just printout the complete program name, esp. since the program is not meant to be installed. Thus using $basename is unlikely to work, not to mention it is misleading. Reported-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jonathan Gray <jsg@jsg.id.au>	2016-09-12 16:49:59 +01:00
Adam Jackson	0cb1428fbb	docs: Note MESA_configless_context as superseded Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-09-12 11:29:11 -04:00
Adam Jackson	d9f5b1915b	egl: Rename MESA_configless_context bit to KHR_no_config_context Keep the old name in the extension string, but refer to the KHR extension internally. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-09-12 11:29:09 -04:00
Adam Jackson	cc45a5c308	egl: QueryContext on a configless context returns zero MESA_configless_context does not specify the interaction with QueryContext at all, and the code to generate an error in this case predates the Mesa extension. Since EGL_NO_CONFIG_{KHR,MESA} are numerically identical there's no way to distinguish which one the application asked for, so use the KHR behaviour. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-09-12 11:28:38 -04:00
Boyuan Zhang	e5009b7c26	st/va: enable vbr rate control for vaapi encode This patch enables variable bit-rate for vaapi encoding. According to va.h, target bit-rate equals to maximum bit-rate multiplies by target_percentage. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-09-12 10:34:53 -04:00
Leo Liu	6a7f79af9b	vl/rbsp: match initial escaped bits with valid in the buffer Otherwise the check for the three byte will not make sense. Signed-off-by: Leo Liu <leo.liu@amd.com>	2016-09-12 10:09:27 -04:00
Timothy Arceri	2da15a3b89	egl: fix gcc warning braces around scalar initializer Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-09-12 22:43:49 +10:00
Nicolai Hähnle	b8703e363c	winsys/radeon: rename nrelocs, crelocs to max_relocs, num_relocs Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-12 13:55:55 +02:00
Nicolai Hähnle	d66bbfbede	winsys/radeon: don't pre-allocate the relocations array It's really not necessary. Switch to an exponential resizing strategy. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-12 13:55:53 +02:00
Nicolai Hähnle	f47da2e34f	winsys/radeon: remove unused radeon_cs_context::priority_usage Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-12 13:55:51 +02:00
Nicolai Hähnle	17fff0c2de	winsys/amdgpu: remove amdgpu_cs_lookup_buffer The radeonsi driver doesn't and shouldn't care about the buffer index. Only the virtual addresses matter. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-12 13:55:47 +02:00
Nicolai Hähnle	12657a7abf	winsys/amdgpu: remove unused field domains from amdgpu_cs_buffer Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-12 13:55:07 +02:00
Nicolai Hähnle	3cdeb2a177	winsys/amdgpu: remove initial buffer list allocation It's really not necessary. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-12 13:55:04 +02:00
Nicolai Hähnle	cc53dfda9f	winsys/amdgpu: extract adding a new buffer list entry into its own function While at it, try to be a little more robust in the face of memory allocation failure. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-12 13:55:01 +02:00
Nicolai Hähnle	11cbf4d7ae	winsys/amdgpu: use only one fence per BO The fence that is added to the BO during flush is guaranteed to be signaled after all the fences that were in the fences array of the BO before the flush, because those fences are added as dependencies for the submission (and all this happens atomically under the bo_fence_lock). Therefore, keeping only the last fence around is sufficient. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-12 13:54:59 +02:00
Nicolai Hähnle	480ac143df	winsys/amdgpu: add do_winsys_deinit function The idea is to have matching init/deinit functions so that deinit can be re-used for cleanup in the error path of amdgpu_winsys_create. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-12 13:54:56 +02:00
Nicolai Hähnle	9fb8d354ca	winsys/amdgpu: clean up error paths in amdgpu_winsys_create No need to call pb_cache_deinit, because the cache hasn't been initialized at that point. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-12 13:54:53 +02:00
Nicolai Hähnle	a6c38d47d4	gallium/radeon: page alignment for buffers is unnecessary In some places (e.g. shader program pointers) we require 256 bytes alignment. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-12 13:54:45 +02:00
Nicolai Hähnle	339867c077	gallium/radeon/winsyses: remove #includes of pb_bufmgr.h Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-12 13:54:36 +02:00
Topi Pohjolainen	e54b70b3d4	i965/rbc: Clarify rational given for shader image resolves Original commit added documentation explaining lossless compression case: commit `56f29911ec` Author: Topi Pohjolainen <topi.pohjolainen@intel.com> Date: Tue Feb 2 10:00:41 2016 +0200 i965: Add a flag telling color resolve pass to ignore CCS_E It, however, easily gives the impression that the sole purpose of the intel_miptree_resolve_color() is to address lossless compression. Original intention is to document the lack of INTEL_MIPTREE_IGNORE_CCS_E flag given for the resolve call. This patch fixes this along with a typo found spotted further down. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-12 11:48:30 +03:00
Topi Pohjolainen	1df4b666ed	i965/blorp: Use hw generetad primitive copies for layered clears Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-12 11:48:30 +03:00
Topi Pohjolainen	b712aa2614	i965/blorp: Sanity check all layers before actual clear Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-12 11:48:30 +03:00
Topi Pohjolainen	a1c7de09dc	intel/blorp: Add plumbing for setting color clear layer count Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-12 11:48:29 +03:00
Topi Pohjolainen	514afdce95	intel/blorp: Allow multiple layers Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-12 11:48:29 +03:00
Topi Pohjolainen	e597821ef2	i965/blorp: Instruct vertex fetcher to provide prim instance id This will indicate target layer (Render Target Array Index) needed for layered clears. v2: Use 3DSTATE_VF_SGVS for gen8+ Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-12 11:48:29 +03:00
Topi Pohjolainen	39712b2a14	i965/rbc: Allocate mcs directly such as we do for compressed msaa. In case of non-compressed simgle sampled buffers the allocation of mcs is deferred until there is actually a clear operation that needs the mcs. In case of render buffer compression the mcs buffer always needed and there is no real reason to defer the allocation. By doing it directly allows to drop quite a bit unnecessary complexity. Patch leaves brw_predraw_set_aux_buffers() a no-op. Subsequent patches will re-use it and it seemed cleaner to leave it instead of removing and re-introducing. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-12 11:48:29 +03:00
Topi Pohjolainen	024a39511f	isl/gen8+: Allow 1D and 3D auxiliary surfaces Otherwise once mcs buffer gets allocated without delay for lossless compression (same as we do for msaa), assert starts to fire in piglit case: tex3d. The test uses depth of one which is in fact supported even now. v2 (Jason): Allow also 1D case as there is nothing in the specs constraining it either. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-12 11:48:29 +03:00
Topi Pohjolainen	6939532593	i965: Add sanity check for non-compressible texture views v2: Fix missing inline declaration Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-12 11:48:29 +03:00
Topi Pohjolainen	1b6fcc08df	i965/rbc: Consult rb settings for texture surface setup Once mcs buffer gets allocated without delay for lossless compression (same as we do for msaa), one gets regression in: GL45-CTS.texture_barrier_ARB.same-texel-rw Setting the auxiliary surface for both sampling engine and data port seems to fix this. I haven't found any hardware documentation backing this though. v2 (Jason): Prepare also for the case where surface is sampled with non-compressible format forcing also rendering without compression. v3: Split asserts and decision making. v4: Detailed comment provided by Jason explaining the need for using auxiliary buffer for texturing when the same surface is also used as render target. Added check for existence of renderbuffer before considering if underlying miptree matches. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-12 11:46:13 +03:00
Topi Pohjolainen	22d9a4824b	i965: Track non-compressible sampling of renderbuffers v3: - Actually set the flags when needed instead of falsely overwriting them (Jason). - Use more generic name for flag (dropped RENDERBUFFER) - Consult also shader images v4: - Consult only lossless compressd shader images v5: - Check the existence of renderbuffer before considering if it matches the given miptree Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-12 08:58:38 +03:00
Topi Pohjolainen	1f51217d99	i965: Replace boolean rb surface state setup argument with flags And add plumbing to provide it all the way to surface state emitter. This is not used yet but will be in subsequent patches to carry additional constraints. v2 (Jason): Use uint32_t instead of int as the type Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-12 08:58:38 +03:00
Topi Pohjolainen	1634a4963c	i965/rbc: Allow integer formats as advertised in isl_format.c Blorp consults brw_is_color_fast_clear_compatible() to see if any restrictions apply for fast clear in addition to the capablities advertised in isl_format.c::format_info[]. On Gen8+ integer formats are backlisted for plain old fast clear but there is no reason why lossless compression shouldn't be supported. In fact, lossless compression of integer formats is already supported for normal render paths. This patch prepares for dropping the delayed allocating of the mcs buffer for lossless compression. Until now the skip of fast clear also prevented the mcs being allocated and hence the lossless compression being effectively turned off for integer formats. Once the mcs buffer is allocated beforehand, the assertion addressed here would start triggering. v2: Drop the assert instead of relaxing it (Jason) Fix typo while at it. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-12 08:58:38 +03:00
Alejandro Piñeiro	e77bf32475	i965: remove unused variable at intel_miptree_create_for_teximage After commit "i965: Fix calculation of the image height at start level", it is not needed. This commit removes the "warning: unused variable ‘i’" warning. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 07:21:32 +02:00
Thomas Helland	08c5b10ae9	mesa/glsl: Move string_to_uint_map into the util folder This clears the last bits of the usecases of the hash table located in mesa/program, allowing us to remove it. V2: Rebase on top of changes to Makefile.sources Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	e55eb2b7ea	glsl: Convert glcpp-parse to the util hash table And change the include in glcpp.h accordingly. V2: Whitespace fix Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	16fb318d0c	glsl: Convert loop analysis to the util hash table Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	ec453979db	mesa: Convert symbol table to the util hash table Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	f224ef4392	glsl: Convert varying test to the util hash table V2: remove now unused ht_count_callback() (Timothy Arceri) Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	9efa977be5	glsl: Convert output read lowering to the util hash table Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	6adcc8f283	glsl: Convert interface block lowering to the util hash table V2: move comment to correct location (Timothy Arceri) Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	5482d31b86	glsl: Convert if lowering to use a set Also do some minor whitespace cleanups Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	85a197c4ed	glsl: Convert linker to the util hash table We are getting the util hash table through the include in program/hash_table.h for the moment until we migrate the string_to_uint_map to a separate file. Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	f10cc9407b	glsl: Convert link_varyings to the util hash table Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	e7f91d9de1	glsl: Change link_functions to use a set The "locals" hash table is used as a set, so use a set to avoid confusion and also spare some minor memory. Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	2228548f83	glsl: Convert recursion detection to the util hash table Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	9b3c0f81a7	glsl: Convert constant_expression to the util hash table V2: Fix incorrect ordering on hash table insert V3: null check value returned by _mesa_hash_table_search() (Timothy Arceri) Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	9f188be8a6	glsl: Convert ast_to_hir to the util hash table V2: Rebase to the adaption of new hashing functions V3: move previous_label declaration to where it is used (Timothy Arceri) Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	9ac6d61751	glsl: Convert ir_clone to the util hash table V2: add braces to multiline if (Timothy Arceri) Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	5b5d4ea4a0	glsl: Convert function inlining to the util hash table Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	eef2be6822	mesa: Convert string_to_uint_map to the util hash table And remove the now unused hash_table_replace. V2: Actually do the equivalent thing, and don't leak memory V3: fix minor typo in comment (Timothy Arceri) Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	ddb8639b18	util: Move hash_table_call_foreach to util hash table It is included through the util/hash_table include in the program hash_table, so this should be safe. This will be needed when we start converting each use of the program_hash_table, as some places need this function. Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	cf4a4820ac	mesa: Remove prog_hash_table.c Here we make the prog_hash_table functionally equivalent to the one in util by wrapping the remaing functions that differ. We also move the functions to the header so we can remove the c file. This enables us to do a step-by-step replacement of the table. Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Thomas Helland	42ba435fd1	mesa: Remove unused hash table includes This should prevent us from rebuilding the world. Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-12 10:48:35 +10:00
Ilia Mirkin	148fbf32a8	freedreno/a3xx: disable filtering for texture buffers and int textures Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-11 13:14:06 -04:00
Niels Ole Salscheider	cfa914a1b4	st/clover: Define __OPENCL_VERSION__ on the device side This is required by the OpenCL standard. Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Vedran Miletić <vedran@miletic.net>	2016-09-10 15:48:54 -07:00
Ilia Mirkin	a8c0c7301c	gm107/ir: allow indirect inputs to be loaded by frag shader Looks like the GM107 IPA op does not allow a separate offset when using an indirect register. Instead we must use AL2P like we do for indirect vertex operations on Kepler+. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-09-10 13:40:04 -04:00
Ilia Mirkin	a22aee5ad1	gm107/ir: AL2P writes to a predicate register We have to force it to write to predicate 7 (aka PT) in order for it not to mess up another predicate. Unclear what would be returned in the predicate, perhaps an error code for out-of-bounds requests. Blob doesn't seem to check it. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2016-09-10 13:36:20 -04:00
Antia Puentes	83e8617f4b	i965: Fix calculation of the image height at start level - Fixes CTS tests: * GL44-CTS.shader_image_size.advanced-nonMS-cs-float * GL44-CTS.shader_image_size.advanced-nonMS-cs-int * GL44-CTS.shader_image_size.advanced-nonMS-cs-uint * GL44-CTS.shader_image_size.advanced-nonMS-gs-float * GL44-CTS.shader_image_size.advanced-nonMS-gs-int * GL44-CTS.shader_image_size.advanced-nonMS-gs-uint * GL44-CTS.shader_image_size.advanced-nonMS-tes-float * GL44-CTS.shader_image_size.advanced-nonMS-tes-int * GL44-CTS.shader_image_size.advanced-nonMS-tes-uint * GL44-CTS.shader_image_size.advanced-nonMS-vs-float * GL44-CTS.shader_image_size.advanced-nonMS-vs-int * GL44-CTS.shader_image_size.advanced-nonMS-vs-uint v1: (written by Dave Airlie) Always shift height images for levels. Fixed the CTS test. v2: Only shift height if the texture is not an 1D_ARRAY, it fixes assertion in GL44-CTS.texture_view.gettexparameter due to the original patch (Antia). v3: Remove the loop. Do not shift height either for 1D textures. Use an explicit switch and add an assertion (levels == 0) for multisampled textures (Jason). v4: Rectangle textures can not have levels either (Ilia Mirkin). Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Antia Puentes <apuentes@igalia.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-10 12:52:32 +02:00
Marek Olšák	08bcbfdc07	radeonsi: flush TC L2 before using a compute indirect buffer There is no known test for this. Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-09 22:45:07 +02:00
Marek Olšák	a5a2cc530c	radeonsi: fix the VGT performance tweak for small instances Based on the VGT spec. The Vulkan driver doesn't do it optimally and they plan to fix it. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-09 22:45:06 +02:00
Marek Olšák	a67d81580b	radeonsi: remove the cache_flush atom Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-09 22:45:06 +02:00
Marek Olšák	f9750932ea	winsys/amdgpu: replace OUT_CS with radeon_emit Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-09 22:45:06 +02:00
Marek Olšák	81da78bfc3	winsys/radeon: replace OUT_CS with radeon_emit Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-09 22:45:06 +02:00
Christoph Haag	55ba5fa9a6	doc: document GALLIUM_DRIVER v2: Add dot at end of sentence Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-09 09:24:28 +02:00
Haixia Shi	b1d636aa00	egl/android: Set EGL_MAX_PBUFFER_WIDTH and EGL_MAX_PBUFFER_HEIGHT Set config attributes EGL_MAX_PBUFFER_WIDTH and EGL_MAX_PBUFFER_HEIGHT to hard-coded non-zero values. These two attributes are required on Android. v2: use _EGL_MAX_PBUFFER_WIDTH/HEIGHT from egldefines.h (based on discussion on the first version) Signed-off-by: Tomasz Figa <tfiga@chromium.org> Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-09 07:51:04 +03:00
Tapani Pälli	478fbc2348	android: depend on libmesa_genxml from i965 Android.gen.mk Static library dependency is required to pull the generated XML headers into the generated C file. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-09 07:51:04 +03:00
Tapani Pälli	4542c7ed5f	i965: release GLSL IR in LinkShader after it's not needed Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-09-09 07:51:04 +03:00
Tapani Pälli	2cd02e30d2	glsl: use hash instead of exec_list in copy propagation This change makes copy propagation pass faster. Complete link time spent in test case attached to bug 94477 goes down to ~400 secs from over 500 secs on my HSW machine. Does not fix the actual issue but brings down the total. No regressions seen in CI. v2: do not leak hash_table structure Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-09-09 07:50:42 +03:00
Jason Ekstrand	175ac629be	i965/fs: Fail the shader compile instead of asserting when we can't spill Blorp doesn't handle spilling so we set allow_spilling to false in that case. The blorp 16x MSAA resolve shader spills in 16-wide but not 8-wide. This commit makes it so that we fail the 16-wide compile and successfully fall back to 8-wide instead of just assert-failing when trying to compile the 16-wide shader. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-09-08 20:53:01 -07:00
Jason Ekstrand	88a2a2e053	nir/gcm: Add global value numbering support Unlike the current CSE pass, global value numbering is capable of detecting common values even if one does not dominate the other. For instance, in you have if (...) { ssa_1 = ssa_0 + 7; /* use ssa_1 / } else { ssa_2 = ssa_0 + 7; / use ssa_2 / } Global value numbering doesn't care about dominance relationships so it figures out that ssa_1 and ssa_2 are the same and converts this to if (...) { ssa_1 = ssa_0 + 7; / use ssa_1 / } else { / use ssa_1 / } Obviously, we just broke SSA form which is bad. Global code motion, however, will repair this for us by turning this into ssa_1 = ssa_0 + 7; if (...) { / use ssa_1 / } else { / use ssa_1 */ } This intended to eventually mostly replace CSE. However, conventional CSE may still be useful because it's less of a scorched-earth approach and doesn't require GCM. This makes it a bit more appropriate for use as a clean-up in a late optimization run. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-08 20:53:01 -07:00
Jason Ekstrand	99ff4b3eb2	nir/gcm: Call nir_metadata_preserve Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-08 20:53:01 -07:00
Max Staudt	02675622b0	r300g: Set R300_VAP_CNTL on RSxxx to avoid triangle flickering On the RSxxx chip series, HW TCL is missing and r300_emit_vs_state() is never called. However, if R300_VAP_CNTL is never set, the hardware (at least the RS690 I tested this on) comes up with rendering artifacts, and parts that are uploaded before this "fix" remain broken in VRAM. This causes artifacts as in fdo#69076 ("triangle flickering"). It seems like this setup needs to happen at least once after power on for 3D rendering to work properly. In the DDX with EXA, this happens in RADEON_SWITCH_TO_3D() when processing an XRENDER Composite or an Xv request. So playing back a video or starting a GTK+2 application fixes 3D rendering for the rest of the session. However, this auto-fix doesn't happen when EXA is not used, such as with GLAMOR or Wayland. This patch ensures the register is configured even in absence of the DDX's EXA module. The register setting is taken from: xf86-video-ati -- RADEONInit3DEngineInternal() mesa/src/mesa/drivers/dri/r300 -- r300EmitClearState() Tested on RS690. CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Max Staudt <mstaudt@suse.de> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-09-09 13:30:47 +10:00
Marek Olšák	5981ab5445	gallium: remove PIPE_BIND_TRANSFER_READ/WRITE not used in any useful way Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-09-08 22:51:33 +02:00
Marek Olšák	0fbaf74977	radeonsi: unify si_set_optimal_micro_tile_mode call sites There is nothing special happening in those code blocks. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-08 22:51:33 +02:00
Marek Olšák	758bc52959	radeonsi: fix texture reinterpretation after DCC fast clear The problem is that TC-compatible DCC clear codes translate into different clear values when you change the format. I have a new piglit reproducing the issue. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-08 22:51:33 +02:00
Marek Olšák	46c425e7c8	radeonsi: enable DCC fast clear for 128-bit formats Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-08 22:51:33 +02:00
Marek Olšák	831c0c80f1	radeonsi: clamp integer clear color values for DCC fast clear It should be possible to get TC-compatible fast clear more often now. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-08 22:51:33 +02:00
Marek Olšák	93f3d8e10d	Revert "radeonsi: enable SDMA on CIK" This reverts commit `0241d8300f`. It doesn't work with mobile Bonaire. It looks like the programming of tiling parameters is wrong on some chips.	2016-09-08 22:51:33 +02:00
Christoph Haag	7b414bc512	doc: fix typo of GALLIUM_HUD_TOGGLE_SIGNAL In the original commit message in `56a1c10` it was wrongly used too: - env GALLIUM_HUD_SIGNAL_TOGGLE: toggle visibility via signal Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-08 20:19:35 +02:00
Jason Ekstrand	a00bd7bc27	nir/spirv: Refactor variable deocration handling Previously, we dind't apply variable decorations to the members of a split structure variable. This doesn't quite work, unfortunately, because things such as the "flat" qualifier may get applied to an entire structure instead of propagated to the members. This fixes 9 of the new CTS tests in the dEQP-VK.glsl.linkage.varying.struct.* group. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-09-08 10:45:23 -07:00
Jason Ekstrand	f5505730d3	nir/spirv: Break variable decoration handling into a helper Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-09-08 10:45:23 -07:00
Jonathan Gray	d50c56f868	aubinator: only use program_invocation_short_name with glibc/cygwin program_invocation_short_name is a gnu extension. Limit use of it to glibc and cygwin and otherwise use getprogname() which is available on BSD and OS X. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-08 18:37:02 +01:00
Jonathan Gray	2d3ebb474c	aubinator: include libgen.h for basename(3) Include libgen.h for basename as required by posix. The definition is not found on at least OpenBSD otherwise. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-08 18:37:02 +01:00
Jonathan Gray	0ba9e281fc	aubinator: stop using non portable error() function error() is a gnu extension and is not present on OpenBSD and likely other systems. Convert use of error to fprintf/strerror/exit. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-08 18:37:02 +01:00
Adam Jackson	dbda375d6f	egl: Fix up indentation on previous commit This was requested in review but I pushed the wrong version. Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-09-08 13:21:27 -04:00
Adam Jackson	a279760536	egl: Document why EGL_OPENGL{, _ES}_API are mostly identical Signed-off-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-09-08 13:19:58 -04:00
Chad Versace	bad80c26e7	anv: Link to libX11-xcb only when unneeded The Makefile unconditionally linked libX11-xcb into libvulkan_intel.so. But it's needed only if HAVE_PLATFORM_X11. Fixes build of libvulkan_intel.so on Chromium OS, which has no X11 libraries. Fixes: `71258e9462` ("anv/x11: Add support for Xlib platform") Cc: Kevin Strasser <kevin.strasser@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-08 09:24:30 -07:00
Tim Rowley	7514e326f8	swr: fixes for format mapping and texture sizing Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-09-08 10:43:21 -05:00
Topi Pohjolainen	b863f4a39a	intel/blorp: Allow single slice converter to suppress number of layers Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-08 08:53:45 +03:00
Lionel Landwerlin	0ad84b4366	spirv/nir: Implement OpAtomicLoad/Store for shared variables Missing bits from `2afb950161`. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-07 17:37:37 +01:00
Jason Ekstrand	37763bf446	nir/spirv: Remove an erroneous "fall through" comment	2016-09-07 09:04:34 -07:00
Kyle Brenneman	6e066f76ee	EGL: Combine the GL and GLES current contexts (v2) Only keep track of a single current context, instead of separate contexts for GL and GLES. In EGL 1.4 (and 1.5), EGL_OPENGL_API and EGL_OPENGL_ES_API are supposed to be interchangeable for all purposes except for eglCreateContext. The _EGLThreadInfo::CurrentContexts array is now a single pointer to the current context, which may be a GL or GLES context. In addition, it now keeps track of the current API as an enum instead of an index. eglMakeCurrent will now replace the current context, regardless of which client API is used for for the current and new contexts. It no longer checks for a conflicting context. In addition, calling eglMakeCurrent with EGL_NO_CONTEXT will now release the current context regardless of the current API. v2: Rebased against master (Adam Jackson) Reviewed-by: Adam Jackson <ajax@redhat.com>	2016-09-07 11:56:48 -04:00
Rob Clark	74b1969d71	gbm: wire up fence extension v2: make fence extension optional to not break non-i965 classic drivers, and move __DRI2_FENCE into core extensions, based on comments from Emil Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-09-07 11:54:00 -04:00
Rob Clark	32c061b110	freedreno: reject imports with bogus pitch Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-09-07 11:41:38 -04:00
Rob Clark	b4e88b500c	gbm: add missing R8 and GR88 formats Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-09-07 11:30:41 -04:00
Lionel Landwerlin	2afb950161	spirv/nir: Add support for OpAtomicLoad/Store Fixes new CTS tests : dEQP-VK.spirv_assembly.instruction.compute.opatomic.load dEQP-VK.spirv_assembly.instruction.compute.opatomic.store v2: don't handle images like ssbo/ubo (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-07 11:00:30 +01:00
Marek Olšák	fe40a65fb6	radeonsi: skip redundant INDEX_TYPE writes Ported from Vulkan. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-07 11:13:13 +02:00
Marek Olšák	bdf767dac4	radeonsi: add more unlikely() uses into si_draw_vbo Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-07 11:13:13 +02:00
Marek Olšák	a8e7ea6abc	radeonsi: skip draws with instance_count == 0 loosely ported from Vulkan Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-07 11:13:13 +02:00
Marek Olšák	53d74e055e	gallium/radeon/winsyses: fix counting mapped memory Not all buffers are unmapped explicitly. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-07 11:13:13 +02:00
Ilia Mirkin	8c8874eafb	nir: fix definition of pack_uvec2_to_uint Found by inspection. Untested beyond compilation. This also matches the logic used in nir_lower_alu_to_scalar. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: mesa-stable@lists.freedesktop.org	2016-09-06 22:45:44 -04:00
Ilia Mirkin	c42acd93d4	mesa/formatquery: limit ES target support, fix core context support First off, as late as ES 3.2, GetInternalformat only supports RENDERBUFFER and 2DMS(_ARRAY) targets. Secondly, the _mesa_has_ext helpers are very accurate... a little too accurate, some might say. If we only show an extension in compat profiles because core profiles have the functionality guaranteed, they will return false. Fix these to either check for a core profile explicitly, or to a different-but-identical extension available in core profile. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Matteo Bruni <matteo.mystral@gmail.com> Tested-by: Matteo Bruni <matteo.mystral@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2016-09-06 22:45:44 -04:00
Ilia Mirkin	f654b4983a	mapi: add gl32.h to the list of GLES3 headers for installation This was missed when I added the updated (and new) Khronos headers. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Mark Janes <mark.a.janes@intel.com> Tested-by: Mark Janes <mark.a.janes@intel.com>	2016-09-06 22:45:44 -04:00
Ilia Mirkin	36347c8d6f	main: GL_RGB10_A2UI does not come with GL 3.0/EXT_texture_integer Add a separate extension check for that format. Prevents glTexImage from trying to find a matching format, which fails on drivers without support for this format. Fixes: sized-texture-format-channels (on a3xx) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Cc: mesa-stable@lists.freedesktop.org	2016-09-06 22:41:48 -04:00
Jason Ekstrand	2b18a3f5d3	nir/spirv: Use fill_common_atomic_sources for image atomics We had two almost identical copies of this code and they were both broken but in different ways. The previous two commits fixed both of them. This one just unifies them so that it's easier to handle in the future. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-09-06 17:08:13 -07:00
Jason Ekstrand	f2a10937d8	nir/spirv: Use the correct sources for CompareExchange on images The CompareExchange operation has two "Memory Semantics" parameters instead of one so the real arguments start at w[7] instead of w[6]. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-09-06 17:08:13 -07:00
Jason Ekstrand	0ead7bef6b	nir/spirv: Swap the argument order for AtomicCompareExchange SPIR-V has the two arguments in the opposite order from GLSL. NIR uses the GLSL order so we had them backwards. Fixes dEQP-VK.spirv_assembly.instruction.compute.opatomic.compex Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-09-06 17:08:13 -07:00
Tim Rowley	edd688d986	vbo: increase VBO_SAVE_BUFFER_SIZE from 8k to 256k dwords Increases the performance of legacy geometry-heavy apps still using display lists. Performance increase for a targeted testcase is on the order of 8x, and applications like ParaView 4.x (5.x uses no longer used display lists) improve by about 10%-20%. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-06 15:15:11 -05:00
Vinson Lee	215075ae30	glsl: Add positional argument specifiers. Fix build with Python < 2.7. File "./glsl/ir_expression_operation.py", line 360, in get_enum_name return "ir_{}op_{}".format(("un", "bin", "tri", "quad")[self.num_operands-1], self.name) ValueError: zero length field name in format Fixes: `e31c72a331` ("glsl: Convert tuple into a class") Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2016-09-06 12:03:30 -07:00
Roland Scheidegger	31a380c8dd	util: (trivial) add <stdint.h> include to slab.c should fix "src/util/slab.c:57:13: error: ‘uint8_t’ undeclared"	2016-09-06 19:47:14 +02:00
Jason Ekstrand	92162dbe32	glsl: Add .gitignore for make check warnings test	2016-09-06 08:32:19 -07:00
Jason Ekstrand	20b2f1ecb9	anv/pipeline: Lower indirect outputs when EmitNoIndirectOutput is set Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reported-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-06 08:27:23 -07:00
Rob Herring	244f0aba16	Android: glsl: add rules to generate ir_expression.h header files Recent changes to generate ir_expression.h header files broke Android builds. This adds the generation rules. This change is complicated due to creating a circular dependency between libmesa_glsl, libmesa_nir, and libmesa_compiler. Normally, we add static libraries so that include paths are added even if there's no linking dependency. That is the case here. Instead, we explicitly add the include path using $(MESA_GEN_GLSL_H) to libmesa_compiler. This in turn requires shuffling the order of make includes. It also uncovered missing dependency tracking of glsl_parser.h. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-06 15:58:55 +01:00
Leo Liu	2593354643	st/omx/dec: enable hevc omx decode support Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2016-09-06 10:08:01 -04:00
Leo Liu	1a534d31fe	st/omx/dec/h265: get the reference list for uvd Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2016-09-06 10:08:01 -04:00
Leo Liu	7d63b80728	st/omx/dec/h265: add short term reference picture sets Specified by subclause 7.3.7 Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2016-09-06 10:08:01 -04:00
Leo Liu	fa7c4f151d	st/omx/dec/h265: add slice header Specified by subclause 7.3.6.1 Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2016-09-06 10:08:01 -04:00
Leo Liu	a639a2868e	st/omx/dec/h265: add picture parameter sets Specified by subclause 7.3.2.3 Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2016-09-06 10:08:01 -04:00
Leo Liu	b3c1583e17	st/omx/dec/h265: add sequence parameter sets Specified by subclause 7.3.2.2 Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2016-09-06 10:08:01 -04:00
Leo Liu	6d186a79f2	st/omx/dec: add initial omx hevc support Mainly based on the h264 implementation. Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2016-09-06 10:08:01 -04:00
Leo Liu	0c374a7770	st/omx/dec: set dst rect to match src size When creating interlaced video buffer, hegith set to "template.height = align(tmpl->height/ array_size, VL_MACROBLOCK_HEIGHT);", and we use "template.height *= array_size;" for the buffer height, so it actually aligned with 32. With progressive video buffer it still aligned with 16, thus causing different height between interlaced buffer and progressive buffer for 4K (height=2160), and 720p (height=720). When transcode the video, this will cause the 16 lines corruption at the bottom of the encode video. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-09-06 10:01:24 -04:00
Marek Olšák	e7a73b75a0	gallium: switch drivers to the slab allocator in src/util	2016-09-06 14:24:04 +02:00
Marek Olšák	761ff40302	util: import the slab allocator from gallium There are also some cosmetic changes.	2016-09-06 14:24:04 +02:00
Michel Dänzer	dc3bb5db8c	loader/dri3: Always use at least two back buffers This can make a significant difference for performance with some extreme test cases such as vblank_mode=0 glxgears. Fixes: `1e3218bc5b` ("loader/dri3: Overhaul dri3_update_num_back") Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97549 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2016-09-06 13:04:48 +09:00
Kenneth Graunke	d0cd504046	glsl: Fix locations of variables in patch qualified interface blocks. As of commit `d82f8d9772`, we actually parse and attempt to handle the 'patch' qualifier on interface blocks. This patch fixes explicit locations for variables in such blocks. Without it, many program interface query dEQP/CTS tests hit this assertion in ir_set_program_inouts.cpp if (is_patch_generic) { assert(idx >= VARYING_SLOT_PATCH0 && idx < VARYING_SLOT_TESS_MAX); bitfield = BITFIELD64_BIT(idx - VARYING_SLOT_PATCH0); } because the location was incorrectly based on VARYING_SLOT_VAR0. Note that most of the tests affected currently fail before they hit this, due to confusion about what the program interface query name of those resources should be. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-05 17:37:55 -07:00
Kenneth Graunke	096ad19a2b	mesa: Fix types in _mesa_get_color_read_format(). This is a mesa_format, not a GLenum. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-05 17:37:55 -07:00
Dave Airlie	69fca64259	amd/addrlib: move addrlib from amdgpu winsys to common code Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-06 10:06:33 +10:00
Dave Airlie	1add3562e3	gallium/util: move endian detect into a separate file This just ports the simpler endian detection bits, addrlib sharing wants this outside gallium. Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-06 10:06:24 +10:00
Dave Airlie	a86be7b6ad	radeon: move radeon_family/chip_class defintions to common This just moves these to a common header file. Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-06 10:06:04 +10:00
Dave Airlie	f1f1ba3781	radeonsi: move sid.h/r600d_common.h to a common place. Step one to merging radv would be to move some files around. This only adds the include path to r600/radeonsi, because later we want to avoid having to add it to the generic target paths. Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-06 10:05:13 +10:00
Marek Olšák	0d7ec8b7d0	gallium/radeon: remove VPORT_ZMIN/ZMAX from init config states It's part of the viewport state now. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	687c4be9cf	gallium/radeon: set VPORT_ZMIN/MAX registers correctly Calculate depth ranges from viewport states and pipe_rasterizer_state::clip_halfz. The evergreend.h change is required to silence a warning. This fixes this recently updated piglit: arb_depth_clamp/depth-clamp-range Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	8b0507672e	gallium/radeon: unify viewport emission code Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	6c8b76263d	radeonsi: also do VS_PARTIAL_FLUSH before updating VGT ring pointers ported from Vulkan Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	22cb5aecbe	radeonsi: fix variable naming in si_emit_cache_flush Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	911202817d	radeonsi: don't emit CS_PARTIAL_FLUSH if compute is not used for less noise in the HUD Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	addca75f4e	radeonsi: add HUD queries for counting VS/PS/CS partial flushes Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	1d0593abd7	gallium/radeon: rename the num-cs-flushes query to num-ctx-flushes num-cs-flushes will mean compute shader flushes Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	1469c70c2a	radeonsi: fix a badly implemented GS bug workaround Limit it to geometry shaders and Hawaii. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	21de3be8e6	radeonsi: fix texture format reinterpretation with DCC DCC is limited in how texture formats can be reinterpreted using texture views. If we get a view format that is incompatible with the initial texture format with respect to DCC, disable DCC. There is a new piglit which tests all format combinations. What works and what doesn't was deduced by looking at the piglit failures. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	63da0c991d	radeonsi: fix Gather4 with integer formats The closed compiler does the same thing. This fixes: GL45-CTS.texture_gather.-int- (18 tests) Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	3e756f09d4	radeonsi: fix a crash in imageSize for cubemap arrays Sometimes it was f32, other times it was i32. Now it's always i32. This fixes: GL45-CTS.texture_cube_map_array.image_texture_size.texture_size_compute_sh Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	03708deed2	radeonsi: fix gl_PatchVerticesIn for tessellation evaluation shader This fixes: GL45-CTS.tessellation_shader.tessellation_control_to_tessellation_evaluation .gl_PatchVerticesIn Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	a4fa215058	radeonsi: fix cubemaps viewed as 2D This fixes: GL43-CTS.texture_view.view_sampling v2: fix a typo, merge both if statements Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Dave Airlie <airlied@redhat.com> (v1) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	2975230fdc	radeonsi: always use the same function signature for llvm.SI.export Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	1c13c71ef8	radeonsi: return correct eviction stats for NVX_gpu_memory_info Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	f660d1cb21	gallium/radeon: also eliminate DCC fast clear in resource_get_handle just do what the comment says Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	01dd73f2f4	gallium/radeon: use the current ctx for CMASK elimination in resource_get_handle For coherency with the current context. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	d22feeaa9d	gallium/radeon: use the current ctx for DCC decompression in resource_get_handle For coherency with the current context. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	0d2e43fcb1	gallium/radeon: derive buffer placement and flags only at initialization Invalidated buffers don't have to go through it. Split r600_init_resource into r600_init_resource_fields and r600_alloc_resource. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	a14c50bceb	radeonsi: set more sampler settings Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Emil Velikov	4ea90682ab	docs: add news item and link release notes for 12.0.2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-05 16:13:48 +01:00
Emil Velikov	2099d5df97	docs: add sha256 checksums for 12.0.2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `614fb93a6d`)	2016-09-05 16:12:08 +01:00
Emil Velikov	f541530bbc	docs: add release notes for 12.0.2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `2fc6a31f10`)	2016-09-05 16:12:07 +01:00
Marek Olšák	b012a13af5	noop: implement resource_get_handle X+DRI3 locks up if the returned handle is invalid.	2016-09-05 16:12:04 +02:00
Marek Olšák	1c71bccdaa	noop: set missing functions	2016-09-05 16:12:04 +02:00
Marek Olšák	ed164f0d6b	noop: simplify some functions	2016-09-05 16:12:04 +02:00
Emil Velikov	62b224d428	glx/glvnd: list the strcmp arguments in correct order Currently, due to the inverse order, strcmp will produce negative result when the needle is towards the start of the haystack. Thus on the next iteration(s) we'll end up further towards the end and eventually fail to locate the entry. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-09-05 11:59:07 +01:00
Jason Ekstrand	821e366385	nir/tests: Update the CF tests to not assume fake edges In `aad4f1550`, we removed the concept of "fake" edges from NIR. Now, if you have a block at the end of an infinite loop it really has no predecessors. This updates the unit tests to match. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97587 Tested-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-09-04 20:44:59 -07:00
Ilia Mirkin	61e978524a	gk110/ir: fix quadop dall emission We recently starting to always emit the NDV (== dall) bit for quadops. However it was folded into the wrong code word. Fixes: `e0a067ed48` (nv50/ir: always emit the NDV bit for OP_QUADOP) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>	2016-09-04 18:28:29 -04:00
Mauro Rossi	98f734e758	android: intel: fix include paths in new "common" library Fixes building error in libmesa_intel_common static library Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-03 20:03:16 -07:00
Ilia Mirkin	ca313e00b6	a3xx: use window scissor to simulate viewport xy clip Unfortunately a3xx does not have a separate disable for depth clipping, so when depth clamp is enabled, we disable the whole 3d clipper logic. This in turn also gets rid of the xy clip that it would normally do. When we detect this would happen, instead we integrate the viewport into the window scissor. This may have slightly different behavior around wide points, but it's unlikely that anything depends on this. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97231 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-09-03 19:58:42 -04:00
Ilia Mirkin	83d7230fd5	a3xx: make use of software clipping when hw can't handle it The hw clipper only handles up to 6 UCPs. If there are more than 6 UCPs, or a clip vertex, or clip distances are in use, then we must use the fallback discard-based clipping from the frag shader. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-09-03 19:58:42 -04:00
Ilia Mirkin	dac72234c7	a3xx: make sure to actually clamp depth as requested We were previously ... not clamping. I guess this meant that everything got clamped to 1/0, which was enough to pass the existing tests. Or perhaps the clamping would only happen to the rasterized depth value and not the frag shader's output depth value. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97231 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-09-03 19:58:42 -04:00
Karol Herbst	ae7eb93e6c	nvc0/ir: allow min/max instructions to be dual-issued in pairs changes for GpuTest /test=pixmark_piano /benchmark /no_scorebox /msaa=0 /benchmark_duration_ms=60000 /width=1024 /height=640: inst_executed: 1.03G inst_issued1: 614M -> 580M inst_issued2: 213M -> 230M score: 1021 -> 1030 Signed-off-by: Karol Herbst <karolherbst@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-03 13:53:09 -04:00
Jason Ekstrand	7e891f90c7	anv: Move cmd_buffer_config_l3 into anv_cmd_buffer.c This is the only remaining part of genX_l3.c and there's really no good reason for it to be in its own file. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-09-03 08:23:07 -07:00
Jason Ekstrand	17968e2dfd	anv/cmd_buffer: Move emit_lri and emit_lrm higher up Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-09-03 08:23:07 -07:00
Jason Ekstrand	42d03c204c	anv: Refactor pipeline l3 config setup Now that we're using gen_l3_config.c, we no longer have one set of l3 config functions per gen and we can simplify a bit. Also, we know that only compute uses SLM so we don't need to look for it in all of the stages. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-09-03 08:23:07 -07:00
Jason Ekstrand	6448c0e324	anv: Leverage the shared L3$ config code When Jordan first implement L3$ configuration for Vulkan, he copied+pasted from the GL driver because we had no good place to share it. Now that we have src/intel/common, we should be sharing these tables. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-09-03 08:23:07 -07:00
Jason Ekstrand	49981891f7	intel: Pull the guts of gen7_l3_state.c into a shared helper Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-09-03 08:23:07 -07:00
Jason Ekstrand	979d0aca62	intel: Rename brw_get_device_name/info to gen_get_device_name/info Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-09-03 08:23:07 -07:00
Jason Ekstrand	527f371999	intel: s/brw_device_info/gen_device_info/ Generated by: sed -i -e 's/brw_device_info/gen_device_info/g' src/intel/*/.c sed -i -e 's/brw_device_info/gen_device_info/g' src/intel/*/.h sed -i -e 's/brw_device_info/gen_device_info/g' */i965/.c sed -i -e 's/brw_device_info/gen_device_info/g' */i965/.cpp sed -i -e 's/brw_device_info/gen_device_info/g' */i965/.h Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-09-03 08:23:06 -07:00
Jason Ekstrand	55364ab5b7	intel: Add a new "common" library for more code sharing The first thing to go in this new library is brw_device_info. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-09-03 08:23:06 -07:00
Mauro Rossi	4218c32166	intel/blorp: fix typo in android makefile Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-03 08:22:53 -07:00
Timothy Arceri	1692228a38	nir: remove unused variable This was let over from `aad4f15506` Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-09-03 20:30:19 +10:00
Connor Abbott	356d101af3	nir: remove some fields from nir_shader_compiler_options I accidentally added these with `0dc4cab`. Oops!	2016-09-03 00:49:58 -04:00
Connor Abbott	c62b58c216	nir: fix bug with moves in nir_opt_remove_phis() In `144cbf8` ("nir: Make nir_opt_remove_phis see through moves."), Ken made nir_opt_remove_phis able to coalesce phi nodes whose sources are all moves with the same swizzle. However, he didn't add the logic necessary for handling the fact that the phi may now have multiple different sources, even though the sources point to the same thing. For example, if we had something like: if (...) a1 = b.yx; else a2 = b.yx; a = phi(a1, a2) ... = a then we would rewrite it to if (...) a1 = b.yx; else a2 = b.yx; ... = a1 by picking a random phi source, which in this case is invalid because the source doesn't dominate the phi. Instead, we need to change it to: if (...) a1 = b.yx; else a2 = b.yx; a3 = b.yx; ... = a3; Fixes 12 CTS tests: ES31-CTS.functional.tessellation.invariance.outer_edge_symmetry.quads* Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-03 00:37:48 -04:00
Connor Abbott	0dc4cabee2	nir: add nir_after_phis() cursor helper And re-implement nir_after_cf_node_and_phis() using it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-03 00:37:48 -04:00
Ilia Mirkin	64a69059ce	glsl: expose max atomic counter/buffer consts for tess in ES 3.2 Curiously OES/EXT_tessellation_shader leave these out, while ES 3.2 adds them in. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-09-03 00:26:36 -04:00
Ilia Mirkin	8122e30aec	mapi: don't forget to expose GetPointerv in GL ES 3.2 I left this out of my previous commit that went around enabling all of the other ES 3.2 entrypoints. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-09-03 00:26:36 -04:00
Ilia Mirkin	346de79ffd	main: add KHR_robustness to ES 3.2 extension requirements Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-09-03 00:26:36 -04:00
Ilia Mirkin	163a029eba	nv50,nvc0: respect render condition enable flag when clearing rt/zs This is a newly added flag. We always pass false into it from nv50_clear_texture, but other callers may want to respect the render condition. (And the functions were originally spec'd to respect it.) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-03 00:01:07 -04:00
Karol Herbst	d0cf7a6beb	nvc0/ir: don't dual-issue ops that depend or interfere with each other Signed-off-by: Karol Herbst <karolherbst@gmail.com> Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> [imirkin: rewrite to split up the helpers and move more logic to target] Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-03 00:01:06 -04:00
Jason Ekstrand	aad4f15506	nir: Remove fake edges in the CF handling code When NIR was first introduced, Connor added this fake-edge hack to work around issues related to unreachable blocks. Thanks to GLSL IR's jump lowering code, the only unreachable code you can have is a block after an infinite loop. With SPIR-V, we didn't have the jump lowering code so we could also end up with the "if (...) { break; } else { continue; }" case which generates an unreachable block after the if. Because of this, most of NIR had to be fixed up for handling unreachable blocks. The only remaining case of not handling unreachable blocks was specifically the block-after-infinite-loop case in dead_cf which was fixed by the previous commit. We can now delete the fake edge hack. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-09-02 11:24:09 -07:00
Jason Ekstrand	9a4d76e534	nir/dead_cf: Don't crash on unreachable after-loop blocks Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-09-02 11:24:09 -07:00
Samuel Pitoiset	ea7b475968	nvc0: reduce the initial code segment size to 512KB Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-01 21:25:39 +02:00
Samuel Pitoiset	6557058827	nvc0: allow to resize the code segment dynamically When an application uses a ton of shaders, we need to evict them when the code segment is full but this is not really a good solution if monster shaders are used because code eviction will happen a lot. To avoid this, it seems better to dynamically resize the code segment area after each eviction. The maximum size is arbitrary fixed to 8MB which should be enough. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-01 21:25:35 +02:00
Samuel Pitoiset	96e21ad763	nvc0: add a new bin for the code segment To avoid the bins list to grow up indefinitely when the code segment size will be bumped, we need to separate that bin from the SCREEN one because it contains other resources like the uniform bo. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-01 21:25:31 +02:00
Samuel Pitoiset	63ac80879e	nvc0: add nvc0_screen_resize_text_area() helper This function will be helpful for resizing the code segment area when we need to evict all shaders. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-01 21:25:28 +02:00
Samuel Pitoiset	3d928d9082	nvc0: re-upload currently bound shaders after code eviction This fixes a very old issue which happens when the code segment size is full. A bunch of real applications like Tomb Raider, F1 2015, Elemental, hit that issue because they use a ton of shaders. In this case, all shaders are evicted (for freeing space) but all currently bound shaders also need to be re-uploaded and SP_START_ID have to be updated accordingly. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-01 21:25:25 +02:00
Samuel Pitoiset	34883626d1	nvc0: refactor the program upload process This refactoring will help for fixing the "out of code space" eviction issue because we will need to reupload the code for all currently bound shaders but it's slightly different than uploading a new fresh code. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-01 21:25:17 +02:00
Jordan Justen	49c24d8a24	i965: fix noop_scissor range issue on width/height If scissor X or Y was set to a negative value then the previous code might have indicated noop scissors when the scissor range actually was masking a portion of the framebuffer. Since fb->_Xmin, _Xmax, _Ymin and _Ymax take scissors into account, we can use these to test for a noop scissor. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-09-01 11:45:13 -07:00
Kenneth Graunke	9c562956f9	glsl: Only force varyings to be flat when varying packing. Varying packing would like to mark certain variables as flat. This works as long as both sides of the interfaces are changed accordingly. However, with SSO, we disable varying packing on the outermost stages. We also disable varying packing for certain tessellation stages. With SSO, we operate on the producer and consumer separately. Checks based on the consumer stage and variable are risky, and can easily lead to altering one half of the interface between stages, breaking SSO pipeline IO validation. Just stop monkeying around with interpolation modes unless required for varying packing. There's no point. This also disables it in unsafe SSO cases. Fixes CTS tests: *.tessellation_shader.tessellation_control_to_tessellation_evaluation.gl_MaxPatchVertices_Position_PointSize Also fixes Piglit's spec/oes_geometry_shader/sso_validation: - user-defined-gs-input-not-in-block.shader_test - user-defined-gs-input-in-block.shader_test Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-09-01 11:24:17 -07:00
Kenneth Graunke	72b56e8b1a	glsl: Reject TCS/TES input arrays not sized to gl_MaxPatchVertices. We handled the unsized case, implicitly sizing arrays to the value of gl_MaxPatchVertices. But if a size was present, we failed to raise a compile error if it wasn't the value of gl_MaxPatchVertices. Fixes CTS tests: .tessellation_shader.compilation_and_linking_errors. {tc,te}_invalid_array_size_used_for_input_blocks Piglit's tcs-input-read-nonconst- tests have recently been fixed. This patch will break older copies of those tests, but the latest should continue working. Update to Piglit 75819c13af2ed5. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-09-01 11:07:07 -07:00
Frank Binns	2f3154f464	wayland-drm: add missing NULL check Although malloc is unlikely to fail check its return value nevertheless. Signed-off-by: Frank Binns <frank.binns@imgtec.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-01 15:48:52 +01:00
Frank Binns	d5f65b8bf5	loader: fix sysfs uevent file parsing When trying to get a device name for an fd using sysfs, it would always fail as it was expecting key/value pairs to be delimited by '\0', which is not the case. Signed-off-by: Frank Binns <frank.binns@imgtec.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-01 15:48:34 +01:00
Frank Binns	d6f669ba83	egl: only store device name when Wayland support is built The device name is only needed for WL_bind_wayland_display so make this clear by only storing the device name when Wayland support is built. Signed-off-by: Frank Binns <frank.binns@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-01 15:47:58 +01:00
Lionel Landwerlin	2dc6930a5a	isl: round format alignment to nearest power of 2 A few inline asserts in anv assume alignments are power of 2, but with formats like R8G8B8 we have odd alignments. v2: round up to power of 2 (Ilia) v3: reuse util_next_power_of_two() from gallium/aux/util/u_math.h (Ilia) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-01 11:36:09 +01:00
Thomas Hellstrom	fc6be40011	gallium/postprocess: Fix resource freeing The code was triggering asserts in DEBUG builds of the SVGA driver since the reference count of the resource was never decremented before destroy. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-09-01 07:59:49 +02:00
Ilia Mirkin	e3db415456	st/mesa: expose OES_geometry_shader and OES_texture_cube_map_array Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-31 20:12:55 -04:00
Eric Engestrom	3bd885d09c	Introduce .editorconfig A few weeks ago, Jose Fonseca suggested [0] we use .editorconfig files to try and enforce the formatting of the code, to which Michel Dänzer suggested [1] we start by importing the existing .dir-locals.el settings. The first draft was discussed in the RFC [2]. These .editorconfig are a first step, one that has the advantage of requiring little to no intervention from the devs once the settings files are in place, but the settings are very limited. This does have the advantage of applying while the code is being written. This doesn't replace the need for more comprehensive formatting tools such as clang-format & clang-tidy, but those reformat the code after the fact. [0] https://lists.freedesktop.org/archives/mesa-dev/2016-June/121545.html [1] https://lists.freedesktop.org/archives/mesa-dev/2016-June/121639.html [2] https://lists.freedesktop.org/archives/mesa-dev/2016-July/123431.html Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Acked-by: Eric Anholt <eric@anholt.net> Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-08-31 17:06:54 -07:00
Eric Anholt	509e2dbc10	vc4: Add missing break statement. This opcode isn't used yet, so it didn't affect anything. Caught by Coverity, reported to me by imirkin.	2016-08-31 17:06:54 -07:00
Brian Paul	c87e8c8515	gallium/docs: clarify render_condition_enabled parameter to clear functions If false, it means do the clear unconditionally. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-31 15:51:06 -06:00
Jason Ekstrand	b8bff0823b	mesa: Add some more .gitignore	2016-08-31 13:45:27 -07:00
Matt Turner	90eaf01616	i965: Pass start_offset to brw_set_uip_jip(). Without this, we would pass over the instructions in the SIMD8 program (which is located earlier in the buffer) when brw_set_uip_jip() is called to handle the SIMD16 program. The assertion about compacted control flow was bogus: halt, cont, break cannot be compacted because they have both JIP and UIP. Instead, we should never see a compacted instruction in this code at all. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-08-31 13:11:27 -07:00
Kenneth Graunke	bea048752e	i965: Merge gen7_clip_state atom into gen6_clip_state atom. The original motivation was that gen6_clip_state ignored _NEW_POLYGON as it didn't care about early culling. The only other change was that Gen6 ignored BRW_NEW_TES_PROG_DATA as it doesn't have tessellation shaders, but listening to this is harmless as it'll never be signalled. Now that we've added _NEW_POLYGON for is_drawing_lines/points, we can merge the two as the distinction is meaningless. This actually fixes a bug, though: Gen8+ was using the gen6_clip_state atom because it doesn't care about early culling, but it also needs BRW_NEW_TES_PROG_DATA, which was missing. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-08-31 12:42:09 -07:00
Kenneth Graunke	4c116cbafb	i965: Use gs_prog_data in is_drawing_points/lines(). State upload code should use prog_data rather than poking at core Mesa shader data structures wherever possible. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-08-31 11:50:15 -07:00
Kenneth Graunke	cd19db4ee6	i965: Fix missing dirty bits related to is_drawing_points/lines. calculate_attr_overrides() uses is_drawing_points(), which depends on tessellation and geometry program state, as well as polygon state. v2: Add missing _NEW_POLYGON as well. Caught by Iago Toral. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-08-31 11:50:15 -07:00
Samuel Pitoiset	3df8615dcd	nvc0: remove an attempt at uploading all IMMD into a CB This has never been used because info->immd.bufSize is always 0 and anyways this is an experimental code which has never been completed. This gets rid of some unused code in the program validation process. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-31 19:05:16 +02:00
Samuel Pitoiset	b2f3d50ca7	nv50: remove unused nv50_program::immd_size field Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-31 19:05:13 +02:00
Ilia Mirkin	6118bcab4e	nv30: set usage to staging so that the buffer is allocated in GART The code a few lines below expects to migrate the bo in question to VRAM. Since we're filling the initial data via CPU, it's more efficient to create the temporary buffer in GART. There is no "push" method implemented, otherwise we'd use that instead. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-08-31 10:28:33 -04:00
Frank Binns	5505845945	egl/x11_dri3: provide an authentication function To support WL_bind_wayland_display an authentication function needs to be provided but this was not being done for this platform as it's not strictly necessary. However, as this isn't an optional function there's the potential for a segfault to occur if authentication is mistakenly performed. Protect against this by providing a function that prints an error. Signed-off-by: Frank Binns <frank.binns@imgtec.com> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-08-31 15:10:14 +02:00
Frank Binns	4c28c916ef	egl/x11_dri3: disable WL_bind_wayland_display for devices without render nodes Up until now, DRI3 was only used for devices that have render nodes, unless overridden via an environment variable, with it falling back to DRI2 otherwise. This limitation was there in order to support WL_bind_wayland_display as it requires client opened device node fds to be authenticated, which isn't possible when using DRI3. This is an unfortunate compromise as DRI3 provides security benefits over DRI2. Instead, allow DRI3 to be used for devices without render nodes but don't advertise WL_bind_wayland_display in this case. Applications that need this extension can still be run by disabling DRI3 support via the LIBGL_DRI3_DISABLE environment variable. Signed-off-by: Frank Binns <frank.binns@imgtec.com> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-08-31 15:09:12 +02:00
Jose Fonseca	55e417222f	scons: Fix MinGW cross compilation. The generated GLSL header files were only being built for the host platform, and not the target platform. Trivial.	2016-08-31 12:18:34 +01:00
Ilia Mirkin	8caf2cb0c0	nv30: only bail on color/depth bpp mismatch when surfaces are swizzled The actual restriction is a little weaker than I originally thought. See https://bugs.freedesktop.org/show_bug.cgi?id=92306#c17 for the suggestion. This also explain why things weren't always failing before, only sometimes. We will allocate a non-swizzled depth buffer for NPOT winsys buffer sizes, which they almost always are. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-08-31 01:17:55 -04:00
Kenneth Graunke	d82f8d9772	glsl: Handle patch qualifier on interface blocks. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-30 22:09:36 -07:00
Ilia Mirkin	a0b1260fe0	i965: enable OES_primitive_bounding_box with the no-op implementation Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 21:31:30 -04:00
Ilia Mirkin	bf47b2bf88	st/mesa: provide the null implementation of bounding box outputs in tcs Until hardware appears (in a gallium driver) that can make use of the TCS-outputted gl_BoundingBox, we just request that the variable gets assigned as a regular patch variable. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-30 20:25:15 -04:00
Ilia Mirkin	891d7e3c9e	glsl: add gl_BoundingBox and associated varying slots Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-30 20:25:15 -04:00
Ilia Mirkin	10663c648e	mesa: add support for GL_PRIMITIVE_BOUNDING_BOX storage and query Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-30 20:25:15 -04:00
Ilia Mirkin	3b81c998a2	mesa: add scaffolding for OES/EXT_primitive_bounding_box Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-30 20:25:15 -04:00
Ilia Mirkin	5ce0969df2	docs: add GL_OES_viewport_array to features Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-30 20:25:15 -04:00
Timothy Arceri	64a48efb9e	aubinator: fix if indentation and add brackets to multiline body Fixes misleading indentation warning in gcc. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-31 10:19:45 +10:00
Francisco Jerez	6df215d97e	i965/fs: Assert that the number of color targets is one when dual-source blend is enabled. Requested by Anuj during review of `4a87e4ade7`, adding as follow-up since it led to assertion failures due to various GLSL bugs that should be fixed now.	2016-08-30 16:54:19 -07:00
Francisco Jerez	fd04d048ae	glsl: Fix gl_program::OutputsWritten computation for dual-source blending. In the fragment shader OutputsWritten is a bitset of FRAG_RESULT_* enumerants, which represent the location of each color output written by the shader. The secondary and primary color outputs of a given render target using dual-source blending have the same location, so the 'idx' computation below will give the wrong bit as result if the 'var->data.index' term is non-zero -- E.g. if the shader writes the primary and secondary colors of the FRAG_RESULT_COLOR output, ir_set_program_inouts will think that the shader writes both FRAG_RESULT_COLOR and FRAG_RESULT_SAMPLE_MASK, which is just bogus. That would cause the brw_wm_prog_key::nr_color_regions computation done in the i965 driver during fragment shader precompilation to be wrong, which currently leads to unnecessary recompilation of shaders that use dual-source blending, and triggers an assertion failure in fs_visitor::emit_fb_writes() on my i965-fb-fetch branch. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-30 16:54:19 -07:00
Francisco Jerez	965934f38a	glsl: Fix incorrect hard-coded location of the gl_SecondaryFragColorEXT built-in. gl_SecondaryFragColorEXT should have the same location as gl_FragColor for the secondary fragment color to be replicated to all fragment outputs. The incorrect location of gl_SecondaryFragColorEXT would cause the linker to mark both FRAG_RESULT_COLOR and FRAG_RESULT_DATA0 as being written to, which isn't allowed by the spec and would ultimately lead to an assertion failure in fs_visitor::emit_fb_writes() on my i965-fb-fetch branch. This should also fix the code below for multiple dual-source-blended render targets, which no driver currently supports but we have plans to enable eventually in the i965 driver (the comment saying that no hardware will ever support it seems rather hilarious). Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-30 16:54:19 -07:00
Francisco Jerez	342f945b13	st/glsl_to_tgsi: Use SecondaryOutputsWritten to determine dual-source fragment outputs. Currently the mesa state tracker relies on there being two bits set per dual-source output in the gl_program::OutputsWritten bitset, but that only worked due to a GLSL front-end bug that caused it to set the OutputsWritten bit for both location and location+1 even though at the GLSL level the primary and secondary color outputs used for dual-source blending have the same location. Fix it by extending outputMapping[] to 2*FRAG_RESULT_MAX elements in order to represent a mapping from a (location, index) pair to its TGSI output, which should also make it slightly easier to add support for dual-source blending in combination with multiple render targets in the long run. No Piglit regressions on llvmpipe. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-30 16:54:19 -07:00
Francisco Jerez	cb4b38af41	glsl: Calculate bitset of secondary outputs written in ir_set_program_inouts. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-30 16:54:18 -07:00
Ian Romanick	c011d7d900	glsl: Fix typo in comment Trivial. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-30 16:28:03 -07:00
Ian Romanick	aee9ab7de7	glsl: Replace most assertions with unreachable() text data bss dec hex filename 7669233 277176 28624 7975033 79b079 i965_dri.so before generated code 7647081 277176 28624 7952881 7959f1 i965_dri.so before this commit 7669289 277176 28624 7975089 79b0b1 i965_dri.so with this commit Looking at the generated assembly, it appears that some of changes made in the generated code prevent some loops from being unrolled. Removing the default cases (via unreachable()) allows these loops to unroll again. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:03 -07:00
Ian Romanick	dd574be54c	glsl: Refactor handling of horizontal operations Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:03 -07:00
Ian Romanick	d6e73150a4	glsl: Use constant_template_horizontal instead of constant_template_horizontal_single_implementation for unops This changes the "shape" of all the pack and unpack operators, but they should function the same. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:03 -07:00
Ian Romanick	822b5c5eb2	glsl: Eliminate constant_template2 constant_template_common can now handle the case where the result type is different from the input type by using type_signature_iter. This changes the "shape" of all the cast-style operators, but they should function the same. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-30 16:28:03 -07:00
Ian Romanick	abc81f7883	glsl: Eliminate constant_template5 constant_template_common can now handle the case where the result type is different from the input type by using type_signature_iter. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:03 -07:00
Ian Romanick	53c54a6c73	glsl: Eliminate constant_template0 This template is mostly an artefact of the development of the original patch series and to minimize the differences between the original code and the generated code. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:03 -07:00
Ian Romanick	ddb4b53de3	glsl: Eliminate one of the templates for simpler operations The difference between these two templates were mostly an artefact of the development of the original patch series and to minimize the differences between the original code and the generated code. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:03 -07:00
Ian Romanick	ee3cdac785	glsl: Use the generated constant expression code Immediately previous to this patch, diff -wud src/glsl/ir_constant_expression.cpp \ src/glsl/ir_expression_operation_constant.h should be "minimal." v3: With much help from José Fonseca, fix the SCons build. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:03 -07:00
Ian Romanick	f3fcfe001f	glsl: Generate code for constant ir_triop_csel expressions v2: 'for (a, b) in d' => 'for a, b in d'. Suggested by Dylan. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:03 -07:00
Ian Romanick	2761190baa	glsl: Generate code for constant ir_triop_lrp expressions v2: 'for (a, b) in d' => 'for a, b in d'. Suggested by Dylan. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:02 -07:00
Ian Romanick	6e09c8715d	glsl: Generate code for constant ir_quadop_vector expressions v2: 'for (a, b) in d' => 'for a, b in d'. Suggested by Dylan. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:02 -07:00
Ian Romanick	f8e185a65f	glsl: Generate code for constant ir_quadop_bitfield_insert expressions v2: 'for (a, b) in d' => 'for a, b in d'. Suggested by Dylan. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:02 -07:00
Ian Romanick	4d8ac28b20	glsl: Generate code for constant ir_triop_vector_insert expressions v2: 'for (a, b) in d' => 'for a, b in d'. Suggested by Dylan. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:02 -07:00
Ian Romanick	9f1d7c5235	glsl: Generate code for constant ir_binop_vector_extract expressions v2: 'for (a, b) in d' => 'for a, b in d'. Suggested by Dylan. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:02 -07:00
Ian Romanick	d8dd49419a	glsl: Generate code for constant ir_binop_mul expressions v2: 'for (a, b) in d' => 'for a, b in d'. Suggested by Dylan. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:02 -07:00
Ian Romanick	8954a019f7	glsl: Generate code for constant ir_triop_fma and ir_triop_bitfield_extract expressions ir_triop_bitfield_extract is a little weird because the second and third operand and aways int, so they may differ in type from the first operand. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:02 -07:00
Ian Romanick	da61c94db8	glsl: Generate code for constant ir_binop_dot expressions v2: 'for (a, b) in d' => 'for a, b in d'. Suggested by Dylan. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:02 -07:00
Ian Romanick	13106e1041	glsl: Generate code for constant ir_binop_lshift and ir_binop_rshift expressions The code generated is quite different from what was previously used. I believe that it is still correct by the GLSL spec, and I believe, due to C rules about shifts, the behavior will be the same. Section 5.9 (Expressions) of the GLSL 4.50 spec says: The result is undefined if the right operand is negative, or greater than or equal to the number of bits in the left expression's base type. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:02 -07:00
Ian Romanick	90da8bf547	glsl: Generate code for constant ir_binop_ldexp expressions ldexp is weird because its two operands have different types. Add support for directly specifying the exact signatures of all the possible variations of an operation. v2: Use tuple() instead of () for clarity. Suggested by Dylan. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:02 -07:00
Ian Romanick	0f87c54d1c	glsl: Generate code for constant unary expressions that don't assign the destination These are operations like the pack functions that have separate functions that assign multiple outputs from a single input. v2: Correct the source and destination types. They were previously transposed. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:02 -07:00
Ian Romanick	8cf9157786	glsl: Generate code for some constant binary expression that are horizontal Only operations where the implementation is identical code regardless of type. The only such operations are ir_binop_all_equal and ir_binop_any_nequal. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:02 -07:00
Ian Romanick	d5bfe6b9c4	glsl: Generate code for constant unary expression that are horizontal Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:02 -07:00
Ian Romanick	8f5357b1d6	glsl: Generate code for constant expressions that have an output type the differs from the input types v2: Remove extra int() cast in find_lsb. Suggested by Matt. 'for (a, b) in d' => 'for a, b in d'. Suggested by Dylan. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:02 -07:00
Ian Romanick	74e335c762	glsl: Generate code for constant binary expressions that combine vector and scalar operands v2: 'for (a, b) in d' => 'for a, b in d'. Suggested by Dylan. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:02 -07:00
Ian Romanick	f81b1c7fa7	glsl: Generate code for constant binary expressions that have one operand type Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:01 -07:00
Ian Romanick	598929aee7	glsl: Generate code for constant unary expression that have different implementations for each source type v2: 'for (a, b) in d' => 'for a, b in d'. Suggested by Dylan. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:01 -07:00
Ian Romanick	aa9f4fc53e	glsl: Generate code for constant unary expression that map one type to another ir_unop_i2b is omitted because its source can either be int or uint. That makes it special. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:01 -07:00
Ian Romanick	3fcb6b85c0	glsl: Begin generating code for the most basic constant expressions Unary operations where all of the supported types use the same C expression to evaluate them. v2: 'for (a, b) in d' => 'for a, b in d'. Suggested by Dylan. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:01 -07:00
Ian Romanick	e31c72a331	glsl: Convert tuple into a class This makes things a little more clear now, and it will make future changes... possible. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:01 -07:00
Ian Romanick	6ef27003ac	glsl: Compact a bunch of things onto one line Even though they are much too long for that. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:01 -07:00
Ian Romanick	0cef8c683e	glsl: Sort constant expression handling by IR operand enum value Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:01 -07:00
Ian Romanick	8d54b5f756	glsl: Trivial whitespace and punctuation changes Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:01 -07:00
Ian Romanick	fd2dabbb9f	glsl: Sort GLSL type enums in switch-statements in enum order Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:01 -07:00
Ian Romanick	13ef8c46b8	glsl: Always use correct float types in constant expression handling Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:01 -07:00
Ian Romanick	ea05a72258	glsl: Extract ir_quadop_bitfield_insert implementation to a separate function Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:01 -07:00
Ian Romanick	fe153309a8	glsl: Extract ir_triop_bitfield_extract implementation to a separate function Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:01 -07:00
Ian Romanick	54ec6e1b8b	glsl: Extract ir_binop_ldexp implementation to a separate function Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:01 -07:00
Ian Romanick	6d5fe1815c	glsl: Use find_msb_uint to implement ir_unop_find_lsb (X & -X) calculates a value with only the least significant bit of X set. Since there is only one bit set, the LSB is the MSB. v2: Remove extra int() cast. Suggested by Matt. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:01 -07:00
Ian Romanick	5c24750a49	glsl: Extract ir_unop_find_msb implementation to a separate function Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:00 -07:00
Ian Romanick	d75034b3a2	glsl: Extract ir_unop_bitfield_reverse implementation to a separate function Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:00 -07:00
Ian Romanick	4b0606e0a7	glsl: Use _mesa_bitcount to implement constant ir_unop_bit_count Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:00 -07:00
Ian Romanick	f4af9f36e7	glsl: Delete spurious comment about mod not taking integer operands This hasn't been true since we added support for GLSL 1.30. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:00 -07:00
Ian Romanick	d6ad3e2dd9	glsl: Delete spurious comment about updating ir_expression::get_num_operands This hasn't been necessary since `007f48815`. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:00 -07:00
Ian Romanick	dc41d998f2	glsl: Do not generate comments or extra whitespace in expression files The comments and whitespace can live in the Python code. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:00 -07:00
Ian Romanick	c6e8fd82ea	glsl: Just access the ir_expression_operation strings table directly The operator_string functions gave us some protection against a malformed table. Now that the table is generated from the same data that generates the enum, this is not a concern. Just cut out the middle man. text data bss dec hex filename 7531892 273992 28584 7834468 778b64 i965_dri-64bit-before.so 7531828 273992 28584 7834404 778b24 i965_dri-64bit-after.so Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:00 -07:00
Ian Romanick	fb44f69779	glsl: Generate ir_expression_operation_strings.h from Python 'diff -ud' is clean. v2: Massive rebase. v3: With much help from José Fonseca, fix the SCons build. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:00 -07:00
Ian Romanick	90781eee4d	glsl: Pull operator_strs out to its own file No change except to the copyright symbol. The next patch will generate this file with Python, and Unicode + Python = pure rage. v2: Massive rebase. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-30 16:28:00 -07:00
Ian Romanick	140ec58a07	glsl: Generate the ir_last_* values This ensures that they remain correct if the list is rearranged or new opcodes are added. I checked a diff of before and after to ensure that each ir_last_ had the same value. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:00 -07:00
Ian Romanick	7d6af9e599	glsl: Generate ir_expression_operation.h from Python There are differences in where end-of-line comments are placed, but 'diff -wud' is clean. v2: Massive rebase. v3: With much help from José Fonseca, fix SCons build. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2016-08-30 16:28:00 -07:00
Jason Ekstrand	10f9901bce	anv: Rework pipeline caching The original pipeline cache the Kristian wrote was based on a now-false premise that the shaders can be stored in the pipeline cache. The Vulkan 1.0 spec explicitly states that the pipeline cache object is transiant and you are allowed to delete it after using it to create a pipeline with no ill effects. As nice as Kristian's design was, it doesn't jive with the expectation provided by the Vulkan spec. The new pipeline cache uses reference-counted anv_shader_bin objects that are backed by a large state pool. The cache itself is just a hash table mapping keys hashes to anv_shader_bin objects. This has the added advantage of removing one more hand-rolled hash table from mesa. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97476 Acked-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2016-08-30 15:08:23 -07:00
Jason Ekstrand	6899718470	anv: Add a struct for storing a compiled shader This new anv_shader_bin struct stores the compiled kernel (as an anv_state) as well as all of the metadata that is generated at shader compile time. The struct is very similar to the old cache_entry struct except that it is reference counted and stores the actual pipeline_bind_map. Similarly to cache_entry, much of the actual data is floating-size and stored after the main struct. Unlike cache_entry, which was storred in GPU-accessable memory, the storage for anv_shader_bin kernels comes from a state pool. The struct itself is reference-counted so that it can be used by multiple pipelines at a time without fear of allocation issues. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Acked-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2016-08-30 15:08:23 -07:00
Jason Ekstrand	13c09fdd0c	anv: Add pipeline_has_stage guards a few places All of these worked before because they were depending on prog_data to be null. Soon, we won't be able to depend on a nice prog_data pointer and it's nice to be more explicit anyway. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-08-30 15:08:23 -07:00
Jason Ekstrand	b259d86ad6	anv: Remove unused fields from anv_pipeline_bind_map Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-08-30 15:08:23 -07:00
Jason Ekstrand	d5945bec12	anv/pipeline: Properly handle OOM during shader compilation Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-08-30 15:08:23 -07:00
Jason Ekstrand	a0f5c496e3	anv/allocator: Correctly set the number of buckets The range from ANV_MIN_STATE_SIZE_LOG2 to ANV_MAX_STATE_SIZE_LOG2 should be inclusive and we have asserts that ensure that you never try to allocate a state larger than (1 << ANV_MAX_STATE_SIZE_LOG2). However, without adding 1 to the difference, we allocate 1 too few bucckts and so, even though we have an assert, anything landing in the last bucket will fail to allocate properly.. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-08-30 15:08:23 -07:00
Jason Ekstrand	4200c2266e	anv/pipeline: Fix bind maps for fragment output arrays Found by inspection. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-08-30 15:08:23 -07:00
Jason Ekstrand	d316cec1c1	anv/descriptor_set: memset anv_descriptor_set_layout We hash this data structure so we can't afford to have uninitialized data even if it is just structure padding. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-08-30 15:08:23 -07:00
Eric Engestrom	d5899b3010	docs/helpwanted: fix GL3.txt/features.txt link Fixes: `f926cf5bd0` ("docs: Rename GL3.txt to features.txt") Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> CC: Andreas Boll <andreas.boll.dev@gmail.com>	2016-08-30 14:38:57 -07:00
Eric Engestrom	aac91fffae	anv/wayland: fix assert typo Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-08-30 13:47:51 -07:00
Eric Engestrom	4e68bb620f	anv/meta: fix unreachable() typo Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-08-30 13:47:51 -07:00
Eric Engestrom	b0acebd41f	st/nine: fix unreachable() typo Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-08-30 13:47:46 -07:00
Eric Engestrom	e2627e34ba	glsl: fix unreachable() typo Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-08-30 13:47:42 -07:00
Eric Engestrom	352f0d9180	get_reviewer.pl: fix mesa check This script was broken for the last few days and I couldn't figure out why. Turns out it was checking for the existence of a file that got renamed, so rename it in here too. Fixes: `f926cf5bd0` ("docs: Rename GL3.txt to features.txt") CC: Ian Romanick <ian.d.romanick@intel.com> CC: Rob Clark <robclark@freedesktop.org> Signed-off-by: Eric Engestrom <eric@engestrom.ch> Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-08-30 16:44:00 -04:00
Kenneth Graunke	6699403651	glsl: Initialize outputs[] array in lower_blend_equation_advanced. Caught by Coverity. Likely fixes real issues if an output component is not present. CID: 1372278 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-08-30 13:11:00 -07:00
Samuel Pitoiset	6820f75c91	nvc0: fix indentation in nvc0_screen_init() Trivial. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-30 18:42:02 +02:00
Samuel Pitoiset	0fc3b7c88e	nvc0: check return value of nvc0_screen_resize_tls_area() While we are at it, make it static and change the return values policy to be consistent. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-30 18:41:59 +02:00
Samuel Pitoiset	b489ac88f6	nvc0: make use of FAIL_SCREEN_INIT in nvc0_screen_create() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-30 18:41:57 +02:00
Samuel Pitoiset	e0a067ed48	nv50/ir: always emit the NDV bit for OP_QUADOP This silences a divergent error found with F1 2015. Basically, the NDV bit has to be set when a FSWZ instruction is inside divergent code, but it's not needed otherwise. The correct fix should be to set it only in divergent code situations. GM107 emitter already sets that bit. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>	2016-08-30 18:41:46 +02:00
Jason Ekstrand	9514c5a30f	intel/blorp: Inline get_vs_entry_size into emit_urb_config Topi asked to have the prefix removed because there's nothing gen7 about it. However, now that everything is in a single file, there is no good reason to have it split out into a helper function anyway. Let's just put the contents in emit_urb_config and call it a day. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-30 09:24:50 -07:00
Tim Rowley	175052507c	swr: [rasterizer] add archrast instrumentation Statistics measurement system	2016-08-30 10:32:36 -05:00
Emil Velikov	5de640a518	i915: Check return value of screen->image.loader->getBuffers Ported from the i965 commit `e7ab358e81`. Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org> Cc: Tomasz Figa <tfiga@chromium.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-08-30 14:50:47 +01:00
Emil Velikov	4f5f9575d0	egl/android: remove config post-processing No longer needed as of last commit, since we no longer add OPENGL to the ClientAPIs thus, RenderType and Conformant don't have the desktop GL bit set. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org>	2016-08-30 14:50:28 +01:00
Emil Velikov	03eaa6c596	egl/dri2: check if the EGL API is valid before adding it to ClientAPIs In the rather unlikely case that the API is considered invalid, don't add it to the (supported) ClientAPIs bitmask. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org> --- Strictly speaking we only need this in the Android case for OpenGL. Adding it everywhere doesn't hurt us since the compiler will const propagate and optimise/remove these.	2016-08-30 14:50:10 +01:00
Emil Velikov	4472b6e469	egl/android: annotate static const data as such Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org>	2016-08-30 14:50:08 +01:00
Emil Velikov	7563c39641	egl: treat EGL_OPENGL_API as invalid on Android At the moment one can use OpenGL in eglBindAPI() only to clear the EGL_OPENGL_BIT from RenderableType and Conformant for _each_ config. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org>	2016-08-30 14:49:24 +01:00
Ilia Mirkin	a165e5cb7c	nouveau: make color/depth bpp match for pre-nv10 chips This avoids generating fbconfigs whose winsys framebuffers will be incomplete (see nouveau_check_framebuffer_complete). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-30 00:21:42 -04:00
Ilia Mirkin	357d8261f1	nouveau: always enable at least one RC Experimentally, this is required for glxgears and others to display the proper colors. This is also what the code used to do before the referenced commit. Fixes: `c703658b39` (mesa: Drop _EnabledUnits.) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-08-30 00:21:42 -04:00
Ilia Mirkin	91681302d0	nouveau: allow NV3x's to be used with nouveau_vieux NV34 and possibly other NV3x hardware has the capability of exposing the NV25 graph class. This allows forcing nouveau_vieux to be used instead of the gallium driver, primarily for testing purposes. (Among other things, NV2x only ever came as AGP or inside an Xbox, never PCI/PCIe). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-30 00:21:42 -04:00
Ilia Mirkin	ab0917311f	nvc0: undo overzealous enum usage Commit `7413625ad3` flipped a few functions too many to use pipe_shader_type. These functions actually take an integer that does not correspond 1:1 with the enum. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-30 00:17:54 -04:00
Brian Paul	ec16a5b091	svga: fix a texture readback bug Backing views/surfaces are used to handle the case when a resource is bound both as a render target and as a sampler source (such as when doing auto mipmap generation). This patch fixes a bug where mapping a resource (to do a glReadPixels) was reading the stale data in the original surface rather than the backing surface which was rendered to. We need to propagate the backing resource (which we rendered to) back to the original resource before we read from it. The problem was the svga_propagate_rendertargets() function was examining the wrong surface views. This fixes the "poc9" test described in VMware bug 1686661. Also tested with Piglit, Cinebench, Lightsmark, etc. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-29 17:46:50 -06:00
Brian Paul	646afc6ff7	svga: move surface propagation code into new function Put new svga_propagate_rendertargets() function where all the other surface propagation code lives. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-29 17:46:50 -06:00
Brian Paul	b9b88516f8	mesa: fix format conversion bug in get_tex_rgba_uncompressed() We need to set the need_convert flag with each loop iteration, not just when the rgba pointer is null. Bug reported by Markus Müller <mueller@imfusion.de> on mesa-users list. Fixes new piglit arb_texture_float-get-tex3d test. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-08-29 17:46:50 -06:00
Dave Airlie	f235dc08ac	radeonsi: add support for cull distances. (v1.1) This should be all that is required for cull distances to work on radeonsi. v1.1: whitespace cleanup, add docs fix clipdist_mask usage. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-08-30 09:35:56 +10:00
Timothy Arceri	5025e88703	spirv: replace assert with unreachable Fixes uninitialised warning for coord_components. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-08-30 09:29:26 +10:00
Jason Ekstrand	f4314d06e8	isl/state: Add some asserts about format capabilities This keeps invalid surface states from leaking through and potentially hanging the GPU. We shouldn't actually be hitting this on a regular basis, but a helpful assert is better than a hang. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	87214414fd	intel/blorp: Add a format parameter to blorp_fast_clear This allows us to use the actual render format as opposed to the texture format. I don't know that the hardware actually cares in the case of fast clears, but it certainly seems more correct. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	348509269e	i965: Move blorp into src/intel/blorp At this point, blorp is completely driver agnostic and can be safely moved into its own folder. Soon, we hope to start using it for doing blits in the Vulkan driver. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	8bd35d8bd2	i965/blorp: Remove the remaining brw prefixes from the blorp.h API Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	3e46f11409	i965/blorp: Use isl_format_get_depth_format for setting depth formats Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	555b22a446	i965: Move the type_size function declartaions to brw_nir.h Signed-of-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	007d8a6d04	i965: Move get_fast_clear_rect to blorp_clear.c This has been the only caller since we deleted the meta fast clear code. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	c8ff36228d	i965: Roll brw_get_ccs_resolve_rect into blorp_ccs_resolve Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	12a2fe5389	i965/blorp: Get rid of most brw and mesa includes Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	87a1cb6979	i965: Move the hiz_op enum to blorp Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	db95a8108f	i965/blorp: Add a fast_clear_op enum Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	71dc2e0106	i965/blorp: Make blorp_addres::buffer a void* The Vulkan driver doesn't use libdrm so we don't want to bake that in. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	2191f5cb7e	i965/blorp: Get rid of brw_context This commit switches all of blorp from taking a brw_context to taking a blorp_context and, where useful, a void batch. In the GL driver, we only have one active batch at a time so the brw_context is* the batch but in Vulkan, batch will point to the anv_cmd_buffer in which we are building instructions. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	99b9e9b86e	i965/blorp: Take a blorp_context in compile_nir_shader Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	a818a32244	i965/meta_util: Take an isl_device in get_fast_clear_rect Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	bc159ff0f7	i965/blorp: Add an "exec" function pointer to blorp_context Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	cea360a708	i965/blorp: Remove some i965-isms from genX_blorp_exec.h Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	cf14b52478	i965/blorp: Move the guts of brw_blorp_exec into genX_blorp_exec.c Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	28ae664e3b	i965/blorp: Pull the guts of blorp_exec into a driver-agnostic header Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	9a842c61fe	i965/blorp/exec: Refactor to use a new blorp_batch struct This gets rid of brw_context throughout the core of the state setup code. Instead, it is replaced with blorp_batch which contains a pointer to the blorp_context and a void* that the driver can use for its own blorp data. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	4e7bddf8a3	i965/blorp: Add a helper for allocating binding tables and surface states Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	8a39069dfe	i965/blorp: Use BT_INDEX enums for setting up the binding table Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	1367af159e	i965/blorp: Shorten binding table index enum names Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	da2a078deb	i965/blorp/genX: Add a blorp_surface_reloc helper Previously, we passed the buffer address (as per the latest offset from the kernel) to ISL to use when it filled out the surface state. We then called drm_intel_bo_emit_reloc() to add the relocation to the list. The newly added blorp_surface_reloc helper adds the relocation to the list and then writes the buffer address directly into the surface state. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	ac08bc8ac2	i965/blorp: Use blorp_address in brw_blorp_surface instead of bo+offset Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	33cc1f6bb4	i965/blorp: Pull emit_surface_state into genX_blorp_exec.c Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	6d2f8f8f5f	i965/blorp: Add driver mocs settings to the context Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	9c380b639f	i965/blorp/genX: Move emit_urb_config into another helper Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	28991c9601	i965/blorp: Use gen6_upload_urb Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	7ecbb9bada	i965/gen6: Refactor gen6_upload_urb This splits it into two functions very similar to gen7_upload_urb. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	3e4b43d11d	i965/blorp/genX: Pull emit_3dstate_multisample into a helper Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	becd434d14	i965/blorp/genX: Add helpers for allocating various bits of state This pulls most of the brw-specific bits into helpers with generic names. Later, those will become the driver hooks for generic code. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	600446ccc7	i965/blorp: Expose the shader cache through function pointers This sanitizes blorp's access to the i965 driver's shader cache by patching it through the blorp_context. When we start using blorp in Vulkan, we will simply have to implement such a caching interface in the Vulkan driver. Note: In my first attempt at this, I simplified it down to a single upload_shader entrypoint and implemented the caching inside of blorp. This doesn't work, however, because the i965 driver will, on occation, dump its entire cache and start over. When this happens, blorp needs to be able to recompile its shaders and re-upload them. It's easiest to just expose the caching interface. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-29 12:17:34 -07:00
Jason Ekstrand	a14d1b63ce	i965/blorp: Add a blorp_context struct and init/finish funcs Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-29 12:17:34 -07:00
Mauro Rossi	cd18bbeef3	android: intel: Flatten the makefile structure Android porting of commit `bebc1a1` "intel: Flatten the makefile structure" Automake approach was followed, by moving makefiles a level up, naming them Android.genxml.mk and Android.isl.mk, performing the necessary adjustments to the paths, adding src/intel/Android.mk and fixing mesa top level makefile. Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-29 12:17:34 -07:00
Jan Vesely	083746bc48	clover: Use device cap to query pointer size instead of hardcoded 32bits Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97513 Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-29 14:40:15 -04:00
Jan Vesely	c7af84968d	gallium: add cap to export device pointer size v2: document the new cap v3: fix 80 char limit in screen.rst Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-29 14:40:15 -04:00
Brian Paul	f5602c27ec	svga: s/unsigned/enum pipe_shader_type/ Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-29 12:40:45 -06:00
Jordan Justen	5e76baa2ad	i965/hsw: Enable ARB_ES3_1_compatibility extension Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-08-29 11:23:08 -07:00
Rhys Kidd	b1b7e921f8	r600g: Clean up defined magic numbers for TGSI opcodes Small code clean up that removes magic numbers where a TGSI opcode has been defined. No functional change expected as each opcode is unsupported on the respective hardware. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: James Harvey <lothmordor@gmail.com>	2016-08-29 11:03:20 -07:00
Rhys Kidd	d4cb3ee95c	r600g: Avoid duplicated initialization of TGSI_OPCODE_DFMA As reported by Clang, TGSI_OPCODE_DFMA (defined magic number 118) is currently initialized twice for Cayman and Evergreen. When Jan Vesely added double precision FMA opcode it did make sense to locate it immediately after TGSI_OPCODE_DMAD, although this is out of order. This change cleans up the prior magic number definition and ensures any later reordering of this struct will not create problems. Prior change was: commit `015e2e0fce` Author: Jan Vesely <jan.vesely@rutgers.edu> Date: Sat Jul 2 16:14:54 2016 -0400 r600g: Add double precision FMA ops Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96782 Fixes: `54c4d525da` ("r600g: Enable FMA on chips that support it") Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Tested-by: James Harvey <lothmordor@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: James Harvey <lothmordor@gmail.com>	2016-08-29 11:03:20 -07:00
Rhys Kidd	8ba1fd339c	i915g: Fix typo in i915_translate_instruction() Noticed this error in a debug message whilst reviewing https://bugs.freedesktop.org/show_bug.cgi?id=97477 This patch doesn't go towards fixing that bug, but at least may clarify future debug output. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-08-29 11:03:20 -07:00
Eric Anholt	60bed14d0f	vc4: Handle discards while in control flow. I missed this while adding loop support because the discard test inside a loop was crashing before, anyway. Fixes piglit glsl-fs-discard-04.	2016-08-29 11:03:11 -07:00
Eric Anholt	b9a74fbec7	vc4: Mark when we add discards while lowering blend state.	2016-08-29 10:57:04 -07:00
Eric Anholt	a99d70d105	nir: Update shader info when adding discards vc4 is about to start using the shader info field to set up discard handling. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-29 10:56:59 -07:00
Tim Rowley	fa8f87132a	swr: [rasterier core] fix GetRasterizerFunc selection Only rasterize scissor edges if one or more scissor/viewport rects are not hottile aligned. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-29 12:42:36 -05:00
Tim Rowley	8e41a65fc5	swr: [rasterizer core] whitespace cleanup Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-29 12:42:30 -05:00
Tim Rowley	cc7f655177	swr: [rasterizer jitter] reimplement SCATTERPS Implement SCATTERPS as a dynamic loop based on mask set bits instead of a static compile time loop. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-29 12:42:23 -05:00
Tim Rowley	c7e21183a1	swr: [rasterizer core] upper left rule for scissors Fixes upper left rule for scissors and viewport/scissor macrotile alignment. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-29 12:42:15 -05:00
Tim Rowley	e54df2c7e4	swr: [rasterizer scripts] undef DEFINE_KNOB after usage Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-29 12:42:10 -05:00
Tim Rowley	a4efbd14d3	swr: [rasterizer core] minor cleanup to thread initialization Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-29 12:42:04 -05:00
Tim Rowley	7472a8ee75	swr: [rasterizer core] remove KNOB_MAX_THREADS Use dynamic memory allocation for per-thread data Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-29 12:41:58 -05:00
Tim Rowley	9e4a482d46	swr: [rasterizer core] track guardbands per viewport rect Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-29 12:41:51 -05:00
Tim Rowley	b473bec878	swr: [rasterizer core] per-primitive viewports/scissors - use per-primitive viewports throughout the pipeline. - track whether all available scissor rects are tile aligned. Causes failures, so not taken into account when choosing rasterizer yet. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-29 12:41:16 -05:00
Tom Stellard	63ed11cde9	radeonsi: Don't use global variables for tess lds We were allocating global variables for the maximum LDS size which made the compiler think we were using all of LDS, which isn't the case. Reviewed-By: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-29 16:36:46 +00:00
Roland Scheidegger	f48ccb8c07	softpipe: (trivial) honor render_condition_enabled for clear_rt/clear_ds	2016-08-29 18:15:08 +02:00
Roland Scheidegger	c5d7624e1d	llvmpipe: (trivial) honor render_condition_enabled for clear_rt/clear_ds	2016-08-29 18:14:49 +02:00
Kai Wasserbäch	4c53267b8f	gallium: Use enum pipe_shader_type in set_shader_images() Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-29 09:07:37 -06:00
Kai Wasserbäch	15fe288dea	gallium: Use enum pipe_shader_type in set_shader_buffers() Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-29 09:07:33 -06:00
Kai Wasserbäch	532db3b788	gallium: Use enum pipe_shader_type in set_sampler_views() Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-29 09:07:25 -06:00
Kai Wasserbäch	7413625ad3	gallium: Use enum pipe_shader_type in bind_sampler_states() (v2) v1 → v2: - Fixed indentation (noted by Brian Paul) - Removed second assert from nouveau's switch statements (suggested by Brian Paul) Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-29 08:45:48 -06:00
Marek Olšák	ed24d79ed7	gallium/radeon: clear dirty_level_mask when discarding CMASK This fixes: GL45-CTS.texture_barrier.* Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2016-08-29 14:23:58 +02:00
Marek Olšák	d301efb400	tgsi/scan: remember sampler view types Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-29 14:16:57 +02:00
Nayan Deshmukh	5f0ea3db16	st/vdpau: use temporary buffers while applying filters Use temporary buffers so that we don't read and write to the same surface at the same time. We don't need to use linear layout now. v2: rebase the patch against reverted change Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-08-29 11:23:56 +02:00
Christian König	77e4424106	st/vdpau: Revert "change the order in which filters are applied(v3)" This reverts commit `09dff7ae2e`. Turned out this can cause some artifacts in the output. Let's revert it for now until we have sorted out all issues. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>	2016-08-29 11:23:51 +02:00
Iago Toral Quiroga	9c9f45b824	i965/vec4: remove the generator hack for dual instanced GS This hack was introduced in commit `03ac2c7223`: i965/gs: Fix up gl_PointSize input swizzling for DUAL_INSTANCED gs Specifically to fixup the code we emitted to deal with gl_PointSize inputs in dual instance mode, where we were emitting a MOV to copy the point size from .w (where the hardware delivers it) to .x (because code will expect this to be a float). This meant that we were emitting a MOV to an ATTR destination that could have a width of 4 (in dual instanced mode) so it was necessary to fix the execution size and regioning of the instruction. Fortunately, Ken fixed this in `67c5d00273`: i965/vec4/gs: Stop munging the ATTR containing gl_PointSize. by using a WWWW swizzle instead of a MOV, and as the commit log in that patch states, we no longer emit instructions with ATTR destinations, so that makes the fixup code in the generator unnecessary. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-29 08:09:09 +02:00
Timothy Arceri	22cec6dc5e	glsl: initialise pointer to NULL Fixes uninitialised warning and covery defect. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-29 13:13:42 +10:00
Ilia Mirkin	6a5504de2f	Update Khronos-supplied headers to r33100 As retrieved from opengl.org and khronos.org. Maintained the APPLE hack in GL/glext.h manually. Added gl32.h. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Dave Airlie <airlied@redhat.com>	2016-08-28 21:41:47 -04:00
Ilia Mirkin	d49a231c33	mesa: add EXT_texture_cube_map_array support This is identical to OES_texture_cube_map_array support. dEQP has tests which use this extension. Also it is part of AEP. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-28 21:38:55 -04:00
Ilia Mirkin	4ec1c2bb7f	mesa: remove OES_shader_io_blocks enable This extension should just be available whenever ES 3.1 is available. With the new extension verification infrastructure, it will only be enable-able on a #version 310 es shader, rendering the original reason for having a separate enable moot. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-28 21:38:55 -04:00
Ilia Mirkin	89e95d15f9	main: use KHR_blend_equation_advanced enable for ES 3.2 availability Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-08-28 21:38:55 -04:00
Ilia Mirkin	05b37e20de	main: add missing EXTRA_END in OES_sample_variables get check Fixes: `3002296cb6` (mesa: add GL_OES_shader_multisample_interpolation support) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: <mesa-stable@lists.freedesktop.org>	2016-08-28 21:38:55 -04:00
Jose Fonseca	09dafb9630	scons: Take indirect gl_and_es_API.xml dependencies in consideration. Same as `26a8f76ba1`. Trivial.	2016-08-27 22:59:06 +01:00
Ilia Mirkin	5b18e5fd7b	docs: sort extensions in relnotes Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-27 17:51:44 -04:00
Jason Ekstrand	fb89551047	isl: Allow multisampled array textures This probably isn't the only thing that needs to be done to get multisampled array textures working in Vulkan but I think this is all that ISL really needs and it does fix 8 of the new CTS tests. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-08-26 19:00:02 -07:00
Ian Romanick	cf7be70aa7	mesa/version: OpenGL ES 3.2 depends on OES_texture_cube_map_array This has a separate enable from ARB_texture_cube_map_array. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-26 15:03:15 -07:00
Ian Romanick	b387bc90c8	i965: Enable OES_texture_cube_map_array on Gen8+ These are the only platforms that current expose OES_geometry_shader. Once OpenGL ES 3.1 and OES_geometry_shader are enabled on Gen7, this extension can be enabled there as well. Gen6 will never get OpenGL ES 3.1, so it will never get this extension... even though it has the desktop OpenGL extension. Alas. NOTE: This causes a failure on Gen8+ platforms in ES3-CTS.gtf.GL3Tests.texture_storage.texture_storage_texture_targets. The test only fails because it doesn't know that 0x9009 is a valid value when the extension exists. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-26 15:03:15 -07:00
Ian Romanick	dc4f53b683	mesa: Add support for OES_texture_cube_map_array This has a separate enable flag because this extension also requires OES_geometry_shader. It is possible that some drivers may support OpenGL ES 3.1 and ARB_texture_cube_map but not support OES_geometry_shader. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-26 15:03:15 -07:00
Ian Romanick	87fa462ffd	mesa: Add and use _mesa_has_texture_cube_map_array helper Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-26 15:03:15 -07:00
Ian Romanick	66b988d09a	mesa: Use _mesa_has_ARB_texture_cube_map_array instead of open-coding it Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-26 15:03:15 -07:00
Ian Romanick	daf1a61e11	mesa: Cosmetic changes in legal_texobj_target Use bool instead of GLboolean and constify ctx. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-26 15:03:15 -07:00
Ian Romanick	d79c950eeb	mesa: Rearrange legal_texobj_target to look more like _mesa_legal_get_tex_level_parameter_target This makes it a bit easier to add support for more features in different APIs. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-26 15:03:15 -07:00
Ian Romanick	ef5bad09c4	glsl: Add and use has_texture_cube_map_array helper Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-26 15:03:15 -07:00
Ian Romanick	c879dbc4e4	glsl: Mark cube map array sampler types as reserved in GLSL ES 3.10 All the GLSL 4.x keywords were added to the list of reserved keywords in GLSL ES 3.10. As far as I can tell, these are the only ones that were missed. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-26 15:03:15 -07:00
Ian Romanick	8fb4af7789	glsl: Silence unused parameter warning glsl/lower_buffer_access.cpp:324:55: warning: unused parameter ‘var’ [-Wunused-parameter] ir_variable *var, ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-26 15:03:15 -07:00
Ian Romanick	63af53dcd3	i965: Enable GL_OES_geometry_shader on Gen8+ Gen7 can get this extension (and GL_OES_shader_io_blocks) as soon as the rest of OpenGL ES 3.1 is enabled. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-26 15:03:15 -07:00
Ian Romanick	259fc50545	glsl/linker: Fail linking on ES if uniform precision qualifiers don't match When GL_OES_geometry_shader is enabled, this fixes dEQP-GLES31.functional.shaders.linkage.geometry.uniform.rules.type_mismatch_1. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-26 15:03:15 -07:00
Ian Romanick	06201e4f1a	glsl: Allow invocations layout qualifier with GL_OES_geometry_shader Fixes dEQP-GLES31.functional.geometry_shading.instanced.geometry_1_invocations dEQP-GLES31.functional.geometry_shading.instanced.invocation_per_layer_2d_array dEQP-GLES31.functional.geometry_shading.instanced.invocation_per_layer_2d_multisample_array dEQP-GLES31.functional.geometry_shading.instanced.invocation_per_layer_3d dEQP-GLES31.functional.geometry_shading.instanced.invocation_per_layer_cubemap dEQP-GLES31.functional.geometry_shading.instanced.multiple_layers_per_invocation_2d_array dEQP-GLES31.functional.geometry_shading.instanced.multiple_layers_per_invocation_2d_multisample_array dEQP-GLES31.functional.geometry_shading.instanced.multiple_layers_per_invocation_3d dEQP-GLES31.functional.geometry_shading.instanced.multiple_layers_per_invocation_cubemap dEQP-GLES31.functional.geometry_shading.query.geometry_shader_invocations Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-26 15:03:14 -07:00
Ian Romanick	3a0ae7b55c	glsl: Allow gl_InvocationID and gl_Layer with GL_OES_geometry_shader Fixes dEQP-GLES31.functional.geometry_shading.layered.fragment_layer_2d_array dEQP-GLES31.functional.geometry_shading.layered.fragment_layer_2d_multisample_array dEQP-GLES31.functional.geometry_shading.layered.fragment_layer_3d dEQP-GLES31.functional.geometry_shading.layered.fragment_layer_cubemap v2: Don't enable gl_ViewportIndex in GLSL ES 3.20. Noticed by Ilia. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-26 15:03:14 -07:00
Ian Romanick	1a72fbf9e6	mesa: Allow GL_EXT_geometry_shader and GL_EXT_geometry_point_size Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-26 15:03:14 -07:00
Ian Romanick	658e90f9a8	mesa: Document reasons for allowing XFB drawing modes in GLES 3.1 w/GL_OES_geometry_shader Originally this patch added the checks to allow the draw calls with XFB, but commit `2dabd497` beat me to it. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-26 15:03:14 -07:00
Ian Romanick	aa228eb1a6	mesa: Remove redundant _mesa_has_shader_subroutine The checks in _mesa_has_shader_subroutine are slightly different than _mesa_has_ARB_shader_subroutine, but they're not different in a way that matters. The only way to have ctx->Version >= 40 is if ctx->Extensions.ARB_shader_subroutine is set. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-08-26 15:03:14 -07:00
Ian Romanick	0115f356ee	nouveau: Enable EXT_texture_env_dot3 on NV10 and NV20 GL_DOT3_RGB_EXT and GL_DOT3_RGBA_EXT. are nearly identical to GL_DOT3_RGB and GL_DOT3_RGBA. The only difference is the _EXT versions do not apply the post-scale. Just smash logscale to 0 so that RC_OUT_SCALE_1 is always used. NOTE: I have not actually tested this. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-26 15:03:14 -07:00
Ian Romanick	a7d92c3c0b	nouveau: Fix non-1x post-scale factor with DOT3 combiner Fixes long standing bug on NV10 and NV20 where using a non-1x RGB or A post-scale with GL_DOT3_RGB or GL_DOT3_RGBA texture environment would not work. The old combiner math uses HALF_BIAS_NORMAL and HALF_BIAS_NEGATE. The GL_NV_register_combiners defines these as HALF_BIAS_NORMAL_NV max(0.0, e) - 0.5 HALF_BIAS_NEGATE_NV -max(0.0, e) + 0.5 In order to get the correct result from the dot-product, the intermediate dot-product must be multiplied by 4. This is a literal implementation of the GL_ARB_texture_env_dot3 spec. It also requires using the register combiner post-scale. As a result, the post-scale cannot be used for the post-scale set by the application. The new combiner math uses EXPAND_NORMAL and EXPAND_NEGATE. The GL_NV_register_combiners defines these as EXPAND_NORMAL_NV 2.0 * max(0.0, e) - 1.0 EXPAND_NEGATE_NV -2.0 * max(0.0, e) + 1.0 Since this fully expands the value to [-1, 1] range, the intermediate dot-product result is the desired value. This leaves the register combiner post-scale available for application use. NOTE: I have not actually tested this. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-26 15:03:14 -07:00
Ian Romanick	f926cf5bd0	docs: Rename GL3.txt to features.txt Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Suggested-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-08-26 15:03:14 -07:00
Ian Romanick	8cd5c3cfe7	docs: Update GL3.txt for OpenGL 4.x on i965-ish hardware v2: Note that GL_KHR_blend_equation_advanced and GL_KHR_blend_equation_advanced_coherent are done. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-26 15:03:14 -07:00
Nicholas Bishop	d0a4c36dd6	docs: add links to clarify patch mailing section * Changed "Mesa mailing list" to "mesa-dev mailing list" to clarify which list patches should be sent to * Added an explicit link to https://lists.freedesktop.org/mailman/listinfo/mesa-dev to show where to subscribe to the list * Added a link to https://git-scm.com/docs/git-send-email to help new users of that command v2: add signed-off-by Signed-off-by: Nicholas Bishop <nicholasbishop@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-08-26 14:54:26 -07:00
Brian Paul	ea33df7b58	svga: minor whitespace, etc clean-ups in svga_pipe_misc.c Reviewed-by: Neha Bhende <bhenden@vmware.com>	2016-08-26 14:20:19 -06:00
Brian Paul	8433b43337	svga: move some code in svga_propagate_surface() Move computation of zslice, layer inside the conditional where they're used. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2016-08-26 14:20:19 -06:00
Brian Paul	1a10b37ac3	svga: simplify surface propagation code in svga_set_framebuffer_state() Rewrite the comment too. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2016-08-26 14:20:19 -06:00
Brian Paul	bb7f094b37	svga: add some comments in the svga_surface struct Give more info about backing resources/surfaces. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2016-08-26 14:20:19 -06:00
Brian Paul	dcf63339e7	svga: use new svga_check_sampler_framebuffer_resource_collision() Reviewed-by: Neha Bhende <bhenden@vmware.com>	2016-08-26 14:20:19 -06:00
Brian Paul	ff500ed5a1	svga: add new svga_check_sampler_framebuffer_resource_collision() Reviewed-by: Neha Bhende <bhenden@vmware.com>	2016-08-26 14:20:19 -06:00
Brian Paul	d3d20d650d	svga: remove assertions in svga_surface cast wrappers We don't do this for other cast wrappers. And this will simplify some code at call sites. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2016-08-26 14:20:19 -06:00
Brian Paul	c6e89fa215	svga: minor code simplification in svga_texture_transfer_unmap() Use the tex variable instead of using svga_texture() again. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2016-08-26 14:20:19 -06:00
Brian Paul	fe5a2704ec	svga: reformat some expressions in svga_texture_transfer_map() Reviewed-by: Neha Bhende <bhenden@vmware.com>	2016-08-26 14:20:19 -06:00
Brian Paul	10ef6ddcf9	svga: remove duplicated variable in svga_texture_transfer_map() tex was already declared at the function body scope. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2016-08-26 14:20:19 -06:00
Brian Paul	09d2780b39	svga: move some assignments in svga_texture_transfer_map() Put near other assignments to the svga_transfer variable. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2016-08-26 14:20:18 -06:00
Brian Paul	4a52512666	svga: minor simplifications in svga_texture_transfer_map() Use local vars instead of jumping through a pointer. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2016-08-26 14:20:18 -06:00
Brian Paul	088dd8f45e	svga: minor reformatting of svga_texture() cast wrapper Reviewed-by: Neha Bhende <bhenden@vmware.com>	2016-08-26 14:20:18 -06:00
Brian Paul	e206f67261	svga: rewrite svga_buffer() cast wrapper To make it symmetric with the svga_texture() cast wrapper. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2016-08-26 14:20:18 -06:00
Brian Paul	c72dcd9a71	svga: remove local variable in create_backed_surface_view() To simplify the code a bit. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2016-08-26 14:20:18 -06:00
Kenneth Graunke	bc13e5f42a	docs: Add GL_KHR_blend_equation_advanced to relnotes.	2016-08-26 13:17:22 -07:00
Mario Kleiner	2cc880cba5	r600: increase performance for DRI PRIME offloading if 2nd GPU is Evergreen+ This is a direct port of Marek Olšáks patch "radeonsi: increase performance for DRI PRIME offloading if 2nd GPU is CIK or VI" to r600. It uses SDMA for the detiling blit from renderoffload VRAM to GTT, as SDMA is much faster for tiled->linear blits from VRAM to GTT. Testing on a dual Radeon HD-5770 setup reduced the time for the render offload gpu to get its rendering into system RAM from approximately 16 msecs for simple rendering at 1920x1080 pixel 32 bpp to 5 msecs, a > 3x speedup! This was measured using ftrace to trace the time the radeon kms driver waited on the dmabuf fence of the renderoffload gpu to complete. All in all this brought the time for a flip down from 20 msecs to 9 msecs, so the prime setup can display at full 60 fps instead of barely 30 fps vsync'ed. The current r600 implementation supports SDMA on Evergreen and later, but not R600/R700 due to some bugs apparently present in their SDMA implementation. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Cc: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-08-26 19:57:21 +02:00
Jordan Justen	7970238fcf	docs: Update stencil texturing & ES 3.1 status for i965 Haswell Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-26 10:09:22 -07:00
Jordan Justen	93f5eb7ae7	i965: Enable OpenGLES 3.1 for Haswell Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-26 10:09:22 -07:00
Jordan Justen	116b6e12d4	i965: Enable ARB_texture_stencil8 for Haswell Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-26 10:09:22 -07:00
Jordan Justen	f20f616324	i965: Enable ARB_stencil_texturing for Haswell Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-26 10:09:22 -07:00
Jordan Justen	751682434e	i965/gen7: Use R8_UINT stencil copy when sampling the stencil texture v2: * Check gen <= 7, rather than gen == 7. (Ian) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-26 10:09:22 -07:00
Jordan Justen	8d78b096f8	i965/gen7: Copy stencil when sampling the stencil texture Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-26 10:09:22 -07:00
Jordan Justen	7af51b8f03	i965: Add function to copy a stencil miptree to an R8_UINT miptree v2: * Cleanups suggested by Ian, Matt and Topi Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-26 10:09:22 -07:00
Jordan Justen	c8194dc737	i965: Track that the stencil data was updated when using Tex*Image Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-26 10:09:22 -07:00
Jordan Justen	101b56bab2	i965: Track that the stencil data was updated when rendering Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-26 10:09:22 -07:00
Jordan Justen	7bd87c1e6e	i965: Track that the stencil data was updated when clearing Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-26 10:09:22 -07:00
Jordan Justen	2a9c65a01d	i965/gen7: Add R8_UINT stencil miptree copy for sampling For gen < 8, we can't sample from the stencil buffer, which is required for the ARB_stencil_texturing extension. We'll make a copy of the stencil data into a new texture that we can sample using the R8_UINT surface type. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-26 10:09:22 -07:00
Jordan Justen	91627d1956	i965: Fix assert with multisampling and cubemaps Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-26 10:09:22 -07:00
Jordan Justen	b82bb98441	i965/hsw: Adjust uploading default color for stencil surfaces v2: * has_component (Ken); const bits_per_channel (Topi) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-26 10:09:22 -07:00
Jordan Justen	30fee52036	i965/hsw: Don't advertise more than 64 threads for compute shaders thread_width_max in the GPGPU walker command limits us to a maximum of 64 threads. This fixes a crash on Haswell in the OpenGLES 3.1 conformance test suite which tests the advertised limits of the max invocation counts. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-26 10:09:22 -07:00
Jordan Justen	861c9cbee3	main: Add MESA_VERBOSE=api support for glClearStencil Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-26 10:09:22 -07:00
Jordan Justen	9a1f950bef	main: Add MESA_VERBOSE=api support for glTexImage Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-26 10:09:22 -07:00
Charmaine Lee	0035f7f136	svga: add guest statistic gathering interface This file was supposed to be added with the previous "svga: add guest statistic gathering interface" patch but went MIA for some reason. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-26 08:04:02 -06:00
Marek Olšák	49c798e902	radeonsi: disable CE on SI + AMDGPU Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-08-26 15:50:10 +02:00
Marek Olšák	281f1a5980	winsys/amdgpu: disable IB chaining on SI Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-08-26 15:50:10 +02:00
Marek Olšák	a6869e7c06	winsys/amdgpu: finish up SI addrlib integration Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-08-26 15:50:10 +02:00
Ronie Salgado	97b55243fb	winsys/amdgpu: initial SI support Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-08-26 15:50:10 +02:00
Marek Olšák	971ef7518f	gallium/radeon: add a driver query for AMDGPU_INFO_NUM_EVICTIONS If the kernel driver doesn't support it, it returns 0. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-08-26 15:50:10 +02:00
Marek Olšák	7172906c0c	radeonsi: fix printing shaders and states on a VM fault This was missed while rewriting the PIPE_DUMP flags. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-08-26 15:50:10 +02:00
Marek Olšák	5ee3cac138	radeonsi: increase performance for DRI PRIME offloading if 2nd GPU is CIK or VI SDMA is much faster for tiled->linear blits from VRAM to GTT. I have Bonaire in my second PCIe slot. $ glxinfo \| grep OpenGL.renderer OpenGL renderer string: Gallium 0.4 on AMD TONGA ... $ DRI_PRIME=1 glxinfo \| grep OpenGL.renderer OpenGL renderer string: Gallium 0.4 on AMD BONAIRE ... Without SDMA: $ DRI_PRIME=1 glxgears 8796 frames in 5.0 seconds = 1759.074 FPS 8899 frames in 5.0 seconds = 1779.672 FPS With SDMA: $ DRI_PRIME=1 glxgears 12765 frames in 5.0 seconds = 2552.788 FPS 12888 frames in 5.0 seconds = 2577.495 FPS The 1st GPU is irrelevant. The improvement should be much lower at 60 fps, but definitely measurable. SI will get this once we add SDMA blit support for it. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-08-26 15:50:10 +02:00
Marek Olšák	0241d8300f	radeonsi: enable SDMA on CIK It passes R600_DEBUG=testdma on Bonaire/radeon. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-08-26 15:50:10 +02:00
Marek Olšák	bcfd49e511	gallium/radeon: increase priority for shader binaries Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-08-26 15:50:10 +02:00
Marek Olšák	c3f716fe67	gallium/radeon: merge USER_SHADER and INTERNAL_SHADER priority flags there's no reason to separate these Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-08-26 15:50:10 +02:00
Miklós Máté	b9ac72b511	vbo: set draw_id Fixes conditional jump depending on uninitialized value in si_state_draw.c:593 Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Miklós Máté <mtmkls@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-26 07:34:22 -06:00
Neha Bhende	10f6e08549	svga: fix regression related to srgb This regression is caused because of commit `3190c7ee97` Regression caused by following OpenGL 4.4 spec rules relates to GL_FRAMEBUFFER_SRGB in Mesa. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-26 06:19:52 -06:00
Neha Bhende	3b7341d547	svga: use local variable blit instead of pointer Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-26 06:19:52 -06:00
Brian Paul	b09e4ab13c	svga: s/INDEX_0D/INDEX_IMMEDIATE32/ Both are zero, but the later is the right token.	2016-08-26 06:19:52 -06:00
Brian Paul	93779b87a1	svga: add comment about unsupported blend modes	2016-08-26 06:19:52 -06:00
Charmaine Lee	b1772651b7	svga: fix ordering of mksstats counter strings String for SVGA_STATS_COUNT_TEXREADBACK was swapped with the string for SVGA_STATS_COUNT_SURFACEWRITEFLUSH. Trivial fix.	2016-08-26 06:19:52 -06:00
Charmaine Lee	2781d60375	svga: avoid emitting redundant SetShaderResource command Tested with Lightsmark2008, Heaven, MTT piglit, glretrace, viewperf, conform. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-26 06:19:52 -06:00
Charmaine Lee	5313b294e6	svga: add a cleanup function to clean up sampler state This patch adds a cleanup function to clean up sampler state at context destruction time. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-26 06:19:52 -06:00
Brian Paul	e292f38c6c	svga: loosen the condition to flush in get_query_result_vgpu10() Fixes piglit spec/ext_transform_feedback/overflow-edge-cases segfaults because the query's fence pointer was null. Tested with Piglit, Sauerbraten, ETQW. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-26 06:19:52 -06:00
Brian Paul	99d8fe20ab	svga: fix vgpu10 query fencing We don't want to flush the command buffer or sync on the fence when ending a query (that kind of defeats the whole purpose of async queries). Do that instead in get_query_result(). Tested with Piglit, arbocclude, Sauerbraten game, Nobel Clinician Viewer, ETQW. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-26 06:19:52 -06:00
Charmaine Lee	3f51a3f6ac	svga: avoid emitting redundant DXSetSamplers command This patch avoid emitting redundant DXSetSamplers command. Tested with Lightsmark2008, Heaven, MTT piglit, glretrace, viewperf. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-26 06:19:52 -06:00
Neha Bhende	6a43148e20	svga: enable ARB_clear_texture extension in the driver. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-26 06:19:52 -06:00
Neha Bhende	2111795d51	svga: define svga_clear() in svga_init_clear_functions() Put all the clearing related functions in svga_init_clear_functions() Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-26 06:19:51 -06:00
Neha Bhende	40557ae07c	svga: add svga_init_clear_functions() define svga_init_clear_functions() and svga_clear_texture as svga->pipe.clear_texture. This is part of ARB_clear_texture extension Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-26 06:19:51 -06:00
Neha Bhende	52d88b67be	svga: add new function svga_clear_texture() To clear texture this function can be used. This is part of ARB_clear_texture extension. Basically this extension allows you to clear texture with given color values. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-26 06:19:51 -06:00
Neha Bhende	1da538f85b	svga: add new begin_blit() Saving all blitter states will be done in begin_blit() so that begin_blit() can be used before performing any blit operation. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-26 06:19:51 -06:00
Charmaine Lee	a5fd54f8bf	svga: add opt to the list of valid build types For opt build, add VMX86_STATS to the list of cpp defines. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-26 06:19:51 -06:00
Charmaine Lee	2e1cfcc431	svga: add guest statistic gathering interface With this patch, guest statistic gathering interface is added to svga winsys interface that can be used to gather svga driver statistic. The winsys module can then share the statistic info with the VMX host via the mksstats interface. The statistic enums used in the svga driver are defined in svga_stats_count and svga_stats_time in svga_winsys.h Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-26 06:19:51 -06:00
Charmaine Lee	4791991808	svga: fix indirect non-indexable temp access If the shader has indirect access to non-indexable temporaries, convert these non-indexable temporaries to indexable temporary array. This works around a bug in the GLSL->TGSI translator. Fixes glsl-1.20/execution/fs-const-array-of-struct-of-array.shader_test on DX11Renderer. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-26 06:19:51 -06:00
Brian Paul	d221a6545c	gallium/hud: move signo declaration inside PIPE_OS_UNIX block To silence unused var warning with MSVC, MinGW. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-26 06:19:51 -06:00
Chris Wilson	f92a87a140	i965: Embrace "unlimited" GTT mmap support From about kernel 4.9, GTT mmaps are virtually unlimited. A new parameter, I915_PARAM_MMAP_GTT_VERSION, is added to advertise the feature so query it and use it to avoid limiting tiled allocations to only fit within the mappable aperture. A couple of caveats: - fence support is still limited by stride to 262144 and the stride needs to be a multiple of tile_width (as before, and same limitation as the current 3D pipeline in hardware) - the max_gtt_map_object_size forcing untiled may be hiding a few bugs in handling of large objects, though none were spotted in piglits. See kernel commit 4cc6907501ed ("drm/i915: Add I915_PARAM_MMAP_GTT_VERSION to advertise unlimited mmaps"). v2: Include some commentary on mmap virtual space vs CPU addressable space. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2016-08-26 09:09:34 +01:00
Tobias Klausmann	bc5be5323f	mesa/main: Fix missing return in non void function This was found by obs: I: Program returns random data in a function E: Mesa no-return-in-nonvoid-function main/program_resource.c:109 v2: Remove the ! on the string (Ian Romanick) Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-26 08:46:03 +02:00
Kenneth Graunke	219a451497	i965: Implement GL_KHR_blend_equation_advanced_coherent on Gen9+. We always use a coherent read, and ignore the "opt out" enable flag. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-25 19:22:10 -07:00
Kenneth Graunke	1bf9b2a600	mesa: Implement GL_KHR_blend_equation_advanced_coherent. This adds the extension enable (so drivers can advertise it) and the extra boolean state flag, GL_BLEND_ADVANCED_COHERENT_KHR, which can be set to request coherent blending. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-25 19:22:10 -07:00
Kenneth Graunke	c2b10cabed	i965: Enable GL_KHR_blend_equation_advanced on G45 and later. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-25 19:22:10 -07:00
Kenneth Graunke	40241d40d0	i965: Disable hardware blending if advanced blending is in use. We'll do blending in the shader in this case, so just disable the hardware blending. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-25 19:22:10 -07:00
Kenneth Graunke	8ab50f5dd1	glsl: Add a lowering pass to handle advanced blending modes. Many GPUs cannot handle GL_KHR_blend_equation_advanced natively, and need to emulate it in the pixel shader. This lowering pass implements all the necessary math for advanced blending. It fetches the existing framebuffer value using the MESA_shader_framebuffer_fetch built-in variables, and the previous commit's state var uniform to select which equation to use. This is done at the GLSL IR level to make it easy for all drivers to implement the GL_KHR_blend_equation_advanced extension and share code. Drivers need to hook up MESA_shader_framebuffer_fetch functionality: 1. Hook up the fb_fetch_output variable 2. Implement BlendBarrier() Then to get KHR_blend_equation_advanced, they simply need to: 3. Disable hardware blending based on ctx->Color._AdvancedBlendEnabled 4. Call this lowering pass. Very little driver specific code should be required. v2: Handle multiple output variables per render target (which may exist due to ARB_enhanced_layouts), and array variables (even with one render target, we might have out vec4 color[1]), and non-vec4 variables (it's easier than finding spec text to justify not handling it). Thanks to Francisco Jerez for the feedback. v3: Lower main returns so that we have a single exit point where we can add our blending epilogue (caught by Francisco Jerez). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-25 19:22:10 -07:00
Kenneth Graunke	e299661166	compiler: Add a new STATE_VAR_ADVANCED_BLENDING_MODE built-in uniform. This will be used for emulating GL_KHR_advanced_blend_equation features in shader code. We'll pass in the blending mode that's in use, and use that in (effectively) a switch statement in the shader. v2: Use the new _AdvancedBlendMode field. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-25 19:22:10 -07:00
Kenneth Graunke	acf57fcf7f	mesa: Add draw time validation for advanced blending modes. v2: Add null checks (requested by Curro). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-25 19:22:10 -07:00
Kenneth Graunke	75ae338d14	mesa: Restyle _mesa_check_blend_func_error(). I'm about to add more error conditions to this function, so I wanted to move the current spec citation above the code that checks it. Indenting it required reformatting, so I tried to move it to our newer style. While there, I also decided to drop some GL type usage, and drop the unnecessary "_mesa_" prefix on a static function. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-25 19:22:09 -07:00
Kenneth Graunke	0745e039a2	mesa: Track the current advanced blending mode. This will be useful for a number of things: - Checking the current advanced blending mode against the shader's blend_support_* qualifiers. - Disabling hardware blending when emulating advanced blending. - Uploading the current advanced blending mode as a state var. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-25 19:22:09 -07:00
Kenneth Graunke	74837e3e91	mesa: Allow advanced blending enums in glBlendEquation[i]. Don't allow them in glBlendEquationSeparate[i], though, as required by the spec. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-25 19:22:09 -07:00
Kenneth Graunke	80df3c030e	glsl: Merge blend_support qualifiers when linking. Since each qualifier represents a blending mode the shader can be used with, we take the union of all possible modes when linking. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-25 19:22:09 -07:00
Ilia Mirkin	4b6819b407	glsl: process blend_support_* qualifiers v2 (Ken): Add a BLEND_NONE enum value (no qualifiers in use). v3 (Ken): Rename gl_blend_support_qualifier to gl_advanced_blend_mode. v4 (Ken): Mark map[] as static const (Ilia). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-25 19:22:09 -07:00
Ilia Mirkin	e682f94594	glsl: add basic KHR_blend_equation_advanced infrastructure Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-25 19:22:09 -07:00
Ilia Mirkin	3b0406457a	mesa: add KHR_blend_equation_advanced enable and extension string Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-25 19:22:09 -07:00
Ilia Mirkin	a8ae1bc767	glapi: add KHR_blend_equation_advanced dispatch v2 (Ken): Fix enum values, drop _mesa_BlendBarrierKHR stub as Curro has already implemented it. v3 (Ken): Rework for _mesa_BlendBarrierKHR -> _mesa_BlendBarrier rename. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-25 19:22:09 -07:00
Kenneth Graunke	1a1f4496c6	mesa: Rename _mesa_BlendBarrierMESA to _mesa_BlendBarrier. Note that _mesa_BlendBarrierMESA is not currently hooked up in the glapi XML, so we can just rename it. We'll hook it up for the KHR_blend_equation_advanced extension shortly. We may as well use the ES 3.2 core name with no suffixes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-25 19:22:09 -07:00
Kenneth Graunke	c2fd6b0f5d	i965: Safely iterate the predecessors of the end block. We want to insert code in each of the predecessors of the end block. This code includes a nir_if, which would split the block, altering the set. To avoid that, I emitted a dead constant at the end of each block before splitting it, so that the set of predecessors remained unchanged. This was admittedly ugly. Connor suggested instead saving a copy of the set, so we can iterate it safely. This is also a little ugly, but a much better plan. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-08-25 19:18:24 -07:00
Kenneth Graunke	3203fe3d50	nir: Use nir_shader_get_entrypoint in TCS quad workaround code. We want to insert the code at the end of the program. Looping over all the functions (of which there was only one) was the old way of doing this, but now we have nir_shader_get_entrypoint(), so let's use it. Suggested by Connor Abbott. v2: Update for nir_shader_get_entrypoint API change. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-08-25 19:18:24 -07:00
Kenneth Graunke	93bfa1d7a2	nir: Change nir_shader_get_entrypoint to return an impl. Jason suggested adding an assert(function->impl) here. All callers of this function actually want ->impl, so I decided just to change the API. We also change the nir_lower_io_to_temporaries API here. All but one caller passed nir_shader_get_entrypoint(), and with the previous commit, it now uses a nir_function_impl internally. Folding this change in avoids the need to change it and change it back. v2: Fix one call I missed in ir3_compiler (caught by Eric). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-08-25 19:18:24 -07:00
Kenneth Graunke	8479b03c58	nir: Make nir_lower_io_to_temporaries store an impl internally. This changes the pass internals to work with a nir_function_impl directly rather than a nir_function. The next patch will change the API. v2: Rebase after framebuffer fetch landed. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-08-25 19:18:11 -07:00
Francisco Jerez	da85b5a9f1	i965: Expose shader framebuffer fetch extensions on Gen9+. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:09 -07:00
Francisco Jerez	4135fc22ff	i965/fs: Hook up coherent framebuffer reads to the NIR front-end. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:09 -07:00
Francisco Jerez	be12a1f36e	i965/fs: Remove special casing of framebuffer writes in scheduler code. The reason why it was safe for the scheduler to ignore the side effects of framebuffer write instructions was that its side effects couldn't have had any influence on any other instruction in the program, because we weren't doing framebuffer reads, and framebuffer writes were always non-overlapping. We need actual memory dependency analysis in order to determine whether a side-effectful instruction can be reordered with respect to other instructions in the program. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:09 -07:00
Francisco Jerez	3daa0fae4b	i965/fs: Don't CSE render target messages with different target index. We weren't checking the fs_inst::target field when comparing whether two instructions are equal. For FB writes it doesn't matter because they aren't CSE-able anyway, but this would have become a problem with FB reads which are expression-like instructions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:08 -07:00
Francisco Jerez	db123df747	i965/fs: Define logical framebuffer read opcode and lower it to physical reads. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:08 -07:00
Francisco Jerez	f2f75b0cf0	i965/fs: Define framebuffer read virtual opcode. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:08 -07:00
Francisco Jerez	71d639f69e	i965/disasm: Fix RC message type strings on Gen7+. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:08 -07:00
Francisco Jerez	26ac16fe2f	i965/eu: Add codegen support for the Gen9+ render target read message. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:08 -07:00
Francisco Jerez	29eb8059fd	i965/eu: Take into account the target cache argument in brw_set_dp_read_message. brw_set_dp_read_message() was setting the data cache as send message SFID on Gen7+ hardware, ignoring the target cache specified by the caller. Some of the callers were passing a bogus target cache value as argument relying on brw_set_dp_read_message not to take it into account. Fix them too. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:08 -07:00
Francisco Jerez	8a2f19a777	i965: Flip the non-coherent framebuffer fetch extension bit on G45-Gen8 hardware. This is not enabled on the original Gen4 part because it lacks surface state tile offsets so it may not be possible to sample from arbitrary non-zero layers of the framebuffer depending on the miptree layout (it should be possible to work around this by allocating a scratch surface and doing the same hack currently used for render targets, but meh...). On Gen9+ even though it should mostly work (feel free to force-enable it in order to compare the coherent and non-coherent paths in terms of performance), there are some corner cases like 1D array layered framebuffers that cannot be handled easily by the non-coherent path because of the incompatible layout in memory of 1D and 2D miptrees (it should be possible to work around this too by doing state-dependent recompiles, but it's hard to care enough since Gen9 has native support for coherent render target reads...) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:08 -07:00
Francisco Jerez	ecc4800383	i965: Implement glBlendBarrier. This is a no-op if the platform supports coherent framebuffer fetch, -- If it doesn't we just need to flush the render cache and invalidate the texture cache in order for previous rendering to be visible to framebuffer fetch. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:08 -07:00
Francisco Jerez	786108e7b2	i965: Upload surface state for non-coherent framebuffer fetch. This iterates over the list of attached render buffers and binds appropriate surface state structures to the binding table block allocated for shader framebuffer read. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:08 -07:00
Francisco Jerez	dc96968dbf	i965: Implement support for overriding the texture target in brw_emit_surface_state. This allows the caller to bind a miptree using a texture target other than the one it it was created with. The code should work even if the memory layouts of the specified and original targets don't match, as long as the caller only intends to access a single slice of the miptree structure. This will be exploited by the next commit in order to support non-coherent framebuffer fetch of a single layer of a 3D texture (since some generations lack the minimum array element control for 3D textures bound to the sampler unit), and multiple layers of a 1D array texture (since binding it as an actual 1D array texture would require state-dependent recompiles because the same shader couldn't simultaneously work for 1D and 2D array textures due to the different texel fetch coordinate ordering). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:07 -07:00
Francisco Jerez	49ea2bd175	i965: Massage argument list of brw_emit_surface_state(). This commit does three different things in a single pass in order to keep the amount of churn low: Remove the for_gather boolean argument which was unused, pass the isl_view argument by value rather than by reference since I'll have to modify it from within the function, and add a target argument to allow callers to bind textures using a target other than the original. The prototype of the function now looks like: void brw_emit_surface_state(struct brw_context brw, struct intel_mipmap_tree mt, GLenum target, struct isl_view view, uint32_t mocs, uint32_t *surf_offset, int surf_index, unsigned read_domains, unsigned write_domains); Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:07 -07:00
Francisco Jerez	74e4baec59	i965: Add missing has_surface_tile_offset flag to the Gen8+ device info structures. This surface state control has been supported by all hardware generations since G45. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:07 -07:00
Francisco Jerez	0fe732e66f	i965: Return the correct layout from get_isl_dim_layout for pre-ILK cube textures. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:07 -07:00
Francisco Jerez	5759eb458b	i965: Factor out isl_surf_dim/isl_dim_layout calculation into functions. The logic to calculate the right layout and dimensionality for a given GL texture target is going to be useful elsewhere, factor it out from intel_miptree_get_isl_surf(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:07 -07:00
Francisco Jerez	99fb167839	i965: Resolve color for non-coherent FB fetch at UpdateState time. This is required because the sampler unit used to fetch from the framebuffer is unable to interpret non-color-compressed fast-cleared single-sample texture data. Roughly the same limitation applies for surfaces bound to texture or image units, but unlike texture sampling, non-coherent framebuffer fetch is by definition non-coherent with previous rendering, so the brw_render_cache_set_check_flush() call can be omitted except after resolve. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:07 -07:00
Francisco Jerez	071665c161	i965: Return whether the miptree was resolved from intel_miptree_resolve_color(). This will allow optimizing out the cache flush in some cases when resolving wasn't necessary. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:07 -07:00
Francisco Jerez	f24e393bd5	i965/fs: Translate nir_intrinsic_load_output on a fragment output. This gets the non-coherent framebuffer fetch path hooked up to the NIR front-end. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:07 -07:00
Francisco Jerez	b00a236d6a	i965/fs: Allocate fragment output temporaries on demand. This gets rid of the duplication of logic between nir_setup_outputs() and get_frag_output() by allocating fragment output temporaries lazily whenever get_frag_output() is called. This makes nir_setup_outputs() a no-op for the fragment shader stage. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:06 -07:00
Francisco Jerez	7dac882073	i965/fs: Rework representation of fragment output locations in NIR. The problem with the current approach is that driver output locations are represented as a linear offset within the nir_outputs array, which makes it rather difficult for the back-end to figure out what color output and index some nir_intrinsic_load/store_output was meant for, because the offset of a given output within the nir_output array is dependent on the type and size of all previously allocated outputs. Instead this defines the driver location of an output to be the pair formed by its GLSL-assigned location and index (I've borrowed the bitfield macros from brw_defines.h in order to represent the pair of integers as a single scalar value that can be assigned to nir_variable_data::driver_location). nir_assign_var_locations is no longer useful for fragment outputs. Because fragment outputs are now allocated independently rather than within the nir_outputs array, the get_frag_output() helper becomes necessary in order to obtain the right temporary register for a given location-index pair. The type_size helper passed to nir_lower_io is now type_size_dvec4 rather than type_size_vec4_times_4 so that output array offsets are provided in terms of whole array elements rather than in terms of scalar components (dvec4 is the largest vector type supported by the GLSL so this will cause all individual fragment outputs to have a size of one regardless of the type). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:06 -07:00
Francisco Jerez	4e990b67ce	i965: Fix undefined signed overflow in INTEL_MASK for bitfields of 31 bits. Most likely we had only ever used this macro on bitfields of less than 31 bits -- That's going to change shortly. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:06 -07:00
Francisco Jerez	f3cb2c34f2	i965/fs: Special-case nir_intrinsic_store_output for the fragment shader. I'm about to change how fragment shader output locations are represented, so the generic nir_intrinsic_store_output implementation that assumes that outputs are just contiguous elements in the big nir_outputs array won't work anymore. This somewhat simplified implementation of nir_intrinsic_store_output for fragment shaders should be functionally equivalent to the current fall-back one. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:06 -07:00
Francisco Jerez	af0cc743e6	i965/fs: Implement non-coherent framebuffer fetch using the sampler unit. v2: Memoize sample ID, misc codestyle changes. (Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:06 -07:00
Francisco Jerez	fe6abb5755	i965/fs: Emit interpolation setup if non-coherent framebuffer fetch is in use. This will be required for the next commit since the non-coherent path makes use of the fragment coordinates implicitly, so they need to be calculated. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:06 -07:00
Francisco Jerez	98d61ee083	i965/fs: Force per-sample dispatch if the shader reads from a multisample FBO. The result of a framebuffer fetch from a multisample FBO is inherently per-sample, so the spec requires at least those sections of the shader that depend on the framebuffer fetch result to be executed once per sample. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:06 -07:00
Francisco Jerez	08705badfe	i965: Allocate space in the binding table for non-coherent FB fetch. Unfortunately due to the inconsistent meaning of some surface state structure fields, we cannot re-use the same binding table entries for sampling from and rendering into the same set of render buffers, so we need to allocate a separate binding table block specifically for render target reads if the non-coherent path is in use. The slight noise is due to the change of brw_assign_common_binding_table_offsets to return the next available binding table index rather than void. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:06 -07:00
Francisco Jerez	40b23ad57e	i965/fs: Add brw_wm_prog_key bit specifying whether FB reads should be coherent. Some of the following changes in this series are specific to the non-coherent path, so I need some way to tell whether the coherent or non-coherent path is in use. The flag defaults to the value of the gl_extensions::MESA_shader_framebuffer_fetch enable so that it can be overridden easily on hardware that supports both framebuffer fetch extensions in order to test the non-coherent path, like: MESA_EXTENSION_OVERRIDE=-GL_EXT_shader_framebuffer_fetch (Of course trying to force-enable the coherent framebuffer fetch extension on hardware without native support won't work and lead to assertion failures). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:06 -07:00
Francisco Jerez	4a87e4ade7	i965/fs: Get rid of fs_visitor::do_dual_src. This boolean flag was being used for two different things: - To set the brw_wm_prog_data::dual_src_blend flag. Instead we can just set it based on whether the dual_src_output register is valid, which will be the case if the shader writes the secondary blending color. - To decide whether to call emit_single_fb_write() once, or in a loop that would iterate only once, which seems pretty useless. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:36:00 -07:00
Francisco Jerez	aee3d8f0d9	nir: Handle FB fetch outputs correctly in nir_lower_io_to_temporaries. This requires emitting a series of copies at the top of the program from each output variable to the corresponding temporary. The initial copy can be skipped for non-framebuffer fetch outputs whose initial value is undefined, and the final copy needs to be skipped for read-only outputs (i.e. gl_LastFragData), since it would be illegal to emit a store output intrinsic for it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:33:29 -07:00
Francisco Jerez	97ac3eba58	nir: Pass through fb_fetch_output and OutputsRead from GLSL IR. The NIR representation of framebuffer fetch is the same as the GLSL IR's until interface variables are lowered away, at which point it will be translated to load output intrinsics. The GLSL-to-NIR pass just needs to copy the bits over to the NIR program. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-25 18:33:29 -07:00
Eric Anholt	00c72acba5	vc4: Add support for fddx/fddy Based vaguely on a patch by jonasarrow on github.	2016-08-25 17:24:11 -07:00
Eric Anholt	e763e19808	vc4: Add register allocation support for MUL output rotation. We need the source to be in r0-r3, so make a new register class for it. It will be up to the surrounding passes to make sure that the r0-r3 allocation of its source won't conflict with anything other class requirements on that temp.	2016-08-25 17:24:11 -07:00
Eric Anholt	8ce6526178	vc4: Add support for MUL output rotation. Extracted from a patch by jonasarrow on github.	2016-08-25 17:24:11 -07:00
Eric Anholt	074f1f3c0c	vc4: Add support for the 2-bit LOAD_IMM variants. Extracted and fixed up from a patch by jonasarrow on github. This ended up not getting used for ddx/ddy, but seems like it might still be useful.	2016-08-25 17:24:11 -07:00
Eric Anholt	3da4e38f48	vc4: Add QPU scheduling to handle MUL rotate sources. We need MUL rotates to do ddx/ddy support.	2016-08-25 17:24:11 -07:00
Eric Anholt	b0b99a7952	vc4: Add disassembly for constant MUL rotates	2016-08-25 17:24:11 -07:00
Eric Anholt	b160708e03	vc4: Add real validation for MUL rotation. Caught problems in the upcoming DDX/DDY implementation.	2016-08-25 17:24:11 -07:00
Eric Anholt	31da39ddc9	vc4: Add a QIR value for the QPU element register. This will be used in the ddx/ddy support for "Am I the top half?" or "Am I the left half?" checks.	2016-08-25 17:24:11 -07:00
Chad Versace	5b03975889	i965: Respect miptree offsets in intel_readpixels_tiled_memcpy() Respect intel_miptree_slice::x_offset,y_offset and intel_mipmap_tree::offset. All three may be non-zero when glReadPixels is called on an EGLImage created from the non-base slice of a miptree. Patch 2/2 that fixes test 'dEQP-EGL.functional.image.create.gles2_cubemap_*'. Reported-by: Haixia Shi <hshi@chromium.org> Diagnosed-by: Haixia Shi <hshi@chromium.org> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Change-Id: I4b397b27e55a743a7094d29fb0a6a4b6b34352b0	2016-08-25 16:52:00 -07:00
Chad Versace	c82f99e883	i965: Fix miptree layout for EGLImage-based renderbuffers When glEGLImageTargetRenderbufferStorageOES() was given an EGLImage created from the non-base slice of a miptree, intel_image_target_renderbuffer_storage() forgot to apply the intra-tile offsets __DRIimage::tile_x,tile_y to the miptree layout. This patch fixes the problem with a quick hack suitable for cherry-picking. A proper fix requires more thorough plumbing in intel_miptree_create_layout() and brw_tex_layout(). Patch 1/2 that fixes test 'dEQP-EGL.functional.image.create.gles2_cubemap_*'. Reported-by: Haixia Shi <hshi@chromium.org> Diagnosed-by: Haixia Shi <hshi@chromium.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org Change-Id: I8a64b0048a1ee9e714ebb3f33fffd8334036450b	2016-08-25 16:52:00 -07:00
Jason Ekstrand	bebc1a1d99	intel: Flatten the makefile structure This pulls isl and genxml into a single make file so that they can properly build in parallel. This isn't terribly important now as genxml just generates sources which happens serially first anyway but it will be more important as we add more stuff to src/intel. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-25 15:29:48 -07:00
Jason Ekstrand	c19fc5e019	isl/tests: Use a longer path for isl.h The tests assumed that isl would be in the include path but that usually isn't the case. Instead, we usually have src/intel and you need to add an "isl/" prefix. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-25 15:29:47 -07:00
Jason Ekstrand	8bdf605214	intel/isl/gen9: Only use the magic 1D alignment for GEN9_1D surfaces If the surface has a layout of GEN4_2D then we need to compute a normal 2D alignment and not use the magic linewar 1D alignment. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-08-25 14:11:15 -07:00
Jason Ekstrand	cda1a5dc0e	intel/isl: Pass the dim_layout into choose_alignment_el Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-08-25 14:10:43 -07:00
Jason Ekstrand	f68cfb05fa	intel/isl: Use DIM_LAYOUT_GEN4_2D for tiled 1-D surfaces on SKL The Sky Lake 1D layout is only used if the surface is linear. For tiled surfaces such as depth and stencil the old gen4 2D layout is used. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-08-25 14:09:44 -07:00
Jason Ekstrand	78715c7211	nir/phi_builder: Don't recurse in value_get_block_def In some programs, we can have very deep dominance trees and the recursion can cause us to risk stack overflows. Instead, we replace the recursion with a pair of loops, one at the start and one at the end. This is functionally equivalent to what we had before and it's actually a bit easier to read in the new form without the recursion. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97225 Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-25 14:08:07 -07:00
Chad Versace	3eddf5219e	.mailmap: Update my address again I joined Google's Chrome OS graphics team.	2016-08-25 13:55:52 -07:00
Matt Turner	e53130cc27	nir: Walk blocks in source code order in lower_vars_to_ssa. Prior to this commit rename_variables_block() is recursively called, performing a depth-first traversal of the control flow graph. The function uses a non-trivial amount of stack space for local variables, which puts us in danger of smashing the stack, given a sufficiently deep dominance tree. XCOM: Enemy Within contains a shader with such a dominance tree (1574 nir_blocks in total, depth of at least 143). Jason tells me that he believes that any walk over the nir_blocks that respects dominance is sufficient (a DFS might have been necessary prior to the introduction of nir_phi_builder). In fact, the introduction of nir_phi_builder made the problem worse: rename_variables_block(), walks to the bottom of the dominance tree before calling nir_phi_builder_value_get_block_def() which walks back to the top of the dominance tree... In any case, this patch ensures we avoid that problem as well. Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97225 Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-08-25 13:45:39 -07:00
Marek Olšák	a491b9e945	radeonsi: don't use allocas for arrays with LLVM 3.8 It crashes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97413	2016-08-25 21:19:17 +02:00
Marek Olšák	fe91ae06d3	gallium/radeon: unify and simplify checking for an empty gfx IB We can take advantage of the fact that multi_fence does the obvious thing with NULL fences. This fixes unflushed fences that can get stuck due to empty IBs.	2016-08-25 21:19:17 +02:00
Matt Turner	e6673e7ac2	mesa: Drop sed of now dead Plo files. gen6/7/8_blorp.c were removed in commits `c8bc1ae96a`, `e198983c61`, and `16a9fcbbb6` respectively.	2016-08-25 11:20:54 -07:00
Kenneth Graunke	6cf8708ce5	meta: Always do GenerateMipmaps in linear colorspace. When generating mipmaps for sRGB textures, force both decode and encode, so the filtering is done in linear colorspace, regardless of settings. Fixes a WebGL conformance test in Chrome: https://www.khronos.org/registry/webgl/sdk/tests/conformance2/textures/misc/tex-srgb-mipmap.html?webglVersion=2 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97322 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-25 11:07:01 -07:00
Eric Engestrom	ed871af91c	configure.ac: raise Mako required version to 0.8.0 It seems [0] old versions of Mako are no longer supported. Emil mentioned it might need v0.8.0 [1] for isl_format_layout [2], although I didn't get a confirmation that it's really the minimum. Let's raise it to that to avoid getting other bugs. We might lower it a bit again later if it turns out we can. [0] https://lists.freedesktop.org/archives/mesa-dev/2016-July/122772.html [1] https://lists.freedesktop.org/archives/mesa-dev/2016-July/122775.html [2] https://lists.freedesktop.org/archives/mesa-dev/2016-July/123278.html Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Dave Airlie <Airlied@redhat.com>	2016-08-25 16:51:27 +01:00
Brian Paul	2a2dc416b6	swrast: fix incorrectly positioned putImage() in swrast driver Some front buffer rendering was in the wrong position. This included scissored clears, glDrawPixels and glCopyPixels. The problem was the y coordinate passed to putImage() didn't match the y coordinate passed to getImage(). We fix this by setting xrb->map_y to the inverted coordinate in swrast_map_renderbuffer() which is used later by the putImage() call. Also pass xrb->map_y to getImage() to be symmetric. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97426 Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-08-25 07:19:35 -06:00
Marek Olšák	3ff0b67e1b	radeonsi: disable SDMA texture copying on Carrizo Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-08-25 14:51:08 +02:00
Marek Olšák	1276316d67	gallium/noop: use 3-space indentation Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-25 14:09:48 +02:00
Marek Olšák	9daaa6f5a6	gallium: add a pipe_context parameter to resource_get_handle radeonsi needs to do some operations (DCC decompression) for OpenGL-OpenCL interop and this is the only way to make it coherent with the current context. It can optionally be set to NULL. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-25 14:09:48 +02:00
Nicolai Hähnle	b662c70aea	st/mesa: fix sRGB BlitFramebuffer regression Broken since: `3190c7ee97` Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97285 Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-08-25 13:21:05 +02:00
Michel Dänzer	1e3218bc5b	loader/dri3: Overhaul dri3_update_num_back Always use 3 buffers when flipping. With only 2 buffers, we have to wait for a flip to complete (which takes non-0 time even with asynchronous flips) before we can start working on the next frame. We were previously only using 2 buffers for flipping if the X server supports asynchronous flips, even when we're not using asynchronous flips. This could result in bad performance (the referenced bug report is an extreme case, where the inter-frame stalls were preventing the GPU from reaching its maximum clocks). I couldn't measure any performance boost using 4 buffers with flipping. Performance actually seemed to go down slightly, but that might have been just noise. Without flipping, a single back buffer is enough for swap interval 0, but we need to use 2 back buffers when the swap interval is non-0, otherwise we have to wait for the swap interval to pass before we can start working on the next frame. This condition was previously reversed. Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97260 Reviewed-by: Frank Binns <frank.binns@imgtec.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-08-25 17:40:24 +09:00
Jason Ekstrand	2301705dee	anv: Include the pipeline layout in the shader hash The pipeline layout affects shader compilation because it is what determines binding table locations as well as whether or not a particular buffer has dynamic offsets. Since this affects the generated shader, it needs to be in the hash. This fixes a bunch of CTS tests now that the CTS is using a pipeline cache. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-08-24 20:42:05 -07:00
Jason Ekstrand	05f36435ef	anv: Add a --disable-vulkan-icd-full-driver-path option This option makes installed Vulkan ICD files contain only a driver library name and not a path. This is intended for distros to help them work around multi-arch issues. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-08-25 10:32:31 +10:00
Francisco Jerez	c8f5bd2c99	i965/fs: Don't consider the stencil output to be a color output. This would cause gl_FragStencilRef to be counted as a color output incorrectly during the precompile phase, which leads to unnecessary recompilation on master and could trigger an assertion failure in fs_visitor::emit_fb_writes() on my i965-fb-fetch branch. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-24 13:28:31 -07:00
Francisco Jerez	2018371692	glsl: Keep track of the set of fragment outputs read by a GL program. This is the set of shader outputs whose initial value is provided to the shader by some external means when the shader is executed, rather than computed by the shader itself. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-24 13:28:31 -07:00
Francisco Jerez	711213fb72	glsl: Don't consider read-only fragment outputs to be written to. Since they cannot be written. This prevents adding fragment outputs to the OutputsWritten set that are only read from via the gl_LastFragData array but never written to. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-24 13:28:31 -07:00
Francisco Jerez	913ae618c6	glsl/linker: Allow fragment output overlap for gl_LastFragData. gl_LastFragData overlaps gl_FragData by definition. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-24 13:28:31 -07:00
Francisco Jerez	6b3d23dcc0	glsl/ast: Allow redeclaration of gl_LastFragData with different precision qualifier. v2: No need to check the GLSL version. (Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-24 13:28:31 -07:00
Francisco Jerez	5e1d34394e	glsl: Don't attempt to do dead varying elimination on gl_LastFragData arrays. Apparently this pass can only handle elimination of a single built-in fragment output array, so the presence of gl_LastFragData (which it wouldn't split correctly anyway) could prevent it from splitting the actual gl_FragData array. Just match gl_FragData by name since it's the only built-in it can handle. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-24 13:28:31 -07:00
Francisco Jerez	6b33eab959	glsl: Define a gl_LastFragData built-in for older GLSL versions. The EXT_shader_framebuffer_fetch extension defines alternative language for GLES2 shaders where user-defined fragment outputs are not allowed. Instead of using inout user-defined fragment outputs the shader is expected to read from the gl_LastFragData built-in array. In addition this allows using the same language on desktop GLSL versions prior to 4.2 that support the deprecated gl_FragData built-in in preparation for the MESA_shader_framebuffer_fetch desktop GL extension. Both legacy and user-defined inout outputs have a common representation at the GLSL IR level, so it shouldn't make any difference for optimization passes and back-ends whether the application is using gl_LastFragData or user-defined outputs, all they'll see is a variable dereference of a fragment output at a certain interface location with the fb_fetch_output bit set to one. v2: Don't define the built-in variable on GLSL versions for which gl_FragData exists but is deprecated. (Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-24 13:28:31 -07:00
Francisco Jerez	19e929a177	glsl: Handle the inout qualifier in fragment shader output declarations. According to the EXT_shader_framebuffer_fetch extension the inout qualifier can be used on ESSL 3.0+ shaders to declare a special kind of fragment output that gets implicitly initialized with the previous framebuffer contents at the current fragment coordinates. In addition we allow using the same language to define FB fetch outputs in GLSL 1.3+ shaders in preparation for the desktop MESA_shader_framebuffer_fetch extensions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-24 13:28:30 -07:00
Francisco Jerez	b49d8f20f4	glsl: Add support for representing framebuffer fetch in the GLSL IR. The GLSL IR representation of framebuffer fetch amounts to a single bit in the ir_variable object applicable to fragment shader outputs. The flag indicates that the variable will be implicitly initialized to the previous contents of the render buffer at the same fragment coordinates and sample index. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-24 13:28:30 -07:00
Francisco Jerez	d7cd7b9c49	glsl: Add parser state enables for the framebuffer fetch extensions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-24 13:28:30 -07:00
Francisco Jerez	303fb5881c	mesa: Add blend barrier entry point and driver hook. Both MESA_shader_framebuffer_fetch_non_coherent and the non-coherent variant of KHR_blend_equation_advanced will use this driver hook to request coherency between framebuffer reads and writes. This intentionally doesn't hook up glBlendBarrierMESA to the dispatch layer since the extension isn't exposed to applications yet, see [1] for more details. [1] https://lists.freedesktop.org/archives/mesa-dev/2016-July/124028.html Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-24 13:28:30 -07:00
Francisco Jerez	6a976bbf84	mesa: Move shader memory barrier functions into barrier.c. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-24 13:28:30 -07:00
Francisco Jerez	83d2f9db29	mesa: Rename "texturebarrier" source files to "barrier". In preparation for collecting all pipeline barrier GL entry points into a single source file. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-24 13:28:30 -07:00
Francisco Jerez	642aa58577	mesa: Add support for querying GL_FRAGMENT_SHADER_DISCARDS_SAMPLES_EXT. This can currently only give true as result since the only way you can expose EXT_shader_framebuffer_fetch right now is by flipping the MESA_shader_framebuffer_fetch bit, but that could potentially change in the future, see [1] for an explanation. [1] https://lists.freedesktop.org/archives/mesa-dev/2016-July/124028.html Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-24 13:28:30 -07:00
Francisco Jerez	115a27357c	mesa: Add extension enables for framebuffer fetch extensions. This allows drivers to expose EXT_shader_framebuffer_fetch in GLES2+ contexts if desired. Note that this adds boolean flags for two MESA extensions, but only the EXT GLES-only extension is exposed for the moment, see the cover letter of this series [1] for the rationale. [1] https://lists.freedesktop.org/archives/mesa-dev/2016-July/124028.html Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-24 13:28:30 -07:00
Francisco Jerez	acb12a1228	glapi: Add XML for GL_EXT_shader_framebuffer_fetch. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-24 13:28:30 -07:00
Samuel Pitoiset	a227b0a4f1	nvc0: invalidate textures/samplers on GK104+ Like Fermi, textures and samplers are aliased between 3D and compute, especially the TIC_FLUSH/TSC_FLUSH methods and we have to re-validate these resources when switching between the two pipelines. This fixes a GPU hang with Elemental (and most likely with other UE4 demos). Tested on GK107 and GM107. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> CC: <mesa-stable@lists.freedesktop.org>	2016-08-24 22:26:36 +02:00
Rhys Kidd	c9c989763a	gallium/ttn: Remove duplicated TGSI_OPCODE_DP2A initialization Duplicate line is currently on 1535. Identified by Clang, when run through Eric Anholt's Travis harness. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-08-24 11:54:50 -07:00
Eric Anholt	78ab62b1e9	travis: Upgrade LLVM dependency to 3.5 and enable LLVM drivers. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>	2016-08-24 11:54:50 -07:00
Eric Anholt	084678ccbb	travis: Enable vc4 in libdrm to satisfy vc4 test build dependency. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>	2016-08-24 11:54:50 -07:00
Eric Anholt	80a872f3f0	travis: Update to the Ubuntu Trusty image. This will hopefully fix wget from x.org (no real reason explained in Travis CI bug reports), and may also mean that we can enable LLVM driver builds. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>	2016-08-24 11:54:50 -07:00
Eric Anholt	ecbc76cf6e	travis: Parse configure.ac to pick an updated LIBDRM_VERSION. Travis has been broken a couple of times by configure.ac updates. To make it useful, auto-update the version necessary. This could potentially be used for other dependencies, too, but those get bumped less frequently. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>	2016-08-24 11:54:50 -07:00
Lionel Landwerlin	91987c51e3	anv: meta_blit2d: adapt texel fetch pitch for fake w-tiled We need to compute detiling coordinates using the physical size of W tiling (128x32) rather than the logical size (64x64). v2: Correct comment (Jason) Fixes dEQP-VK.api.copy_and_blit.image_to_image_stencil Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97448 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-24 11:29:23 -07:00
Eric Anholt	87a88f2daa	vc4: Fix GPU hangs with >16 varying values. Fixes glsl-routing in piglit and hangs in glbenchmark 2.0.2.	2016-08-24 10:43:22 -07:00
Leo Liu	5277f25480	vl/rbsp: fix another three byte not detected This happens when three byte "00 00 03" is partly loaded to vlc->buffer, thus at the bottom of buffer with valid bits is "00" or "00 00" and left like "00 03" or "03" in the data, so that it will not be detected by three byte emulation check. The reason for that is the escaped bit was set to 0 from the rbsp init. Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2016-08-24 11:17:16 -04:00
Marek Olšák	2c13abb491	radeonsi: fix VM faults due NULL internal const buffers on CIK They are harmless, but the interrupts do decrease performance. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97039 Cc: 12.0 <mesa-stable@lists.freedesktop.org>	2016-08-24 15:39:57 +02:00
Tomasz Figa	577f85e2bb	gallium/winsys/kms: Look up the GEM handle after importing a prime FD drmPrimeHandleToFD() will return the same GEM handle every time the same buffer is imported, even from a different prime FD. Since GEM handles are not reference counted, we need to make sure that each GEM handle is referenced only by one display target struct, by looking it up in kms_sw->bo_list first and bumping the refcount of the found dt on hit and falling back to creating a new dt only on miss. v2: Split into separate function. Use helper function for lookup. v3 [Emil Velikov]: Rename kms_sw_displaytarget_{lookup,find_and_ref} (Jordan) Signed-off-by: Tomasz Figa <tfiga@chromium.org> CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Hans de Goede <hdegoede@redhat.com> (v2) Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-24 14:39:23 +01:00
Tomasz Figa	0465c72d46	gallium/winsys/kms: Move display target handle lookup to separate function As a preparation to use the lookup in more than once place, move the code that looks up given KMS/GEM handle to a separate function. This change should not introduce any functional changes. v2: Split into separate patch. Move lookup code into separate function. v3 [Emil Velikov]: Rename kms_sw_displaytarget_{lookup,find_and_ref} (Jordan) Signed-off-by: Tomasz Figa <tfiga@chromium.org> CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Hans de Goede <hdegoede@redhat.com> (v2) Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-24 14:39:23 +01:00
Tomasz Figa	e71b78ebf9	gallium/winsys/kms: Fully initialize kms_sw_dt at prime import time (v2) Currently kms_sw_displaytarget_add_from_prime() allocates the struct and fills in only some of the fields, resulting in a half-baked struct that needs to be further completed by the caller. To make this a bit more consistent, pass width, height and stride to this function and fill in everything there, so that caller can take the returned struct as is. v2: Split from one big patch into four fixing one thing at a time. Signed-off-by: Tomasz Figa <tfiga@chromium.org> CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-24 14:39:23 +01:00
Tomasz Figa	0aa6a818ef	gallium/winsys/kms: Fix double refcount when importing from prime FD (v2) Currently the code creates a display target struct with refcount field initialized to 1 and then the caller again increments it, leading to a leaked reference. Let's remove the unnecessary increment. v2: Split from one big patch into four fixing one thing at a time. Signed-off-by: Tomasz Figa <tfiga@chromium.org> CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-24 14:39:22 +01:00
Alejandro Piñeiro	b4959e17f1	shaderapi: don't generate not linked error on GetProgramStage in general Both ARB_shader_subroutine and the GL core spec doesn't list any error when the program is not linked. We left a error generation for the uniform location, in order to be consistent with other methods from the spec that generate them. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-08-24 14:57:13 +02:00
Eric Engestrom	9411eb67ec	gallium/cso: avoid unnecessary null dereference The label `out:` calls `destroy()` which dereferences `ctx`. This is unnecessary as there is nothing to destroy. Immediately return instead. CovID: 1258255 Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-24 11:35:05 +01:00
Eric Engestrom	2f86582b92	.gitignore: Ignore tags generated by `make tags` Signed-off-by: Eric Engestrom <eric@engestrom.ch> [Emil Velikov: rebase] Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-24 11:33:48 +01:00
Eric Engestrom	f6b9fb6e4c	st/xvmc: fix a couple 'unused-but-set-variable' warnings Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-24 11:32:00 +01:00
Eric Engestrom	49dad1aafd	egl: turn a couple asserts static (compile-time) Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-24 11:30:15 +01:00
Eric Engestrom	8af1b540c5	i915: remove unnecessary `if` if (x) return true; else return false; can be simplified as: return x; since `x` is already a boolean expression. Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-24 11:17:05 +01:00
Eric Engestrom	253274351f	i965: remove unnecessary `if` if (x) return true; else return false; can be simplified as: return x; since both `x` are already boolean expressions. Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-24 11:17:05 +01:00
Alejandro Piñeiro	07fe2d565b	program_resource: subroutine active uniforms should return NumSubroutineUniforms Before this commit, GetProgramInterfaceiv for pname ACTIVE_RESOURCES and all the <shader>_SUBROUTINE_UNIFORM programInterface were returning the count of resources on the shader program using that interface, instead of the num of uniform resources. This would get a wrong value (for example) if the shader has an array of subroutine uniforms. Note that this means that in order to get a proper value, the shader needs to be linked, something that is not explicitly mentioned on ARB_program_interface_query spec, but comes from the general definition of active uniform. If the program is not linked we return 0. v2: don't generate an error if the program is not linked, returning 0 active uniforms instead, plus extra spec references (Tapani Palli) Fixes GL44-CTS.program_interface_query.subroutines-compute Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-08-24 11:33:04 +02:00
Stencel, Joanna	690ead4a13	egl/wayland-egl: Fix for segfault in dri2_wl_destroy_surface. Segfault occurs when destroying EGL surface attached to already destroyed Wayland window. The fix is to set to NULL the pointer of surface's native window when wl_egl_destroy_window() is called. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Stencel, Joanna <joanna.stencel@intel.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-24 10:18:13 +01:00
Kai Wasserbäch	f033d97155	st/va: Remove unused variable coded_size from vlVaEndPicture() Removes the following GCC warning: ../../../../../src/gallium/state_trackers/va/picture.c:542:17: warning: unused variable 'coded_size' [-Wunused-variable] unsigned int coded_size; ^~~~~~~~~~ Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>	2016-08-24 10:35:53 +02:00
Kai Wasserbäch	83d08d4cab	st/va: Remove else case in vlVaEndPicture() made superfluous by `c59628d11b` Commit `c59628d11b` made the else statement and duplication of the context->decoder->end_frame() call superfluous. Cc: Boyuan Zhang <boyuan.zhang@amd.com> Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>	2016-08-24 10:35:20 +02:00
Eric Engestrom	cd340052ad	st/va: add missing mutex_unlock Fixes: `c59628d11b` ("st/va: enable dual instances encode by sync surface") Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-08-24 10:33:07 +02:00
Kenneth Graunke	e7530bfcd6	aubinator: Style fixes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-23 21:19:58 -07:00
Sirisha Gandikota	56ba9656bb	aubinator: Fix the tool to correctly decode the DWords Several fixes have been added as part of this as listed below: 1) Fix the mask and add disassembler handling for STATE_DS, STATE_HS as the mask returned wrong values of the fields. 2) Fix the GEN_TYPE_ADDRESS/GEN_TYPE_OFFSET decoding - the address/ offset were handled the same way as the other fields and that gives the wrong values for the address/offset. 3) Decode nested/recurssive structures - Many packets contain nested structures, ex: 3DSATE_SO_BUFFER, STATE_BASE_ADDRESS, etc contain MOC structures. Previously, the aubinator printed 1 if there was a MOC structure. Now we decode the entire structure and print out its fields. 4) Print out the DWord address along with its hex value - For a better clarity of information, it is helpful to print both the address and hex value of the DWord along with the DWord count. Since the DWord0 contains the instruction code and the instruction length, it is unnecessary to print the decoded values for DWord0. This information is already available from the DWord hex value. 5) Decode the <group> and the corresponding fields in the group- The <group> tag can have fields of several types including structures. A group can contain one or more number of fields and this has be correctly decoded. Previously, aubinator did not decode the groups or the fields/structures inside them. Now we decode the <group> in the instructions and structures where the fields in it repeat for any number of times specified. v2: Fix the formatting (per Matt) Make the start and end pos calculation to extract fields from a DWord more appropriate by moving %32 away from mask() method Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Ben Widawsky <ben@bwidawsk.net>	2016-08-23 21:19:55 -07:00
Kristian Høgsberg Kristensen	3e218ad7f8	aubinator: Add a new tool called Aubinator to the src/intel/tools folder. The Aubinator tool is designed to help the driver developers in debugging the driver functionality by decoding the data in the .aub files. Primary Authors of this tool are Damien Lespiau <damien.lespiau at intel.com> and Kristian Høgsberg Kristensen <krh at bitplanet.net>. v2: Review comments are incorporated by Sirisha Gandikota as below: 1) Make Makefile.am more crisp, reuse intel_aub.h from libdrm (per Emil) 2) Aubinator will use platform name instead of GEN number (per Matt) 3) Disassmebler gets created based on pciid rather then GEN number (per Matt) 4) Other formatting comments (per Ken, Matt and Emil) Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Ben Widawsky <ben@bwidawsk.net>	2016-08-23 21:19:33 -07:00
Kenneth Graunke	eb1a0ddfd5	glsl: Mark tessellation qualifier maps static const. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-23 21:15:59 -07:00
Jason Ekstrand	70bc891c42	isl/formats: Integer formats are not filterable In `ca2a8e5628`, we updated the format table to add more formats (most of which are new on SKL) but accidentally marked some integer formats as filterable. You can't filter an integer format. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-08-23 16:51:34 -07:00
Ilia Mirkin	361678edd7	st/dri: respect driver's request to avoid mixed color/depth bit configs Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-23 18:30:53 -04:00
Ilia Mirkin	9515d651f9	gallium: add a cap to expose whether driver supports mixed color/zs bits Some hardware can't render to color/depth buffers of mixed bitness. When that happens a fallback has to happen, but this allows the driver to express that this isn't an optimal scenario. The purpose of this is to remove such fbconfigs from the GLX/EGL config list. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-23 18:30:49 -04:00
Ilia Mirkin	528390021f	dri: add a way to request that modes have matching color/zs depths Some GPUs, notably nv3x/nv4x can't render to mismatched color/zs framebuffer depths. Fallbacks can be done by the driver, with shadow surfaces, but no reason to encourage applications to select non-matching glx visuals. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-23 18:30:30 -04:00
Ilia Mirkin	092f994a03	nv50/ir: make sure cfg iterator always hits all blocks In some very specially-crafted cases, we could attempt to visit a node that has already been visited, and then run out of bb's to visit, while there were still cross blocks on the list. Make sure that those get moved over in that case. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96274 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2016-08-23 18:30:12 -04:00
Jason Ekstrand	7bdccd104b	anv/clear: Clear E5B9G9R9 images as R32_UINT We can't actually clear these images normally because we can't render to them. Instead, we have to manually unpack the rgb9e5 color value on the CPU and clear it as R32_UINT. We still have a bit of work to do to clear non-power-of-two images, but this should get all of the power-of-two clears working on at least Haswell. This fixes three of the new Vulkan CTS tests in the dEQP-VK.api.image_clearing.clear_color_image.* group. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-08-23 11:45:25 -07:00
Jason Ekstrand	afa7ca0f77	anv/clear: Make cmd_clear_image take an actual VkClearValue Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-08-23 11:45:24 -07:00
Jason Ekstrand	cf3cf2ecfc	anv/blit2d: Add support for RGB destinations This fixes 104 of the new image_clearing and copy_and_blit Vulkan CTS tests. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-08-23 11:45:24 -07:00
Jason Ekstrand	16ddda8452	anv/blit2d: Add a format parameter to bind_dst and create_iview Signed-off-by: Jasosn Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-08-23 11:45:24 -07:00
Jason Ekstrand	954c0bfb20	anv/image: Don't create invalid render target surfaces Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-23 11:45:24 -07:00
Jason Ekstrand	ca2a8e5628	isl/formats: Update the table with more samplable formats There were a lot of formats where support was added on Haswell or later but we never updated the format table. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-08-23 11:45:24 -07:00
Jason Ekstrand	aba9e25b70	isl/formats: Report ETC as being samplable on Bay Trail Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-08-23 11:45:24 -07:00
Jason Ekstrand	f6967ddd32	i965/surface_formats: Don't advertise 8 or 16-bit RGB formats We have implicitly been not advertising these formats since we had them turned off in the format capabilities table. We are about to update that table and this prevents a change in behavior. The only change in behavior created by this patch is that we no longer advertise support for R16G16B16_FLOAT which means that it's now renderable which seems like a bonus. Maybe someday we'll want to change things to start supporting 16-bit RGB formats natively but, at the moment, there's no need. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-08-23 11:45:24 -07:00
Jason Ekstrand	fb90291dd5	anv/formats: Don't use an RGBX format if it isn't renderable The whole point of using RGBX is so that we can render to it so if it isn't renderable, that kind-of defeats the purpose. Some formats (one example is R32G32B32X32_SFLOAT) exist in the format table but aren't actually renderable. Eventually, we'd like to get away from RGBX entirely, but this fixes hangs on BDW today. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-23 11:45:24 -07:00
Nicolas Boichat	4f3f8bb59d	egl/dri2: dri2_initialize: Do not reference-count TestOnly display In the case where dri2_initialize is called with a TestOnly display, the display is not actually initialized, so dri2_egl_display always fails, and we cannot do any reference counting. Fixes piglit spec@egl_khr_create_context@verify gl flavor (reproducible with LIBGL_ALWAYS_SOFTWARE=1). Fixes: `9ee683f877` (egl/dri2: Add reference count for dri2_egl_display) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reported-by: Michel Dänzer <michel@daenzer.net> Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-23 18:08:17 +01:00
Jan Ziak	6687037f1f	vbo: fix format string compiler warning for 32-bit machines Signed-off-by: Jan Ziak (http://atom-symbol.net) <0xe2.0x9a.0x9b@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-23 07:31:28 -06:00
Dongwon Kim	c6e97aaf75	egl/dri2: remove error checks on return values from mtx_lock and cnd_wait This removes unnecessary error checks on return result of mtx_lock and cnd_wait calls as in all other places in MESA source since there is no chance that any of these functions return any of error codes in current implementation. This patch also removes a redundent _eglError call that follows EGL_FALSE check in the bottom of dri2_client_wait_sync. Signed-off-by: Dongwon Kim <dongwon.kim@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-23 12:00:45 +01:00
Dave Airlie	96ea753d9e	i965: report bound buffer size not underlying buffer size for image size (v2) This seems to make sense, the image is bound to a subset of the buffer so the image size should be from the bound size not the underlying object. This fixes: GL44-CTS.shader_image_size.advanced-nonMS-fs-int v2: get mininum of the two values, same as we write to the hw. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-08-23 13:39:15 +10:00
Jason Ekstrand	34ff4fbba6	anv: Throw INCOMPATIBLE_DRIVER for non-fatal initialization errors The only reason we should throw INITIALIZATION_FAILED is if we have found useable intel hardware but have failed to bring it up for some reason. Otherwise, we should just throw INCOMPATIBLE_DRIVER which will turn into successfully advertising 0 physical devices Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-08-22 18:49:49 -07:00
Dave Airlie	26187f3890	st/glsl_to_tgsi: fix st_src_reg_for_double constant. This needs to set the src swizzle so it doesn't access the .zw members ever when we are just emitting a 0 constant here. This fixes: vert-conversion-explicit-dvec3-bvec3.shader_test and a bunch of other fp64 tests on softpipe and radeonsi. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-08-23 11:14:03 +10:00
Dave Airlie	0bce055d9e	mesa/subroutines: drop the old subroutine index uploads. We used to upload the indices when they changed, now we rely on the drivers calling the correct hook to have the values updated from the context storage. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2016-08-23 11:03:46 +10:00
Dave Airlie	6a332a389a	st/mesa: use the new subroutine index upload API. This plugs the new API into the gallium state tracker. Signed-off-by: Dave Airlie <airlied@redhat.com> Acked-by: Andres Gomez <agomez@igalia.com>	2016-08-23 11:03:45 +10:00
Dave Airlie	4adad99cfb	i965: use new subroutine index uploader. This plugs the subroutine index updates into the i965 backend, where it loads constants. Signed-off-by: Dave Airlie <airlied@redhat.com> Acked-by: Andres Gomez <agomez@igalia.com>	2016-08-23 11:03:45 +10:00
Dave Airlie	ea783667e4	mesa: add api to write subroutine indicies to the program storage. This writes the subroutine indicies to the program storage for a stage. This API is intended to be used by drivers to update the uniform storage before uploading to the hw. This isn't the most thread safe effort, but it will be significantly more multi-context safe. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2016-08-23 11:03:45 +10:00
Dave Airlie	4566aaaa5b	mesa/subroutines: start adding per-context subroutine index support (v1.1) One piece of ARB_shader_subroutine I ignored was the fact that it needs to store the subroutine index data per context and not per shader program. There is one CTS test that tests this: GL45-CTS.shader_subroutine.multiple_contexts However the test only does a write to context and readback, it never renders using the values, so this is enough to fix the test however not enough to do what the spec says. So with this patch the info is now stored per context, but it gets updated into the program at UseProgram and when the values are inserted into the context, which won't help if multiple contexts are in use in multiple threads. v1.1: cleanups and nit-picks (Andres) Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2016-08-23 11:03:45 +10:00
Matt Turner	27d20ee264	vbo: Make #if 0'd debugging code compile.	2016-08-22 16:31:50 -07:00
Timothy Arceri	8ee909ee42	nir: avoid segfault when ssa src not found Without this the following line will segfault and we don't get to see the results of the validate_assert() above. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-08-23 09:06:29 +10:00
Eric Anholt	47e3cc7557	vc4: Tell state_tracker that we would prefer NIR. Before this series, the code generation path was: GLSL IR -> TGSI -> NIR -> NIR clone -> QIR -> QPU Now it's (generally) GLSL IR -> NIR -> NIR clone -> QIR -> QPU	2016-08-22 12:11:08 -07:00
Eric Anholt	d08f09c24e	st/nir: Trim out unused VS input variables. If we're going to skip setting up vertex input data in them, we should probably not leave them as vertex inputs with a driver_location that happens to alias to something else. Fixes a regression in glsl-mat-attribute on vc4 when enabling GTN. v2: Change commit message shortlog, lower the new globals away before handing off to the driver. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-22 12:11:05 -07:00
Eric Anholt	3ef1853f7d	nir: Fix crash in nir_lower_drawpixels. Generally you'd see the gl_Color reference first and get some cursor set. However, in piglit draw-pixel-with-texture we're now seeing the TexCoord dereferenced first. Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-08-22 11:52:27 -07:00
Eric Anholt	0a8ff1681b	nir: Fix a comment typo in nir_lower_drawpixels. Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-08-22 11:52:26 -07:00
Eric Anholt	f4d143f0d9	vc4: Use proper type sizes for uniforms.	2016-08-22 11:52:26 -07:00
Eric Anholt	bdb54cdc16	vc4: Add VARYING_SLOT_PNTC support. We end up with this when doing GLSL-to-NIR.	2016-08-22 11:52:26 -07:00
Eric Anholt	3c1ea6e651	vc4: Fix vc4_nir_lower_io for non-vec4 I/O. To support GLSL-to-NIR, we need to be able to support actual float/vec2/vec3 varyings.	2016-08-22 11:52:26 -07:00
Eric Anholt	e8378fee0c	nir: Define system values for vc4's blending-lowering arguments. In the GLSL-to-NIR conversion of VC4, I had a bit of trouble with what I was calling the "state uniforms" that I was putting into the NIR fighting with its other lowering passes. Instead of using magic uniform base numbers in the backend, follow the lead of load_user_clip_plane and just define system values for them. v2: Fix unintended change to channel_num, drop unspecified const_index value on blend_const_color_r_float. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-22 11:52:26 -07:00
Lionel Landwerlin	475ce61d1a	anv: GetDeviceImageFormatProperties: fix TRANSFER formats We let the user believe we support some transfer formats which we don't. This can lead to crashes when actually trying to use those formats for example on dEQP-VK.api.copy_and_blit.image_to_image.* tests. Let all formats we can render to or sample from as meta implements transfers using attachments. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-22 10:41:30 -07:00
Marek Olšák	0328b20050	gallium/hud: round max_value to print nicely rounded numbers next to graphs This improves readability a lot. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-22 16:01:35 +02:00
Marek Olšák	0f1befe926	gallium/hud: generalize code for drawing numbers next to graphs Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-22 16:01:35 +02:00
Marek Olšák	a33eb48d61	gallium/hud: draw numbers with 3 decimal places if those aren't 0 Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-22 16:01:35 +02:00
Marek Olšák	b9c9551c09	gallium/hud: use sRGB for nicer AA lines Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-22 16:01:35 +02:00
Marek Olšák	6ffde82083	gallium/hud: use AA lines for graphs this looks a lot better (with the next patch) Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-22 16:01:35 +02:00
Marek Olšák	6902f9e82a	gallium/hud: don't enable blending for all objects Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-22 16:01:35 +02:00
Tapani Pälli	0abebec012	util: add assert that key cannot be NULL on insertion Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-08-22 07:37:55 +03:00
Tapani Pälli	68233801ae	glsl: fix key used for hashing switch statement cases Implementation previously used value itself as the key, however after hash implementation change by `ee02a5e` we cannot use 0 as key. v2: use constant pointer as the key and implement comparison for contents (Eric Anholt) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97309	2016-08-22 07:36:33 +03:00
Mauro Rossi	a5f445640e	android: i965: add per-gen libmesa_i965_gen{8,9} static Needed to fix android build after commit `16a9fcb` which enabled genxml for gen{8,9} state setup This is the last patch needed, android build tested successfully.	2016-08-20 16:18:31 -07:00
Mauro Rossi	9dc70a71f8	android: i965: add per-gen libmesa_i965_gen{7,75} static libraries Needed to fix android build after commit `e198983` which enabled genxml for gen{7,75} state setup Android build fix for gen{8,9} will follow as incremental patch, build tested successfully with all per-gen patches applied.	2016-08-20 16:18:28 -07:00
Mauro Rossi	7478ddad29	android: i965: add per-gen libmesa_i965_gen6 static library Needed to fix android build after commit `c8bc1ae` where new per-gen genX_blorp.c source replaced gen6_blorp.c for gen6 Android build fixes for gen{7,75} and gen{8,9} will follow as incremental patches, build tested successfully with all per-gen patches applied.	2016-08-20 16:18:26 -07:00
Kenneth Graunke	7db81d9a87	glsl: Rename link_fs_input_layout_qualifiers to "inout". We're going to handle output qualifiers here too, and calling it "inout" seems to be the going convention. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-20 13:52:25 -07:00
Matt Turner	7e3e1bed03	i965/cfg: Factor common code out of switch statement. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-08-20 11:40:42 -07:00
Jason Ekstrand	a2ae67aa47	anv: Give the installed intel_icd.json file an absolute path Not providing a path allows the ICD to work on multi-arch systems but breaks it if you install anywhere other than /usr/lib. Given that users may be installing locally in .local or similar, we probably do want to provide a filename. Distros can carry a revert of this commit if they want an intel_icd.json file without the path. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Chad Versace <chad@kiwitree.net>	2016-08-20 00:50:03 -07:00
Daniel Scharrer	16ef7ab5c1	mesa: Fix fixed function spot lighting on newer hardware (again) This was first fixed in commit `b3f9c5c` and then broken again in commit `fe2d2c7`, which removed the abs modifier from input registers. v2: Don't change the size of struct ureg. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91342 Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Daniel Scharrer <daniel@constexpr.org>	2016-08-19 20:46:53 -07:00
Matt Turner	a9033d1dc1	i965: Remove comment within a comment.	2016-08-19 20:44:37 -07:00
Roland Scheidegger	0849621891	llvmpipe: fix issues with depth clamp We only did depth clamp when the value was written from the fs. This is very wrong both for d3d10 and GL, and only passed the corresponding piglit test due to pure luck (it no longer does with the enhanced test). Also, interpolation clamped values to 1.0 always, which can legitimately happen if depth clip is disabled, so fix that as well (untested). There is one unresolved issue left, d3d10 always does depth clamping, whereas GL does not (but does [0,1] clamp instead for fs depth outputs) - this information isn't in any gallium state object, leave it as-is for now (though it looks like llvmpipe misses the [0,1] clamp as well). This (with the previous patch) fixes piglit depth-clamp-range test. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-08-20 04:05:33 +02:00
Roland Scheidegger	b0a647f284	llvmpipe: fix depth clamping wrt reversed near/far values This wasn't handled before (the result was that no matter what value got clamped, it always ended up as the near value in this case) (if clamping actually happened). Fix this by using the util helper for that (the math is otherwise "mostly" the same, mostly because there could actually be differences due to float rounding, but I don't even know which one would be more correct). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-08-20 04:05:33 +02:00
Matt Turner	a73116ecc6	i965/sched: Simplify work done by add_barrier_deps(). Scheduling barriers are implemented by placing a dependence on every node before and after the barrier. This is unnecessary as we can limit the number of nodes we place dependencies on to those between us and the next barrier in each direction. Runtime of dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23 is reduced from ~25 minutes to a little more than three. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94681 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 16:52:25 -07:00
Matt Turner	e7c376adfd	i965/vec4: Ignore swizzle of VGRF for use by var_range_end(). var_range_end(v, n) loops over the n components of variable number v and finds the maximum value, giving the last use of any component of v. Therefore it expects v to correspond to the variable associated with the .x channel of the VGRF. var_from_reg() however returns the variable for the first channel of the VGRF, post-swizzle. So, if the last register had a swizzle with y, z, or w in the swizzle component, we would read out of bounds. For any other register, we would read liveness information from the next register. The fix is to convert the src_reg to a dst_reg in order to call the dst_reg version of var_from_reg() that doesn't consider the swizzle. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 16:52:25 -07:00
Matt Turner	3ef31122d0	i965/vec4: Print spills:fills. Allows shader-db to work on vec4 programs (has been broken since shader-db commit 646df5ca98b2 from April!) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 16:52:25 -07:00
Ilia Mirkin	89f00f749f	a4xx: make sure to actually clamp depth as requested We were previously ... not clamping. I guess this meant that everything got clamped to 1/0, which was enough to pass the existing tests. Or perhaps the clamping would only happen to the rasterized depth value and not the frag shader's output depth value. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97231 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-08-19 19:40:04 -04:00
Ilia Mirkin	cd8e30452f	a4xx: only disable depth clipping, not all clipping, when requested The previous bit disables the whole clipper, including the regular viewport-related clipping that would go on. The two new bits disable near and far clipping (separately, as verified with the depth-clamp-range piglit). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-08-19 19:40:04 -04:00
Eric Anholt	5adee83806	vc4: Switch store_output to using nir_lower_io_to_scalar / component.	2016-08-19 13:11:36 -07:00
Eric Anholt	f8fecc396a	vc4: Use the intrinsic's first_component for vattr VPM index. Avoids another multiplication by 4 of the base in the NIR.	2016-08-19 13:11:36 -07:00
Eric Anholt	cbf8c19410	vc4: Convert to using nir_lower_io_scalar for FS inputs. The scalarizing of FS inputs can be done in a non-driver-dependent manner, so extract it out of the driver.	2016-08-19 13:11:36 -07:00
Eric Anholt	c30b22c421	vc4: Switch to using the intrinsic accessors. The const_index[] values have always felt magic, and this documents them a bit better.	2016-08-19 13:11:36 -07:00
Eric Anholt	9f1411d1ec	nir: Add an IO scalarizing pass using the intrinsic's first_component. vc4 wants to have per-scalar IO load/stores so that dead code elimination can happen on a more granular basis, which it has been doing in the backend using a multiplication by 4 of the intrinsic's driver_location. We can represent it properly in the NIR using the first_component field, though. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 13:11:36 -07:00
Eric Anholt	c35f979220	nir: Add nir_builder support for individual system value loads. The previous nir_load_system_value(b, nir_intrinsic_load_whatever), 0) was rather verbose, when system values should be easy to generate. The index is left out because only one system value had an index included in it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 13:11:36 -07:00
Eric Anholt	24728637e2	nir: Move the undef of nir_intrinsics.h macros to the .h. I wanted to include this from nir_builder as well, so it also needed the undefs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 13:11:36 -07:00
Eric Anholt	c078c41520	ttn: Use nir_load_front_face instead of the TGSI-style input. This reduces the diff between GLSL-to-NIR and TGSI-to-NIR, and gives NIR more optimization to work on. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 13:11:36 -07:00
Eric Anholt	3f607f9e4f	nir: Use the system-value front face for twoside lowering. GLSL-to-NIR generates system value usage, and vc4/freedreno would both like the system value instead of the varying, so switch this pass over to it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 13:11:36 -07:00
Eric Anholt	ed92241d78	ttn: Make FRAG_RESULT_DEPTH be a float variable to match gtn and ptn. This lets TTN-using drivers handle FRAG_RESULT_DEPTH the same between all their source paths. Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-08-19 13:11:36 -07:00
Eric Anholt	d80d03b830	vc4: Dump the TGSI before trying to convert it to NIR. In the case of debugging a crash in TTN, this is nice to have.	2016-08-19 13:11:36 -07:00
Boyuan Zhang	c0be51f270	radeon/vce: set flag based on dual instance enablement Set the flag on when dual instance encoding is supported, otherwise set it to off. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>	2016-08-19 10:36:44 -04:00
Boyuan Zhang	c59628d11b	st/va: enable dual instances encode by sync surface This patch improves the performance of Vaapi Encode by enabling dual instances encoding. flush function is not called after each end_frame call. radeon/vce will do flush whenever 2 frames are submitted for encoding. Implement sync surface function to flush only if the frame hasn't been flushed yet. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-08-19 10:36:44 -04:00
Jason Ekstrand	93d2b5c576	i965/blorp: Remove no longer used state setup helpers Now that we're using genxml for everything, we no longer need the hand-rolled state emit helpers. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	16a9fcbbb6	i965/blorp: Use genxml for gen8-9 state setup Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	e198983c61	i965/blorp: Use genxml for gen7 state setup Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	344841fcba	i965/blorp: Add genxml-based vertex setup helpers Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	7b035fd0c9	i965/blorp: Add a helper for emitting surface states The new helper emits surface states and the binding table in one go. It's nice to have it pulled out of the main blorp_exec function. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	48f13545dd	i965/blorp: Add genxml-based sampler state emit function Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	eb655c4fc2	i965/blorp: Add genxml-based dynamic state emit functions Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	c8bc1ae96a	i965: Move gen6_blorp.c to a file that gets recompiled per-gen At the moment, it's only used for gen6 but that will change soon. We use the genX prefix for recompiled things in the Vulkan driver. It isn't great, but it seems to have worked ok. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	eea6a66222	i965/blorp/gen6: Use genxml packing structs for state setup Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	b5c20a98c1	i965/blorp: Stop setting point and line rasterization rules Blorp never uses points or lines and the default values of 0 are perfectly fine. Explicitly setting them is just noise. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	5e2dd7a381	i965/blorp/gen8: Move viewport setup to after wm state This matches gen6 and gen7. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	802f0f8596	i965/blorp/gen6-7: Move multisample setup to right after samplers This mimics gen8 blorp Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	75304fdbd8	i965/blorp/gen6-7: Move surfaces and samplers closer together This mimics what we do on gen8. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	8b0426ddd4	i965/blorp/gen7-8: Emit depth stencil state with CC and BLEND All three go together on SNB so let's keep them together for gen7+ as well. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	38c1909c0a	i965/blorp/gen6: Move constant disables higher up This is what gen7-8 do and it's a bit cleaner. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	e0bc2cb145	i965/blorp: Don't clear an empty region Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	e4d6ffbbf6	i965/blorp: Move the non-static blorp state setup helpers to another file We're about to start replacing blorp state setup code with packing structs and we want to feel free to delete files as we go. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	50768a3879	i965/blorp: Make gen6 VS and GS disable helpers static Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	949a892026	i965: Roll intel_reg.h into brw_defines.h More than half of the stuff in intel_reg.h had nothing whatsoever to do with registers and really belongs in brw_defines.h anyway. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	8455f9430f	i965: Stop including brw_defines.h in brw_state.h Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	4c3acf94da	i965/state: Move is_drawing_lines/points to gen6_clip_state.c Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	04f3594cd5	genxml/gen9: Make 3DSTATE_SBE::AttributeActiveComponentFormat an array Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	bfdff28d68	genxml: Add a uint MOCS field to VERTEX_BUFFER_STATE Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	373613fa4b	genxml: Make a couple of VERTEX_BUFFER_STATE fields boolean Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	29f1f945a6	genxml: Make VERTEX_ELEMENT_STATE::Valid a bool Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	eb2589cba6	genxml/gen6: Make SAMPLER_STATE look a bit more like gen7 Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	2a84e40dae	genxml: Add a uint MOCS field to DEPTH_BUFFER packets This is easier than dealing with structs all the time Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	3f1022b029	genxml/gen6: Make "Depth Clear Value" a uint The actual data storred is in float, UNORM24, or UNORM16 depending on the actual depth format. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	be62e7645e	genxml/gen6: Add the 3D_Prim_Topo_Type enum Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	cca95a7bd6	genxml/gen6: Fix the length of 3DSTATE_WM Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	3ddb6f6e2a	genxml/gen6: Add a Surface Base Address field to HIER_DEPTH_BUFFER Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	be52e16dbc	genxml/gen6: Add uint MOCS fields for most things Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Kenneth Graunke	7d0554f341	nir: Rely on the fact that bcsel takes a well formed boolean. According to Connor, it's safe to assume that the first operand of bcsel, as well as the operand of b2f and b2i, must be well formed booleans. https://lists.freedesktop.org/archives/mesa-dev/2016-August/125658.html With the previous improvements to a@bool handling, this now has no change in shader-db instruction counts on Broadwell. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-19 02:05:23 -07:00
Francisco Jerez	7ceb42ccc5	i965/sched: Change the scheduling heuristics to favor early program termination. This uses the unblocked time of the exit assigned to each available node to attempt to unblock exit nodes as early as possible, potentially reducing the runtime of the shader when an exit branch is taken. There is a natural trade-off between terminating the program as early as possible and reducing the worst-case latency of the program as a whole (since this will typically move exit-unblocking nodes closer to its dependencies potentially causing additional stalls of the execution pipeline), but in practice the bandwidth and ALU cycle savings from terminating the program earlier tend to outweigh the slight increase in worst-case program execution latency, so it makes sense to prefer nodes likely to unblock an earlier exit regardless of the latency benefits of other available nodes. I haven't observed any benchmark regressions from this change after testing on VLV, HSW, BDW, BSW and SKL. The FPS of the GfxBench Manhattan benchmark increases by 10%-20% and the FPS of Unigine Valley improves by roughly 5% depending on the platform and settings. The change to the register pressure-sensitive heuristic is rather conservative and gives precedence to the existing heuristic in order to avoid increasing register pressure and causing spill count and SIMD width regressions in shader-db. It may make sense to revisit this with additional performance data. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-18 20:05:00 -07:00
Francisco Jerez	4147ca75d5	i965/sched: Assign a preferred exit node to each node of the dependency graph. This adds a bit of metadata to schedule_node that will be used to compare available nodes in the scheduling heuristic code based on which of them unblocks the earliest successor exit node. Note that assigning exit nodes wouldn't be necessary in a bottom-up scheduler because we could achieve the same effect by scheduling the exit nodes themselves appropriately. No shader-db changes. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-18 20:05:00 -07:00
Francisco Jerez	b295d7ca32	i965/sched: Calculate the critical path of scheduling nodes non-recursively. The critical path of each node is calculated by induction based on the critical paths of its children, which can be done in a post-order depth-first traversal of the dependency graph. The current code implements graph traversal by iterating over all nodes of the graph and then recursing into its children -- But it turns out that recursion is unnecessary because the lexical order of instructions in the block is already a good enough reverse post-order of the dependency graph (if it weren't a reverse post-order some instruction would have been located before one of its dependencies in the original ordering of the basic block, which is impossible), so we just need to walk the instruction list in reverse to achieve the same result more efficiently. No shader-db changes. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-18 20:05:00 -07:00
Francisco Jerez	b2b621a0ec	i965/fs: Switch to per-subspan discard jumps. ANY4H is more efficient than ANY8H and ANY16H because it makes sure that whenever a whole subspan hits a discard statement it gets disabled by the EU until the end of the program, regardless of whether the discard condition is uniform across all channels of the SIMD8-16 thread. OTOH ANY8H/ANY16H would cause the rest of the program to be executed for all channels if only one of the channels hadn't taken the discard branch, potentially increasing the bandwidth and ALU usage of the program unnecessarily. This change increases the FPS by over 3x of a simple micro-benchmark that discards a bunch of fragments and then does a single costly texturing operation. I've just re-verified the FPS change on HSW and SKL, but I expect all platforms from Gen6 up to get a similar benefit. Note that we could potentially be more aggressive and use the NORMAL predicate to discard individual channels, but that would need to happen post-scheduling because the scheduler currently doesn't care to reorder HALT instructions with respect to other instructions, and the NORMAL predicate would cause the results of subsequent derivative computations to become undefined -- If the scheduler didn't reorder HALT instructions it would actually be safe to switch to NORMAL because the behavior of derivative computations after a non-uniform discard statement is undefined by the GLSL spec, but that would make the optimization implemented by one of the following commits somewhat more difficult. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-18 20:05:00 -07:00
Francisco Jerez	01b321f242	i965/fs: Drop bogus writemasking disable bit from HALT instructions. This may have been the reason people ran into problems with non-uniform HALT instructions and ended up using the inefficient ANY16H/ANY8H predicates instead of ANY4H or NORMAL in order to prevent non-uniform discard. The HALT instruction is able to handle non-uniform execution masks just fine. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-18 20:04:59 -07:00
Ilia Mirkin	27e59ed477	mesa: avoid valgrind warning due to opaque only being set sometimes Valgrind complains with a "Conditional jump or move depends on uninitialised value(s)" warning due to opaque being conditionally initialized. However in the punchthrough_alpha == true case, it is always initialized, so just flip the condition around to silence the warning. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2016-08-18 22:48:55 -04:00
Ilia Mirkin	59bb821180	vbo: remove unnecessary max_basevertex computation The max basevertex is already computed and added into max_index by the caller, _tnl_draw_prims. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-18 20:26:34 -04:00
Ilia Mirkin	659dc10d32	vbo: add basevertex when looking up elements for vbo splitting Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97351 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-18 20:26:22 -04:00
Marek Olšák	07ccec002b	radeonsi: initialize and finalize the LLVM function pass manager Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2016-08-18 21:36:03 +02:00
Emil Velikov	d61d259518	isl: automake: use VISIBILITY_CFLAGS to restrict symbol visibility v2: Add VISIBILITY_CFLAGS to AM_CFLAGS (Ken) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1) Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-18 15:06:19 +01:00
mil Velikov	ebd5dc8826	anv: remove dummy VK_DEBUG_MARKER_EXT entry points The vkCmdDbgMarker{Begin,End} symbols are exported, yet the json does no advertise that the driver supports the extension. Furthermore the functions are empty stubs. Remove those until we get a proper implementation and json notation. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Cc: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-18 15:05:32 +01:00
Emil Velikov	49394e8d77	anv: do not export the Vulkan API With version 1 of the Loader interface there is an internal/private symbol (vk_icdGetInstanceProcAddr) which is used to retrieve all the API from the Vulkan entrypoints from the ICD. Implying that exposing the Vulkan API is not recommended. Version 2 goes a step further explicitly forbiding the ICD from exposing Vulkan symbols (and adding a negotiation API) As a reference: - Nvidia 367.35 Missing negotiation API - version 1. Exposes only vk_icdGetInstanceProcAddr. - AMD 16.30.3.306809 Have negotiation API - version 2, Exposes vk_icdGetInstanceProcAddr. Exposes a couple of Vulkan entry points - seems to be in violation with the spec. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-18 14:55:42 +01:00
Emil Velikov	1cdb6ca40b	anv: automake: build with -Bsymbolic Explicitly suggested in the Loader interface version 2 section, but it's good idea either way. It essentially, ensures that our symbols are not interposed. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Cc: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-18 14:53:33 +01:00
Emil Velikov	40e4fff563	anv: automake: use VISIBILITY_CFLAGS to restrict symbol visibility Hide the internal symbols and annotate the vk_icdGetInstanceProcAddr as public since the loader needs it (since v1 of the loader interface). v2: Add VISIBILITY_CFLAGS to AM_CFLAGS (Ken) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1) Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-18 14:53:30 +01:00
Emil Velikov	b0d56f2f4f	anv: remove internal 'validate' layer Presently the layer has only a single entry point. As mentioned by Jason the function does not validate anything that isn't checked elsewhere, thus we can drop the whole thing. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Cc: Jason Ekstrand <jason@jlekstrand.net> Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-18 14:53:24 +01:00
Kenneth Graunke	3a9e6102b4	nir/search: Extend 'a@bool' to handle a couple of system values. load_front_face and load_helper_invocation produce booleans. On Broadwell: total instructions in shared programs: 11638956 -> 11638011 (-0.01%) instructions in affected programs: 115093 -> 114148 (-0.82%) helped: 628 HURT: 14 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-18 01:27:27 -07:00
Kenneth Graunke	e8543feba7	nir/search: Fold src_is_bool()/alu_instr_is_bool() into src_is_type(). I don't want src_is_bool() and src_is_type(x, nir_type_bool) to behave differently. Having the logic spread out over three functions makes it harder to decide where to put new logic, as well. So, combine them all. It's a bit simpler because there's now only one recursive function rather than a pair of mutually recursive functions. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-18 01:27:15 -07:00
Kenneth Graunke	241870fe5b	nir/search: Introduce a src_is_type() helper for 'a@type' handling. Currently, 'a@type' can only match if 'a' is produced by an ALU instruction. This is rather limited - there are other cases we can easily detect which we should handle. Extending the code in-place would be fairly messy, so we introduce a new src_is_type() helper. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-18 01:26:47 -07:00
Kenneth Graunke	d14dd727f4	i965: Fix barrier count shift in scalar TCS backend. The "Barrier Count" field goes in 14:9 of m0.2. The vec4 backend correctly shifts by 9, but the scalar backend only shifted by 8. It's not like this changed - I think I just made a typo when writing the original scalar TCS backend code. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-08-18 00:47:00 -07:00
Kenneth Graunke	159f037755	i965: Fix execution size of scalar TCS barrier setup code. Previously, the scalar TCS backend was generating: mov(8) g17<1>UD 0x00000000UD { align1 WE_all 1Q compacted }; and(8) g17.2<1>UD g0.2<0,1,0>UD 0x0001e000UD { align1 WE_all 1Q }; shl(8) g17.2<1>UD g17.2<8,8,1>UD 0x0000000bUD { align1 WE_all 1Q }; or(8) g17.2<1>UD g17.2<8,8,1>UD 0x00008200UD { align1 WE_all 1Q }; send(8) null<1>UW g17<8,8,1>UD gateway (barrier msg) mlen 1 rlen 0 { align1 WE_all 1Q }; This is rubbish - g17.2<8,8,1>UD spans two registers, and is an illegal region. Not to mention it clobbers 8 channels of data when we only wanted to touch m0.2. Instead, we want: mov(8) g17<1>UD 0x00000000UD { align1 WE_all 1Q compacted }; and(1) g17.2<1>UD g0.2<0,1,0>UD 0x0001e000UD { align1 WE_all }; shl(1) g17.2<1>UD g17.2<0,1,0>UD 0x0000000bUD { align1 WE_all }; or(1) g17.2<1>UD g17.2<0,1,0>UD 0x00008200UD { align1 WE_all }; send(8) null<1>UW g17<8,8,1>UD gateway (barrier msg) mlen 1 rlen 0 { align1 WE_all 1Q }; Using component() accomplishes this. Fixes GL44-CTS.tessellation_shader.tessellation_shader_tc_barriers. barrier_guarded_read_write_calls on Skylake. Probably fixes other barrier issues on Gen8+. v2: Use a group(1, 0) builder so inst->exec_size is set correctly (thanks to Francisco Jerez for catching that it was incorrect). Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [v1] Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-08-18 00:47:00 -07:00
Kenneth Graunke	9e778837ff	i965: Implement the WaPreventHSTessLevelsInterference workaround. Fixes several GL44-CTS.tessellation_shader (and GL45 and ES31) subcases: - vertex_spacing - tessellation_shader_point_mode.points_verification - tessellation_shader_quads_tessellation.inner_tessellation_level_rounding Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-08-18 00:46:55 -07:00
Kenneth Graunke	d8971128ac	nir/builder: Add bany_inequal and bany helpers. The first simply picks the bany_inequal[234] opcodes based on the SSA def's number of components. The latter implicitly compares with zero to achieve the same semantics of GLSL's any(). Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-08-18 00:46:04 -07:00
Kenneth Graunke	01e99cba04	mesa: Fix uf10_to_f32() scale factor in the E == 0 and M != 0 case. GL_EXT_packed_float, 2.1.B Unsigned 10-Bit Floating-Point Numbers: 0.0, if E == 0 and M == 0, 2^-14 * (M / 32), if E == 0 and M != 0, 2^(E-15) * (1 + M/32), if 0 < E < 31, INF, if E == 31 and M == 0, or NaN, if E == 31 and M != 0, In the second case (E == 0 and M != 0), we were multiplying the mantissa by 2^-20, when we should have been multiplying by 2^-19 (which is 2^(-14 + -5), or 2^-14 * 2^-5, or 2^-14 / 32). The previous section defines the formula for 11-bit numbers, which is: 2^-14 * (M / 64), if E == 0 and M != 0, In other words, we had accidentally copy and pasted the 11-bit code to the 10-bit case, and neglected to change the exponent. Fixes dEQP-GLES3.functional.pbo.renderbuffer.r11f_g11f_b10f_triangles when run with surface dimensions of 1536x1152 or 1920x1080. Cc: mesa-stable@lists.freedesktop.org References: https://code.google.com/p/chrome-os-partner/issues/detail?id=56244 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Stephane Marchesin <stephane.marchesin@gmail.com> Reviewed-by: Antia Puentes <apuentes@igalia.com>	2016-08-17 17:26:11 -07:00
Tim Rowley	0ff57446e3	swr: [rasterizer core] only use Viewport/Scissors during SwrDraw* operations Add explicit rects for: - SwrClearRenderTarget - SwrDiscardRect - SwrInvalidateTiles - SwrStoreTiles Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-17 17:08:55 -05:00
Tim Rowley	6209dbf5a4	swr: [rasterizer common] reorder SWR_FORMAT_INFO Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-17 17:08:55 -05:00
Tim Rowley	2a25ce7472	swr: [rasterizer core] make dirtytile list point directly to macrotilequeues Speeds up high geometry HPC workloads. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-17 17:08:55 -05:00
Tim Rowley	550503e776	swr: [rasterizer core] portability - remove use of INT64 Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-17 17:08:55 -05:00
Tim Rowley	d70f96fd67	swr: [rasterizer core] viewport transform disabled fix When viewport transform is disabled (ie. screen space coords are passed in directly), the W component should be interpreted as RHW. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-17 17:08:55 -05:00
Tim Rowley	812b45d049	swr: [rasterizer core] clamp scissor rects to current tile rect Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-17 17:08:55 -05:00
Tim Rowley	93fb768c7e	swr: [rasterizer core] align stats structures Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-17 17:08:55 -05:00
Tim Rowley	9a25987b4a	swr: [rasterizer core] use AVX2 permute to simplify PaTriList Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-17 17:08:55 -05:00
Tim Rowley	c7c1a03f90	swr: [rasterizer core] move some global variables to SWR_CONTEXT Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-17 17:08:55 -05:00
Tim Rowley	b8c4717567	swr: [rasterizer core] change scale on VP matrix element gathers Was 1, which led to pulling denorms for non-zero indices. Changed to sizeof(float). Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-17 17:08:54 -05:00
Tim Rowley	d816c5d6ad	swr: [rasterizer] implementing native AVX-512 simd16 intrinsics Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-17 17:08:49 -05:00
Jason Ekstrand	342756a100	i965/blorp: Use nir_alu_type for the texture data type This lets us remove the brw_reg.h include Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	ce2a9831cc	i965: brw_blorp_blit.cpp -> blorp_blit.c Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	934adf1c30	i965: brw_blorp_clear.cpp -> blorp_clear.c Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	f5fbcc3683	i965: Split brw_blorp.c/h into multiple files This mega-commit pulls most of the i965-specific bits of blorp into the brw_blorp.c/h files which now contain nothing but i965 wrappers around "core blorp" calls. The "core blorp" api is moved into blorp.h and the internal blorp data structures are moved into blorp_priv.h. The new file blorp.c is created to house "core blorp" internals which are pulled from the old brw_blorp.c Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	075cc874bb	i965/blorp: Factor the guts of blorp_hiz_exec into a helper Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	9d22fd934a	i965/blorp: Break the guts of do_single_blorp_clear into two helpers The helpers are completely miptree-unaware and each fairly cleanly do a single thing. This does come at the downside of not doing proper debug reporting on whether or not we're doing replicated clears. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	7cddca39c0	i965/meta_util: Convert get_fast_clear_rect to take an isl_surf Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	376ce1d26e	i965/blorp/clear: Move isl_surf setup higher in the function Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	583f040fda	i965/blorp: Refactor fast-clear logic a bit This pulls the mcs allocation into the if statement where we initially determine that we are doing a fast clear and moves the programming of wm_inputs and figuring out the fast clear rect into it's own if statement. The next commit will put code inbetween the two. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	457a408932	i965/blorp/clear: Stop stomping the destination format The blorp_surface_info_init call above should set the format for us and stomping it later does nothing whatsoever. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	a6c2091da6	i965/meta_util: Only modify the input parameters in get_fast_clear_rect We had another inline copy of brw_meta_get_buffer_rect embedded in get_fast_clear_rect for no good reason. This lets us get rid of the gl_frameuffer parameter to get_fast_clear_rect. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	f748e15735	i965/blorp: Stop calling brw_meta_get_buffer_rect We already have an inlined version of the function slightly higher up in do_single_blorp_clear and all calling it does is stomp the values with the same thing. We might as well just get rid of it. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	18aad17ce2	i965/blorp: Pull the guts of resolve_color into a miptree-agnostic helper Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	dff74b83e1	i965/meta_util: Convert get_resolve_rect to use ISL Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	8fccdf85ba	i965/blorp: Make the guts of brw_blorp_blit_miptrees miptree-unaware Now that we have the brw_blorp_surf struct, we can start to make bits of blorp completely miptree-unaware. To start things off, we split the guts of brw_blorp_blit_miptrees into a brw_blorp_blit function which knows nothing about miptrees. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	75deae9c90	i965/blorp: Add a new brw_blorp_surf intermediate struct At the moment, this seems to make all of the interfaces messier rather than clener. However, it does provide a representation of a surface that simultaneously contains everything and is completely unaware of miptrees. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	57664c869f	i965/blorp: Use the isl_surf for more params setup The isl_surf munging doesn't happen until fairly late in the blorp_blit function. We can use the isl_surf for the vast majority if not all of our params setup. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	d8644f3eb6	i965/blorp: Do gen6 stencil offsets up-front This keeps all of the nastyness of gen6 stencil on the i965 side of the API line and lets us delete that nasty hand-rolled ISL-based offset path that we were using for ALL_SLICES_AT_EACH_LOD. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	406c503396	i965/blorp: Set up HiZ surfaces up-front Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	4d86b3fa2d	i964/blorp: Set up most aux surfaces up-front This commit also adds support for an offset for aux surfaces. In GL, this only gets used for HiZ on SNB at the moment. However, in Vulkan, all aux surfaces are at a non-zero offset and that is likely to happen in GL eventually. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	d540864730	i965/blorp: Stop using the miptree in state setup for tex/rt surfaces This commit movies us from a miptree model to a surf+bo+offset model. In the GL driver, miptrees are almost always at the start of the bo so the offset is zero but we don't want to always make that assumption. In the sort term, gen6 stencil and HiZ will be at an offset but, in the long term, any Vulkan surface is liable to be at a non-zero offset. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	8b02cd44d7	i965/blorp/blit: Move format work-arounds before surface_info_init Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	20c06d2b79	i965/miptree: Add real support for HiZ The previous HiZ support was bogus because all of get_aux_isl_surf looked at mt->mcs_mt directly. For HiZ buffers, you need to look at either mt->hiz_buf or mt->hiz_buf->mt. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	dc880c99b6	isl/state: Only set clear color if aux is used Otherwise, the clear color will get ignored. This prevents assertion errors if clear color is set to something invalid and aux is not used. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	2684e48321	i965/miptree: Use the isl helpers for creating aux surfaces In order for the calculations of things such as fast clear rectangles to work, we need more details of the auxiliary surface to be correct. In particular, we need to be able to trust the width and height fields. (These are not necessarily what you want coming out of the miptree.) The only values state setup really cares about are the row and array pitch and those we can safely stomp from the miptree. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	d9df82f2ff	isl: Add helpers for creating different types of aux surfaces Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	3c44d99653	i965/miptree: Use mcs_mt->qpitch for aux surfaces At one point, we were doing this correctly. It must have gotten lost in one of the many rebases. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	67ea60db0b	i965/miptree: Allow get_aux_isl_surf when there is no aux surface Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	dd46c8da31	i965/miptree: Support depth in get_isl_clear_color Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	6155d4ef56	isl/state: Add an assertion for IVB multisample array textures Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	3c75b315e1	isl: Add a #define for DEV_IS_BAYTRAIL Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	56746d04d5	i965/blorp: Remove unused fields from blorp_surface_info The only reason why we need layer or level is that we need the z-offset for 3-D surfaces. Let's just have the one field for that. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	1495b6315e	i965/blorp: Simplify depth buffer state setup a bit The data comes in via ISL in a format that's almost directly usable by the hardware so we can avoid some of the conversion headache. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	d814353365	i965/blorp: Use the generic surface state path for gen8 textures Now that the generic blorp path uses base level/layer, there's no need to make gen8 special. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	ed432fd681	isl: Add asserts for gen8+ X/YOffset rules Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	96fa98c18e	i965/blorp: Only do offset hacks for fake W-tiling and IMS Since the dawn of time, blorp has used offsets directly to get at different mip levels and array slices of surfaces. This isn't really necessary since we can just use the base level/layer provided in the surface state. While it may have simplified blorp's original design, we haven't been using the blorp path for surface state on gen8 thanks to render compression and there's really no good need for it most of the time. This commit restricts such surface munging to the cases of fake W-tiling and fake interleaved multisampling. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	9f9abc8214	i965/blorp: Add a z_offset field to blorp_surface_info The layer field is in terms of physical layers which isn't quite what the sampler will want for 2-D MS array textures. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	a9a6df807e	i965/blorp: Pass the Z component into all texture operations Multisample array surfaces on IVB don't support the minimum array element surface attribute so it needs to come through the sampler message. We may as well just pass it through everything. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	7abcdfbe13	i965/blorp: Rework hiz rect alignment calculations At the moment, the minify operation does nothing because params.depth.view.base_level is always zero. However, as soon as we start using actual base miplevels and array slices, we are going to need the minification. Also, we only need to align the surface dimensions in the case where we are operating on miplevel 0. Previously, it didn't matter because it aligned on miplevel 0 and, for all other miplevels, the miptree code guaranteed that the level was already aligned. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	871893cda2	i965/blorp: Map 1-D render targets with DIM_LAYOUT_GEN4_2D as 2D on gen9 The sampling hardware can handle them ok. It just looks at the tiling to determine whether it's the new gen9 1-D layout or the old one. The render hardware isn't so smart. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	ecd9789368	i965/miptree: Fill out the isl_surf::usage field Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	560a92c4fd	isl: Take the slice0_extent shortcut for interleaved MSAA The shortcut works just fine for MSAA and the comment even says so. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	1e02611276	isl: Remove duplicate px->sa conversions In all three cases, we start with width and height taken from isl_surf::phys_slice0_extent_sa which is already in samples. There is no need to do the conversion and doing so gives us an incorrect value. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	603d5f7638	i965/blorp: Use the isl_view from the blorp_surface_info Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	c097160463	i965/blorp: Get rid of brw_blorp_surface_info::width/height Instead, we manually mutate the surface size as needed. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	2095f932ef	i965/blorp: Move surface offset calculations into a helper The helper does a full transformation on the surface to turn it into a new 2-D single-layer single-level surface representing the original layer and level in memory. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	90ab43d1bb	i965/blorp: Use ISL to compute image offsets For the moment, we still call the old miptree function; we just assert that the two are equal. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	ba88a9622d	isl: Add functions for computing surface offsets in samples Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	f6c75df083	isl: Fix get_image_offset_sa_gen4_2d for multisample surfaces The function takes a logical array layer but was assuming it was a physical array layer. While we'er here, we also make it not assert-fail on gen9 3-D surfaces. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	7997f4f95b	i965/blorp: Add an isl_view to blorp_surface_info Eventually, this will be the actual view that gets passed into isl to create the surface state. For now, we just use it for the format and the swizzle. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	e046a46460	i965/blorp: Move intratile offset calculations out of surface state setup Previously we multiplied full x/y offsets, resolved tile aligned buffer offset and intra tile offset based on that. Now we let ISL to take into account the msaa setting and we only multiply the resolved intra tile offsets. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	27a58615d3	i965/blorp: Refactor interleaved multisample destination handling We put all of the code for fake IMS together. This requires moving a bit of the program key setup code further down so that it gets the right values out of the final surface. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	3c25caa318	i965/blorp: Get rid of brw_blorp_surface_info::array_layout Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	09879eff30	i965/blorp: Use isl_msaa_layout instead of intel_msaa_layout We also remove brw_blorp_surface_info::msaa_layout. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	e2a1bdb3c5	i965/blorp: Use the ISL aux_layout for deciding whether to do an MCS fetch Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	28b0ad890c	i965/blorp: Get rid of brw_blorp_surface_info::num_samples Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	aa6c058ac4	i965/blorp: Make sample count asserts a bit more lazy Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	aa4117a9e4	i965/blorp: Get rid of brw_blorp_surface_info::map_stencil_as_y_tiled Now that we're carrying around the isl_surf, we can just modify it directly instead of passing an extra bit around. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	801189e199	i965/blorp: Remove compute_tile_offsets We have a handy little function is ISL that does exactly the same thing. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	b82de88008	i965/blorp: Create the isl_surf up-front Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	ffeb5f67ac	i965/blorp/clear: Initialize surface info after allocating an MCS Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	1666d029aa	isl/state: Use a valid alignment for 1-D textures The alignment we use doesn't matter (see the comment) but it should at least be an alignment we can represent with the enums. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	0aa0b39769	i965/miptree: Remove the stencil_as_y_tiled parameter from get_tile_masks It's only used to stomp the tiling to Y and it's only used by blorp so there's no reason why blorp can't do it itself. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Jason Ekstrand	573f6ffd04	isl: Fix the parameter names for get_intratile_offset It's been in elements for a while but, for whatever reason, the parameter names in the header file never got updated. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-17 14:46:22 -07:00
Brian Paul	5de29aeef0	util: try to use SSE instructions with MSVC and 32-bit gcc The lrint() and lrintf() functions are pretty slow and make some texture transfers very inefficient. This patch makes a better effort at using those intrisics for 32-bit gcc and MSVC. Note, this patch doesn't address the use of SSE4.1 with MSVC. v2: get rid of the ROUND_WITH_SSE symbol, per Matt. Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-17 12:53:20 -06:00
Brian Paul	18e6e0796a	svga: fix src/dst typo in can_blit_via_copy_region_vgpu10() The function was always returning false because of this typo. Retested with piglit. There's some sRGB-related blit failures, but that seems unrelated. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com>	2016-08-17 12:53:20 -06:00
Brian Paul	55417140cd	svga: initialize a variable to silence a gcc warning Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-17 12:53:20 -06:00
Ian Romanick	607ab6d3bf	glsl: Pull enum ir_expression_operation out to its own file No change except to the copyright symbol. The next patch will generate this file with Python, and Unicode + Python = pure rage. v2: Massive rebase... I guess a lot can change in a year. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-17 13:48:25 +01:00
Ian Romanick	de71bc9eb6	glsl: Make the generated sources build rules more like NIR Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-17 13:48:25 +01:00
Francesco Ansanelli	120c9c6380	mesa/st: use llabs instead of abs for long args (v2) v2: long has 32bit on Windows (Marek) Signed-off-by: Francesco Ansanelli <francians@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 14:16:29 +02:00
Marek Olšák	57a8991020	radeonsi: fix up buffer descriptor upper-bound checking st/mesa does this too, so we're safe. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-17 14:15:33 +02:00
Marek Olšák	325379096f	gallium: change pipe_image_view::first_element/last_element -> offset/size This is required by OpenGL. Our hardware supports this. Example: Bind RGBA32F with offset = 4 bytes. Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-17 14:15:33 +02:00
Marek Olšák	7cd256ce7e	gallium: change pipe_sampler_view::first_element/last_element -> offset/size This is required by OpenGL. Our hardware supports this. Example: Bind RGBA32F with offset = 4 bytes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97305 Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-17 14:15:33 +02:00
Marek Olšák	1ac23a9359	gallium/radeon: assign the highest priority to scratch; make rings second just FYI, the kernel receives priority/4 Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-17 14:15:29 +02:00
Marek Olšák	9009516501	gallium/winsys: re-number winsys priority flags free 60..63, move CP_DMA up Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-17 12:24:35 +02:00
Marek Olšák	95020c6dfd	gallium/radeon: mark shader rings as highest-priority buffers and rename the enum Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-17 12:24:35 +02:00
Marek Olšák	e2bb24f213	gallium/radeon: set SHADER_RW_BUFFER priority for streamout buffers Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-17 12:24:35 +02:00
Marek Olšák	a6b5845a0d	radeonsi: use current context for DCC feedback-loop decompress, fixes Elemental This is just a workaround. The problem is described in the code. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96541 v2: say that it's only between the current context and aux_context Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)	2016-08-17 12:24:35 +02:00
Marek Olšák	9812a50ae6	radeonsi: simplify CB_TARGET_MASK logic we can now rely on CB_COLORn_INFO to disable empty slots. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-17 12:24:35 +02:00
Marek Olšák	2d2b384066	radeonsi: don't set CB_COLOR1_INFO for dual src blending Vulkan doesn't do this. The reason may be that CB_COLOR1_INFO.SOURCE_FORMAT from NI was moved to SPI_SHADER_COL_FORMAT for SI. I asked CB guys about this 2 days ago and they still haven't replied. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-17 12:24:35 +02:00
Marek Olšák	e722b90bc9	radeonsi: eliminate PS OUT[1] if dual src blending is off and CB1 is not bound All VP DX9 ports benefit from this. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-17 12:24:35 +02:00
Marek Olšák	3de8ffe836	gallium/radeon: use unflushed fences for PIPE_QUERY_GPU_FINISHED Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-17 12:24:35 +02:00
Nicolai Hähnle	c5798d6314	gallium/radeon: use lp_build_alloca_undef Avoid building all those store 0 / store undef instruction pairs that end up getting removed anyway. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:25 +02:00
Nicolai Hähnle	41001ca4bd	gallivm: add lp_build_alloca_undef Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:24 +02:00
Nicolai Hähnle	17e88e276c	gallivm: add create_builder_at_entry helper function Reduces code duplication. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:24 +02:00
Nicolai Hähnle	f4204ba53d	gallium/radeon: protect against out of bounds temporary array accesses They can lead to VM faults and worse, which goes against the GL robustness promises. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:24 +02:00
Nicolai Hähnle	ea283779be	gallium/radeon: add radeon_llvm_bound_index for bounds checking Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:24 +02:00
Nicolai Hähnle	8916d1e2fa	gallium/radeon: reduce alloca of temporaries based on usagemask v2: take actual writemasks into account Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:24 +02:00
Nicolai Hähnle	6bba956073	gallium/radeon: use tgsi_scan_arrays for temp arrays Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:23 +02:00
Nicolai Hähnle	7c2295d7ef	gallium/radeon: allocate temps array info in radeon_llvm_context_init Also, prepare for using tgsi_array_info. This also opens the door for properly handling allocation failures, but I'm leaving that for a separate change. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:23 +02:00
Nicolai Hähnle	850c8dcc9c	gallium/radeon: always do the full store in store_value_to_array Doing the write-back of the temporary vector in radeon_llvm_emit_store makes no sense. This also allows us to get rid of get_alloca_for_array. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:23 +02:00
Nicolai Hähnle	4b150931c9	gallium/radeon: extract common getelementptr logic into get_pointer_into_array Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:23 +02:00
Nicolai Hähnle	dfbb8ea284	gallium/radeon: pass indirect register info into get_alloca_for_array To have the same signature as get_array_range. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:23 +02:00
Nicolai Hähnle	b76aabffa2	gallium/radeon: extract common lookup code into get_temp_array function Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:23 +02:00
Nicolai Hähnle	fa84296a5a	gallium/radeon: clarify the comment on the array alloca heuristic Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:22 +02:00
Nicolai Hähnle	92b66b38c9	gallium/radeon: more descriptive names for LLVM temporaries in debug builds Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:22 +02:00
Nicolai Hähnle	eacfc86d83	gallium/radeon: simplify radeon_llvm_emit_store for direct array addressing We can use the pointer stored in the temps array directly. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:22 +02:00
Nicolai Hähnle	87fa7cea23	gallium/radeon: simplify radeon_llvm_emit_fetch for direct array addressing We can use the pointer stored in the temps array directly. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:22 +02:00
Nicolai Hähnle	eb50cbf3bd	gallium/radeon: clean up emit_declaration for temporaries In the alloca'd array case, no longer create redundant and unused allocas for the individual elements; create getelementptrs instead. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:22 +02:00
Nicolai Hähnle	cb9ed66cc5	st_glsl_to_tgsi: use calloc the way it's meant to be used Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:22 +02:00
Nicolai Hähnle	67c0f077a2	tgsi/scan: add tgsi_scan_arrays Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:21 +02:00
Ian Romanick	2ec3a3e151	glsl: Add missing ir_quadop_vector constant evaluation for Boolean types Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-17 10:52:39 +01:00
Ian Romanick	cf58e3f522	glsl: Fix typo in ir_unop_f2u implementation This won't affect the output, but it was, technically, wrong. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-17 10:52:39 +01:00
Ian Romanick	8b123b08cb	glsl: Fix typo in ir_unop_b2i implementation This won't affect the output, but it was, technically, wrong. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-17 10:52:39 +01:00
Ian Romanick	cd8764737e	glsl: Don't support integer types for operations that can't handle them ir_unop_fract already forbade integer types in ir_validate. ir_unop_rcp, ir_unop_rsq, and ir_unop_sqrt should also forbid them in ir_validate. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-17 10:52:39 +01:00
Ian Romanick	437e612bd7	glsl: Don't support ir_unop_abs or ir_unop_sign for unsigned integers Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-17 10:52:39 +01:00
Ian Romanick	cceb50e14e	nir/algebraic: Optimize common array indexing sequence Some shaders include code that looks like: uniform int i; uniform vec4 bones[...]; foo(bones[i * 3], bones[i * 3 + 1], bones[i * 3 + 2]); CSE would do some work on this: x = i * 3 foo(bones[x], bones[x + 1], bones[x + 2]); The compiler may then add '<< 4 + base' to the index calculations. This results in expressions like x = i * 3 foo(bones[x << 4], bones[(x + 1) << 4], bones[(x + 2) << 4]); Just rearranging the math to produce (i * 48) + 16 saves an instruction, and it allows CSE to do more work. x = i * 48; foo(bones[x], bones[x + 16], bones[x + 32]); So, ~6 instructions becomes ~3. Some individual shader-db results look pretty bad. However, I have a really, really hard time believing the change in estimated cycles in, for example, 3dmmes-taiji/51.shader_test after looking that change in the generated code. G45 total instructions in shared programs: 4020840 -> 4010070 (-0.27%) instructions in affected programs: 177460 -> 166690 (-6.07%) helped: 894 HURT: 0 total cycles in shared programs: 98829000 -> 98784990 (-0.04%) cycles in affected programs: 3936648 -> 3892638 (-1.12%) helped: 894 HURT: 0 Ironlake total instructions in shared programs: 6418887 -> 6408117 (-0.17%) instructions in affected programs: 177460 -> 166690 (-6.07%) helped: 894 HURT: 0 total cycles in shared programs: 143504542 -> 143460532 (-0.03%) cycles in affected programs: 3936648 -> 3892638 (-1.12%) helped: 894 HURT: 0 Sandy Bridge total instructions in shared programs: 8357887 -> 8339251 (-0.22%) instructions in affected programs: 432715 -> 414079 (-4.31%) helped: 2795 HURT: 0 total cycles in shared programs: 118284184 -> 118207412 (-0.06%) cycles in affected programs: 6114626 -> 6037854 (-1.26%) helped: 2478 HURT: 317 Ivy Bridge total instructions in shared programs: 7669390 -> 7653822 (-0.20%) instructions in affected programs: 388234 -> 372666 (-4.01%) helped: 2795 HURT: 0 total cycles in shared programs: 68381982 -> 68263684 (-0.17%) cycles in affected programs: 1972658 -> 1854360 (-6.00%) helped: 2458 HURT: 307 Haswell total instructions in shared programs: 7082636 -> 7067068 (-0.22%) instructions in affected programs: 388234 -> 372666 (-4.01%) helped: 2795 HURT: 0 total cycles in shared programs: 68282020 -> 68164158 (-0.17%) cycles in affected programs: 1891820 -> 1773958 (-6.23%) helped: 2459 HURT: 261 Broadwell total instructions in shared programs: 9002466 -> 8985875 (-0.18%) instructions in affected programs: 658784 -> 642193 (-2.52%) helped: 2795 HURT: 5 total cycles in shared programs: 78503092 -> 78450404 (-0.07%) cycles in affected programs: 2873304 -> 2820616 (-1.83%) helped: 2275 HURT: 415 Skylake total instructions in shared programs: 9156978 -> 9140387 (-0.18%) instructions in affected programs: 682625 -> 666034 (-2.43%) helped: 2795 HURT: 5 total cycles in shared programs: 75591392 -> 75550574 (-0.05%) cycles in affected programs: 3192120 -> 3151302 (-1.28%) helped: 2271 HURT: 425 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-17 10:52:38 +01:00
Michel Dänzer	4ac640e3d2	glx: Don't use current context in __glXSendError There's no guarantee that there is one, and we don't need one anyway. Fixes piglit tests: glx@glx-fbconfig-bad glx@glx_ext_import_context@import context, multi process glx@glx_ext_import_context@import context, single process Fixes: `2e3f067458` ("glx: fix error code when there is no context bound") Cc: "11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-08-17 17:16:34 +09:00
Ilia Mirkin	e988999791	nv50/ir: fix bb positions after exit instructions It's fairly rare that the BB layout puts BBs after the exit block, which is likely the reason these issues lingered for so long. This fixes a fraction of issues with the giant pixmark piano shader. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: <mesa-stable@lists.freedesktop.org>	2016-08-16 21:56:16 -04:00
Ilia Mirkin	0b5f40b881	nv50/ir: properly clear upper bits of a bitset fill Found by inspection. In practice, val is always == 0, so this never got triggered. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-16 21:56:16 -04:00
Francisco Jerez	4d436c011f	i965/fs: Estimate maximum sampler message execution size more accurately. The current logic used to determine the execution size of sampler messages was based on special-casing several argument and opcode combinations, which unsurprisingly missed the possibility that some messages could exceed the payload size limit or not depending on the number of coordinate components present. In particular: - The TXL, TXB and TEX messages (the latter on non-FS stages only) would attempt to use SIMD16 on Gen7+ hardware even if a shadow reference was present and the texture was a cubemap array, causing it to overflow the maximum supported sampler payload size and crash. - The TG4_OFFSET message with shadow comparison was falling back to SIMD8 regardless of the number of coordinate components, which is unnecessary when two coordinates or less are present. Both cases have been handled incorrectly ever since cubemap arrays and texture gather were respectively enabled (the current logic used by the SIMD lowering pass is almost unchanged from the previous no16 fall-back logic used pre-SIMD lowering times). Fixes the following GL4.5 conformance test on Gen7-8 (the bug also affects Gen9+ in principle, but SKL passes the test by luck because it manages to use the TXL_LZ message instead of TXL): GL45-CTS.texture_cube_map_array.sampling Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97267 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-16 16:31:59 -07:00
Francisco Jerez	61a02fb74c	i965/fs: Return zero from fs_inst::components_read for non-present sources. This makes it easier for the caller to find out how many scalar components are actually read by the instruction. As a bonus we no longer need to special-case BAD_FILE in the implementation of fs_inst::regs_read. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-16 16:31:59 -07:00
Francisco Jerez	0c754d1c42	i965/fs: Lower TEX to TXL during NIR translation. This simplifies the code slightly and will allow the SIMD lowering pass to find out easily what the actual texturing opcode is in order to determine the maximum execution size of texturing instructions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-16 16:31:59 -07:00
Rob Clark	5def00875d	freedreno/a3xx: fix generic clear path Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-08-16 19:26:03 -04:00
Brian Paul	df2dcf6200	st/mesa: use pipe var instead of st->pipe in st_create_context_priv() As is done in most other places in the function. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-16 08:28:33 -06:00
Brian Paul	038b1b11fe	gallium: remove unused u_clear.h file Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-16 08:28:33 -06:00
Brian Paul	22b8288b33	gallium/i915: inline the util_clear() code into i915_clear_blitter() This is the only place the util_clear() function was used. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-16 08:28:32 -06:00
Brian Paul	66debeae9d	gallium/util: minor reformatting in u_box.h Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-16 08:28:32 -06:00
Brian Paul	b6c81a780f	svga: remove unused var in svga_mark_surfaces_dirty() Signed-off-by: Brian Paul <brianp@vmware.com>	2016-08-16 08:28:22 -06:00
Brian Paul	1e5eb79d9a	svga: avoid a calloc in svga_buffer_transfer_map() Just initialize the two other pipe_transfer fields explicitly. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-16 08:24:53 -06:00
Brian Paul	f934117bbb	svga: don't call os_get_time() when not needed by Gallium HUD The calls to os_get_time() were showing up higher than expected in profiles. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-16 08:24:53 -06:00
Brian Paul	dcf2126f90	svga: remove unneeded memset() call in draw_vgpu10() All three fields of the vbuffer_attrs[] array are assigned in the following loop. The remaining elements of the array are not used. Tested with full Piglit run, Heaven 4.0, etc. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-16 08:24:52 -06:00
Brian Paul	ced0dd0e95	svga: reduce looping in svga_mark_surfaces_dirty() We don't need to loop over the max number of color buffers, just the current number (which is usually one). Tested with full Piglit run, Heaven 4.0, etc. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-16 08:24:52 -06:00
Brian Paul	88efaf9878	svga: minor clean-ups in define_rasterizer_object() Add const qualifiers, new comment. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-16 08:24:52 -06:00
Brian Paul	ce9c05a593	svga: remove incorrect buffer invalidation code Fixes regression with team_fortress_2 trace. This change has been in our in-house tree for some time. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-16 08:24:52 -06:00
Brian Paul	06b23f747d	svga: additional comments for svga_hw_draw_state members And re-order a few fields. Signed-off-by: Brian Paul <brianp@vmware.com>	2016-08-16 08:24:52 -06:00
Brian Paul	7c5eda6f4e	svga: use the sws local var to simplify some code Signed-off-by: Brian Paul <brianp@vmware.com>	2016-08-16 08:24:52 -06:00
Brian Paul	7b821941f6	svga: minor whitespace and code clean-ups Signed-off-by: Brian Paul <brianp@vmware.com>	2016-08-16 08:24:52 -06:00
Rob Clark	27f12dd8fd	freedreno/a4xx: use generic clear path Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-08-16 09:21:13 -04:00
Rob Clark	f77e59e76c	freedreno/a3xx: use generic clear path Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-08-16 09:21:13 -04:00
Rob Clark	a8e6734a83	freedreno: support for using generic clear path Since clears are more or less just normal draws, there isn't that much benefit in having hand-rolled clear path. Add support to use u_blitter instead if gen specific backend doesn't implement ctx->clear(). Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-08-16 09:21:13 -04:00
Rob Clark	142dd7b9c0	gallium/u_blitter: split out a helper for common clear state Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-16 09:21:13 -04:00
Rob Clark	2b2f436c69	gallium/u_blitter: add helper to save FS const buffer state Not (currently) state that is overwridden by u_blitter itself, but drivers with custom blit/clear which are reusing part of the u_blitter infrastructure will use it. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-16 09:21:13 -04:00
Rob Clark	433e12fea8	gallium/u_blitter: export some functions Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-16 09:21:13 -04:00
Nicolas Boichat	78e3cea419	egl/dri2: dri2_make_current: Release previous context's display eglMakeCurrent can also be used to change the active display. In that case, we need to decrement ref_count of the previous display (possibly destroying it), and increment it on the next display. Also, old_dsurf/old_rsurf cannot be non-NULL if old_ctx is NULL, so we only need to test if old_ctx is non-NULL. v2: Save the old display before destroying the context. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97214 Fixes: `9ee683f877` (egl/dri2: Add reference count for dri2_egl_display) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reported-by: Alexandr Zelinsky <mexahotabop@w1l.ru> Tested-by: Alexandr Zelinsky <mexahotabop@w1l.ru> Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>	2016-08-16 17:30:35 +09:00
Nayan Deshmukh	09dff7ae2e	st/vdpau: change the order in which filters are applied(v3) Apply the median and matrix filter before the compostioning we apply the deinterlacing first to avoid the extra overhead in processing the past and the future surfaces in deinterlacing. v2: apply the filters on all the surfaces (Christian) v3: use get_sampler_view_planes() instead of get_sampler_view_components() and iterate over VL_MAX_SURFACES (Christian) Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-08-16 10:07:35 +02:00
Kenneth Graunke	1f47f78fc3	glcpp: Update tests for new #undef of built-in macro rules. Ian recently changed the preprocessor to allow this in most GLSL versions, but not GLSL ES 3.00+. This patch converts the existing test that expects a failure to a #version 300 es shader, and adds a #version 110 shader to make sure that it's allowed. Fixes 'make check'. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97307 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Tested-by: Vinson Lee <vlee@freedesktop.org>	2016-08-15 22:55:34 -07:00
Dave Airlie	c2f2252037	anv: fix writemask on blit fragment shader. I'm not sure if anything even uses this, but I found this on radv, so just fix it on anv for consistency. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-08-16 10:29:44 +10:00
Nicolas Boichat	c0580f6a38	egl/android: Set dpy->DriverData to NULL on error Avoid use-after-free on error. Fixes: `9ee683f877` (egl/dri2: Add reference count for dri2_egl_display) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Tested-by: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-15 19:00:30 +01:00
Nicolas Boichat	a9e8fb7397	egl/drm: Set disp->DriverData to NULL on error Avoid use-after-free on error. Fixes: `9ee683f877` (egl/dri2: Add reference count for dri2_egl_display) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Tested-by: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-15 19:00:30 +01:00
Nicolas Boichat	0e67d86540	egl/surfaceless: Set disp->DriverData to NULL on error Avoid use-after-free on error. Fixes: `9ee683f877` (egl/dri2: Add reference count for dri2_egl_display) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Tested-by: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-15 19:00:30 +01:00
Nicolas Boichat	48fd952f28	egl/wayland: Set disp->DriverData to NULL on error Avoid use-after-free, fix spec@egl_khr_fence_sync@conformance. Fixes: `9ee683f877` (egl/dri2: Add reference count for dri2_egl_display) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reported-by: Michel Dänzer <michel@daenzer.net> Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Tested-by: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-15 19:00:30 +01:00
Jan Ziak	769ac1ec78	egl/x11: avoid using freed memory if dri2 init fails Found with valgrind: ==4841== Invalid read of size 4 ==4841== at 0x56BDC80: dri2_initialize (egl_dri2.c:783) ==4841== by 0x56BAFE5: _eglMatchAndInitialize (egldriver.c:261) ==4841== by 0x56BB15E: _eglMatchDriver (egldriver.c:295) ==4841== by 0x56B58C9: eglInitialize (eglapi.c:480) ==4841== by 0x4F537DC: _glfwInitEGL (in /usr/lib64/libglfw.so.3.2) ==4841== by 0x4F4BEFB: _glfwPlatformInit (in /usr/lib64/libglfw.so.3.2) ==4841== by 0x4F46F40: glfwInit (in /usr/lib64/libglfw.so.3.2) ==4841== by 0x402E59: main ==4841== Address 0x6a05824 is 148 bytes inside a block of size 480 free'd ==4841== at 0x4C2B680: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==4841== by 0x56C2AAE: dri2_initialize_x11_swrast (platform_x11.c:1233) ==4841== by 0x56C2AAE: dri2_initialize_x11 (platform_x11.c:1493) ==4841== by 0x56BDCEB: dri2_initialize (egl_dri2.c:805) ==4841== by 0x56BAFAF: _eglMatchAndInitialize (egldriver.c:261) ==4841== by 0x56BB0C9: _eglMatchDriver (egldriver.c:292) ==4841== by 0x56B58C9: eglInitialize (eglapi.c:480) ==4841== by 0x4F537DC: _glfwInitEGL (in /usr/lib64/libglfw.so.3.2) ==4841== by 0x4F4BEFB: _glfwPlatformInit (in /usr/lib64/libglfw.so.3.2) ==4841== by 0x4F46F40: glfwInit (in /usr/lib64/libglfw.so.3.2) ==4841== by 0x402E59: main ==4841== Block was alloc'd at ==4841== at 0x4C2A868: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==4841== by 0x56C2A47: dri2_initialize_x11_swrast (platform_x11.c:1171) ==4841== by 0x56C2A47: dri2_initialize_x11 (platform_x11.c:1493) ==4841== by 0x56BDCEB: dri2_initialize (egl_dri2.c:805) ==4841== by 0x56BAFAF: _eglMatchAndInitialize (egldriver.c:261) ==4841== by 0x56BB0C9: _eglMatchDriver (egldriver.c:292) ==4841== by 0x56B58C9: eglInitialize (eglapi.c:480) ==4841== by 0x4F537DC: _glfwInitEGL (in /usr/lib64/libglfw.so.3.2) ==4841== by 0x4F4BEFB: _glfwPlatformInit (in /usr/lib64/libglfw.so.3.2) ==4841== by 0x4F46F40: glfwInit (in /usr/lib64/libglfw.so.3.2) ==4841== by 0x402E59: main Signed-off-by: Jan Ziak (http://atom-symbol.net) <0xe2.0x9a.0x9b@gmail.com> Fixes: `9ee683f877` (egl/dri2: Add reference count for dri2_egl_display) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolas Boichat <drinkcat@chromium.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-15 19:00:29 +01:00
Emil Velikov	6b4b2a4dd6	anv: add genX_multisample.h to the sources list(s). Otherwise it won't end up in the release tarball. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-15 19:00:29 +01:00
Kevin Strasser	71258e9462	anv/x11: Add support for Xlib platform Some applications continue to use the Xlib client library and expect that VK_KHR_xlib_surface will be available in the driver. Service these applications by converting the Display pointer to xcb_connection_t and use the existing xcb code in the driver. Signed-off-by: Kevin Strasser <kevin.strasser@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-08-15 09:47:06 -07:00
Tapani Pälli	5d9b50e596	glx: apple specific occurences of dummyContext check Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> Cc: Jeremy Huddleston Sequoia <jeremyhu@apple.com>	2016-08-15 09:24:10 +03:00
Bernard Kilarski	2e3f067458	glx: fix error code when there is no context bound v2: change all related NULL checks to check against dummyContext v3: really check for dummyContext only when ctx was from __glXGetCurrentContext v4: cover more checks, add dummyBuffer, dummyVtable (Emil) Signed-off-by: Bernard Kilarski <bernard.r.kilarski@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Cc: "11.2" <mesa-stable@lists.freedesktop.org>	2016-08-15 09:24:10 +03:00
Mathias Fröhlich	312ece9cd7	mesa: Remove duplicate include. In api_validate.c stdbool.h was included twice. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-08-15 07:10:39 +02:00
Mathias Fröhlich	84984b9986	vbo: Remove always true return from vbo_bind_arrays. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-08-15 07:10:39 +02:00
Mathias Fröhlich	72f1566f90	mesa: Move check for vbo mapping into api_validate.c. Instead of checking for mapped buffers in vbo_bind_arrays do this check in api_validate.c. This additionally enables printing the draw calls name into the error string. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-08-15 07:10:39 +02:00
Mathias Fröhlich	b7b0c51f1f	mesa: Move _mesa_all_buffers_are_unmapped to arrayobj.c. Move the function to check if all vao buffers are unmapped into the vao implementation file. Rename the function to _mesa_all_buffers_are_unmapped. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-08-15 07:10:39 +02:00
Mathias Fröhlich	c17cf1c8f5	vbo: Array draw must not care about glBegin/glEnd vbo mapping. In array draw do not check if the vertex buffer object that is used to implement immediate mode glBegin/glEnd is mapped. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-08-15 07:10:39 +02:00
Ilia Mirkin	5c1ccd8053	nv50,nvc0: fix depth range when halfz is enabled Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97231 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-08-14 17:41:49 -04:00
Ilia Mirkin	c85b7f0e87	gallium/util: add helper to compute zmin/zmax for a viewport state Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-08-14 17:41:33 -04:00
Ilia Mirkin	68b64f32e8	vbo: allow DrawElementsBaseVertex in display lists Looks like it was missed originally. The multi version is there already. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97331 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: mesa-stable@lists.freedesktop.org	2016-08-14 12:06:51 -04:00
Rob Clark	561fd226d4	freedreno/a3xx+a4xx: move common VBOs to fd_context These are the same for a3xx and later. (a2xx could probably use them too, but due to limited hw support and ancient downstream kernels, it isn't so easy to test.) Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-08-13 13:59:03 -04:00
francians@gmail.com	a49fb4ab2d	freedreno/a2xx: add missing casts to silence notices Signed-off-by: Francesco Ansanelli <francians@gmail.com> Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-08-13 09:37:41 -04:00
Rob Clark	78ba262d00	freedreno/ir3: fix issue with emit_tex() For various tex fetch instructions, coord's get fixed up in different ways. But modifying the array returned from get_src() has side-effects if the same SSA src is used again.. the later instruction will see the previous fixups. Fix this, and const'ify things to prevent this sort of mistake in the future. Noticed by Varad when adding support for txf_ms. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-08-13 09:33:47 -04:00
Ilia Mirkin	a32c87f74b	glsl: emit a specific error when ast_*_assign changes type For regular ast_add, we can implicitly change either a or b's type. However in an assignment situation, the type of the lvalue is fixed. So if the implicit conversion logic decides to change it, it means that the rhs's type could not be converted to the lhs type. Emit a specific error for this rather than the rather mysterious "is not an lvalue" error that results from having a i2f or other operation as the lvalue. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96729 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-08-12 22:45:20 -04:00
Ilia Mirkin	d816a51b81	st/mesa: provide GL_OES_copy_image support by caching the original ETC data The additional provision of GL_OES_copy_image is that it work for ETC. However many desktop GPUs don't have native ETC support, so st/mesa does the decoding by hand. Instead of discarding the compressed data, keep it around in CPU memory. Use it when performing image copies. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Marek Olšák <marek.olsak@amd.com>	2016-08-12 20:21:08 -04:00
Ilia Mirkin	7727e6f67c	st/mesa: refactor duplicated etc fallback checks Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-12 20:21:08 -04:00
Ilia Mirkin	1baae00089	glsl: look for frag data bindings with [0] tacked onto the end for arrays The GL spec is very unclear on this point. Apparently this is discussed without resolution in the closed Khronos bugtracker at https://cvs.khronos.org/bugzilla/show_bug.cgi?id=7829 . The recommendation is to allow dropping the [0] for looking up the bindings. The approach taken in this patch is to instead tack on [0]'s for each arrayness level of the output's type, and doing the lookup again. That way, for out vec4 foo[2][2][2] we will end up looking for bindings for foo, foo[0], foo[0][0], and foo[0][0][0], in that order of preference. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96765 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-08-12 20:21:08 -04:00
Lionel Landwerlin	0294dd00cc	anv: pipeline: gen7: fix assert in debug mode SampleMask is only 8bits long on gen7. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97278 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-08-12 17:03:48 -07:00
Haixia Shi	8c56ff643b	mesa: change state query return value for RGB565 The GL_BGR and GL_UNSIGNED_SHORT_5_6_5_REV are not defined anywhere in OpenGL ES 3.2 (or earlier) specification, and there are no known extensions in the Khronos registry that would add these enums as valid responses for glGetIntegerv(GL_IMPLEMENTATION_COLOR_READ_TYPE) and glGetIntegerv(GL_IMPLEMENTATION_COLOR_READ_FORMAT) queries. Note that this patch does not change the bit layout returned by the query. As defined by the GL spec, the bit layout of GL_RGB + GL_UNSIGNED_SHORT_5_6_5 and GL_BGR + GL_UNSIGNED_SHORT_5_6_5_REV are identical. TEST=dEQP-GLES3.functional.state_query.integers.* Signed-off-by: Haixia Shi <hshi@chromium.org> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: Stéphane Marchesin <marcheu@chromium.org> Change-Id: I81bbc8ccdc7e125edaeae443baf6fa8fdefcc6b6	2016-08-12 15:34:09 -07:00
Anuj Phogat	0bf531aee6	anv/device: Add limits for InterpolationOffset Fixes the vulkan cts regression in test dEQP-VK.api.info.device.properties Cc: Mark Janes <mark.a.janes@intel.com> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-12 10:45:02 -07:00
Anuj Phogat	7f6136d7db	i965: Change 8X MSAA sample mapping This is required following the change in 8X sample positions. Fixes the recently modified multisample-scaled-blit piglit tests. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-12 10:45:02 -07:00
Anuj Phogat	fb1bc5007d	i965: Change 8x multisample positions There are no standard sample positions defined in OpenGL and OpenGL ES specs. Implementations have the freedom to pick the positions which give plausible results. But the Vulkan 1.0 spec does define standard sample positions for different sample counts. Defined positions in Vulkan for all the sample counts except 8X match with the positions we set in i965. We have an upcoming plan to share the blorp code between OpenGL and Vulkan driver in near future. Keeping the 8X sample positions same on both the drivers will help us move in that direction. Here is an argument by Neil Roberts (from commit `20250e85`) against any advantage of current 8X sample positions over the new ones: "The comment above for the 8x sample positions says that the hardware implements centroid interpolation by picking the centre-most sample that is inside the primitive. That implies that it might be worthwhile to pick a pattern that includes 0.5,0.5. However by experimentation this doesn't seem to actually be the case. With the sample positions in this patch, if I modify the piglit test below so that it instead reports the centroid position, it reports 0.492188,0.421875 which doesn't match any of the positions. If I modify the sample positions so that they include one at exactly 0.5,0.5 it doesn't help and it reports another position which is even further from the center for some reason. arb_gpu_shader5-interpolateAtSample-different Kenneth Graunke experimented with some other patterns that have a higher standard deviation but I think after some discussion it was decided that it would be better to pick the same pattern as the other graphics API in case there are games that rely on this pattern." Observed no regressions in jenkins testing. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-12 10:45:02 -07:00
Anuj Phogat	1fe36d849c	anv: Use macro to avoid code duplication for sample positions Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-12 10:45:02 -07:00
Marek Olšák	317e136ef0	st/mesa: BufferData should flag NewDriverState because NewDriverState is filtered depending on active shader states, while st->dirty isn't. Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-12 18:50:01 +02:00
Marek Olšák	085aa7f91e	st/mesa: don't update atomic, SSBO, UBO and TBO states that have no effect Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-12 18:50:01 +02:00
Marek Olšák	ac032d800e	st/mesa: _NEW_TEXTURE & CONSTANTS shouldn't flag states that aren't used Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-12 18:50:01 +02:00
Marek Olšák	c323d5b809	st/mesa: when changing shaders, only dirty states that are affected by them This reduces the amount of state processing that has no effect. Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-12 18:50:01 +02:00
Marek Olšák	8c1775c14c	st/mesa: determine states used or affected by shaders at compile time At compile time, each shader determines which ST_NEW flags should be set at shader bind time. This just sets the new field for all shaders. The next commit will use it. v2: small code unification Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)	2016-08-12 18:49:24 +02:00
Marek Olšák	a7d33315a7	st/mesa: remove TES/TCS/GS state dirtying optimization This will be replaced with a better mechanism. Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-12 18:47:24 +02:00
Marek Olšák	0be30ea1a8	st/mesa: don't update clip state on VS changes if it has no effect Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-12 18:47:24 +02:00
Marek Olšák	412bd7360c	st/mesa: don't update clip state if it has no effect Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-12 18:47:24 +02:00
Chad Versace	dd93cbc894	mesa: Document that _mesa_enum_to_string() returns non-null (v2) It always returns non-null, even if the number is an invalid enum. Cc: Haixia Shi <hshi@chromium.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Change-Id: I26e8843c96130be972e66f48a49e362442e1bf97	2016-08-12 09:09:55 -07:00
Kenneth Graunke	f9f462936a	glsl: Fix invariant matching in GLSL 4.30 and GLSL ES 1.00. Old languages (GLSL <= 4.20 and GLSL ES 1.00) require "invariant" to be specified on both inputs and outputs, and match when linking. New languages only allow outputs to be qualified as "invariant" and remove the "invariant must match" restriction when linking varyings (because no input can have that qualifier). Commit `426a50e208` introduced the new behavior for ES 3.00. It also removed the "must match" restriction for ES 1.00 shaders, which I believe is incorrect. This patch adds that back, as well as making 4.30+ follow the new rules. Thanks to Qiankun Miao for noticing this discrepancy. Fixes a WebGL 2.0 conformance test when run in Chromium: https://www.khronos.org/registry/webgl/sdk/tests/deqp/data/gles3/shaders/qualification_order.html?webglVersion=2 Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96971 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-08-11 23:56:53 -07:00
Kenneth Graunke	0ed316360f	glsl: Tidy stream handling in merge_qualifier(). The previous commit fixed xfb_buffer handling, which was largely copy and pasted from the stream handling. The difference is that stream was set in input_layout_mask, so it worked. However, that's totally rubbish: stream is only valid on geometry shader outputs. Presumably this was to hack around inout. Instead, apply the solution I used in the previous fix. Really, we just need to separate shader interface and parameter qualifier handling so this isn't a mess, but this patch at least tidies it slightly. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-08-11 23:56:48 -07:00
Kenneth Graunke	dffa371665	glsl: Fix inout qualifier handling in GLSL 4.40. inout variables have q.in and q.out set. We were trying to set xfb_buffer = 1 for shader output variables (and inadvertantly setting it on inout parameters, too). But input_layout_mask doesn't have xfb_buffer set, so it was seen as in invalid input qualifier. This meant that all 'inout' parameters were broken. Caught by running a WebGL conformance test in Chromium: https://www.khronos.org/registry/webgl/sdk/tests/deqp/data/gles3/shaders/qualification_order.html?webglVersion=2 Fixes Piglit's tests/spec/glsl-4.40/compiler/inout-parameter-qualifier. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-08-11 23:56:40 -07:00
Miklós Máté	17f1c49b9a	swrast: fix active attribs with atifragshader Only include the ones that can be used by the shader. This fixes texture coordinates, which were completely wrong, because WPOS was included in the list of attribs. It also increases performance noticeably. Signed-off-by: Miklós Máté <mtmkls@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-11 08:29:23 -06:00
Indrajit Das	8074c6b6ea	st/omx/dec/h264: pass default scaling lists in raster format Tested-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2016-08-11 16:02:28 +02:00
Jose Fonseca	06b63f1f43	appveyor: Force Visual Studio 2013 image. It seems the default build image is now Visual Studio 2015, and Visual Studio 2013 is not installed.	2016-08-11 14:39:39 +01:00
Jose Fonseca	16627fc87d	appveyor: Install pywin32 extensions. AppVeyor build images seem to have been upgraded to Python 2.7.12, but no longer have pywin32 pre-installed.	2016-08-11 14:39:39 +01:00
Timothy Arceri	33b3815773	glsl/tests: fix segfault in uniform initializer test Caused by `549222f5` Tested-by: Aaron Watry <awatry@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97286	2016-08-11 14:57:18 +10:00
Ian Romanick	50b49d242d	glcpp: Only disallow #undef of pre-defined macros on GLSL ES >= 3.00 shaders Section 3.4 (Preprocessor) of the GLSL ES 3.00 spec says: It is an error to undefine or to redefine a built-in (pre-defined) macro name. The GLSL ES 1.00 spec does not contain this text. Section 3.3 (Preprocessor) of the GLSL 1.30 spec says: #define and #undef functionality are defined as is standard for C++ preprocessors for macro definitions both with and without macro parameters. At least as far as I can tell GCC allow '#undef __FILE__'. Furthermore, there are desktop OpenGL conformance tests that expect '#undef __VERSION__' and '#undef GL_core_profile' to work. Fixes: GL45-CTS.shaders.preprocessor.definitions.undefine_version_vertex GL45-CTS.shaders.preprocessor.definitions.undefine_version_fragment GL45-CTS.shaders.preprocessor.definitions.undefine_core_profile_vertex GL45-CTS.shaders.preprocessor.definitions.undefine_core_profile_fragment Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: mesa-stable@lists.freedesktop.org	2016-08-10 16:42:02 -07:00
Ian Romanick	eda6349346	glcpp: Track the actual version instead of just the version_resolved flag Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: mesa-stable@lists.freedesktop.org	2016-08-10 16:42:02 -07:00
Timothy Arceri	30e5ff7067	glsl: remove remaining tabs in link_uniform_initializers.cpp Reviewed-by: Eric Anholt <eric@anholt.net>	2016-08-11 08:33:38 +10:00
Timothy Arceri	549222f5f8	glsl: use UniformHash to find storage location There is no need to be looping over all the uniforms. Reviewed-by: Eric Anholt <eric@anholt.net>	2016-08-11 08:33:30 +10:00
Timothy Arceri	82e153daff	glsl: remove dead builtins before assigning varying locations Builtins already have locations assigned so this shouldn't change anything. We want to call it earlier so we can tranform GLSL IR to NIR earlier. Reviewed-by: Eric Anholt <eric@anholt.net>	2016-08-11 08:33:21 +10:00
Timothy Arceri	588702cc41	glsl: split out varying and uniform linking code Here a new function link_varyings_and_uniforms() is created this should help make it easier to follow the code in link_shader() which was getting very large. Note the end of the new function contains a for loop with some lowering calls that currently don't seem related to varyings or uniforms but they are a dependancy for converting to NIR ealier so we move things here now to keep things easy to follow. Reviewed-by: Eric Anholt <eric@anholt.net>	2016-08-11 08:33:12 +10:00
Jason Ekstrand	4c3a6b07e2	i965/vec4: Make opt_vector_float reset at the top of each block The pass isn't really control-flow aware and you can get into case where it tries to combine instructions from different blocks. This can actually lead to an assertion failure when removing unneeded instructions if part of the vector is set in one block and part in another. This prevents regressions in the next commit. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-08-10 15:19:55 -07:00
Eric Anholt	ac6966360f	mesa: Use a temporary set to track whether we've added a resource yet. Saves another .1s on servo.trace. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-08-10 12:27:22 -07:00
Eric Anholt	ee02a5e330	prog_hash_table: Convert to using util/hash_table.h. Improves glretrace -b servo.trace (a trace of Mozilla's servo rendering engine booting, rendering a page, and exiting) from 1.8s to 1.1s. It uses a large uniform array of structs, making a huge number of separate program resources, and the fixed-size hash table was killing it. Given how many times we've improved performance by swapping the hash table to util/hash_table.h, just do it once and for all. This just rebases the old hash table API on top of util/, for minimal diff. Cleaning things up is left for later, particularly because I want to fix up the new hash table API a little bit. v2: Add UNUSED to the now-unused parameter. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-08-10 12:27:22 -07:00
Eric Anholt	91945f9e91	prog_hash_table: Convert compare funcs to match util/hash_table.h. I'm going to replace this hash table with util/hash_table.h, and the first step is to compare things the same way. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-08-10 12:27:22 -07:00
Eric Anholt	60f1b436b9	nir: Drop an unused program/hash_table.h include. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-08-10 12:27:22 -07:00
Tim Rowley	6198160250	swr: [rasterizer core] unused variable warning fixes Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:09:48 -05:00
Tim Rowley	9aa75e5d46	swr: [rasterizer jitter] add core string to JitManager Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:09:42 -05:00
Tim Rowley	b311bdf92d	swr: [rasterizer core] fix OOB check of viewport indices Use correct comparison intrinsic for OOB check of viewport indices. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:09:36 -05:00
Tim Rowley	2eae02f77c	swr: [rasterizer common] add linux definition for InterlockedAdd64 Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:09:22 -05:00
Tim Rowley	e8b35a2321	swr: [rasterizer jitter] add VMASKSTOREPS intrinsic Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:09:16 -05:00
Tim Rowley	3393279fc9	swr: [rasterizer jitter] add mask support for odd format fetch Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:09:10 -05:00
Tim Rowley	92621ac5d5	swr: [rasterizer core] routing of viewport indexes through frontend Viewport transform performed based on per-prim viewport index if available. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:09:00 -05:00
Tim Rowley	4e8763cb09	swr: [rasterizer core] split FE and BE stats Separated FE stats out into its own structure. There are 17 FE vs 3 BE stat fields. Since there is only one FE thread per DC then we don't have to loop over all threads and sum up FE stats over all the worker threads. This also reduces size of DC since we only need to store one copy of the FE stats and not one per worker. Finally, we can use the new FE callback mechanism to update these. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:08:51 -05:00
Tim Rowley	f833b694cd	swr: [rasterizer core] remove all old stats code Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:08:45 -05:00
Tim Rowley	ad153189ec	swr: [rasterizer core] viewport array support Change viewport matrix storage from AOS to SOA to support viewport arrays. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:08:40 -05:00
Tim Rowley	d86e2487a0	swr: [rasterizer jitter] fetch support for offsetting VertexID Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:08:33 -05:00
Tim Rowley	6625fd08db	swr: [rasterizer core] fundamentally change how stats work Add a per draw stats callback to update driver stats. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:08:23 -05:00
Tim Rowley	047493c198	swr: [rasterizer core] add rasterizerSampleCount to PS context Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:08:17 -05:00
Tim Rowley	a83beb936e	swr: [rasterizer core] remove cygwin threads.cpp stubs Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:08:11 -05:00
Tim Rowley	29e1c4a8a9	swr: [rasterizer core] allow override of KNOB thread settings - Remove HYPERTHREADED_FE support - Add threading info as optional data passed to SwrCreateContext. If supplied this data will override any KNOB thread settings. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:08:05 -05:00
Tim Rowley	e0c10306f5	swr: [rasterizer core] add SwrWaitForIdleFE This is a blocking call that waits until all FE work is complete. This is useful for waiting for FE work to complete such as for streamout. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:07:59 -05:00
Tim Rowley	8dfaf249cc	swr: [rasterizer core] change threadsDone to be a 32-bit value. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:07:53 -05:00
Tim Rowley	6624e01114	swr: [rasterizer core] update trivial accept test conditions enable/disable raster tile trivial accept test based on scissor enable trait. Can be optimized further. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:07:47 -05:00
Tim Rowley	7cf187d08a	swr: [rasterizer core] improve implementation for SoWriteOffset 1. SoWriteOffset is no longer treated as a stat 2. Added callback from core to update streamout write offset Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:07:40 -05:00
Tim Rowley	8d3b20135e	swr: [rasterizer common] make disabled asserts always print (but not break) Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-10 11:07:00 -05:00
Leo Liu	6575ebdc45	vl/rbsp: add a check for emulation prevention three byte This is the case when the "00 00 03" is very close to the beginning of nal unit header v2: move the check to rbsp init Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-08-10 09:52:44 -04:00
Ilia Mirkin	bc5df3b321	Re-apply "glsl: don't try to lower non-gl builtins as if they were gl_FragData" If a shader has an output array, it will get treated as though it were gl_FragData and rewritten into gl_out_FragData instances. We only want this to happen on the actual gl_FragData and not everything else. This is a small part of the problem pointed out by the below bug. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96765 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-10 15:43:36 +02:00
Marek Olšák	9c63fd9056	radeonsi: set CB_COLORn_INFO.ROUND_MODE just do what the register spec says Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-08-10 15:43:36 +02:00
Marek Olšák	667ad9fa3e	radeonsi: set CB_COLORn_INFO.SIMPLE_FLOAT This can help enable some blend optimizations (see the register spec). Vulkan always sets this. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-08-10 15:43:36 +02:00
Marek Olšák	36057ff12a	radeonsi: disallow MIN/MAX blend equations for dual source blending Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-08-10 15:43:36 +02:00
Marek Olšák	947e0614d0	radeonsi: only set dual source blending for MRT0 This is the proper fix for Overlord and Witcher 2 hangs. The hang condition is that 1 app must write to MRT0 and MRT1 from a pixel shader while MRT1 is disabled in CB_TARGET_MASK (does this generate unflushable pixel quads? I don't know), and another app (e.g. Glamor) must enable dual source blending in both MRT0 and MRT1. The hw gets confused, which leads to corruption and hangs. Cc: 12.0 11.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-08-10 15:43:36 +02:00
Miklós Máté	88c2fc6b2d	st/mesa: in ATI fs don't assume TEMP0=REG0 The temporaries are allocated dynamically. Signed-off-by: Miklós Máté <mtmkls@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-08-10 15:03:58 +02:00
Trevor Davenport	9a4d5db4d2	st/nine: Fix invalid attempt to use indirect draws. Since commit `6d7177f01b`, radeonsi would take a different path if info->indirect_params was not initialized properly. Nine was not initializating this field. Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-08-10 15:02:20 +02:00
Mathias Fröhlich	0ce5ec8ece	util: Use win32 intrinsics for util_last_bit if present. v2: Split into two patches. v3: Fix off by one problem. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2016-08-10 09:30:07 +02:00
Marek Olšák	3f100b77f9	gallium/radeon: use unflushed fences for deferred flushes (v2) +23% Bioshock Infinite performance. v2: - use the new fence_finish interface - allow deferred fences with multiple contexts - clear the ctx pointer after a deferred flush Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-10 01:11:10 +02:00
Marek Olšák	1cc95a1255	st/mesa: set the ctx parameter of fence_finish for deferred flushes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-10 01:11:10 +02:00
Marek Olšák	54272e18a6	gallium: add a pipe_context parameter to fence_finish required by glClientWaitSync (GL 4.5 Core spec) that can optionally flush the context Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-10 01:11:10 +02:00
Marek Olšák	c6043e7d54	st/mesa: use PIPE_USAGE_STREAM for GL_CLIENT_STORAGE_BIT without READ_BIT (v2) v2: keep STAGING for GL_MAP_READ_BIT Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-08-10 01:11:10 +02:00
Marek Olšák	33a9b4e8a1	gallium/radeon: add HUD queries for mapped VRAM/GTT mainly for monitoring visible VRAM congestion Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-10 01:11:10 +02:00
Marek Olšák	645d395d9a	winsys/radeon: track the amount of mapped memory Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-10 01:11:10 +02:00
Marek Olšák	1e04483c22	winsys/amdgpu: track the amount of mapped memory Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-10 01:11:10 +02:00
Marek Olšák	8276776e64	winsys/amdgpu: don't try to unmap userptr buffers no app calls this AFAIK Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-10 01:11:10 +02:00
Marek Olšák	ef836c0d04	gallium/radeon: increase the size of the renderer string Mine is longer than 64 bytes. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-10 01:11:10 +02:00
Marek Olšák	739d526b07	gallium/radeon: implement ARB_clear_texture (v3) Some ideas copied from Jakob Sinclair's implementation, but the color clearing is completely different. v2: remove leftover code, disable conditional rendering disable render condition cleanly Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-10 01:11:10 +02:00
Marek Olšák	7df15389af	gallium/radeon: handle render_condition_enable for clear_rt/ds Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-10 01:10:21 +02:00
Marek Olšák	a909210131	gallium: add render_condition_enable param to clear_render_target/depth_stencil Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-10 01:10:21 +02:00
Haixia Shi	a7c6993a33	egl: android: query native window default width and height (v2) On android platform, the width and height of a native window surface may be updated after initialization. It is therefore necessary to query android framework for the current width and height. v2: remove Android specific #ifdef's and just implement the fallback directly if the platform query_surface() callback is not provided. TEST=dEQP-EGL.functional.resize.surface_size#* on cyan-cheets Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org> (v1) Reviewed-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Chad Versace <chad@kiwitree.net> Change-Id: I673f7d2f1d90c3bf572b30f63da537f2cae1496e	2016-08-09 15:49:28 -07:00
Anuj Phogat	c4cd0e8ecd	anv/device: Enable sample shading on gen7+ Passes all 30 min_sample_shading tests in vulkan cts. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-09 14:45:25 -07:00
Anuj Phogat	f16295a198	anv/gen7_pipeline: Set multisample state using shared function Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-09 14:45:25 -07:00
Anuj Phogat	2ef5063ad7	anv/pipeline: Add sample locations for gen7-7.5 V1: Add multisample positions (Nanley) V2: Fix 8x sample positions to match OpenGL (Anuj) V3: Vulkan has standard sample locations. They need not be same as in OpenGL. (Anuj) Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-09 14:45:25 -07:00
Anuj Phogat	dc49dd7f10	anv/pipeline: Move emit_ms_state() to genX_pipeline_util.h This will help sharing multisample state setting code. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-09 14:45:25 -07:00
Mathias Fröhlich	aa920736fe	gallium: Add c99_compat.h to u_bitcast.h We need this for 'inline'. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Tested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-09 21:20:56 +02:00
Mathias Fröhlich	027cbf00f2	util: Move _mesa_fsl/util_last_bit into util/bitscan.h As requested with the initial creation of util/bitscan.h now move other bitscan related functions into util. v2: Split into two patches. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Tested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-09 21:20:46 +02:00
Nicolai Hähnle	e4cb3af524	radeonsi: enable multi-draw related pipe caps This enables GL_shader_draw_parameters and GL_ARB_indirect_parameters as well as a properly accelerated implementation of GL_ARB_multi_draw_indirect. Enabling the feature requires a sufficiently uptodate firmware -- those have already been released a long time ago, although this does mean that the feature only works with the amdgpu kernel module, since the radeon module doesn't have a way to query the firmware version. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-09 15:56:04 +02:00
Nicolai Hähnle	6d7177f01b	radeonsi: program additional multi draw parameters Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-09 15:56:04 +02:00
Nicolai Hähnle	b6c71d37c7	radeonsi: program the DRAWID SGPR Note that for indirect draws, the new MULTI firmware packets are required. There's also no need to reset last_{start_instance,sh_base_reg}, since resetting last_base_vertex is sufficient. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-09 15:56:04 +02:00
Nicolai Hähnle	8dbf2a8570	radeonsi: add DRAWID parameter to vertex shaders Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-09 15:56:04 +02:00
Nicolai Hähnle	febb5dbf72	radeonsi: wire up TGSI_SEMANTIC_BASEINSTANCE Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-09 15:56:03 +02:00
Nicolai Hähnle	d34292a77f	radeonsi: remove an incorrect assertion Byte indices don't need any alignment, so remove this assertion (it got moved into a path where a piglit test hit it during the refactoring of commit `64ff23a58c`). Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-09 15:56:03 +02:00
Nicolai Hähnle	2852dedaa0	radeonsi: flush TC L2 cache for indirect draw data This fixes a bug when indirect draw data is generated by transform feedback. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-09 15:56:03 +02:00
Nicolai Hähnle	76c4a3b567	radeonsi/sid: add additional bits for the DRAW_(INDEX)_INDIRECT_MULTI packets Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-09 15:56:03 +02:00
Brian Paul	60dc36a680	st/mesa: define ST_NEW_ flags as uint64_t values, not enums MSVC doesn't support 64-bit enum values, at least not with C code. The compiler was warning: c:\users\brian\projects\mesa\src\mesa\state_tracker\st_atom_list.h(43) : warning C4309: 'initializing' : truncation of constant value c:\users\brian\projects\mesa\src\mesa\state_tracker\st_atom_list.h(44) : warning C4309: 'initializing' : truncation of constant value ... And at runtime we crashed since the high 32-bits of the 'dirty' bitmask was always 0xffffffff and the 32+u_bit_scan() index went out of bounds of the atoms[] array. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-09 07:50:18 -06:00
Miklós Máté	d9519c6f06	mesa: simplify ff fs generator a bit Literally. Signed-off-by: Miklós Máté <mtmkls@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-09 07:46:37 -06:00
Marek Olšák	06b2fd04f6	ddebug: dump driver states and shaders for apitrace calls I think this was an oversight when the PIPE_DUMP flags were added. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-09 15:35:42 +02:00
Timothy Arceri	8c4d9afb7e	nir: make use of nir_cf_list_extract() helper Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-09 13:21:30 +10:00
Matt Turner	b1d9c742e9	nir: Always print non-identity swizzles. Previously we would not print a swizzle on ssa_52 when only its .x component is used (as seen in the definition of ssa_53): vec3 ssa_52 = fadd ssa_51, ssa_51 vec1 ssa_53 = flog2 ssa_52 vec1 ssa_54 = flog2 ssa_52.y vec1 ssa_55 = flog2 ssa_52.z But this makes the interpretation of the RHS of the definition difficult to understand and dependent on the size of the LHS. Just print swizzles when they are not the identity swizzle, so the previous example is now printed as: vec3 ssa_52 = fadd ssa_51.xyz, ssa_51.xyz vec1 ssa_53 = flog2 ssa_52.x vec1 ssa_54 = flog2 ssa_52.y vec1 ssa_55 = flog2 ssa_52.z Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-08 17:52:35 -07:00
Lionel Landwerlin	8cde4ddbce	anv/pipeline/gen7: Set multisample modes Fixes the following failures : dEQP-VK.api.copy_and_blit.resolve_image.whole_4_bit dEQP-VK.api.copy_and_blit.resolve_image.whole_8_bit dEQP-VK.api.copy_and_blit.resolve_image.partial_4_bit dEQP-VK.api.copy_and_blit.resolve_image.partial_8_bit dEQP-VK.api.copy_and_blit.resolve_image.with_regions_4_bit dEQP-VK.api.copy_and_blit.resolve_image.with_regions_8_bit Tested on IVB/HSW Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-08 14:44:25 -07:00
Lionel Landwerlin	a3c472a2ec	anv/pipeline: rename info to rs_info in emit_rs_state Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-08 14:44:25 -07:00
Marek Olšák	1ebf3c4b67	Revert "glsl: don't try to lower non-gl builtins as if they were gl_FragData" This reverts commit `a37e46323c`. It broke the game Overlord such that it hung a GCN GNU. While I don't know how the hang happened because of its randomness and gfx corruption precedes it, many of the shaders contain this: out vec4 FragData[gl_MaxDrawBuffers];	2016-08-08 23:24:20 +02:00
Tomasz Figa	3723e9826f	egl/android: Add support for YV12 pixel format (v2) This patch adds support for YV12 pixel format to the Android platform backend. Only creating EGL images is supported, it is not added to the list of available visuals. v2: Use const array defined just for YV12 instead of trying to be overly generic. Signed-off-by: Tomasz Figa <tfiga@chromium.org> Signed-off-by: Kalyan Kondapally <kalyan.kondapally@intel.com> Tested-by: Rob Herring <rob@kernel.org> Reviewed-by: Chad Versace <chad@kiwitree.net> Change-Id: I4aeb2d67a95c5cdd10b530c549b23146c8f0b983	2016-08-08 14:18:38 -07:00
Kenneth Graunke	3190c7ee97	st/mesa: Make Gallium's BlitFramebuffer follow the GL 4.4 sRGB rules. OpenGL 4.4 specifies that BlitFramebuffer should perform sRGB encode and decode like ES 3.x does, but only when GL_FRAMEBUFFER_SRGB is enabled. This is technically incompatible in certain cases, but is more consistent across GL, ES, and WebGL, and more flexible. The NVIDIA 367.35 drivers appear to follow this behavior. For the awful spec analysis, please read Piglit's tests/spec/arb_framebuffer_srgb/blit.c, which explains the differences between GL 4.1, 4.2, 4.3 (2012), 4.3 (2013), and 4.4, and why this is the right rule to implement. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-08 14:04:18 -07:00
Kenneth Graunke	f6dc71483a	meta: Make Meta's BlitFramebuffer() follow the GL 4.4 sRGB rules. Just avoid whacking GL_FRAMEBUFFER_SRGB altogether, so we respect the application's setting. This appears to work. v2: Update one more comment (requested by Ian). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-08 14:04:01 -07:00
Kenneth Graunke	ad32dcf630	i965: Make BLORP's BlitFramebuffer follow the GL 4.4 sRGB rules. OpenGL 4.4 specifies that BlitFramebuffer should perform sRGB encode and decode like ES 3.x does, but only when GL_FRAMEBUFFER_SRGB is enabled. This is technically incompatible in certain cases, but is more consistent across GL, ES, and WebGL, and more flexible. The NVIDIA 367.35 drivers appear to follow this behavior. For the awful spec analysis, please read Piglit's tests/spec/arb_framebuffer_srgb/blit.c, which explains the differences between GL 4.1, 4.2, 4.3 (2012), 4.3 (2013), and 4.4, and why this is the right rule to implement. Note that ctx->Color.sRGBEnabled is initialized to _mesa_is_gles(ctx), and ES doesn't have enable/disable flags for GL_FRAMEBUFFER_SRGB, so it's effectively on all the time. This means the ES behavior should be unchanged. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-08 14:01:51 -07:00
Kenneth Graunke	352401f6a9	i965: Make BLORP do sRGB encode/decode on ES 2 as well. This should have no effect, as all drivers which support BLORP also support ES 3.0 - so ES 2.0 would be promoted and follow the ES 3 rules. ES 1.0 doesn't have BlitFramebuffer. This is purely to clarify the next patch a bit. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-08 14:01:51 -07:00
Kenneth Graunke	0c7047ab9c	Revert "st/mesa: use sRGB formats for MSAA resolving if destination is sRGB" This reverts commit `4e549ddb50`, dropping the hack from Gallium that I just deleted from i965. See the previous commit for rationale. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-08 14:01:51 -07:00
Kenneth Graunke	cc27c7fe38	i965: Drop the "do resolves in sRGB" hack. I've never quite understood the purpose of this hack - supposedly, doing resolves in the sRGB colorspace is slightly more accurate. Currently, BlitFramebuffer() ignores sRGB encoding and decoding on OpenGL, although it encodes and decodes in GLES 3.x. The updated OpenGL 4.4 rules also allow for encoding and decoding if GL_FRAMEBUFFER_SRGB is enabled, allowing the application to control what colorspace blits are done in. I don't think this hack makes any sense in such a world - the application can do what it wants, and we shouldn't second guess them. A related Piglit patch, "Make multisample accuracy test set GL_FRAMEBUFFER_SRGB when resolving." makes the Piglit MSAA accuracy test explicitly request SRGB encoding/decoding during resolves when running "srgb" subtests. Without that patch, this commit will regress those tests, but with it, they should continue to work just fine. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-08 14:01:51 -07:00
Kenneth Graunke	b1586526e8	i965: Bail on the BLT path if BlitFramebuffer requires sRGB conversion. Modern OpenGL BlitFramebuffer require sRGB encode/decode when GL_FRAMEBUFFER_SRGB is enabled. The blitter can't handle this, so we need to bail. On Gen4-5, this means falling back to Meta, which should handle it. We allow sRGB <-> sRGB blits, as decode then encode ought to be a noop (other than potential precision loss, which nobody wants anyway). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-08 14:01:51 -07:00
Tomasz Figa	7dfb1a4074	egl/android: Make get_fourcc() accept HAL formats There are DRI_IMAGE_FOURCC macros, for which there are no corresponding DRI_IMAGE_FORMAT macros. To support such formats we need to make the lookup function take the native format directly. As a side effect, it simplifies all existing calls to this function, because they all called get_format() first to convert from native to DRI_IMAGE_FORMAT. Signed-off-by: Tomasz Figa <tfiga@chromium.org> Tested-by: Rob Herring <rob@kernel.org> Reviewed-by: Chad Versace <chad@kiwitree.net> Change-Id: I4674000fb5ccfd02e38b8fa89bc567ac1d4fc16b	2016-08-08 11:40:41 -07:00
Tomasz Figa	e77b493390	egl/android: Refactor image creation to separate flink and prime paths (v2) This patch splits current dri2_create_image_android_native_buffer() into main entry point and two additional functions, one for creating an image from flink name and one for handling prime FDs using the generic DMA-buf path. This makes the code cleaner and also prepares for disabling flink path more easily in the future. v2: Split into separate patch. Add error messages. Signed-off-by: Tomasz Figa <tfiga@chromium.org> Tested-by: Rob Herring <rob@kernel.org> Reviewed-by: Chad Versace <chad@kiwitree.net> Change-Id: Ifdfb5927399d56992fe707160423c29278f49172	2016-08-08 11:40:37 -07:00
Tomasz Figa	217af75a40	egl/android: Respect buffer mask in droid_image_get_buffers (v2) Drivers can request different set of buffers depending on the buffer mask they pass to the get_buffers callback. This patch makes droid_image_get_buffers() respect this mask. v2: Return error only in case of real error condition and ignore requests of unavailable buffers. Signed-off-by: Tomasz Figa <tfiga@chromium.org> Tested-by: Rob Herring <rob@kernel.org> Reviewed-by: Chad Versace <chad@kiwitree.net> Change-Id: I6c3c4eca90f4c618579f6725dec323c004cb44ba	2016-08-08 11:40:31 -07:00
Tomasz Figa	c6c26bc589	egl/android: Remove unused variables in droid_get_buffers_with_format() Fix compilation warnings due to unused variables left after some earlier code changes. Signed-off-by: Tomasz Figa <tfiga@chromium.org> Tested-by: Rob Herring <rob@kernel.org> Reviewed-by: Chad Versace <chad@kiwitree.net> Change-Id: Iec09eb2a62887f3a38dff156756ed8385f3f3447	2016-08-08 11:40:26 -07:00
Jason Ekstrand	52fcc40760	anv/pipeline/gen7: Set the depth format in 3DSTATE_SF Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-08 11:13:46 -07:00
Jason Ekstrand	21d5c1be6a	isl: Add a helper for getting a depth format from an isl_format Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-08 11:13:44 -07:00
Jason Ekstrand	ce980541d5	anv/pipeline: Unify 3DSTATE_RASTER and 3DSTATE_SF setup between gen7 and gen8 Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-08 11:13:41 -07:00
Jason Ekstrand	960e8a1260	anv/pipeline/gen8: Set 3DSTATE_SF::StatisticsEnable We've been setting it in gen7 forever but never in gen8; best to make it consistent. This hasn't caused any problems yet because we don't advertise support for statistics queries yet. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-08 11:13:36 -07:00
Jason Ekstrand	12e653adec	anv/pipeline/gen8: Unconditionally set DXMultisampleRasterizaitonEnable The multisample rasterization mode is computed based on this field, 3DSTATE_RASTER::DXMultisampleRasterizationMode (only for forced multisampling), 3DSTATE_RASTER::APIMode, and the number of samples. There are two tables in the SKL PRM that describe how the final multisample mode is calculated: "Windower (WM) Stage >> Multisampling >> Multisample ModeState >> Table 1" and the formula for "SF_INT::Multisample Rasterization Mode". The "DX Multisample Rasterization Enable" bit changes whether multisample mode is set to OFF_PIXEL or ON_PATTERN in the samples > 1 case. In the samples == 1 case, the bit has no effect. Since Vulkan has no concept of disabling multisampling for samples > 1, we can just set the bit. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-08 11:13:33 -07:00
Jason Ekstrand	1df511b6f0	anv/pipeline/gen8: Use fewer designated initializers in emit_rs_state Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-08 11:13:31 -07:00
Jason Ekstrand	6136fb8687	genxml: Make 3DSTATE_SF more consistent between gen7 and gen8+ Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-08 11:13:28 -07:00
Jason Ekstrand	2d76dcae71	anv/pipeline/gen8: Remove an old comment This is now handled in emit_3dstate_clip Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-08 11:13:04 -07:00
Kenneth Graunke	7314007925	mesa: Skip ES 3.0/3.1 transform feedback primitive counting error. This error condition is not implementable when using tessellation or geometry shaders. The text was also removed from the ES 3.2 spec. I believe the intended behavior is to remove the error condition when either OES_geometry_shader or OES_tessellation_shader are exposed. v2: Quote a better part of issue 13 (suggested by Ian). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-08 10:01:30 -07:00
Kenneth Graunke	23b2bcd460	mesa: Share code between _mesa_validate_DrawArrays[_Instanced]. Mostly, I want to share the GLES 3 transform feedback handling, though most of the rest of the code is identical as well. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-08 10:01:30 -07:00
Kenneth Graunke	522b5d4566	glsl: Implicitly enable OES_shader_io_blocks if geom/tess are enabled. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-08 09:59:03 -07:00
Kenneth Graunke	0eaa84e8af	glsl: Expose gl_PointSize if OES/EXT_tessellation_point_size is enabled. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-08 09:59:03 -07:00
Kenneth Graunke	58709d36d7	glsl: Add extension plumbing for OES/EXT_tessellation_shader. This adds the #extension directive support, built-in #defines, lexer keyword support, and updates has_tessellation_shader(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-08 09:59:03 -07:00
Kenneth Graunke	722fd10456	mesa: Move tessellation shader gets to GL_CORE, GLES31 section. This makes them available in the GLES 3.1 API. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-08 09:59:03 -07:00
Kenneth Graunke	c8438b62b7	mesa: Add {OES,EXT}_tessellation_shader to the extensions table. Also update _mesa_has_tessellation to know about the new extensions. For now, these are dummy_false, to avoid turning on the extension until everything's in place. Eventually, we'll move them over to the "ARB_tessellation_shader" bit so that any drivers supporting both the desktop extension and ES 3.1 get the feature. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-08 09:59:03 -07:00
Kenneth Graunke	73554c47e0	mapi: Add PatchParameteriOES and PatchParameteriEXT. The OES_tessellation_shader and EXT_tessellation_shader specifications have suffixed names. These are identical to the core function, so just alias them. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-08 09:59:03 -07:00
Nicolai Hähnle	96bbb620a5	radeonsi: add has_draw_indirect_multi flag Prefer to use DRAW_(INDEX)_INDIRECT_MULTI when available in the firmware. Versions for SI and CI already added as provided by the firmware team, but keep in mind that they won't currently be used since the radeon kernel module has no interface to query the firmware version. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-08 12:53:06 +02:00
Nicolai Hähnle	5c343cce0f	radeonsi: transpose indirect/index draw dispatch This allows better code sharing for indirect draw calls. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-08 12:53:04 +02:00
Nicolai Hähnle	64ff23a58c	radeonsi: move index buffer calculations in si_emit_draw_packets up Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-08 12:53:02 +02:00
Nicolai Hähnle	cf7d18b75c	radeonsi: unify emitting PKT3_SET_BASE for indirect draws Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-08 12:52:59 +02:00
Nicolai Hähnle	e0736c438c	winsys/amdgpu: query ME/PFP/CE firmware versions The radeon kernel module doesn't have the firmware query interface, so the corresponding values will remain 0. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-08 12:52:41 +02:00
Nicolai Hähnle	7f5a8dc27e	radeonsi: move spi_ps_input_addr override outside of the loop Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-08 12:51:32 +02:00
Nicolai Hähnle	287822ee33	radeonsi: drop unnecessary u_pstipple.h include Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-08 12:51:29 +02:00
Nicolai Hähnle	3e4c5693a1	radeonsi: do not pass the return type to buffer_load_const Overriding it is not allowed anyway, and actually lead to a crash when polygon stippling was used with monolithic shaders. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-08 12:51:26 +02:00
Kenneth Graunke	bd1bd03268	glsl: Combine GS and TES array resizing visitors. These are largely identical, except that the GS version has a few extra error conditions. We can just pass in the stage and skip these. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-08-07 23:53:59 -07:00
Kenneth Graunke	398428f406	glsl: Fix location bias for patch variables. We need to subtract VARYING_SLOT_PATCH0, not VARYING_SLOT_VAR0. Since "patch" only applies to inputs and outputs, we can just handle this once outside the switch statement, rather than replicating the check twice and complicating the earlier conditions. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-08-07 23:53:42 -07:00
Kenneth Graunke	1556f16e46	glsl: Fix the program resource names of gl_TessLevelOuter/Inner[]. These are lowered to gl_TessLevel{Outer,Inner}MESA. We need them to appear in the program resource list with their original names and types. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-08-07 23:53:28 -07:00
Kenneth Graunke	4a49851da1	glsl: Delete bogus ir_set_program_inouts assert. This assertion is bogus. Varying structs, and arrays of structs, are allowed by GLSL, and we can see them here. While we currently don't have any partial-variable support for those, simply returning false and marking the entire thing as used is certainly legitimate. I believe this is often swept under the rug by varying packing, but that's disabled in certain tessellation situations. Hit by 20 dEQP-GLES31.functional.tessellation.user_defined_io.* tests. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-08-07 23:51:21 -07:00
Kenneth Graunke	86915b495b	glsl: Simplify interface qualifier parsing. This better matches the grammar in section 4.3.9 of the GLSL 4.5 spec, and also removes some redundant code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-08-07 23:48:48 -07:00
Kenneth Graunke	d0642c52fc	glsl: Add a has_tessellation_shader() helper. Similar to has_geometry_shader(), has_compute_shader(), and so on. This will make it easier to add more conditions here later. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-08-07 23:47:55 -07:00
Marek Olšák	3fb4a9b3b3	Revert "gallium/radeon: count contexts" This reverts commit `b403eb3385`. Not needed.	2016-08-06 17:29:23 +02:00
Marek Olšák	11b1d064a3	radeonsi: add GLSL lit tests They can only be run manually as described in HOW_TO_RUN. It should help catch suboptimal code generation. Some of the tests already fail. v2: rename the tests to *.glsl, fix lit.cfg to find FileCheck Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)	2016-08-06 16:11:43 +02:00
Marek Olšák	35942ee8a8	radeonsi: add a standalone compiler amdgcn_glslc This will be used by GLSL lit tests. For developers only. It shouldn't be distributable and it doesn't use the Mesa build system. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 16:11:39 +02:00
Marek Olšák	ad8af99c86	radeonsi: add environment variable SI_FORCE_FAMILY This will be used by: amdgcn_glslc -mcpu=[family] It can also be used for shader-db if you want stats for a different family. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 16:11:35 +02:00
Marek Olšák	d0646cc745	winsys/radeon: implement cs_get_next_fence Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 14:29:31 +02:00
Marek Olšák	63b99590db	winsys/amdgpu: implement cs_get_next_fence Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 14:29:30 +02:00
Marek Olšák	04a6cb63aa	gallium/radeon: add cs_get_next_fence winsys callback Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 14:29:30 +02:00
Marek Olšák	b403eb3385	gallium/radeon: count contexts We don't wanna use unflushed fences when we have multiple contexts. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 14:29:30 +02:00
Marek Olšák	16d568d911	gallium/radeon: count gfx IB flushes This will be used as a counter for whether fence_finish needs to flush the IB. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 14:29:30 +02:00
Marek Olšák	c5ff0d3e65	gallium/radeon: move radeon_winsys::cs_memory_below_limit to drivers Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 13:56:14 +02:00
Marek Olšák	076db67217	gallium/radeon: inline radeon_winsys::query_memory_usage Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 13:56:14 +02:00
Marek Olšák	9646ae7799	gallium/radeon/winsyses: expose per-IB used_vram and used_gart to drivers The following patches will use this. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 13:56:14 +02:00
Marek Olšák	1c8f17599e	gallium/radeon/winsyses: print CS submission error number Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 13:56:14 +02:00
Marek Olšák	0edc2e433e	radeonsi: flush if constant, shader, and streamout buffers use too much memory Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 13:56:14 +02:00
Marek Olšák	c3efdeb8dd	radeonsi: flush if sampler views and images use too much memory Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 13:56:14 +02:00
Marek Olšák	d82cfab84c	radeonsi: deal with high vertex buffer memory usage correctly Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 13:56:14 +02:00
Marek Olšák	e62caf576e	radeonsi: take compute shader and dispatch indirect memory usage into account Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 13:56:14 +02:00
Marek Olšák	c56ecb68e7	radeonsi: take scratch buffer and draw indirect memory usage into account Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 13:56:14 +02:00
Marek Olšák	ed2254d157	radeonsi: check IB memory usage of CP DMA operations Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 13:56:14 +02:00
Marek Olšák	f4b977bf3d	gallium/radeon: add r600_resource::vram_usage and gart_usage Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 13:56:14 +02:00
Mathias Fröhlich	62d41162bb	mesa: Copy bitmask of VBOs in the VAO on gl{Push,Pop}Attrib. On gl{Push,Pop}Attrib(GL_CLIENT_VERTEX_ARRAY_BIT) take care that gl_vertex_array_object::VertexAttribBufferMask matches the bound buffer object in the gl_vertex_array_object::VertexBinding array. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2016-08-06 06:27:37 +02:00
Nanley Chery	c495c18b24	anv/gen7_pipeline: Set PixelShaderKillPixel for discards According to the IVB PRM Vol2 P1, this bit must be set if a pixel shader contains a discard instruction. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97207 Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-05 09:53:52 -07:00
Jason Ekstrand	21f357b66e	util/r11g11b10f: Whitespace cleanups Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-08-05 09:07:06 -07:00
Jason Ekstrand	ffcf8e1049	util/format: Use explicitly sized types Both the rgb9e5 and r11g11b10 formats are defined based on how they are packed into a 32-bit integer. It makes sense that the functions that manipulate them take an explicitly sized type. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-08-05 09:07:04 -07:00
Jason Ekstrand	c7eb9a7565	util/rgb9e5: Get rid of the float754 union There are a number of reasons for this refactor. First, format_rgb9e5.h is not something that a user would expect to define such a generic union. Second, defining it requires checking for endianness which is ugly. Third, 90% of what we were doing with the union was float <-> uint32_t bitcasts and the remaining 10% can be done with a sinmple left-shift by 23. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-08-05 09:07:01 -07:00
Jason Ekstrand	cda8d95660	util/format_rgb9e5: Get rid of the rgb9e5 union The rgb9e5 format is a packed format defined in terms of slicing up a single 32-bit value. The bitfields are far more confusing than simple shifts and require that we check the endianness. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-08-05 09:06:59 -07:00
Jason Ekstrand	f29fd7897a	util: Move format_r11g11b10f.h to src/util It's used from both mesa main and gallium. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-08-05 09:06:57 -07:00
Jason Ekstrand	6c665cdfc5	util: Move format_rgb9e5.h to src/util It's used from both mesa main and gallium. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-08-05 09:06:31 -07:00
Andres Gomez	591869e921	glsl: fix indentation, comments and line lengths in ast_function.cpp Acked-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com>	2016-08-05 14:27:11 +03:00
Andres Gomez	8f98a120f3	glsl: apply_implicit_conversion is static again Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com>	2016-08-05 14:27:11 +03:00
Andres Gomez	1443c10d74	glsl: struct constructors/initializers only allow implicit conversions When an argument for a structure constructor or initializer doesn't match the expected type, only Section 4.1.10 “Implicit Conversions” are allowed to try to match that expected type. From page 32 (page 38 of the PDF) of the GLSL 1.20 spec: " The arguments to the constructor will be used to set the structure's fields, in order, using one argument per field. Each argument must be the same type as the field it sets, or be a type that can be converted to the field's type according to Section 4.1.10 “Implicit Conversions.”" From page 35 (page 41 of the PDF) of the GLSL 4.20 spec: " In all cases, the innermost initializer (i.e., not a list of initializers enclosed in curly braces) applied to an object must have the same type as the object being initialized or be a type that can be converted to the object's type according to section 4.1.10 "Implicit Conversions". In the latter case, an implicit conversion will be done on the initializer before the assignment is done." v2: Remove also the now redundant constant conversion, the constant_record_constructor helper and the replacement code (Timothy). Fixes GL44-CTS.shading_language_420pack.initializer_list_negative Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com>	2016-08-05 14:27:03 +03:00
Andres Gomez	de60d549b9	glsl: Refactor implicit conversion into its own helper v2: Refactor also the conversion to constant and replacement code (Timothy). Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com>	2016-08-05 14:27:03 +03:00
Andres Gomez	af796d756e	glsl/types: disallow implicit conversions before GLSL 1.20 Implicit conversions were added in the GLSL 1.20 spec version. v2: Join the checks for GLSL 1.10 and ESSL (Timothy). Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com>	2016-08-05 14:27:03 +03:00
Kenneth Graunke	875341c69b	i965: Rework the unlit centroid workaround. Previously, for every input, we moved the dispatch mask to the flag register, then emitted two predicated PLN instructions, one with centroid barycentric coordinates (for normal pixels), and one with pixel barycentric coordinates (for unlit helper pixels). Instead, we can simply emit a set of predicated MOVs at the top of the program which copy the pixel barycentric coordinates over the centroid ones for unlit helper pixel channels. Then, we can just use normal PLNs. On Sandybridge: total instructions in shared programs: 7538470 -> 7534500 (-0.05%) instructions in affected programs: 101268 -> 97298 (-3.92%) helped: 705 HURT: 9 (all of which are SIMD16 programs) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-05 01:43:52 -07:00
Tim Rowley	b521083ffb	swr: [rasterizer core] static analysis fixes for conservative rast Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-04 14:38:35 -05:00
Tim Rowley	68dc544879	swr: [rasterizer core] implement InnerConservative input coverage Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-04 14:38:35 -05:00
Tim Rowley	4034f48833	swr: [rasterizer core] remove CanEarlyZ function Test is now in SetupPipeline. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-04 14:38:34 -05:00
Tim Rowley	b365989875	swr: [rasterizer core] use 32x32 macrotile for openswr Significant performance increase (up to 2x) on high geometry workloads. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-04 14:38:34 -05:00
Tim Rowley	5f4bc9e85b	swr: [rasterizer fetch] add support for 24bit format fetch Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-04 14:38:34 -05:00
Tim Rowley	527d45c8fe	swr: [rasterizer fetch] additional fetch format support Add support for 0 pitch in fetch. Add support for USCALE/SSCALE for 32bit integer fetches. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-04 14:38:34 -05:00
Tim Rowley	f438b7ba81	swr: [rasterizer jitter] fix potential jit exit crash Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-04 14:38:34 -05:00
Tim Rowley	57b07498d2	swr: [rasterizer core] update sync handling Sync now uses a callback to ensure that it's called by the last thread moving past a DC. This will help with the new counter handling. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-04 14:38:34 -05:00
Tim Rowley	191786d0f4	swr: [rasterizer core] rename variable Avoid nested declarations of the same name within a single function. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-04 14:01:37 -05:00
Tim Rowley	61cc012e9a	swr: [rasterizer jitter] adjust extern "C" block scope Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-04 14:01:31 -05:00
Tim Rowley	9f7d99fcfe	swr: [rasterizer core] conservative rast degenerate handling Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-04 14:01:25 -05:00
Tim Rowley	f01827a469	swr: [rasterizer core] allow hexadecimal for integer knobs Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-04 13:52:12 -05:00
Eric Anholt	49741e1cd2	mesa: Dynamically allocate the matrix stack. By allocating and initializing the matrices at context creation, the OS couldn't even overcommit the pages. This saves about 63k (out of 946k) of maximum memory size according to massif on simulated vc4 glsl-algebraic-add-add-1. It also means we could potentially relax the maximum stack sizes, but that should be a separate commit. v2: Drop redundant Top update, explain why the stack is small at init time. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-04 08:52:11 -07:00
Eric Anholt	2a808219b3	state_tracker: Initialize the draw context only when needed. It's only used for rarely-used deprecated GL features (feedback/rasterpos), so we can skip the memory allocation and initialization for it most of the time. Saves about 659k (out of 1605k) of maximum memory size according to massif on simulated vc4 glsl-algebraic-add-add-1 Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-04 08:48:27 -07:00
Eric Anholt	c976e164d2	vc4: Move scalarizing and some lowering to link time. This works out to be a wash in terms of memory usage: We use more memory to store the separate ALU instructions, but we optimize out a lot of code as well. The main result, though, is that we do more of our work at link time rather than draw time.	2016-08-04 08:48:27 -07:00
Eric Anholt	2350569a78	vc4: Avoid VS shader recompiles by keeping a set of FS inputs seen so far. We don't want to bake the whole array into the FS key, because of the hashing overhead. But we can keep a set of the arrays seen, and use a pointer to the copy in as the array's proxy. Between this and the previous patch, gl-1.0-blend-func now passes on hardware, where previously it was filling the 256MB CMA area with shaders and OOMing. Drops 712 shaders from shader-db.	2016-08-04 08:48:27 -07:00
Eric Anholt	62ea2461ed	vc4: Don't recompile the CS when the FS changes. The compiled_fs_id is a proxy for the vc4->prog.fs->input_slots[], but only the VS dereferences it. Drops 754 shaders from shader-db.	2016-08-04 08:48:27 -07:00
Eric Anholt	d577dbc201	vc4: Move FS inputs setup out to a helper function. It's a pretty big block, and I was about to make it bigger.	2016-08-04 08:48:27 -07:00
Kenneth Graunke	144cbf8987	nir: Make nir_opt_remove_phis see through moves. I found a shader in Tales of Maj'Eyal that contains: if ssa_21 { block block_1: /* preds: block_0 / ...instructions that prevent the select peephole... vec1 32 ssa_23 = imov ssa_4 vec1 32 ssa_24 = imov ssa_4.y vec1 32 ssa_25 = imov ssa_4.z / succs: block_3 / } else { block block_2: / preds: block_0 / vec1 32 ssa_26 = imov ssa_4 vec1 32 ssa_27 = imov ssa_4.y vec1 32 ssa_28 = imov ssa_4.z / succs: block_3 / } block block_3: / preds: block_1 block_2 */ vec1 32 ssa_29 = phi block_1: ssa_23, block_2: ssa_26 vec1 32 ssa_30 = phi block_1: ssa_24, block_2: ssa_27 vec1 32 ssa_31 = phi block_1: ssa_25, block_2: ssa_28 Here, copy propagation will bail because phis cannot perform swizzles, and CSE won't do anything because there is no dominance relationship between the imovs. By making nir_opt_remove_phis handle identical moves, we can eliminate the phis and rewrite everything to use ssa_4 directly, so all the moves become dead and get eliminated. I don't think we need to check "exact" - just the alu sources. Presumably phi sources should match in their exactness. On Broadwell: total instructions in shared programs: 11639872 -> 11638535 (-0.01%) instructions in affected programs: 134222 -> 132885 (-1.00%) helped: 338 HURT: 0 v2: Fix return value to be NULL, not false (caught by Iago). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-08-04 00:42:12 -07:00
Kenneth Graunke	7603b4d3a1	nir: Make nir_alu_srcs_equal non-static. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-08-04 00:41:07 -07:00
Kenneth Graunke	6aa730000f	nir: Turn imov/fmov of undef into undef. On Broadwell: total instructions in shared programs: 11640214 -> 11639872 (-0.00%) instructions in affected programs: 17744 -> 17402 (-1.93%) helped: 78 HURT: 0 total spills in shared programs: 2924 -> 2922 (-0.07%) spills in affected programs: 104 -> 102 (-1.92%) helped: 1 HURT: 0 total fills in shared programs: 4394 -> 4389 (-0.11%) fills in affected programs: 237 -> 232 (-2.11%) helped: 1 HURT: 0 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-08-04 00:40:59 -07:00
Kenneth Graunke	12a912586f	i965: Use a separate register for every access to an SSA undef. Previously, we allocated a new VGRF for every undefined definition. Instead, this patch makes us allocate a new VGRF for every use of an undefined definition. This makes sure that undefined values are fully independent of one another, and have live ranges limited to their single use. This allows register coalescing to combine the source and destination of MOVs from undefined sources, eliminating the MOV altogether. On Broadwell: total instructions in shared programs: 11641187 -> 11640214 (-0.01%) instructions in affected programs: 70199 -> 69226 (-1.39%) helped: 213 HURT: 1 v2: Add a comment (based on Iago's suggested one). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-08-04 00:40:10 -07:00
Michel Dänzer	67c5e843b9	vl/dri3: Destroy Present event context when destroying drawable v2 Without this, the X server may accumulate stale Present event contexts if a client performs several video decoding sessions using the same window. v2: Based on Chris Wilson's review: * Use xcb_discard_reply() instead of free(xcb_request_check()) Reviewed-and-Tested-by: Leo Liu <leo.liu@amd.com>	2016-08-04 15:45:43 +09:00
Michel Dänzer	5d191bafa2	loader/dri3: Destroy Present event context when destroying drawable v2 Without this, the X server may accumulate stale Present event contexts if a client ends up creating and destroying DRI drawables for the same window. v2: Based on Chris Wilson's review: * Use xcb_present_select_input_checked so that protocol errors generated by old X servers can be handled gracefully * Use xcb_discard_reply() instead of free(xcb_request_check())	2016-08-04 15:45:43 +09:00
Ben Widawsky	1743c4184b	gbm: Correct bo_import documentation (trivial) Missed here: commit `a43d286ef7` Author: Kristian Høgsberg <krh@bitplanet.net> Date: Fri Mar 28 10:17:11 2014 -0700 gbm: Add import from fd Cc: Kristian Høgsberg <krh@bitplanet.net> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-08-03 10:56:41 -07:00
Eric Anholt	bc1fc9c985	vc4: Avoid generating a custom shader per level in glGenerateMipmaps(). We were baking in the LOD of the source level to each shader. Instead, pass it in as a uniform -- this requires storing it to a temp register, but that's better than compiling a ton of separate shaders: total instructions in shared programs: 115032 -> 115036 (0.00%) instructions in affected programs: 96 -> 100 (4.17%) LOST: 572	2016-08-03 10:55:54 -07:00
Eric Anholt	e97e9e62a1	vc4: Tell valgrind about BO allocations from mmap time to destroy. This helps in debugging memory pressure. It would be nice if we could tell valgrind about it all the way from allocation time to destroy, but we need a pointer to hand to VALGRIND_MALLOCLIKE_BLOCK.	2016-08-03 10:28:20 -07:00
Jan Ziak	fd32868590	loader: fix memory leak in loader_dri3_open Found via "valgrind --leak-check=full glxgears". Signed-off-by: Jan Ziak (http://atom-symbol.net) <0xe2.0x9a.0x9b@gmail.com> Acked-by: Boyan Ding <boyan.j.ding@gmail.com> Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-08-03 10:25:09 -07:00
Eric Anholt	a0671d67de	vc4: Fix a leak of the src[] array of VPM reads in optimization. Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-08-03 10:25:09 -07:00
Eric Anholt	9f95690959	vc4: Fix leak of the bo_handles table.	2016-08-03 10:25:08 -07:00
Eric Anholt	02f8c444e8	vc4: Fix handling of UBO range offsets. The ranges are in units of bytes, not dwords. This wasn't caught by piglit tests because ttn tends to make one big uniform file, so we only had one UBO range with a src and dst offset of 0.	2016-08-03 10:25:08 -07:00
Eric Anholt	9128acfb57	nir: Allow opt_peephole_select to work on empty blocks. nir_opt_peephole_select has the job of removing IF statements with no side effects. However, if the IF statement's successor didn't have any instructions in it, we were skipping it, which occurred in mupen64 on vc4 with glsl_to_nir enabled: instructions in affected programs: 6134 -> 4120 (-32.83%) total uniforms in shared programs: 38268 -> 38219 (-0.13%) No changes on Haswell shader-db. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-03 10:25:08 -07:00
Eric Anholt	36b9eb82c1	vc4: Dump NIR at shader state creation time as well. I keep wanting to see this version of the NIR.	2016-08-03 10:25:08 -07:00
Marek Olšák	435d9595d3	r600g: use last_gfx_fence like radeonsi Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-03 17:46:46 +02:00
Marek Olšák	a6bfafa083	gallium/radeon: move last_gfx_fence from radeonsi to common code Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-03 17:46:46 +02:00
Marek Olšák	c15a9dec29	radeonsi: skip unnecessary si_update_shaders calls Small decrease in draw call overhead. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-03 17:46:46 +02:00
Marek Olšák	c2a0e99169	radeonsi: print the command line to VM fault reports (v2) v2: rebase on top of Brian's commit Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-03 17:46:46 +02:00
Marek Olšák	6573ad69ef	ddebug: print the command line to all logs (v2) for piglit with the pipelined hang detection mode v2: rebase on top of Brian's commit Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-03 17:46:46 +02:00
Marek Olšák	840353059a	ddebug: don't use fmemopen on non-Linux OS Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97140 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-03 17:46:46 +02:00
Marek Olšák	c88b309fd5	radeonsi: don't set the last parameter component of llvm.AMDGPU.cube LLVM doesn't use it. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-03 17:46:46 +02:00
Marek Olšák	42c5f839ad	radeonsi: use llvm.amdgcn.cube* if available Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-03 17:46:46 +02:00
Marek Olšák	1fb6e55eaf	radeonsi: use llvm.amdgcn.rsq.f64 if available Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-03 17:46:46 +02:00
Marek Olšák	db2d31dab1	radeonsi: use v_mad_f32 for fma v_fma_f32 runs at FP64 rate (= slow). Alien Isolation and F1 2015 seem to use fma for all d3d multiply-add instructions, which is silly. This tries to restore performance for those games. The main difference between v_mad_f32 and v_fma_f32 is that v_mad doesn't support denormals, which we don't enable anyway, because they are slow too. Also, there is code size reduction: Totals from affected shaders: VGPRS: 109796 -> 109808 (0.01 %) Spilled SGPRs: 29995 -> 30022 (0.09 %) Spilled VGPRs: 12 -> 13 (8.33 %) <-- it's just one shader going from 12 to 13 Code Size: 6667596 -> 6476356 (-2.87 %) bytes Max Waves: 26931 -> 26899 (-0.12 %) I've not actually tested real performance. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-03 17:46:46 +02:00
Haixia Shi	4c4bfed670	i965: use mt->offset in intel_miptree_map_movntdqa() We need to include mt->offset in the calculation of src pointer because its value may be non-zero, for example in a cubemap texture. Signed-off-by: Haixia Shi <hshi@chromium.org> Cc: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Chad Versace <chad@kiwitree.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Change-Id: I461ad5b204626d5a1c45611fc6b63735dcf29f63	2016-08-03 08:28:52 -07:00
Timothy Arceri	6fb6201f71	nir: fix validation message Looks like a copy and paste error from `f752effa08` Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-08-03 09:31:57 +10:00
Chad Versace	2d788a9181	.mailmap: Update my address I left Intel, so make my personal address the canonical address.	2016-08-02 13:29:53 -07:00
Tim Rowley	11072de368	swr: build swr with -fno-strict-aliasing swr rasterizer contains numerous data transfers between vectors and ordinary C types. Fixing for strict aliasing will take time. Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-02 14:30:33 -05:00
Andres Gomez	3356ac208b	ast: Updated AST_NUM_OPERATORS for coherence with ast_operators AST_NUM_OPERATORS stores the dimension of the ast_operators enumeration but was not updated after its last modification. This doesn't add any real modification for any code paths but it makes sense for coherence. v2 (Eric Engestrom): Just place the define at the end of the enumeration, not below. Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-08-02 21:33:03 +03:00
Matt Turner	c3211ae093	i965: Disable the unlit centroid workaround on Gen7. Once upon a time (commit `8313f44409`) Paul added code for the unlit centroid workaround (WaCopyUnlitCentroidBarys). His commit message claims it fixed the EXT_framebuffer_multisample/interpolation {2,4} {centroid-deriv,centroid-deriv-disabled} piglit tests but does not say on which platform, though he cites the IVB PRM. "3DSTATE_WM [DevIVB, DevHSW]" says "[DevIVB]: Workaround: When Centroid Barycentric mode is required, HW may produce incorrect interpolation results when a 2X2 pixels have unlit pixels." I later disabled it for Haswell (commit `f6db414f3c`) with no known ill effects. The Sandybridge page does not have this text, but the workarounds database (see WaCopyUnlitCentroidBarys) says the issues applies only to Sandybridge, and in fact in commit `1a2de7dce8` I note that disabling the workaround on Sandybridge causes the tests Paul originally mentioned to fail. So this is, and always has been, a huge confusing mess. Disabling the workaround indeed causes the tests Paul originally mentioned to fail on Sandybridge but not on Ivybridge/Baytrail. On Ivybridge: total instructions in shared programs: 6914901 -> 6909599 (-0.08%) instructions in affected programs: 106766 -> 101464 (-4.97%) helped: 884 total cycles in shared programs: 70874764 -> 70813774 (-0.09%) cycles in affected programs: 794144 -> 733154 (-7.68%) helped: 688 HURT: 186 LOST: 1 GAINED: 6 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-02 10:37:13 -07:00
Marek Olšák	6db93cd167	gallium/util: fix align64 it cut off the upper 32 bits Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-08-01 23:28:14 +02:00
Matt Turner	88ad8c7ded	mesa: Drop -fno-strict-aliasing. Improves performance of OglBatch7 by 4.06851% +/- 1.17925% (n=169) on Haswell, and cuts ~18k of .text: text data bss dec hex filename 5824627 287816 29384 6141827 5db783 before/i965_dri.so 5806354 287816 29384 6123554 5d7022 after/i965_dri.so Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-08-01 12:09:17 -07:00
Matt Turner	12a14052e8	i915: Avoid aliasing violation. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-08-01 12:09:17 -07:00
Matt Turner	be35c6ba92	draw: Avoid aliasing violations. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-01 12:09:17 -07:00
Matt Turner	8e68f35d32	r600g: Avoid aliasing violations. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-01 12:09:17 -07:00
Matt Turner	d2838f77ec	r300g: Avoid aliasing violation. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-01 12:09:17 -07:00
Matt Turner	16ff8f9ae8	gallium/auxiliary: Add u_bitcast.h header. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-01 12:09:17 -07:00
Matt Turner	bbe012f02a	glsl_to_tgsi: Avoid aliasing violations. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-08-01 12:09:17 -07:00
Brian Paul	500a3dd11f	st/mesa: silence missing braces warning in st_program.c Silence a gcc warning: state_tracker/st_program.c: In function 'st_create_fp_variant': state_tracker/st_program.c:957:10: warning: missing braces around initializer [-Wmissing-braces] nir_lower_drawpixels_options options = {0}; ^ state_tracker/st_program.c:957:10: warning: (near initialization for 'options.texcoord_state_tokens') [-Wmissing-braces] Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-01 12:20:19 -06:00
Brian Paul	13fa051356	auxiliary/os: add new os_get_command_line() function This can be used by the driver to get the command line which started the process. Will be used by the VMware driver for extra logging. For now, this is only implemented for Linux via /proc/self/cmdline and Windows via GetCommandLine(). Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-01 12:20:19 -06:00
Charmaine Lee	c2b4942afc	svga: avoid redundant SetVertexBuffer/SetIndexBuffer commands at rebind This patch eliminates the redundant SetVertexBuffers and SetIndexBuffer commands that are emitted for rebind purpose. With this patch, the set commands will be skipped, but we will still reference the associated resources to allow the kernel to bring in the resources. Tested with Lightsmark2008, Valley, MTT glretrace, piglit, conform. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-08-01 12:20:19 -06:00
Rob Clark	53b2b8bf6f	u_vbuf: fix potentially bogus assert There are cases where we hit u_vbuf path due to alignment or pitch- alignment restrictions, but for an output-format that u_vbuf does not support translating (yet the driver does support natively). In which case we hit the memcpy() path and don't care that u_vbuf doesn't understand it. Fixes crash with debug build of mesa in: dEQP-GLES3.functional.vertex_arrays.single_attribute.strides.fixed.user_ptr_stride17_components2_quads1 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95000 Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-01 13:42:11 -04:00
Ben Widawsky	e7c8c85785	gbm: Removed unused function. AFAICT, it's never been used. It was briefly nudged in the right direction here: commit `10e5ffd496` Author: Emil Velikov <emil.l.velikov@gmail.com> Date: Sat Jan 25 17:19:10 2014 +0000 gbm: do not export _gbm_mesa_get_device Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2016-08-01 09:11:14 -07:00
Timothy Arceri	cec377eed3	i965: fix comparison warning Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-01 14:52:07 +10:00
Eric Anholt	26ff7e373f	vc4: Zero-initialize the hardware sampler view structure. Fixes failure to initialize the force_first_level flag, causing failures in piglit levelclamp.	2016-07-31 19:23:03 -07:00
Mathias Fröhlich	b730960e77	mesa: Remove set but not used gl_client_array::Stride. The field is only read for printing today and there it was probably a leftover. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-07-31 10:05:46 +02:00
Mathias Fröhlich	56c65cd315	mesa: Remove set but not used gl_client_array::Enabled. The way it is used today does not care about the Enabled flag anymore. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-07-31 10:05:46 +02:00
Mathias Fröhlich	43a6f435ca	vbo: Use the VAO array enabled flags in vbo_exec_array. Instead of gl_client_array::Enabled inside a VAO, directly use the gl_vertex_attrib_array::Enabled value which is the origin of the above. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-07-31 10:05:46 +02:00
Mathias Fröhlich	4cda690019	vbo: Walk the VAO in check_array_data. Only a debugging function, but move away from gl_client_array and use the first order information from the VAO. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-07-31 10:05:46 +02:00
Mathias Fröhlich	99b42184f9	vbo: Walk the VAO in print_draw_arrays. Only a debugging function, but move away from gl_client_array and use the first order information from the VAO. Also make use of gl_vert_attrib_name. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-07-31 10:05:45 +02:00
Mathias Fröhlich	eec516d8e1	mesa: Walk the VAO in _mesa_print_arrays. Only a debugging function, but move away from gl_client_array and use the first order information from the VAO. Also make use of gl_vert_attrib_name. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-07-31 10:05:45 +02:00
Mathias Fröhlich	144737a498	vbo: Walk the VAO to check for mapped buffers. Similarily to _mesa_all_varyings_in_vbos walk the VAO to check if we have an illegal mapped buffer object instead of walking all gl_client_arrays. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-07-31 10:05:45 +02:00
Mathias Fröhlich	3f5e5696fe	vbo: Walk the VAO to see if all varyings are in vbos. In vbo_draw_transform_feedback we currently look at exec->array.inputs to determine if all varying vertex attributes reside in vbos. But the vbo_bind_arrays call only happens past the vbo_all_varyings_in_vbos query. Thus we may work on a stale set of client arrays. Using the current VAOs content for this query feels much more logical to me. Additionally with this change mesa makes more use of the information already tracked in the VAO instead of looping across VERT_ATTRIB_MAX vertex arrays. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-07-31 10:05:45 +02:00
Mathias Fröhlich	f8be969b1b	mesa: Implement _mesa_all_varyings_in_vbos. Implement the equivalent of vbo_all_varyings_in_vbos for vertex array objects. v2: Update comment. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-07-31 10:05:45 +02:00
Mathias Fröhlich	f7cb46a972	mesa: Unbind deleted vbo using _mesa_bind_vertex_buffer. When a vertex buffer object gets deleted, it is unbound at the VAO. To do this use _mesa_bind_vertex_buffer instead of plain unreferencing the buffer object. This keeps the VAOs internal state consistent. In this case it showed up with gl_vertex_array_object::VertexAttribBufferMask getting out of sync. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-07-31 10:05:45 +02:00
Timothy Arceri	f696b712d7	glsl: be more strict on block qualifiers V2: Add spec references and allow patch qualifier (Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96528	2016-07-31 09:24:45 +10:00
Timothy Arceri	d3dc1b8b5e	glsl: add name param to validate_flags() Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-31 09:24:45 +10:00
Timothy Arceri	2262fe4081	glsl: add component to ast_type_qualifier::validate_flags This was added with ARB_enhanced_layouts. V2: Add an extra format specifier for the new qualifier. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-31 09:24:45 +10:00
Timothy Arceri	bbe839379a	docs: Add GL4.4 and ARB_enhanced_layouts to the release notes Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-31 08:19:21 +10:00
Kenneth Graunke	b5661c1d70	anv: Perform rasterizer discard in the SOL stage instead of the clipper. See commit `b0629e6894`, where we discovered that the SOL stage's "Rendering Disable" feature is a lot faster at throwing away all geometry than the clipper's "reject all" mode. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-30 12:06:37 -07:00
Roland Scheidegger	99a47391e4	Revert "gallium/util: fix resource leak" This reverts commit `d1fe26a628`. Replacing a resource leak with a segfault isn't the solution.	2016-07-30 18:18:09 +02:00
Eric Engestrom	d1fe26a628	gallium/util: fix resource leak CovID: 401540 Signed-off-by: Eric Engestrom <eric@engestrom.ch> Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-30 17:27:42 +02:00
francians@gmail.com	e713a9e613	freedreno/a4xx: fix comparison out of range warnings Signed-off-by: Francesco Ansanelli <francians@gmail.com> Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:25:42 -04:00
francians@gmail.com	43492c7f2c	freedreno/a3xx: fix comparison out of range warnings Signed-off-by: Francesco Ansanelli <francians@gmail.com> Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:25:31 -04:00
francians@gmail.com	089cc74b6a	freedreno/a2xx: fix comparison out of range warnings Signed-off-by: Francesco Ansanelli <francians@gmail.com> Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:25:16 -04:00
francians@gmail.com	3fa68fdc90	freedreno/ir3: init ir3_shader_key with memset() To silence missing initializers warning Signed-off-by: Francesco Ansanelli <francians@gmail.com> Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:24:59 -04:00
Eric Engestrom	a63bac9271	gallium/freedreno: move cast to avoid integer overflow Previously, the bitshift would be performed on a simple int (32 bits on most systems), overflow, and then be cast to 64 bits. CovID: 1362461 Signed-off-by: Eric Engestrom <eric@engestrom.ch> Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Eric Engestrom	3563c4d161	freedreno/a2xx: remove duplicate assignment CovID: 1362445, 1362446 Signed-off-by: Eric Engestrom <eric@engestrom.ch> Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	2d64a003c5	freedreno: defer flush_queue allocation Some apps, like warsow, create a bazillion contexts but don't render on most of them. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	4175606474	freedreno: add some hw query traces Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	e684c32d2f	freedreno: some locking Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	010e4b2d52	os: add pipe_mutex_assert_locked() Would be nice if we could also have lockdep, like in the linux kernel. But this is better than nothing. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	9f0eb69527	freedreno: drop needs_rb_fbd We need to emit RB_FRAME_BUFFER_DIMENSION once per batch.. tracking this in fd_context is wrong when the gmem code executes asynchronously from the flush_queue worker. But in fact we don't really need to track it at all. We cannot assume previous value at the beginning of the batch (because of other processes potentially using the GPU), so just drop the tracking and emit it in _tile_init(). Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	e6bfe1c773	freedreno: move needs_wfi into batch This is also used in gmem code, which executes from the "bottom half" (ie. from the flush_queue worker thread), so it cannot be in fd_context. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	0739bbceec	freedreno: a bit of micro-optimization Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	e1b1052700	freedreno: drop mem2gmem/gmem2mem query stages They weren't really used, and it gets somewhat more complicated to deal with if batches are flushed asynchronously (on another thread). So just drop them, and move _query_set_state(NULL) call into batch (so it is not happening on background thread). Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	00bed8a794	freedreno: threaded batch flush With the state accessed from GMEM+submit factored out of fd_context and into fd_batch, now it is possible to punt this off to a helper thread. And more importantly, since there are cases where one context might force the batch-cache to flush another context's batches (ie. when there are too many in-flight batches), using a per-context helper thread keeps various different flushes for a given context serialized. TODO as with batch-cache, there are a few places where we'll need a mutex to protect critical sections, which is completely missing at the moment. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	c44163876a	freedreno: track batch/blit types Add a bit of extra book-keeping about blits and back-blits (from resource shadowing). If the app uploads all mipmap levels, as opposed to uploading the first level and then glGenerateMipmap(), we can discard the back-blit (as opposed to being naive and shadowing the resource for each mipmap level). Also, after a normal blit, we might as well flush the batch immediately, since there is not likely to be further rendering to the surface. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	7f8fd02dc7	freedreno: re-order support for hw queries Push query state down to batch, and use the resource tracking to figure out which batch(es) need to be flushed to get the query result. This means we actually need to allocate the prsc up front, before we know the size. So we have to add a special way to allocate an un- backed resource, and then later allocate the backing storage. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	10baf05b2c	freedreno: use prsc for hw queries Switch to using a pipe_resource (rather than an fd_bo directly) for hw query result buffers. This is first step towards making queries work properly with reordered batches, since we'll need the additional dependency tracking to know which batches to flush. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	ba30096888	freedreno: support discarding previous rendering in special cases Basically, to "DCE" blits triggered by resource shadowing, in cases where the levels are immediately completely overwritten. For example, mid-frame texture upload to level zero triggers shadowing and back-blits to the remaining levels, which are immediately overwritten by glGenerateMipmap(). Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	7105774bab	freedreno: shadow textures if possible to avoid stall/flush To make batch re-ordering useful, we need to be able to create shadow resources to avoid a flush/stall in transfer_map(). For example, uploading new texture contents or updating a UBO mid-batch. In these cases, we want to clone the buffer, and update the new buffer, leaving the old buffer (whose reference is held by cmdstream) as a shadow. This is done by blitting the remaining other levels (and whatever part of current level that is not discarded) from the old/shadow buffer to the new one. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	dcde4cd114	freedreno: spiff up some debug traces Make it easier to track batches, to ensure things happen properly when they are reordered. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	9f219c7047	freedreno: add batch-cache and batch reordering Note that I originally also had a entry-point that would construct a key and do lookup from a pipe_surface. I ended up not needing that (yet?) but it is easy-enough to re-introduce later if we need it for the blit path. For now, not enabled by default, but can be enabled (on a3xx/a4xx) with FD_MESA_DEBUG=reorder. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	f02a64dbdd	freedreno: move more batch related tracking to fd_batch To flush batches out of order, the gmem code needs to not depend on state from fd_context (since that may apply to a more recent batch). So this all moves into batch. The one exception is the gmem/pipe/tile state itself. But this is only used from gmem code (and batches are flushed serially). The alternative would be having to re-calculate GMEM layout on every batch, even if the dimensions of the render targets are the same. Note: This opens up the possibility of pushing gmem/submit into a helper thread. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	eeafaf2d37	freedreno: dynamically sized/growable cmd buffers Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	9e4561d3c4	freedreno: push resource tracking down into batch Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Rob Clark	9bbd239a40	freedreno: introduce fd_batch Introduce the batch object, to track a batch/submit's worth of ringbuffers and other bookkeeping. In this first step, just move the ringbuffers into batch, since that is mostly uninteresting churn. For now there is just a single batch at a time. Note that one outcome of this change is that rb's are allocated/freed on each use. But the expectation is that the bo pool in libdrm_freedreno will save us the GEM bo alloc/free which was the initial reason to implement a rb pool in gallium. The purpose of the batch is to eventually facilitate out-of-order rendering, with batches associated to framebuffer state, and tracking the dependencies on other batches. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-30 09:23:42 -04:00
Marek Olšák	12aec78993	mesa: remove dd_function_table::UseProgram finally unused Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-30 15:02:14 +02:00
Marek Olšák	b47839ad83	st/mesa: update sampler states when shaders are changed This bug seems to have always been there. Applications changing shaders but not textures between draw calls would have gotten undefined behavior. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-30 15:02:14 +02:00
Marek Olšák	c7954b130a	st/mesa: don't dirty sample shading on _NEW_PROGRAM Already done as part of ST_NEW_FRAGMENT_PROGRAM in st_validate_state. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-30 15:02:14 +02:00
Marek Olšák	79dcd69afa	st/mesa: remove excessive shader state dirtying This just needs to be done by st_validate_state. v2: add "shaders_may_be_dirty" flags for not skipping st_validate_state on _NEW_PROGRAM to detect real shader changes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-30 15:02:14 +02:00
Marek Olšák	1f73e2bb94	st/mesa: unreference optional shaders when unbinding Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-30 15:02:14 +02:00
Marek Olšák	0a46e6f410	st/mesa: skip updates of states that have no effect v2: - also don't check edge flags for GLES Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-30 15:02:14 +02:00
Marek Olšák	c8fe3b9dca	st/mesa: completely rewrite state atoms The goal is to do this in st_validate_state: while (dirty) atoms[u_bit_scan(&dirty)]->update(st); That implies that atoms can't specify which flags they consume. There is exactly one ST_NEW_* flag for each atom. (58 flags in total) There are macros that combine multiple flags into one for easier use. All _NEW_* flags are translated into ST_NEW_* flags in st_invalidate_state. st/mesa doesn't keep the _NEW_* flags after that. torcs is 2% faster between the previous patch and the end of this series. v2: - add st_atom_list.h to Makefile.sources Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-30 15:02:14 +02:00
Marek Olšák	53bc28920a	st/mesa: remove st_tracked_state::name Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-30 15:02:14 +02:00
Marek Olšák	f2adba4a4c	st/mesa: remove atom debugging code This won't be needed after the rewrite. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-30 15:02:14 +02:00
Kenneth Graunke	ebdc82d065	i965: Fix move_interpolation_to_top() pass. The pass I introduced in commit `a2dc11a781` was entirely broken. A missing "break" made the load_interpolated_input case always fall through to "default" and hit a "continue", making it not actually move any load_interpolated_input intrinsics at all. It would only move the simple load_barycentric_* intrinsics, which don't emit any code anyway, making it basically useless. The initial version I sent of the pass worked, but I apparently failed to verify that the simplified version in v2 actually worked. With the obvious fix applied (so we actually tried to move load_interpolated_input intrinsics), I discovered a second bug: we weren't moving the offset SSA def to the top, breaking SSA validation. The new version of the pass actually moves load_interpolated_input intrinsics and all their dependencies, as intended. Papers over GPU hangs on Ivybridge and Baytrail caused by the recent NIR FS input rework by restoring the old behavior. (I'm not honestly sure why they hang with PLN not at the top.) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97083 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-29 16:05:24 -07:00
Rob Clark	591eeb7d1c	freedreno: limit non-user constant buffers to a4xx Seems to mostly work on a3xx. Except when it doesn't and kills gpu quite badly. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-29 14:58:39 -04:00
Jan Ziak	427771d1c7	glsl: fix uninitialized instance variable Valgrind detected that variable ir_copy_propagation_visitor::killed_all is uninitialized. Signed-off-by: Jan Ziak (http://atom-symbol.net) <0xe2.0x9a.0x9b@gmail.com> Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-29 14:57:51 -04:00
Jan Ziak	b107169eef	configure: add support for LLVM 4.0.0svn static libs Signed-off-by: Jan Ziak (http://atom-symbol.net) <0xe2.0x9a.0x9b@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2016-07-29 16:24:03 +09:00
Rob Herring	a235765d27	virgl: add exported dmabuf to BO hash table Exported dmabufs can get imported by the same process, but the handle was not getting added to the hash table on export. Add the handle to the hash table on export. Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-07-29 09:09:56 +10:00
Anuj Phogat	6d958c7c16	anv: Enable per sample shading on gen8+ Vulkan CTS test results on gen9: ./deqp-vk --deqp-case=dEQP-VK.pipeline.multisample.min_sample_shading* Test run totals: Passed: 60/90 (66.7%) Failed: 0/90 (0.0%) Not supported: 30/90 (33.3%) Warnings: 0/90 (0.0%) Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-28 13:11:12 -07:00
Anuj Phogat	0f94cdc976	anv/pipeline: Fix setting per sample shading in pixel shader We should use the persample_dispatch variable in prog_data. Fixes all (~60) the DEQP sample shading tests. Many tests exited with VK_ERROR_OUT_OF_DEVICE_MEMORY without this patch. V2: Use the shader key bits set in brw_compile_fs (Jason) Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-28 13:11:12 -07:00
Nicolas Boichat	9ee683f877	egl/dri2: Add reference count for dri2_egl_display android.opengl.cts.WrapperTest#testGetIntegerv1 CTS test calls eglTerminate, followed by eglReleaseThread. A similar case is observed in this bug: https://bugs.freedesktop.org/show_bug.cgi?id=69622, where the test calls eglTerminate, then eglMakeCurrent(dpy, NULL, NULL, NULL). With the current code, dri2_dpy structure is freed on eglTerminate call, so the display is not initialized when eglReleaseThread calls MakeCurrent with NULL parameters, to unbind the context, which causes a a segfault in drv->API.MakeCurrent (dri2_make_current), either in glFlush or in a latter call. eglTerminate specifies that "If contexts or surfaces associated with display is current to any thread, they are not released until they are no longer current as a result of eglMakeCurrent." However, to properly free the current context/surface (i.e., call glFlush, unbindContext, driDestroyContext), we still need the display vtbl (and possibly an active dri dpy connection). Therefore, we add some reference counter to dri2_egl_display, to make sure the structure is kept allocated as long as it is required. One drawback of this is that eglInitialize may not completely reinitialize the display (if eglTerminate was called with a current context), however, this seems to meet the EGL spec quite well, and does not permanently leak any context/display even for incorrectly written apps. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-28 14:08:25 +01:00
Emil Velikov	8431c0e9d4	vc4: automake: remove vc4_drm.h from the sources lists The file was removed with earlier commit breaking 'make dist'. Drop it from Makefile.sources since it's no longer around. Fixes: `16985eb308` ("vc4: Switch to using the libdrm-provided vc4_drm.h.") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-28 14:08:24 +01:00
Nicolai Hähnle	bade0cd0fb	ddebug: use pclose to close a popen()'d FILE Found by Coverity. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-28 10:47:51 +01:00
Nicolai Hähnle	21556d86fc	glsl: fix optimization of discard nested multiple levels The order of optimizations can lead to the conditional discard optimization being applied twice to the same discard statement. In this case, we must ensure that both conditions are applied. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96762 Cc: mesa-stable@lists.freedesktop.org Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-28 10:47:04 +01:00
Nicolai Hähnle	185b0c15ab	st_glsl_to_tgsi: only skip over slots of an input array that are present When an application declares varying arrays but does not actually do any indirect indexing, some array indices may end up unused in the consuming shader, so the number of input slots that correspond to the array ends up less than the array_size. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-28 10:46:02 +01:00
Dieter Nützel	041b330a32	clover: make GCC 4.8 happy Without this GCC 4.8.x throws below error: error: invalid initialization of non-const reference of type 'clover::llvm::compat::raw_ostream_to_emit_file {aka llvm::raw_svector_ostream&}' from an rvalue of type '<brace-enclosed initializer list>' v2: change commit title and add error message like Eric Engestrom requested Signed-off-by: Dieter Nützel <Dieter@nuetzel-hh.de> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97019 [ Francisco Jerez: Trivial formatting fix. ] Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-07-27 20:41:05 -07:00
Timothy Arceri	a86aa87342	i965: remove unnecessary null check We would have hit a segfault already if this could be null. Fixes Coverity warning spotted by Matt. Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-28 11:05:57 +10:00
Timothy Arceri	29d70cc964	glsl: free hash tables earlier These are only used by get_matching_input() which has been call at this point so free the hash tables. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-07-28 08:05:04 +10:00
Samuel Pitoiset	af08cfc626	nvc0: enable ARB_tessellation_shader on GM107+ This exposes OpenGL 4.1 on Maxwell (tested on GM107 and GM206). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-27 23:19:07 +02:00
Samuel Pitoiset	3ac373df6e	gm107/ir: add a legalize SSA pass for PFETCH PFETCH, actually ISBERD on GM107+ ISA only accepts a GPR for src0. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-27 23:18:58 +02:00
Samuel Pitoiset	653af07119	nvc0: fix up TCP header on GM107+ The number of outputs patch (limited to 255) has moved in the TCP header, but blob seems to also set the old position. Also, the high 8-bits are now located inbetween the min/max parallel output read address at position 20. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-27 23:18:41 +02:00
Mathias Fröhlich	2060f19b4f	vbo: Fix handling of POS/GENERIC0 attributes. In case of split primitives we need to restore the original setting of the vtx.attrsz array to make immediate mode attribute array tracking work. v2: Use bool instead of boolean. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96950	2016-07-27 06:43:03 +02:00
Marek Olšák	c98c732158	radeon/llvm: Use alloca instructions for larger arrays [revert a revert] This reverts commit `f84e9d749f`. Bioshock Infinite no longer hangs.	2016-07-26 23:31:56 +02:00
Marek Olšák	8636a718b5	r600g: add support for B5G6R5 PBO uploads via texture buffers (v2) v2: set endian swap to 16 untested	2016-07-26 23:21:45 +02:00
Marek Olšák	1e5f00f9d5	radeonsi: pre-generate shader logs for ddebug This cuts down the overhead of si_dump_shader when ddebug is capturing shader logs, which is done for every draw call unconditionally (that's quite a lot of work for a draw call). Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-26 23:06:46 +02:00
Marek Olšák	18475aab6d	radeonsi: add empty lines after shader stats to separate individual shaders dumped consecutively. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-26 23:06:46 +02:00
Marek Olšák	dd66f9d3e7	radeonsi: move the shader key dumping to si_shader_dump Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-26 23:06:46 +02:00
Marek Olšák	b47727a83a	ddebug: implement pipelined hang detection mode For good performance while being able to generate decent hang reports. The report doesn't contain the parsed IB and the buffer list, but it isolates the draw call and dumps shaders while not having to flush the context. This is for GPU hangs that are harder to reproduce and require interactive playing for minutes or even hours. dd_pipe.h explains some implementation details. Initializing, copying (recording) and clearing states is most of the code. The performance should be at least 50% of the normal performance depending on the circumstances. (i.e. 50% is expected to be the worst case scenario, not the best case) The majority of time is spent in dump_debug_state(PIPE_DUMP_CURRENT_SHADERS) and that's after all the optimizations in later patches. There is no obvious way to optimize that further. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-26 23:06:46 +02:00
Marek Olšák	0795a3d54f	ddebug: don't save pointers to call parameters Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-26 23:06:46 +02:00
Marek Olšák	e4079677a7	ddebug: move dd_call into dd_pipe.h Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-26 23:06:46 +02:00
Marek Olšák	d50f9e9b04	ddebug: separate draw call dumping logic Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-26 23:06:46 +02:00
Marek Olšák	95c3025a41	ddebug: move all states into a separate structure Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-26 23:06:46 +02:00
Marek Olšák	f7720948cc	ddebug: write contents of dmesg into hang reports Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-26 23:06:46 +02:00
Marek Olšák	1f85f17998	ddebug: implement create_batch_query Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-26 23:06:46 +02:00
Marek Olšák	6b9924ccb6	ddebug: don't use abort() We don't want a core dump. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-26 23:06:46 +02:00
Marek Olšák	26ef8158ac	ddebug: make dd_get_file_stream accept the screen only Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-26 23:06:46 +02:00
Marek Olšák	27fa933a71	ddebug: clean up ddebug_screen_create Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-26 23:06:46 +02:00
Marek Olšák	6bf81de339	gallium: rework flags for pipe_context::dump_debug_state The pipelined hang detection mode will not want to dump everything. (and it's also time consuming) It will only dump shaders after a draw call and then dump the status registers separately if a hang is detected. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-26 23:06:46 +02:00
Rob Herring	9ace2c1355	vc4: add hash table look-up for exported dmabufs It is necessary to reuse existing BOs when dmabufs are imported. There are 2 cases that need to be handled. dmabufs can be created/exported and imported by the same process and can be imported multiple times. Copying other drivers, add a hash table to track exported BOs so the BOs get reused. v2: Whitespace fixup (by anholt) Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-07-26 13:47:50 -07:00
Eric Anholt	ce8504d196	vc4: Disable early Z with computed depth. We don't tell the hardware whether we're computing depth, so we need to manage early Z state manually. Fixes piglit early-z.	2016-07-26 13:47:50 -07:00
Eric Anholt	4d0b2c7aaa	ttn: Update shader->info as we generate code. We could use the nir_shader_gather_info() pass to update it after the fact, but this is what glsl_to_nir and prog_to_nir do. Reviewed-by: Rob Clark <robclark@freedesktop.org>	2016-07-26 13:47:50 -07:00
Vedran Miletić	7b9a0f4e38	mesa: standardize naming Mesa3D, MESA -> Mesa Signed-off-by: Vedran Miletić <vedran@miletic.net> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-26 13:28:01 -07:00
Kenneth Graunke	95c48391ee	mesa: Make MESA_SHADER_CAPTURE_PATH skip shaders with Name == -1. Shaders with shProg->Name == ~0 (aka 4294967295) are internal meta shaders that we don't really want to capture. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-26 13:27:09 -07:00
Matt Turner	20553e4a2d	mesa: Use AC_HEADER_MAJOR to include correct header for major(). Gentoo has been smoke testing an upcoming change to glibc. Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=580392	2016-07-26 12:12:41 -07:00
Matt Turner	815135166c	glsl: Remove references to tail_pred.	2016-07-26 12:12:27 -07:00
Matt Turner	5ed3299822	glx: Avoid aliasing violations. Compilers are perfectly capable of generating efficient code for calls like these to memcpy(). Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-07-26 12:12:27 -07:00
Matt Turner	2a1d2874f1	mesa: Avoid aliasing violation in uniform_query.cpp. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-07-26 12:12:27 -07:00
Matt Turner	f5ac1d366e	mesa: Avoid aliasing violation in FXT1. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-07-26 12:12:27 -07:00
Matt Turner	a1e9b72102	swrast: Avoid aliasing violation. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-07-26 12:12:27 -07:00
Matt Turner	149309a424	glsl: Avoid aliasing violations. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-07-26 12:12:27 -07:00
Matt Turner	d1f6f65697	glsl: Separate overlapping sentinel nodes in exec_list. I do appreciate the cleverness, but unfortunately it prevents a lot more cleverness in the form of additional compiler optimizations brought on by -fstrict-aliasing. No difference in OglBatch7 (n=20). Co-authored-by: Davin McCall <davmac@davmac.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-07-26 12:12:27 -07:00
Jason Ekstrand	5d76690f17	i965/miptree: Stop multiplying cube depth by 6 in HiZ calculations intel_mipmap_tree::logical_depth0 is now in number of 2D slices so we no longer need to be multiplying by 6. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-07-26 07:58:44 -07:00
Jason Ekstrand	833e389bc0	i965/miptree/isl: Stop multiplying depth by 6 for cubes Now that the logical_depth0 field is in number of 2D slices, we don't need to be multiplying by 6 when creating the surface. It wasn't hurting anything primarily because we get the actual length from the view which was already handling it correctly. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-07-26 07:58:44 -07:00
Jason Ekstrand	d16dc8e963	i965/blorp/gen8: Stop multiplying depth by 6 for cubes intel_mipmap_tree::logical_depth0 is now in 2-D slices so there is no need for us to multiply by 6 when we go to fill out a blorp surface state. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-07-26 07:58:44 -07:00
Samuel Pitoiset	126bd15940	nvc0: use nvc0_m2mf_push_linear() to reduce code duplication Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-26 00:50:34 +02:00
Samuel Pitoiset	c5236f0ecc	nvc0: use nve4_p2mf_push_linear() to reduce code duplication Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-26 00:40:37 +02:00
Andreas Boll	0420666ac0	build: Remove unused AX_CHECK_COMPILE_FLAG macro Unused since `1a6ae84041` Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-25 15:14:12 +02:00
Nils Wallménius	a354c389f5	main: memcpy larger chunks in _mesa_propagate_uniforms_to_driver_storage When possible, do the memcpy on larger blocks. This reduces cycles spent in _mesa_propagate_uniforms_to_driver_storage from 1.51 % to 0.62% according to perf during the Unigine Heaven benchmark. It did not affect the framerate of the benchmark. The system used for testing was an i5 6600K with a Radeon R9 380. Piglit hangs randomly on this system both with and without the patch so i could not make a comparison. v2: fixed whitespace Signed-off-by: Nils Wallménius <nils.wallmenius@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-25 13:51:16 +02:00
Boyuan Zhang	dd208ea006	st/va: enable h264 VAAPI encode Enable H.264 VAAPI encoding through config. Currently only H.264 baseline is supported. Encode entrypoint is not accepted by driver. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>	2016-07-25 13:39:54 +02:00
Boyuan Zhang	71da1354d7	st/va: add function to handle misc param type frame rate Frame rate can be passed to driver either through VAEncSequenceParameterBufferType or VAEncMiscParameterTypeFrameRate. Previous code only implement the former one, which is used by Gstreamer-Vaapi. Now adding implementation for VAEncMiscParameterTypeFrameRate. Also adding default frame rate as 30 just in case application never provides frame rate information to driver. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>	2016-07-25 13:39:53 +02:00
Boyuan Zhang	10dec2de2d	st/va: add enviromental variable to disable interlace Add environmental variable to disable interlace mode. At VAAPI decoding stage, driver can not distinguish b/w pure decoding case and transcoding case. And since interlace encoding is not supported, we have to disable interlace for transcoding case. The temporary solution is to use enviromental variable to disable interlace mode. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>	2016-07-25 13:39:53 +02:00
Boyuan Zhang	b0ceb4cc48	st/va: add preset values for VAAPI encode Add some hardcoded values hardware needs mainly for rate control purpose. With previously hardcoded values for OMX, the rate control result is not correct. This change fixed the rate control result by setting correct values for Vaapi. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>	2016-07-25 13:39:52 +02:00
Boyuan Zhang	85d807f2e0	st/va: add functions for VAAPI encode Add necessary functions/changes for VAAPI encoding to buffer and picture. These changes will allow driver to handle all Vaapi encode related operations. This patch doesn't change the Vaapi decode behaviour. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>	2016-07-25 13:39:52 +02:00
Boyuan Zhang	10c1cc47a6	st/va: get rate control method from configattrib v2 Rate control method is passed from app to driver through config attrib list. That is why we need to store this rate control method to config. And later on, we will pass this value to context->desc.h264enc.rate_ctrl.rate_ctrl_method. v2 (chk): fix broken build and commit message Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>	2016-07-25 13:39:51 +02:00
Boyuan Zhang	34f4634843	st/va: add conversion for yv12 to nv12in putimage v2 For putimage call, if image format is yv12 (or IYUV with U V field swap) and surface format is nv12, then we need to convert yv12 to nv12 and then copy the converted data from image to surface. We can't use the existing logic where surface is destroyed and re-created with yv12 format. v2 (chk): fix some compiler warnings and commit message Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>	2016-07-25 13:39:51 +02:00
Boyuan Zhang	23b4ab1738	vl/util: add copy func for yv12image to nv12surface v2 Add function to copy from yv12 image to nv12 surface for VAAPI putimage call. We need this function in VaPutImage call where copying from yv12 image to nv12 surface for encoding. Existing function can't be used because it only work for copying from yv12 surface to nv12 image in Vaapi. v2: cleanup variable types and commit message Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>	2016-07-25 13:39:18 +02:00
Boyuan Zhang	5bcaa1b9e9	st/va: add encode entrypoint v2 VAAPI passes PIPE_VIDEO_ENTRYPOINT_ENCODE as entry point for encoding case. We will save this encode entry point in config. config_id was used as profile previously. Now, config has both profile and entrypoint field, and config_id is used to get the config object. Later on, we pass this entrypoint to context->templat.entrypoint instead of always hardcoded to PIPE_VIDEO_ENTRYPOINT_BITSTREAM for decoding case previously. Encode entrypoint is not accepted by driver until we enable Vaapi encode in later patch. v2 (chk): fix commit message to match 80 chars, use switch instead of ifs, fix memory leaks in the error path, implement vlVaQueryConfigEntrypoints as well, drop VAEntrypointEncPicture (only used for JPEG). Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>	2016-07-25 13:30:42 +02:00
Samuel Pitoiset	e7b2ce5fd8	nvc0: upload sample locations on GM20x This fixes a bunch of multisample piglit tests on GM206, like bin/arb_texture_multisample-texelfetch 2 -auto -fbo Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-24 22:46:26 +02:00
Rob Clark	2f57e57881	freedreno/a4xx: time-elapsed query should be active for clears Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-24 09:33:05 -04:00
Samuel Pitoiset	3a2e67bf78	nvc0/ir: fix up an assertion in emitUADD() It's illegal to have neg modifiers on both sources for OP_ADD, and it's illegal to have OP_SUB with just src0 neg. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-24 00:42:47 +02:00
Samuel Pitoiset	a159a3d5cb	nvc0: fix wrong indentation in nvc0_validate_fb() Trivial. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-07-23 23:59:10 +02:00
Ilia Mirkin	e483cb9a3a	glsl: reuse main extension table to appropriately restrict extensions Previously we were only restricting based on ES/non-ES-ness and whether the overall enable bit had been flipped on. However we have been adding more fine-grained restrictions, such as based on compat profiles, as well as specific ES versions. Most of the time this doesn't matter, but it can create awkward situations and duplication of logic. Here we separate the main extension table into a separate object file, linked to the glsl compiler, which makes use of it with a custom function which takes the ES-ness of the shader into account (thus allowing desktop shaders to properly use ES extensions that would otherwise have been disallowed.) We can also now use this logic to generate #define's for all supported extensions automatically, removing the duplicate (and often inaccurate) list in glcpp. The effect of this change should be nil in most cases. However in some situations, extensions like GL_ARB_gpu_shader5 which were formerly available in compat contexts on the GLSL side of things will now become inaccessible. This regresses two ES CTS tests: ES3-CTS.shaders.shader_integer_mix.define ES31-CTS.shader_integer_mix.define however that is due to them using #version 100 instead of 300 es. As the extension is only defined for ES3, I believe this is the correct behavior. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v2) v2 -> v3: integrate glcpp defines into the same mechanism	2016-07-23 13:48:04 -04:00
Rob Clark	9253dcde58	freedreno/a4xx: timestamp queries Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-23 13:39:30 -04:00
Rob Clark	b888d8e937	freedreno: hw timestamp support If the kernel supports it, use hw counter for timestamps. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-23 13:39:30 -04:00
Rob Clark	6a4b052820	freedreno: prep work for timestamp queries We need "NULL" state to be a valid bit in the bitmask, because timestamp queries are not restricted to draw/etc stages (ie. the only commands to submit may just be to read the timestamp). And just because there are no draws, isn't a reason to skip the flush and return zero. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-23 13:39:30 -04:00
Nicolai Hähnle	3d69357da9	radeonsi: ensure sample locations are set for line and polygon smoothing Since commit `d938b8c`, the sample locations are no longer set unconditionally, so we need to set the atom to dirty on all chips, not just Polaris. Cc: 12.0 <mesa-stable@lists.freedesktop.org>	2016-07-23 15:36:39 +02:00
Nicolai Hähnle	f755da0f2f	radeonsi: fix Polaris MSAA regression The regression was introduced by commit `d938b8c`. The problem here is that in order to use the small primitive filter, we need to explicitly set the sample locations to 0. But the DB doesn't properly process the change of sample locations without a flush, and so we can end up with incorrect Z values. Instead of doing a flush, just disable the small primitive filter when MSAA is force-disabled. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96908 Cc: 12.0 <mesa-stable@lists.freedesktop.org>	2016-07-23 15:36:38 +02:00
francians@gmail.com	abb2a865a4	freedreno/ir3: Add missing braces in initializer Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-23 09:14:55 -04:00
francians@gmail.com	c99cdd2175	freedreno/a2xx: silence missing case 'SHADER_COMPUTE' warning (v2) v2: no need for break after an unreachable (Matt Turner) Signed-off-by: Francesco Ansanelli <francians@gmail.com> Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-23 09:14:18 -04:00
Marek Olšák	700de07771	radeonsi: implement buffer_subdata without indirect calls There is less noise in CPU profile data now. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-23 13:33:42 +02:00
Marek Olšák	8e3e9d2839	gallium/util: don't modify usage in pipe_buffer_write All drivers were already doing it except virgl. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-23 13:33:42 +02:00
Marek Olšák	1ffe77e7bb	gallium: split transfer_inline_write into buffer and texture callbacks to reduce the call indirections with u_resource_vtbl. The worst call tree you could get was: - u_transfer_inline_write_vtbl - u_default_transfer_inline_write - u_transfer_map_vtbl - driver_transfer_map - u_transfer_unmap_vtbl - driver_transfer_unmap That's 6 indirect calls. Some drivers only had 5. The goal is to have 1 indirect call for drivers that care. The resource type can be determined statically at most call sites. The new interface is: pipe_context::buffer_subdata(ctx, resource, usage, offset, size, data) pipe_context::texture_subdata(ctx, resource, level, usage, box, data, stride, layer_stride) v2: fix whitespace, correct ilo's behavior Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Acked-by: Roland Scheidegger <sroland@vmware.com>	2016-07-23 13:33:42 +02:00
Kenneth Graunke	0ba7288376	nir: Lower interp_var_at_* like a normal load_var for flat inputs. "flat centroid" and "flat sample" both just mean "flat", so we should ignore interpolateAtCentroid/Sample and just return the flat value. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97032 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-07-22 20:31:20 -07:00
Kenneth Graunke	f80bea2d80	mesa: Don't call GenerateMipmap if Width or Height == 0. One of the WebGL 2.0 conformance tests is trying to call glGenerateMipmaps with a width and height of 0. With the meta implementation, this generates a "framebuffer attachment incomplete" status, and falls back to the CPU path, calling MapTextureImage. Except that there's no actual texture to map, and we assert fail. There's no work to do in this case. The test expects it to succeed, so just return early with no error and avoid hassling the driver. Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96911 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-07-22 20:31:20 -07:00
Jason Ekstrand	b33bccb519	anv/pipeline: Set up point coord enables Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-07-22 16:48:54 -07:00
Jason Ekstrand	9e05e51cff	spirv/nir: Add support for ImageQuerySamples Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>	2016-07-22 16:48:54 -07:00
Jason Ekstrand	71202352c8	spirv/nir: Handle texture projectors Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>	2016-07-22 16:48:54 -07:00
Jason Ekstrand	36c31b8fa2	nir/spirv: Refactor coordinate handling in handle_texture Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>	2016-07-22 16:48:54 -07:00
Jason Ekstrand	b820c8b78c	spirv/nir: Refactor type handling in handle_texture Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>	2016-07-22 16:48:54 -07:00
Jason Ekstrand	561be50a1a	spirv/nir: Move opcode selection higher up in handle_texture Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>	2016-07-22 16:48:54 -07:00
Jason Ekstrand	c8da91aa24	anv/image: Assert that the image format is actually supported Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>	2016-07-22 16:48:54 -07:00
Jason Ekstrand	34a39e91ba	spirv/nir: Don't increment coord_components for array lod queries For lod query instructions, we really don't care whether or not the sampler is an array type because that doesn't factor into the LOD. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>	2016-07-22 16:48:54 -07:00
Jason Ekstrand	67b7d876e4	i965: Get rid of the do_lower_unnormalized_offsets pass We can do this in NIR now. No need to keep a GLSL pass lying around for it. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>	2016-07-22 16:48:54 -07:00
Jason Ekstrand	9f32721f86	i965/nir: Enable NIR lowering of txf and rect offsets This fixes the following piglit tests on gen6+: tex-miplevel-selection textureProjGradOffset 2DRect tex-miplevel-selection textureGradOffset 2DRect tex-miplevel-selection textureGradOffset 2DRectShadow tex-miplevel-selection textureProjGradOffset 2DRect_ProjVec4 tex-miplevel-selection textureProjGradOffset 2DRectShadow Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>	2016-07-22 16:48:54 -07:00
Jason Ekstrand	d9156efc52	nir/lower_tex: Add support for lowering coordinate offsets On i965, we can't support coordinate offsets for texelFetch or rectangle textures. Previously, we were doing this with a GLSL pass but we need to do it in NIR if we want those workarounds for SPIR-V. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>	2016-07-22 16:48:53 -07:00
Jason Ekstrand	843fc8f3e7	nir/lower_tex: Add some helpers for working with tex sources Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>	2016-07-22 16:48:53 -07:00
Jason Ekstrand	09135cd55a	nir: Add a helper for determining the type of a texture source Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>	2016-07-22 16:27:35 -07:00
Jason Ekstrand	3c0077a6ec	anv/pipeline: Set binding_table.gather_texture_start This should get texture gather working on gen8+ and mostly working on gen7. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>	2016-07-22 16:27:35 -07:00
Jason Ekstrand	95e9d58bdb	spirv/nir: Properly handle gather components Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>	2016-07-22 16:27:35 -07:00
Jason Ekstrand	7c7acf53b2	spirv/nir: Add support for shadow samplers that return vec4 While SPIR-V technically doesn't support "old style" shadow, the shadow-compare gather instruction does return a vec4 so we need to be able to set the old_style_shadow bit in NIR. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>	2016-07-22 16:27:35 -07:00
Jason Ekstrand	2ddefd03b7	spirv/nir: Fix some texture opcode asserts We can't get an lod with txf_ms and SPIR-V considers textureGrad to be an explicit-LOD texturing instruction. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-dev@lists.freedesktop.org>	2016-07-22 16:27:35 -07:00
Samuel Pitoiset	3f5cf8c488	nv50/ir: allow to swap sources for OP_SUB This allows the load-propagation pass to swap the sources in presence of immediate values. Maxwell (GM107): total instructions in shared programs :1928187 -> 1927634 (-0.03%) total gprs used in shared programs :330741 -> 330154 (-0.18%) total local used in shared programs :28032 -> 28032 (0.00%) local gpr inst bytes helped 0 271 425 425 hurt 0 0 194 194 Fermi (GF114): total instructions in shared programs :2334474 -> 2333829 (-0.03%) total gprs used in shared programs :380934 -> 380215 (-0.19%) total local used in shared programs :33304 -> 33264 (-0.12%) local gpr inst bytes helped 5 314 521 521 hurt 0 4 195 195 No regressions on GM107 and GF114 with full piglit. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-22 22:51:37 +02:00
Marek Olšák	2e890b5350	gallium/radeon: make deferred flushes asynchronous Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-22 22:34:49 +02:00
Marek Olšák	d17b35e671	gallium: add PIPE_FLUSH_DEFERRED There are 2 uses: - Asynchronous flushing for multithreaded drivers. - Return a fence without flushing (mid-command-buffer fence). The driver can defer flushing until fence_finish is called. This is required to make Bioshock Infinite faster, which creates 1000 fences (flushes) per frame. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-07-22 22:34:49 +02:00
Marek Olšák	4cdc482283	gallium/os: use CLOCK_MONOTONIC for sleeps (v2) v2: handle EINTR, remove backslashes Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-07-22 22:34:49 +02:00
Eric Engestrom	4da9f7e7ce	mapi: fix typo in macro name Fixes: `5ec140c17b` ("mapi: Massage code to allow clang to compile.") Reported-by: Alexandre Demers <alexandre.f.demers@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-07-22 10:14:00 -07:00
Kenneth Graunke	44ef2ce6ec	docs: Put swr back on the GL_ARB_texture_buffer_object_rgb32 list. Looks like this was lost when resolving merge conflicts in commit `d1fbd4cdb1`.	2016-07-22 09:57:54 -07:00
Andres Gomez	d068b38e46	glsl: subroutine types cannot be compared subroutine variables are to be used just in the way functions are called. Although the spec doesn't say it explicitely, this means that these variables are not to be used in any other way than those left for function calls. Therefore, a comparison between 2 subroutine variables should also cause a compilation error. From The OpenGL® Shading Language 4.40, page 117: " To use subroutines, a subroutine type is declared, one or more functions are associated with that subroutine type, and a subroutine variable of that type is declared. The function currently assigned to the variable function is then called by using function calling syntax replacing a function name with the name of the subroutine variable. Subroutine variables are uniforms, and are assigned to specific functions only through commands (UniformSubroutinesuiv) in the OpenGL API." From The OpenGL® Shading Language 4.40, page 118: " Subroutine uniform variables are called the same way functions are called. When a subroutine variable (or an element of a subroutine variable array) is associated with a particular function, all function calls through that variable will call that particular function." Fixes GL44-CTS.shader_subroutine.subroutines_cannot_be_assigned_float_int_values_or_be_compared Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-07-22 17:30:25 +03:00
Timothy Arceri	a2b3c146d2	i965: fix varying output setup Since `7f53fead5c` we treat every location as using all four components so we only need special handling for doubles when they cross multiple locations. This fixes a crash in GL45-CTS.enhanced_layouts.varying_locations where the outputs array would overflow when a dmat2 was stored at the max varying location i.e 30. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-07-23 00:04:10 +10:00
Samuel Pitoiset	c2801f9272	nvc0/mme: fix offsets used for indirect draws This fixes a regression introduced in `1da704a94c` because the offset has moved from 0x180 to 0x1a0, and the macros have to be re-compiled. Fixes: `1da704a` ("nvc0: increase the tex handles area size in the driver") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-22 11:32:09 +02:00
Samuel Pitoiset	dbcff7fdbb	nvc0: fix offsets of MP perf counters input parameters This fixes a regression introduced in `1da704a94c` because the offset has moved from 0x600 to 0x620, and the kernels used for reading MP perf counters have to be re-assembled. This also fixes amd_performance_monitor_measure piglit. Fixes: `1da704a` ("nvc0: increase the tex handles area size in the driver") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-22 11:32:04 +02:00
Kenneth Graunke	cb70773129	mesa: Add GL_BGRA_EXT to the list of GenerateMipmap internal formats. The GL_EXT_texture_format_BGRA8888 extension specification defines a GL_BGRA_EXT unsized internal format (which is a little odd - usually BGRA is a pixel transfer format). The extension is written against the ES 1.0 specification, so it's a little hard to map, but I believe it's effectively adding it to the table used here, so we should allow it here as well. Note that GL_EXT_texture_format_BGRA8888 is always enabled (dummy_true), so we don't need to check if it's enabled here. This fixes mipmap generation in Skia and ChromeOS. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> References: https://bugs.chromium.org/p/chromium/issues/detail?id=630371 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reported-by: Stéphane Marchesin <marcheu@chromium.org> Cc: mesa-stable@lists.freedesktop.org	2016-07-21 21:31:57 -07:00
Kenneth Graunke	be1c53d2cf	i965: Fix "operation operation" in comment. From the redundant redundant department. Reported-by: Michael Schellenberger Costa <mschellenbergercosta@googlemail.com>	2016-07-21 21:31:57 -07:00
Kenneth Graunke	76e161056a	i965: Fix shared atomic intrinsics to pay attention to base. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-07-21 21:31:55 -07:00
Kenneth Graunke	cf6f2d3ce7	nir: Add a base const_index to shared atomic intrinsics. Commit `52e75dcb8c` made nir_lower_io start using nir_intrinsic_set_base instead of writing const_index[0] directly. However, those intrinsics apparently don't /have/ a base, so this caused assert failures. However, the old code was happily setting non-existent const_index fields, so it was pretty bogus too. Jason pointed out that load_shared and store_shared have a base, and that the i965 driver uses that field. So presumably atomics should have one as well, so that loads/stores/atomics all refer to variables with consistent addressing. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-07-21 21:31:41 -07:00
Timothy Arceri	91dde3ddca	glsl: re-enable varying packing in GL4.4+ We can still do packing we just need to get the packing type from the consumer rather than the producer. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97033	2016-07-22 10:21:08 +10:00
Kenneth Graunke	2db357e4c3	i965: Include VUE handles for GS with invocations > 1. We always resort to the pull model for instanced GS inputs. So, we'd better include the VUE handles, or else we can't actually pull anything. Ian reports that on his branch with OES_geometry_shader enabled, this fixes a bunch of dEQP-GLES31.functional.geometry_shading tests:: - instanced.draw_2_instances_geometry_2_invocations - instanced.draw_2_instances_geometry_8_invocations - instanced.draw_4_instances_geometry_2_invocations - instanced.draw_4_instances_geometry_8_invocations - instanced.draw_8_instances_geometry_2_invocations - instanced.draw_8_instances_geometry_8_invocations - instanced.geometry_2_invocations - instanced.geometry_32_invocations - instanced.geometry_8_invocations - instanced.geometry_max_invocations - instanced.geometry_output_different_2_invocations - instanced.geometry_output_different_32_invocations - instanced.geometry_output_different_8_invocations - instanced.geometry_output_different_max_invocations - instanced.invocation_output_vary_by_attribute - instanced.invocation_output_vary_by_texture - instanced.invocation_output_vary_by_uniform - query.primitives_generated_instanced Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Tested-by: Ian Romanick <ian.d.romanick@intel.com>	2016-07-21 11:15:12 -07:00
Matt Turner	8c8c3f859e	mesa: Add -fno-math-errno -fno-trapping-math to CXXFLAGS. Not sure why I forgot to add them to CXXFLAGS in commit `f55c408067` or commit `875458b778`. Cuts about 1k of .text. text data bss dec hex filename 5806354 287816 29384 6123554 5d7022 i965_dri.so before 5805497 287744 29384 6122625 5d6c81 i965_dri.so after Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-21 10:45:28 -07:00
Matt Turner	5353855e9d	mesa: Drop -fno-builtin-memcmp. According to the referenced bug report, gcc-4.5 and newer do not inline memcmp(). I see no difference in performance of ipers with llvmpipe on a Sandybridge (which does not have "Enhanced REP MOVSB/STOSB") by removing this flag. I attempted to confirm the problem with gcc-4.4, but it fails to compile for quite a few different reasons. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-07-21 10:45:28 -07:00
Matt Turner	5ec140c17b	mapi: Massage code to allow clang to compile. According to https://llvm.org/bugs/show_bug.cgi?id=19778#c3 this code was violating the spec, resulting in it failing to compile. Cc: mesa-stable@lists.freedesktop.org Co-authored-by: Tomasz Paweł Gajc <tpgxyz@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89599 Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-21 10:45:28 -07:00
Ian Romanick	6bc5491193	docs: Add extensions not part of any GL or GL ES version Based loosely on patches submitted ages ago by Thomas Helland. v2: Add lots of missing data provided by Ilia. Fix sort order of GL_ARB_sparse_texture extensions suggested by Ilia. v3: Note that Dave Airlie has started work on GL_ARB_bindless_texture. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2016-07-21 10:31:04 -07:00
Ian Romanick	d1fbd4cdb1	docs: Update GL3.txt for OpenGL 4.0 on i965-ish hardware Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2016-07-21 10:30:20 -07:00
Ian Romanick	7dc99da81a	docs: Update GL3.txt for OpenGL ES on i965-ish hardware Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2016-07-21 10:26:55 -07:00
Timothy Arceri	4f89cf4941	i965: print error messages if gs fails to compile We do this for all other stages. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-21 15:05:05 +10:00
Timothy Arceri	b463b1d7cc	i965: enable GL4.4 for Gen8+ Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-21 12:06:11 +10:00
Timothy Arceri	4ba9bd138a	i965: enable ARB_enhanced_layouts for gen6+ Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-21 12:06:11 +10:00
Timothy Arceri	f3805c5f09	i965/vec4: add packing support for tcs load outputs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-21 12:06:11 +10:00
Timothy Arceri	255388a965	i965/vec4: add support for packing tes inputs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-07-21 12:06:11 +10:00
Timothy Arceri	d07cfb31c4	i965/vec4: add support for packing tcs outputs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-21 12:06:11 +10:00
Timothy Arceri	b25e49a3c7	i965/vec4: support packing tcs inputs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-21 12:06:11 +10:00
Timothy Arceri	d1192bef7e	i965/vec4: add component packing for gs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-21 12:06:11 +10:00
Timothy Arceri	d1b1fca0b7	i965/vec4: add support for packing vs/gs/tes outputs Here we create a new output_generic_reg array with the ability to store the dst_reg for each component of user defined varyings. This is needed as the previous code only stored the dst_reg based on the varying location which meant packed varyings would overwrite each other. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-07-21 12:06:11 +10:00
Timothy Arceri	b427abba0c	i965/vec4: add support for packing inputs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-21 12:06:11 +10:00
Timothy Arceri	138aad06b3	i965: add helper for creating packing writemask For example where n=3 first_component=1 this will give us 0xE (WRITEMASK_YZW). V2: Add assert to check first component is <= 4 (Suggested by Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-21 12:06:11 +10:00
Timothy Arceri	4b57b53f85	i965: add helpers for creating component layout swizzle This will be used to swizzle components to the beginning or end of the vector based on the component layout qualifier and whether we are doing a load or store. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-21 12:06:11 +10:00
Eric Anholt	d2b4b16589	vc4: Return V3D version details in the GL renderer info. This is as close as we get to a name for the 3D blocks.	2016-07-20 16:15:15 -07:00
Eric Anholt	d81934cded	vc4: Check the V3D version reported by the kernel. We don't want to bring up an old userspace driver on a kernel for newer hardware. We'll also want to look at the other ident fields in the future.	2016-07-20 16:15:15 -07:00
Eric Anholt	83b8ca58e1	vc4: Detect and report kernel support for branching.	2016-07-20 16:15:15 -07:00
Eric Anholt	16985eb308	vc4: Switch to using the libdrm-provided vc4_drm.h. The required version is set to .69 for the getparam ioctl that will be used in the next commit.	2016-07-20 16:15:15 -07:00
Timothy Arceri	3d8c29ed32	docs: mark ARB_enhanced_layouts as DONE for i965 Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-21 09:10:53 +10:00
Timothy Arceri	d99a040bbf	i965: enable ARB_enhanced_layouts for gen8+ Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-21 09:10:53 +10:00
Timothy Arceri	cba6657d8b	nir: add doubles component packing support This makes sure we give the correct driver location for doubles when using component packing. Specifically it handles packing a dvec3 with a double which is the only packing scenario allowed which spans across two locations. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-07-21 09:10:53 +10:00
Timothy Arceri	ad5dd39984	i965: add component packing support for load_output intrinsics Here we use the component qualifier (which is the first component) as an offset when loading output varyings. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-21 09:10:53 +10:00
Timothy Arceri	7f53fead5c	i965: enable component packing for vs and fs Rather than trying to work out the total number of components used at a location we simply treat all outputs as vec4s. This removes the need for complex code looping over varyings to match packed locations and the need for storing the total number of components used at each location. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-21 09:10:53 +10:00
Timothy Arceri	09e46f99ad	i965: bring back type_size_vec4_times_4() We will use this for output varyings. To make component packing simpler we will just treat all varyings as vec4s. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-21 09:10:53 +10:00
Jason Ekstrand	9d503aea06	nir/inline: Constant-initialize local variables in the callee if needed Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-07-20 15:29:55 -07:00
Jason Ekstrand	dc9f2436c3	nir: Add a nir_deref_foreach_leaf helper Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-07-20 15:29:55 -07:00
Tom Stellard	106946153f	clover: Re-order includes in invocation.cpp to fix build The build was failing because the official CL headers have a few defines, like: # define cl_khr_gl_sharing 1 Which have the same name as some class members of clang's OpenCLOptions class. If we include the cl headers first, this breaks the build because the member names of this class are replaced by the literal 1. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Vedran Miletić <vedran@miletic.net>	2016-07-20 21:15:53 +00:00
Tom Stellard	a73bf11a63	clover: Add missing include v2 clang commit r275822 removed unnecessary includes from header files, so we now need to explicitly include clang/Lex/PreprocessorOptions.h v2: - Use <> instead of "" for the include path. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Vedran Miletić <vedran@miletic.net>	2016-07-20 21:15:53 +00:00
Kenneth Graunke	3dba8516d6	i965: Move VS load_input handling to nir_emit_vs_intrinsic(). TCS/TES/GS and now FS all handle these in stage-specific functions. CS don't have inputs, so VS was the only one left using this code. Move it to the VS-specific function for clarity. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-20 11:01:26 -07:00
Kenneth Graunke	1608209952	i965: Delete the FS_OPCODE_INTERPOLATE_AT_CENTROID virtual opcode. We no longer use this message. As far as I can tell, it's fairly useless - the equivalent information is provided in the payload. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisforbes@google.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-20 11:01:24 -07:00
Kenneth Graunke	1eef0b73aa	i965: Rewrite FS input handling to use the new NIR intrinsics. This eliminates the need to walk the list of input variables, recurse into their types (via logic largely redundant with nir_lower_io), and interpolate all possible inputs up front. The backend no longer has to care about variables at all, which eliminates complications from trying to pack multiple variables into the same location. Instead, each intrinsic specifies exactly what's needed. This should unblock Timothy's work on GL_ARB_enhanced_layouts. Each load_interpolated_input intrinsic corresponds to PLN instructions, while load_barycentric_at_* intrinsics correspond to pixel interpolator messages. The pixel/centroid/sample barycentric intrinsics simply refer to payload fields (delta_xy[]), and don't actually generate any code. Because we use a single intrinsic for both centroid-qualified variables and interpolateAtCentroid(), they become indistinguishable. We stop sending pixel interpolator messages for those, and instead use the payload provided data, which should be considerably faster. On Broadwell: total instructions in shared programs: 9067751 -> 9067570 (-0.00%) instructions in affected programs: 145902 -> 145721 (-0.12%) helped: 422 HURT: 209 total spills in shared programs: 2849 -> 2899 (1.76%) spills in affected programs: 760 -> 810 (6.58%) helped: 0 HURT: 10 total fills in shared programs: 3910 -> 3950 (1.02%) fills in affected programs: 617 -> 657 (6.48%) helped: 0 HURT: 10 LOST: 3 GAINED: 3 The differences mostly appear to be slight changes in MOVs. v2: Use nir_shader_compiler_options::use_interpolated_input_intrinsics flag rather than passing it directly to nir_lower_io. Use the unreachable() macro rather than assert in one place. (Review feedback from Chris Forbes.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisforbes@google.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-20 11:01:16 -07:00
Kenneth Graunke	a2dc11a781	i965: Move load_interpolated_input/barycentric_* intrinsics to the top. Currently, i965 interpolates all FS inputs at the top of the program. This has advantages and disadvantages, but I'd like to keep that policy while reworking this code. We can consider changing it independently. The next patch will make the compiler generate PLN instructions "on the fly", when it encounters an input load intrinsic, rather than doing it for all inputs at the start of the program. To emulate this behavior, we introduce an ugly pass to move all NIR load_interpolated_input and payload-based (not interpolator message) load_barycentric_* intrinsics to the shader's start block. This helps avoid regressions in shader-db for cases such as: if (...) { ...load some input... } else { ...load that same input... } which CSE can't handle, because there's no dominance relationship between the two loads. Because the start block dominates all others, we can CSE all inputs and emit PLNs exactly once, as we did before. Ideally, global value numbering would eliminate these redundant loads, while not forcing them all the way to the start block. When that lands, we should consider dropping this hacky pass. Again, this pass currently does nothing, as i965 doesn't generate these intrinsics yet. But it will shortly, and I figured I'd separate this code as it's relatively self-contained. v2: Dramatically simplify pass - instead of creating new instructions, just remove/re-insert their list nodes (suggested by Jason Ekstrand). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisforbes@google.com> [v1] Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-20 11:01:11 -07:00
Kenneth Graunke	048a56c1fc	i965: Add a pass to demote sample interpolation intrinsics. When working with a non-multisampled render target, asking for "sample" interpolation locations doesn't make sense. We demote them to centroid. In a couple of patches, brw_compute_barycentric_modes will begin looking at these intrinsics to determine the barycentric modes. fs_visitor also will use them to code-generate pixel interpolator messages or payload references. Handling the "but what if it's not MSAA?" logic ahead of time in a NIR pass simplifies things and prevents duplicated logic. This patch doesn't actually do anything useful yet as we don't generate these intrinsics. I decided to keep it separate as it's self-contained, in the hopes of shrinking the "convert everything" patch for reviewers. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisforbes@google.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-20 11:01:08 -07:00
Kenneth Graunke	707ca00fce	nir: Add nir_load_interpolated_input lowering code. Now nir_lower_io can optionally produce load_interpolated_input and load_barycentric_* intrinsics for fragment shader inputs. flat inputs continue using regular load_input. v2: Use a nir_shader_compiler_options flag rather than ad-hoc boolean passing (in response to review feedback from Chris Forbes). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisforbes@google.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-20 11:01:00 -07:00
Kenneth Graunke	2496462479	nir: Add new intrinsics for fragment shader input interpolation. Backends can normally handle shader inputs solely by looking at load_input intrinsics, and ignore the nir_variables in nir->inputs. One exception is fragment shader inputs. load_input doesn't capture the necessary interpolation information - flat, smooth, noperspective mode, and centroid, sample, or pixel for the location. This means that backends have to interpolate based on the nir_variables, then associate those with the load_input intrinsics (say, by storing a map of which variables are at which locations). With GL_ARB_enhanced_layouts, we're going to have multiple varyings packed into a single vec4 location. The intrinsics make this easy: simply load N components from location <loc, component>. However, working with variables and correlating the two is very awkward; we'd much rather have intrinsics capture all the necessary information. Fragment shader input interpolation typically works by producing a set of barycentric coordinates, then using those to do a linear interpolation between the values at the triangle's corners. We represent this by introducing five new load_barycentric_* intrinsics: - load_barycentric_pixel (ordinary variable) - load_barycentric_centroid (centroid qualified variable) - load_barycentric_sample (sample qualified variable) - load_barycentric_at_sample (ARB_gpu_shader5's interpolateAtSample()) - load_barycentric_at_offset (ARB_gpu_shader5's interpolateAtOffset()) Each of these take the interpolation mode (smooth or noperspective only) as a const_index, and produce a vec2. The last two also take a sample or offset source. We then introduce a new load_interpolated_input intrinsic, which is like a normal load_input intrinsic, but with an additional barycentric coordinate source. The intention is that flat inputs will still use regular load_input intrinsics. This makes them distinguishable from normal inputs that need fancy interpolation, while also providing all the necessary data. This nicely unifies regular inputs and interpolateAt functions. Qualifiers and variables become irrelevant; there are just load_barycentric intrinsics that determine the interpolation. v2: Document the interp_mode const_index value, define a new BARYCENTRIC() helper rather than using SYSTEM_VALUE() for some of them (requested by Jason Ekstrand). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisforbes@google.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-20 11:00:45 -07:00
Kenneth Graunke	e614062e54	anv: Properly call gen75_emit_state_base_address on Haswell. This should fix MOCS values. Caught by Coverity. CID: 1364155 Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-20 10:59:44 -07:00
Kenneth Graunke	87660579f5	genxml: Rename "API Rendering Disable" to "Rendering Disable". Gen7/7.5 call it "Rendering Disable" while Gen8/9 prefix it with "API". Pick one for consistency, and so we can share code between generations. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-20 10:59:44 -07:00
Kenneth Graunke	bfd9942cdc	anv: Unify 3DSTATE_CLIP code across generations. The bulk of this is the same. There are just a couple fields that only exist on one generation or another, and we can easily handle those with an #ifdef. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-20 10:59:44 -07:00
Kenneth Graunke	44502afd82	anv: Enable early culling on Gen7. We set the cull mode, but forgot the enable bit. Gen8 uses this. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-20 10:59:44 -07:00
Kenneth Graunke	0d77f08042	anv: Fix near plane clipping on Gen7/7.5. The Gen7/7.5 clip code used APIMODE_OGL, while the Gen8+ clip code used APIMODE_D3D. The meaning hasn't changed, so one of these must be wrong. It appears that the hardware documentation is completely wrong. It claims that the "API Mode" bit means: 0h APIMODE_OGL NEAR_VP boundary == 0.0 (NDC) 1h APIMODE_D3D NEAR_VP boundary == -1.0 (NDC) However, DirectX typically uses 0.0 for the near plane, while unextended OpenGL uses -1.0. i965's gen6_clip_state.c uses APIMODE_D3D for the GL_ZERO_TO_ONE case, so I believe the meanings are backwards from what the documentation says. Section 23.2 ("Primitive Clipping") of the Vulkan 1.0.21 specification contains the following equations: -w_c <= x_c <= w_c -w_c <= y_c <= w_c 0 <= z_c <= w_c This means that Vulkan follows D3D semantics. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-20 10:59:44 -07:00
Kenneth Graunke	6b67270262	genxml: Add APIMODE_D3D missing enum values and improve consistency. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-20 10:59:44 -07:00
Kenneth Graunke	c31cf532af	genxml: Add CLIPMODE_* prefix to 3DSTATE_CLIP's "Clip Mode" enum values. Gen6-7.5 use CLIPMODE_REJECT_ALL, while Gen8+ just used REJECT_ALL. Being consistent will let me unify code, and I prefer having the prefix. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-20 10:59:44 -07:00
Tim Rowley	0f13a8f770	swr: [rasterizer core] introduce simd16intrin.h Refactoring to leave existing simd_* intrinsics in "simdintrin.h" unchanged, adding corresponding simd16_* intrinsics in "simd16intrin.h" on the side, with emulation, that we can use piecemeal, rather than the all-or-nothing approach to bring up avx512. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-07-20 10:22:15 -05:00
Tim Rowley	5fe361e2c0	swr: [rasterizer core] fix for possible int32 overflow condition Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-07-20 10:22:15 -05:00
Tim Rowley	a123d12e14	swr: [rasterizer core] rename _MAX enum values to _COUNT Makes these names semantically correct. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-07-20 10:22:15 -05:00
Tim Rowley	e41d9dd576	swr: [rasterizer core] centroid correction Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-07-20 10:22:15 -05:00
Tim Rowley	e0529a4668	swr: [rasterizer core] support range of values in TemplateArgUnroller Fixes Linux warnings. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-07-20 10:22:15 -05:00
Tim Rowley	0363015964	swr: [rasterizer core] ensure adjacent topologies use the cut-aware PA Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-07-20 10:22:15 -05:00
Tim Rowley	efdaf5fa3e	swr: [rasterizer] attribute swizzling and linkage Add support for enhanced attribute swizzling. Currently supports constant source overrides to handle PrimitiveID support. No support yet for input select swizzling or wrap shortest. Removes obsoleted linkageMask and associated code. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-07-20 10:22:15 -05:00
Tim Rowley	a5846fb75a	swr: [rasterizer common] icc declspec definitions Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-07-20 10:22:15 -05:00
Tim Rowley	0d13f2e801	swr: [rasterizer jitter] rework vertex/instance ID storage in fetch Moved the setting into the existing component control code. Fixes bad interaction between attribute/component setting for vertex/instance ID and component packing. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-07-20 10:22:14 -05:00
Tim Rowley	1d09b3971a	swr: [rasterizer core] avx512 simd utility work Enabling KNOB_SIMD_WIDTH = 16 for AVX512 pre-work and low level simd utils Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-07-20 10:22:14 -05:00
Tim Rowley	98641f4e73	swr: [rasterizer core] viewport rounding for disabled scissor Adjust viewport rounding when scissor rect is disabled during macro tile scissor setup. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-07-20 10:22:14 -05:00
Jason Ekstrand	96dfed49e4	i965: Stop muging cube array lengths by 6 From the Sky Lake PRM: "For SURFTYPE_CUBE: For Sampling Engine Surfaces and Typed Data Port Surfaces, the range of this field is [0,340], indicating the number of cube array elements (equal to the number of underlying 2D array elements divided by 6). For other surfaces, this field must be zero." In other words, the depth field for cube maps is in number of cubes not number of 2-D slices so we need to divide by 6. ISL will do this correctly for us assuming that we provide it with the correct array bounds which it expects to be in 2-D slices. It appears as if we've been doing this wrong ever since we first added cube map arrays for Sandy Bridge and the change to ISL made things slightly worse. While we're at it, we now need to remoe the shader hacks we've always done since they were only needed because we were setting the depth field six times too large. v2: Fix the vec4 backend as well (not sure how I missed this). Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Chris Forbes <chrisforbes@google.com>	2016-07-20 08:19:26 -07:00
Jason Ekstrand	e19b7f7f1b	i965/miptree: Set logical_depth0 == 6 for cube maps This matches what we do for cube maps where logical_depth0 is in number of face-layers rather than number of cubes. This does mean that we will temporarily be setting the surface bounds too loose for cube map textures but we are already setting them too loose for cube arrays and we will be fixing that in the next commit anyway. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Chris Forbes <chrisforbes@google.com> Cc: "12.0 11.2 11.1" <mesa-stable@lists.freedesktop.org>	2016-07-20 08:19:22 -07:00
Jason Ekstrand	d4d505d0b0	i965/miptree: Enforce that height == 1 for 1-D array textures The GL API and mesa internals do this differently than we do. In GL, there is no depth parameter for 1-D arrays and height is used. In the i965 miptree code we do the sane thing and make height == 1 and use depth for number of slices. This makes for a mismatch every time we create a 1-D array texture from GL. Instead of actually solving this problem, we just said "1-D is hard, let's make sure it works no matter which way we pass the parameters" and called it a day. This commit fixes the one GL -> i965 transition point where we weren't already handling 1-D array textures to do the right thing and then replaces the magic fixup code with an assert that you're doing the right thing. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Chris Forbes <chrisforbes@google.com> Cc: "12.0 11.2 11.1" <mesa-stable@lists.freedesktop.org>	2016-07-20 08:18:19 -07:00
Stefan Dirsch	27ef7bfd6c	Avoid overflow in 'last' variable of FindGLXFunction(...) This 'last' variable used in FindGLXFunction(...) may become negative, but has been defined as unsigned int resulting in an overflow, finally resulting in a segfault when accessing _glXDispatchTableStrings[...]. Fixed this by definining it as signed int. 'first' variable also needs to be defined as signed int. Otherwise condition for while loop fails due to C implicitly converting signed to unsigned values before comparison. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Stefan Dirsch <sndirsch@suse.de> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-20 16:05:17 +01:00
Tomasz Figa	9e1248d075	egl/android: Stop leaking DRI images Current implementation of the DRI image loader does not free the images created in get_back_bo() and so leaks memory. Moreover, it creates a new image every time the DRI driver queries for buffers, even if the backing native buffer has not changed. leaking memory again. This patch adds missing call to destroyImage() in droid_enqueue_buffer() and a check if image is already created to get_back_bo() to fix the above. Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-20 15:48:54 +01:00
Tomasz Figa	565fa6b748	egl/android: Add some useful error messages It is much easier to debug issues when the application gives some meaningful error messages. This patch adds few to the EGL Android platform backend. Signed-off-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-20 15:48:03 +01:00
Tomasz Figa	94282b6dd0	egl/android: Check return value of dri2_get_dri_config() It might return NULL if specific config variant is unsupported. Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-20 15:47:23 +01:00
Emil Velikov	4f48674d51	i965: store reference to the context within struct brw_fence (v2) As the spec allows for {server,client}_wait_sync to be called without currently bound context, while our implementation requires context pointer. v2: Add a mutex and acquire it for the duration of brw_fence_client_wait() and brw_fence_is_completed() as suggested by Chad. Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Tomasz Figa <tfiga@chromium.org>	2016-07-20 15:45:20 +01:00
Nicolas Boichat	9bebef4034	egl/dri2: dri2_make_current: Set EGL error if bindContext fails Without this, if a configuration is, say, available only on GLES2/3, but not on GLES1, and is rejected by the dri module's bindContext call, eglMakeCurrent fails with error "EGL_SUCCESS". In this patch, we set error to EGL_BAD_MATCH, which is what CTS/dEQP dEQP-EGL.functional.surfaceless_context expect. Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-20 15:10:33 +01:00
Tomasz Figa	ccda100a5a	egl/android: Remove unused variables There are some unused variables left after previous clean-ups triggering compiler warnings. Let's remove them. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-20 15:10:33 +01:00
Tomasz Figa	70a28afb29	gallium/dri: Add shared glapi to LIBADD on Android An earlier patch fixed the problem for classic drivers, however Gallium was still left broken. This patch applies the same workaround to Gallium, when compiled for Android. Following is a quote from the original patch: `0cbc90c57c` mesa: dri: Add shared glapi to LIBADD on Android /system/vendor/lib/dri/*_dri.so actually depend on libglapi: without this, loading the so file fails with: cannot locate symbol "__emutls_v._glapi_tls_Context" On non-Android (non-bionic) platform, EGL uses the following workflow, which works fine: dlopen("libglapi.so", RTLD_LAZY \| RTLD_GLOBAL); dlopen("dri/<driver>_dri.so", RTLD_NOW \| RTLD_GLOBAL); However, bionic does not respect the RTLD_GLOBAL flag, and the dri library cannot find symbols in libglapi.so, so we need to link to libglapi.so explicitly. Android.mk already does this. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Tomasz Figa <tfiga@chromium.org> Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-20 15:10:33 +01:00
Emil Velikov	ae9a2baaa6	mesa: scons: remove left over src/glsl include The path no longer exists. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-20 13:33:43 +01:00
Emil Velikov	1c7c0d77ac	mesa: scons: list builddir before srcdir Analogous to previous commit. Note: scons always uses OOT builds, while the in-tree generated files could be created either manually or by the autoconf build. Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Cc: Alexander von Gluck IV <kallisti5@unixzen.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-20 13:32:24 +01:00
Emil Velikov	eafa82e20e	mesa: automake: list builddir before srcdir In the case of building in out-of-tree fashion, while having generated in-tree sources, the latter [likely stale] files will be used. Flip the order to prevent that. Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-20 13:30:50 +01:00
Józef Kucia	14608ef920	radeonsi: advertise 8 bits subpixel precision for viewport bounds Signed-off-by: Józef Kucia <joseph.kucia@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-07-20 12:45:31 +02:00
Józef Kucia	98aa807188	r600: advertise 8 bits subpixel precision for viewport bounds Signed-off-by: Józef Kucia <joseph.kucia@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-07-20 12:45:31 +02:00
Józef Kucia	3cd28fe3de	gallium: add a cap for VIEWPORT_SUBPIXEL_BITS (v2) This allows Gallium drivers to advertise the subpixel precision for floating point viewports bounds. v2: - Set ViewportSubpixelBits in st_init_limits. Signed-off-by: Józef Kucia <joseph.kucia@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-20 12:45:31 +02:00
Samuel Pitoiset	3c78d89692	nvc0: disable MS images on GM107+ MS images have to be handled explicitly and I don't plan to implement them for now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-20 11:11:33 +02:00
Samuel Pitoiset	8489f20689	nv50/ir: print OP_SUREDB subops in debug mode Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-20 11:11:30 +02:00
Samuel Pitoiset	1edc44bfd3	gm107/ir: add emission for SUREDx Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-20 11:11:26 +02:00
Samuel Pitoiset	4aaacd6dd0	gm107/ir: add emission for SUSTx and SULDx Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-20 11:11:21 +02:00
Samuel Pitoiset	e14cb05ce1	gm107/ra: fix constraints for surface operations Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-20 11:11:16 +02:00
Samuel Pitoiset	c68989b2c8	gm107/ir: lower surface operations Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-20 11:11:12 +02:00
Samuel Pitoiset	2ae4b5d622	nvc0: bind images for 3d/cp shaders on GM107+ On Maxwell, images binding is slightly different (and much better) regarding Fermi and Kepler because a texture view needs to be uploaded for each image and this is going to simplify the thing a lot. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-20 11:11:03 +02:00
Samuel Pitoiset	1da704a94c	nvc0: increase the tex handles area size in the driver cb Currently, we can store 32 tex handles of 32-bits integer each and that fits perfectly with the underlying hardware except on GM107+ which requires to upload a texture view for each images. This patch increases the number of storable texture handles in the driver constant buffer from 32 to 40 because we expose 8 images. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-20 11:10:56 +02:00
Kenneth Graunke	f0f466214e	nir: Fix uninitialized use of 'replacement'. For intrinsics we don't care about, just skip to the next loop iteration and process the next instruction. We don't want to execute the rest of the code. This was a bug in commit `cdfc05ea6e`. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-07-19 17:34:59 -07:00
Kenneth Graunke	89873c9b08	i965: Use tex_mocs instead of rb_mocs for GL images. Fixes a 10-20% performance regression in OglCSDof caused by commit `5a8c89038a`, which made images (in the image load/store sense) use BDW_MOCS_PTE instead of BDW_MOCS_WB. This seems sketchy, as the default PTE value is supposed to be WB LLC eLLC, which is the same as our MOCS WB setting. It's only supposed to change when using a surface for display, which won't ever happen for images. Something may be wrong in the kernel... Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-19 17:34:59 -07:00
Marek Olšák	0ab47146c9	winsys/amdgpu: use pb_cache buckets for fewer pb_cache misses Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-19 23:45:06 +02:00
Marek Olšák	dea6fdadca	winsys/radeon: use pb_cache buckets for fewer pb_cache misses This makes Bioshock Infinite with deferred flushing 2.2% faster. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-19 23:45:06 +02:00
Marek Olšák	8d5944199d	gallium/pb_cache: reduce the number of pointer dereferences Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-19 23:45:06 +02:00
Marek Olšák	3cdc0e133f	gallium/pb_cache: divide the cache into buckets for reducing cache misses Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-19 23:45:06 +02:00
Marek Olšák	fec7f74129	gallium/pb_cache: check parameters that are more likely to fail first This makes Bioshock Infinite with deferred flushing 2% faster. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-19 23:45:06 +02:00
Marek Olšák	2596ae2b6e	radeonsi: emit PS exports last This effectively removes s_waitcnt instructions after FP16 exports. Before: v_cvt_pkrtz_f16_f32_e32 v0, v0, v1 ; 5E000300 v_cvt_pkrtz_f16_f32_e32 v1, v2, v3 ; 5E020702 exp 15, 0, 1, 0, 0, v0, v1, v0, v0 ; F800040F 00000100 s_waitcnt expcnt(0) ; BF8C0F0F v_cvt_pkrtz_f16_f32_e32 v0, v4, v5 ; 5E000B04 v_cvt_pkrtz_f16_f32_e32 v1, v6, v7 ; 5E020F06 exp 15, 1, 1, 0, 0, v0, v1, v0, v0 ; F800041F 00000100 s_waitcnt expcnt(0) ; BF8C0F0F v_cvt_pkrtz_f16_f32_e32 v0, v8, v9 ; 5E001308 v_cvt_pkrtz_f16_f32_e32 v1, v10, v11 ; 5E02170A exp 15, 2, 1, 0, 0, v0, v1, v0, v0 ; F800042F 00000100 s_waitcnt expcnt(0) ; BF8C0F0F v_cvt_pkrtz_f16_f32_e32 v0, v12, v13 ; 5E001B0C v_cvt_pkrtz_f16_f32_e32 v1, v14, v15 ; 5E021F0E exp 15, 3, 1, 1, 1, v0, v1, v0, v0 ; F8001C3F 00000100 s_endpgm ; BF810000 After: v_cvt_pkrtz_f16_f32_e32 v0, v0, v1 ; 5E000300 v_cvt_pkrtz_f16_f32_e32 v1, v2, v3 ; 5E020702 v_cvt_pkrtz_f16_f32_e32 v2, v4, v5 ; 5E040B04 v_cvt_pkrtz_f16_f32_e32 v3, v6, v7 ; 5E060F06 exp 15, 0, 1, 0, 0, v0, v1, v0, v0 ; F800040F 00000100 v_cvt_pkrtz_f16_f32_e32 v4, v8, v9 ; 5E081308 v_cvt_pkrtz_f16_f32_e32 v5, v10, v11 ; 5E0A170A exp 15, 1, 1, 0, 0, v2, v3, v0, v0 ; F800041F 00000302 v_cvt_pkrtz_f16_f32_e32 v6, v12, v13 ; 5E0C1B0C v_cvt_pkrtz_f16_f32_e32 v7, v14, v15 ; 5E0E1F0E exp 15, 2, 1, 0, 0, v4, v5, v0, v0 ; F800042F 00000504 exp 15, 3, 1, 1, 1, v6, v7, v0, v0 ; F8001C3F 00000706 s_endpgm ; BF810000 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-19 23:45:06 +02:00
Marek Olšák	b2b45cecef	radeonsi: set optimal settings in COMPUTE_RESOURCE_LIMITS ported from Vulkan Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-19 23:45:06 +02:00
Marek Olšák	ad70c3954b	radeonsi: really wait for the second EOP event and not the first one Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-19 23:45:06 +02:00
Marek Olšák	1a1cc67edd	gallium/radeon: remove RADEON_FLUSH_KEEP_TILING_FLAGS flag always set Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-19 23:45:06 +02:00
Ian Romanick	0b626d7524	nir/algebraic: Optimize fabs(u2f(x)) I noticed this when I tried to do frexp(float(some_unsigned)) in the ir_unop_find_lsb lowering pass. The code generated for frexp() uses fabs, and this resulted in an extra instruction. Ultimately I ended up not using frexp. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:30 -07:00
Ian Romanick	94296be276	st/mesa: Enable MESA_shader_integer_functions on all GLSL 1.30 platforms Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:30 -07:00
Ian Romanick	7cb49b1bd7	i965: Enable MESA_shader_integer_functions on all GLSL 1.30 platforms Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:29 -07:00
Ian Romanick	5726e57f13	i965: Don't lower uaddCarry and usubBorrow in both GLSL IR and NIR Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:29 -07:00
Ian Romanick	d7a47a76e0	i965: Update assertion to account for Gen < 7 Previously SHADER_OPCODE_MULH could only exist on Gen7+, so the assertion assumed the Gen7+ accumulator rules. A future patch will allow this instruction on at least Gen6, so update the assertion. v2: Use get_lowered_simd_width instead of open coding it. Suggested by Curro. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> [v1]	2016-07-19 12:19:29 -07:00
Ian Romanick	3e7cebc8da	i965: Use LZD to implement nir_op_find_lsb on Gen < 7 v2: Rebase on changes to previous two patches. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:29 -07:00
Ian Romanick	c2019c6c26	i965: Use LZD to implement nir_op_ifind_msb on Gen < 7 v2: Retype LZD source as UD to avoid potential problems with 0x80000000. Suggested by Matt. Also update comment about problem values with LZD(abs(x)). Suggested by Curro. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:29 -07:00
Ian Romanick	de20086eed	i965: Use LZD to implement nir_op_ufind_msb This uses one less instruction. v2: Move emit_find_msb_using_lzd out of the visitor classes. Suggested by Curro. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:29 -07:00
Ian Romanick	26c7f04d4a	i965: Always enable GL_ARB_shading_language_packing With the existing lowering passes, the functions from this extension become a bunch of bit twiddling operations that have always been supported. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:29 -07:00
Ian Romanick	4b2b6d4d4d	i965: Move enable of EXT_shader_integer_mix This extension does not depend on the Gen. It only depends on the availability of GLSL 1.30. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:29 -07:00
Ian Romanick	a2379e44aa	glsl: Add lowering pass for ir_bin_imul_high This isn't the lowering pass you want. Most GPUs that can support GLSL 1.30 have a multiply unit that can do something more interesting than 32x32->32. Many have 32x16->48. Any GPU that does, should do the lowering in the backend. This is just the thing that will always work. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:29 -07:00
Ian Romanick	1b5477668a	glsl: Add lowering pass for ir_unop_find_msb Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:29 -07:00
Ian Romanick	2a381a3c73	glsl: Add lowering pass for ir_unop_find_lsb Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:29 -07:00
Ian Romanick	ad9acb19c3	glsl: Add lowering pass for ir_unop_bitfield_reverse Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:28 -07:00
Ian Romanick	3079dcb00c	glsl: Add lowering pass for ir_quadop_bitfield_insert Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:28 -07:00
Ian Romanick	4d6d219b58	glsl: Add lowering pass for ir_triop_bitfield_extract Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:28 -07:00
Ian Romanick	7340be8a01	glsl: Add lowering pass for ir_unop_bit_count Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:28 -07:00
Ian Romanick	806add360f	MESA_shader_integer_functions: Allow new function overload matching rules Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:28 -07:00
Ian Romanick	90537e1a0e	MESA_shader_integer_functions: Allow implicit int->uint conversions Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:28 -07:00
Ian Romanick	65b0346fdb	MESA_shader_integer_functions: Expose new built-in functions Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:28 -07:00
Ian Romanick	15c4ae461d	MESA_shader_integer_functions: Boiler plate extension tracking Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:28 -07:00
Ian Romanick	91482ef226	MESA_shader_integer_functions: Add extension specification v2: Fix typo in #extension line noticed by Ken. v3: Update spec status. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-19 12:19:15 -07:00
Samuel Pitoiset	9c63224540	gm107/ir: make use of ADD32I for all immediates ADD only allows to emit 19-bits immediates. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>	2016-07-19 18:07:15 +02:00
Samuel Pitoiset	0904a2ba97	gm107/ir: add missing NEG modifier for IADD32I Like FADD32I, the NEG modifier of src0 is at position 56. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-07-19 18:07:10 +02:00
Andreas Boll	c482decd4d	ddebug: Fix trivial typo in stderr message Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>	2016-07-19 16:04:40 +02:00
Andreas Boll	d66cb7c84f	configure.ac: Use ${datarootdir} for --with-vulkan-icddir help string too The help string wasn't updated in `cbc37f7`. Fixes: `cbc37f7` ("anv: install the intel_icd.json to ${datarootdir} by default") Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Cc: mesa-stable@lists.freedesktop.org	2016-07-19 16:04:01 +02:00
Eric Engestrom	8ba46fbd9e	vl: fix memory leak CovID: 1363008 Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-07-19 12:41:00 +02:00
Boyuan Zhang	60c7450f16	vl: add entry point Add entrypoint to distinguish H.264 decode and encode. For example, in patch 5/11 when is calling "VaCreateContext", "pps" and "sps" shouldn't be allocated for H.264 encoding. So we need to use the entry_point to determine this is H.264 decode or H.264 encode. We can use config to determine the entrypoint since config_id is passed to us for VaCreateContext call. However, for VaDestoyContext call, only context_id is passed to us. So we need to know the entrypoint in order to not free the pps/sps for encoding case. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-07-19 12:36:46 +02:00
Ilia Mirkin	ed9dd3bcd9	nv50,nvc0: srgb rendering is only available for rgba/bgra Mark both L8_SRGB and L8A8_SRGB as non-renderable (the latter already didn't have the bind flags). This makes the state tracker pick a different format when rendering is required, or mark the fb as incomplete. This fixes: bin/getteximage-formats init-by-clear-and-render -auto -fbo bin/getteximage-formats init-by-rendering -auto -fbo which previously ran into srgb-encoding differences. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2016-07-18 20:04:17 -04:00
Ilia Mirkin	8e7893eb53	nvc0: add support for BGRA8 images This is useful for pbo downloads, which are now accelerated with images. BGRA8 is a moderately common format to do that in. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-07-18 20:04:17 -04:00
Jason Ekstrand	905d7dc4d1	i965: Skip update_texture_surface when the plane doesn't exist Thanks to rebase fail, recent surface state changes (commits `7e951cd56`, `8521ce1a7`, and `69c0dc5c53`) effectively reverted `727a9b2493` and `367cf3a2e3` which was unintentional. This should bring it back. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-07-18 16:44:29 -07:00
Timothy Arceri	cd5cbf0f6b	glsl: use linked shaders rather than compiled shaders At this point there is no reason not to be using the linked shaders, using the linked shaders should be faster and will make things simpler for upcoming shader cache work. The previous variable name suggests the linked shaders were intended to be used here anyway. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-07-19 09:42:00 +10:00
Lars Hamre	198074a41c	The extension is already exposed, this simply marks it as done. Signed-off-by: Lars Hamre <chemecse@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-07-19 01:20:27 +02:00
Anuj Phogat	22935a3040	docs: Fix typo in extension name Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-07-18 15:53:24 -07:00
Anuj Phogat	7832e18879	docs: Add support for GL_KHR_texture_compression_astc_sliced_3d Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reported-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-18 15:44:18 -07:00
Anuj Phogat	c7b787ef90	Revert "docs: Mark KHR_texture_compression_astc_sliced_3d done on i965" This reverts commit `82f8c23950`. KHR_texture_compression_astc_sliced_3d is not a requirement for GLES 3.2. Reported-by: Ilia Mirkin <imirkin@alum.mit.edu>\ Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-07-18 15:43:58 -07:00
Anuj Phogat	82f8c23950	docs: Mark KHR_texture_compression_astc_sliced_3d done on i965 Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-07-18 14:39:54 -07:00
Anuj Phogat	ac0eb36d8e	i965/gen9: Enable KHR_texture_compression_astc_sliced_3d Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-07-18 14:39:54 -07:00
Anuj Phogat	15dea5ca82	mesa: Add the infrastructure for KHR_texture_compression_astc_sliced_3d V2: Drop the changes to gl.xml. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-07-18 14:39:54 -07:00
Christian König	3e1ad846f9	radeon/uvd: add session context buffer for polaris 10/11 v2 This way we have unlimited UVD sessions. v2: only enable it when kernel supports it as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-07-18 17:13:17 +02:00
Leo Liu	134d6e4e4f	vl/dri3: fix a memory leak from front buffer Inspired by fix for mem leak of vdpau interop, resource_from_handle set texture reference count, that need to be decreased and released, recall there is a similar case for DRI3, that is with VA-API glx extension, there is temporary TFP(texture from pixmap), we target it through dma-buf. leak happens when without count down the reference. Checked and found with mpv vo=opengl case, there only one static TFP, the leak happens once, but for totem player using gstreamer VA-API glx, the dynamic TFP for each frame, so leak quite a bit. This fixes mem leak for mpv and totem. Signed-off-by: Leo Liu <leo.liu@amd.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-07-18 09:20:40 -04:00
Iago Toral Quiroga	0f2516d88f	i965/tes/scalar: fix 64-bit indirect input loads We totally ignored this before because there were no piglit tests for indirect loads in tessellation stages with doubles. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-07-18 09:53:51 +02:00
Iago Toral Quiroga	1737e75bfb	i965/tcs/scalar: only update imm_offset for second message in 64bit input loads Our indirect URB read messages take both a direct and an indirect offset so when we emit the second message for a 64-bit input load we can just always incremement the immediate offset, even for the indirect case. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-07-18 09:53:16 +02:00
Kenneth Graunke	18f67c8a69	i965: Move pulls_bary setting to emit_pixel_interpolator_send(). pulls_bary should be set when the shader uses a pixel interpolator message. So, setting it from the function that emits pixel interpolator messages makes a lot of sense. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-07-17 19:26:54 -07:00
Kenneth Graunke	7ef7738a61	i965: Write gl_FragCoord directly to the destination. This patch makes emit_general_interpolation take a destination register as an argument, and write directly to that. This is simpler than the old approach of ralloc'ing a register, writing to that temporary, and then making the caller emit per-component MOVs to copy it to the actual destination. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-07-17 19:26:53 -07:00
Kenneth Graunke	a03812c321	i965: Drop has_pln checks in unlit centroid workaround. The unlit centroid workaround starts being necessary on Gen6, which is the first platform with multisampling. PLN exists on G45+, so all platforms which need this workaround have PLN. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-07-17 19:26:53 -07:00
Kenneth Graunke	b94890c19f	i965: Drop VARYING_SLOT_FACE special case in barycentric setup. glsl_to_nir always produces a system value for gl_FrontFacing, rather than an input. So there should never be an input with this slot, making this code dead. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-07-17 19:26:53 -07:00
Kenneth Graunke	ac1181ffbe	compiler: Rename INTERP_QUALIFIER_* to INTERP_MODE_. Likewise, rename the enum type to glsl_interp_mode. Beyond the GLSL front-end, talking about "interpolation modes" seems more natural than "interpolation qualifiers" - in the IR, we're removed from how exactly the source language specifies how to interpolate an input. Also, SPIR-V calls these "decorations" rather than "qualifiers". Generated by: $ find . -regextype egrep -regex '.\.(c\|cpp\|h)' -type f -exec sed -i \ -e 's/INTERP_QUALIFIER_/INTERP_MODE_/g' \ -e 's/glsl_interp_qualifier/glsl_interp_mode/g' {} \; Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Dave Airlie <airlied@redhat.com>	2016-07-17 19:26:48 -07:00
Dave Airlie	e7d96e7685	virgl: drop pointless leftover init of virgl_transfer_inline_write. Pointed out by Marek. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-07-17 06:20:53 +10:00
Ilia Mirkin	062c6b8e54	nv50: fix alphatest for non-blendable formats The hardware can only do alphatest when using a blendable format. This means that the various *16 norm formats didn't work with alphatest. It appears that Talos Principle uses such formats, as well as alpha tests, for some internal renders, which made them be incorrect. However this does not appear to affect the final renders, but in a different game it easily could. The approach we take is that when alphatests are enabled and a suitable format is used (which we anticipate is the vast minority of the time), we insert code into the shader to perform the comparison and discard. Once inserted, that code lives in the shader forever, and we re-upload it each time the function changes with a fixed-up compare. To avoid re-uploading too often, if we switch back to a blendable format, the test is (effectively) disabled and the hw alphatest functionality is used. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-16 11:45:30 -04:00
Rob Clark	cc46fc3c09	mesa/st: reduce size of state->st bitmask In `d035d50` this changed to 64b.. which I'm pretty sure was unintentional. Revert it back to 32b so the entire state struct is a nice round 64b. (Note sure that it would actually be measurable, but I did notice that check_state() was hot in some benchmarks.) Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-16 10:00:04 -04:00
Rob Clark	44bbfedbd9	gallium/u_queue: add optional cleanup callback Adds a second optional cleanup callback, called after the fence is signaled. This is needed if, for example, the queue has the last reference to the object that embeds the util_queue_fence. In this case we cannot drop the ref in the main callback, since that would result in the fence being destroyed before it is signaled. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-16 10:00:04 -04:00
Nicolai Hähnle	6f73c7595f	radeonsi: remove the DRAW_PREAMBLE packet According to firmware guys, the new sequence that we added for Polaris should work on all CIK parts, and should actually be faster on some parts. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-16 13:02:37 +02:00
Brian Paul	b89d0df535	mesa: handle numSamples=0 in _mesa_test_proxy_teximage() Should fix the regressions reported in bug 96949. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96949 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-15 21:32:24 -07:00
Kenneth Graunke	aa6f60f844	nir: Use dest.ssa.num_components rather than intrin->num_components. I recently refactored this to share code between load and atomic lowering. loads used intrin->num_components, while atomics used intrin->dest.ssa.num_components. They should be equivalent, but Jason wanted me to use the latter. I missed applying his review. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-15 19:42:43 -07:00
Kenneth Graunke	da3d4a4c56	nir: Update outdated intrinsic const_index comments. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 17:17:10 -07:00
Kenneth Graunke	52e75dcb8c	nir: Use nir_intrinsic_set_base in atomic lowering. This is more readable and also offers assertions that protect against setting const_index fields on the wrong kind of intrinsic. Suggested by Jason. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 17:17:10 -07:00
Kenneth Graunke	50b9bb9421	nir: Split nir_lower_io's input/output/atomic handling into helpers. The original function was becoming a bit hard to read, with the details of creating and filling out load/store/atomic atomics all in one function. This patch makes helpers for creating each type of intrinsic, and also combines them with the *_op() helpers, as they're closely coupled and not too large. v2: Minor style nits from Jason. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 17:17:10 -07:00
Kenneth Graunke	e12e4af780	nir: Drop bogus nir_var_shader_in case in nir_lower_io's store_op(). This can't happen, the caller asserts that mode is shader_out or shared. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 17:17:09 -07:00
Kenneth Graunke	cdfc05ea6e	nir: Share destination rewriting and replacement code in IO lowering. Both loads and atomics had identical code to rewrite destinations, and all cases had the same two lines to replace instructions. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 17:17:09 -07:00
Kenneth Graunke	349fe79c9b	nir: Share get_io_offset handling in nir_lower_io. The load/store/atomic cases all duplicated the get_io_offset code, with a few tiny differences: stores didn't bother checking for per-vertex inputs, because they can't be stored to, and atomics didn't check at all, since shared variables aren't per-vertex. However, it's harmless to check, and allows us to share more code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 17:17:09 -07:00
Kenneth Graunke	7171a9a87d	nir: Make a 'var' temporary in nir_lower_io. Less typing and word wrapping issues than intrin->variables[0]->var. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 17:17:09 -07:00
Kenneth Graunke	f05770121f	i965: Remove the emit_linterp() helper. Rather than computing the barycentric mode each time we emit a LINTERP, we can simply compute it once, as soon as we know we're doing non-flat interpolation. At that point, emit_linterp() doesn't do much, so fold it into the call sites and drop it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 17:16:54 -07:00
Kenneth Graunke	203243f5ff	i965: Reduce the number of fs_reg(brw_reg) calls in LINTERP handling. A bit tidier. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 17:16:54 -07:00
Kenneth Graunke	eefbbb943e	i965: Make a barycentric_mode() helper function. This combines two copies of basically the same code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-15 17:16:54 -07:00
Kenneth Graunke	783511e605	i965: Rename brw_wm_barycentric_interp_mode to brw_barycentric_mode. brw_wm_barycentric_interp_mode is wordy, brw_barycentric_mode is less typing and suffers from fewer line wrapping problems. The enum values themselves don't really benefit from "WM" in the name, either. Put "BARYCENTRIC" first instead of at the end and drop "WM". Generated by: for file in .c .cpp .h; do sed -i \ -e 's/brw_wm_barycentric_interp_mode/brw_barycentric_mode/g' \ -e 's/BRW_WM_$[A-Z_]$_BARYCENTRIC/BRW_BARYCENTRIC_\1/g' \ -e 's/BRW_WM_BARYCENTRIC_INTERP_MODE_COUNT/BRW_BARYCENTRIC_MODE_COUNT/g' \ $file; done with a few whitespace changes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 17:16:54 -07:00
Kenneth Graunke	2d6dd30a9b	i965: Handle default interpolation modes and locations in NIR. This consolidates a bunch of hacks in a single place - by setting the interpolation modes and locations on variables appropriately, we can simply trust them in the rest of the code. This avoids having to handle INTERP_QUALIFIER_NONE, gl_Color overrides, sample-shading overrides, and Gen4-5 centroid-overrides in a bunch of places. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 17:16:54 -07:00
Jason Ekstrand	745f5778f3	i965/context: Remove some unnecessary vfuncs Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 16:01:43 -07:00
Jason Ekstrand	305044c5b1	i965: Get rid of gen6_surface_state.c The only useful thing left was gen6_init_vtable_surface_functions which we can easily put in brw_wm_surface_state.c. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 16:01:43 -07:00
Jason Ekstrand	16fb285946	i965: Use ISL for emitting buffer surface states Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 16:01:43 -07:00
Jason Ekstrand	ee229d1b9c	i965/state: Account for the element size in emit_buffer_surface_state Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-07-15 16:01:43 -07:00
Jason Ekstrand	69c0dc5c53	i965/gen4-6: Use the generic ISL-based path for texture surfaces Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 16:01:43 -07:00
Jason Ekstrand	2d56959bf8	i965/gen6: Use the generic ISL-based path for renderbuffer surfaces Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 16:01:43 -07:00
Jason Ekstrand	efa7668545	i965/gen7: Use the generic ISL-based path for renderbuffer surfaces Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 16:01:43 -07:00
Jason Ekstrand	8521ce1a7e	i965/gen7: Use the generic ISL-based path for texture surfaces Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 16:01:43 -07:00
Jason Ekstrand	26282a01f5	i965/gen8: Use the generic ISL-based path for renderbuffer surfaces Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 16:01:43 -07:00
Jason Ekstrand	7e951cd562	i965/gen8: Use the generic ISL-based path for texture surfaces Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 16:01:41 -07:00
Jason Ekstrand	09b5a71517	i965/state: Add generic surface update functions based on ISL Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:59:33 -07:00
Jason Ekstrand	1abb37baa0	i965/surface_state: Rename brw_update to gen4_update We're about to add generic versions which work across gens and those should have the brw name. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:59:33 -07:00
Jason Ekstrand	5a8c89038a	i965/state: Use ISL for emitting image surfaces Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:59:33 -07:00
Jason Ekstrand	7a21d1bfc3	i965/blorp: Use a generic ISL path for texture surfaces on gen8 Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-07-15 15:59:33 -07:00
Jason Ekstrand	5cf665afa1	i965/state: Add a helper for emitting a surface state using isl Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:59:24 -07:00
Jason Ekstrand	73ae4ec294	i965/blorp: Use the generic ISL path for texture surfaces on gen6 Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:53:49 -07:00
Jason Ekstrand	cc78061003	i965/blorp: Use the generic ISL path for renderbuffer surfaces on gen6 Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:53:49 -07:00
Jason Ekstrand	366a6a659d	i965/blorp: Use the generic ISL path for texture surfaces on gen7 Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:53:49 -07:00
Jason Ekstrand	3339ef42cf	i965/blorp: Use the generic ISL path for renderbuffer surfaces on gen7 Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:53:48 -07:00
Jason Ekstrand	16022352ea	i965/blorp: Use the generic ISL path for renderbuffer surfaces on gen8-9 Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:53:48 -07:00
Jason Ekstrand	6553dc0d70	i965/blorp: Add a generic ISL-based surface state emit path Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:53:48 -07:00
Jason Ekstrand	e974456d4f	i965/miptree: Add a helper for getting the aux isl_surf from a miptree Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:53:48 -07:00
Jason Ekstrand	1e45349e82	i965/miptree: Add a helper for getting the ISL clear color from a miptree Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:53:48 -07:00
Jason Ekstrand	f665a3da72	i965/miptree: Add a helper for getting an isl_surf from a miptree Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:53:48 -07:00
Jason Ekstrand	e2dd3ce976	i965: Add an isl_device to the brw_context Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:53:48 -07:00
Jason Ekstrand	4f282ff67e	isl/state: Add support for OffsetX/Y in surface state Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:53:48 -07:00
Jason Ekstrand	f8984b918a	isl: Add support for filling out surface states all the way back to gen4 Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:53:48 -07:00
Jason Ekstrand	815847e2b3	isl: Add an ISL_DEV_IS_G4X macro Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:53:48 -07:00
Jason Ekstrand	27883f8cbc	genxml: Add macros and #includes for gens 4-6 Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:53:48 -07:00
Jason Ekstrand	ba798ac6b1	genxml: Make X/Y Offset field of SURFACE_STATE a uint THe offset type has special implications that it's intended to be some form of aligned memory address. These assumptions allow it to handle the case where there is some alignment requirement on the offset and the bottom bits are used for other things. However, the offsets in the surface state field are really just unsigned integers. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:53:48 -07:00
Jason Ekstrand	9a999ceab8	genxml: Add enough XML for gens 4, 4.5, and 5 to get SURFACE_STATE Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:53:47 -07:00
Jason Ekstrand	0f6eb5dea0	isl/state: Divide the aux qpitch by 4 The field is in multiples of 4 like regular QPitch. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:53:47 -07:00
Jason Ekstrand	2c6ca658e7	isl: Fix the bs assertion in isl_tiling_get_info Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 15:53:47 -07:00
Jason Ekstrand	593731ea3c	anv: Handle VK_WHOLE_SIZE properly for buffer views The old calculation, which used view->offset, encorporated buffer->offset into the size calculation where it doesn't belong. This meant that, if buffer->offset > buffer->size, you would always get a negative size. This fixes 170 dEQP-VK.renderpass.attachment.* Vulkan CTS tests on Haswell. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-07-15 15:48:21 -07:00
Jason Ekstrand	827405f072	anv: Add an align_down_npot_u32 helper Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-07-15 15:48:21 -07:00
Jason Ekstrand	f124f4a394	anv: Enable independentBlend on gen7 We can totally do it, we were just only setting up one BLEND_STATE and, now that the code is unified with gen8, we should be handling it correctly. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-07-15 15:48:21 -07:00
Jason Ekstrand	a2e7b2e653	anv/pipeline: Unify blend state setup between gen7 and gen8 This fixes all 674 broken dEQP-VK.pipeline.blend Vulkan CTS tests on Haswell. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-07-15 15:48:21 -07:00
Jason Ekstrand	aaa202ebe7	genxml: Make gen6-7 blending look more like gen8 This renames BLEND_STATE to BLEND_STATE_ENTRY and adds an new struct BLEND_STATE which is just an array of 8 BLEND_STATE_ENTRYs. This will make it much easier to write gen-agnostic blend handling code. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-07-15 15:48:21 -07:00
Eric Anholt	3bcd0f1912	vc4: Speed up glGenerateMipmaps by avoiding shadow baselevel. To support general GL_TEXTURE_BASE_LEVEL we have to copy to a temporary miptree. However, if a single level is being selected, we can use the existing miptree and force all the sampling to be from that particular level. This avoids a ton of software fallbacks in glGenerateMipmaps(), which uses base levels in the blit implementation in gallium. Improves "glmark2 -b terrain" from 2 fps to 3 (perhaps some more precision would be useful?), and cuts its CPU usage during the benchmarking from ~30% to ~10% (total CPU time from 8.8s to 7.6s).	2016-07-15 13:54:00 -07:00
Eric Anholt	88152d7dc0	vc4: Drop VC4_DIRTY_TEXSTATE in favor of the per-stage flags. The compiler uses the per-stage flags already, so it didn't need this. vc4_uniforms was using it, so just replace it with both of the stage flags for now.	2016-07-15 13:54:00 -07:00
Eric Anholt	5db82e0c89	vc4: Remove dead dirty_samplers field. We use a big VC4_DIRTY_FRAGTEX/VC4_DIRTY_VERTEX on the stage, instead.	2016-07-15 13:54:00 -07:00
Eric Anholt	219b75deb9	vc4: Turn on control flow support in the simulator environment. We can't merge the non-simulator support until we merge the kernel side and get a new libdrm release.	2016-07-15 13:54:00 -07:00
Brian Paul	9a23a177b9	mesa: handle numLevels, numSamples in _mesa_test_proxy_teximage() If numSamples > 0, we can compute the size of the whole mipmapped texture. That's the case for glTexStorage(GL_PROXY_TEXTURE_x). Also, multiply the texture size by numSamples for MSAA textures. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-07-15 14:24:34 -06:00
Brian Paul	39183ea971	mesa: add proxy texture targets in _mesa_next_mipmap_level_size() So we can use it for computing size of proxy textures. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-07-15 14:24:34 -06:00
Brian Paul	0ac9f25032	mesa: add numLevels, numSamples to Driver.TestProxyTexImage() So that the function can work properly with glTexStorage(), where we know how many mipmap levels there are. And so we can compute storage for MSAA textures. Also, remove the obsolete texture border parameter. A subsequent patch will update _mesa_test_proxy_teximage() to use these new parameters. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-07-15 14:24:34 -06:00
Brian Paul	e477d92c94	mesa: use _mesa_clear_texture_image() in clear_texture_fields() This avoids a failed assert(img->_BaseFormat != -1) in init_teximage_fields_ms() because the internalFormat argument is GL_NONE. This was hit when using glTexStorage() to do a proxy texture test. Fixes a failure with the updated Piglit tex3d-maxsize test. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-07-15 14:24:34 -06:00
Charmaine Lee	6b7923ee46	svga: avoid ubinding render targets that have already been unbound Fixed the remaining redundant SetRenderTargets command emission. Tested with lightsMark2008, Heaven, mtt piglit, glretrace, conform. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-07-15 14:24:34 -06:00
Neha Bhende	4f633d110a	svga: dump code for GenMips. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-07-15 14:24:33 -06:00
Jon Turney	c7151401e0	Disable use of weak in threads_posix.h on Cygwin Weak doesn't work the same on PE/COFF as on ELF, they are only weak references. Specifically, since nothing else pulls in the object which contains pthread_mutexattr_init() (and coming from the C library, that is the only thing that object contains), means that it ends up as 0 Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>	2016-07-15 19:46:54 +01:00
Jon Turney	7d8edbaee7	configure: Don't require pthread-stubs on Cygwin Commit `1f4869a2` unconditionally requires pthread-stubs. Unfortunately, the cleverness that pthread-stubs is doesn't work with PE/COFF, and historically Cygwin doesn't have a pthread-stubs.pc. Don't require pthread-stubs on Cygwin. Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>	2016-07-15 19:46:54 +01:00
Yaakov Selkowitz	5d303867f5	Use correct names for dlopen()ed files on Cygwin Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com> Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk>	2016-07-15 19:46:54 +01:00
Yaakov Selkowitz	3c18c16ecf	configure: Define _GNU_SOURCE for Cygwin as well Cygwin headers are now a bit more correct in handling feature test macros, so use _GNU_SOURCE when building for Cygwin, as well. (Notwithstanding `f381c27c`, we should probably have always been using _GNU_SOURCE, since asprintf() is used by mesa in places) Signed-off-by: Yaakov Selkowitz <yselkowi@redhat.com> Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk>	2016-07-15 19:46:54 +01:00
Nanley Chery	1fc739d28e	Revert "isl: Don't filter tiling flags if a specific tiling bit is set" This reverts commit `091f1da902` . Although a user may specify a specfic tiling bit, ISL should still prevent incompatible tiling/surface combinations. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-15 10:35:40 -07:00
Nanley Chery	e179fee049	anv/blit2d: Copy with stencil sources when needed In the next patch, ISL will unconditionally perform verification of a surface's tiling and usage. Since it will require that w-tiled images be stencil buffers, create a stencil surface to copy from a w-tiled/stencil surface. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 10:35:40 -07:00
Nanley Chery	1ef80b26d7	anv/image: Fix initialization of the ISL tiling If an internal user creates an image with Vulkan tiling VK_IMAGE_TILING_OPTIMAL and an ISL tiling that isn't set, ISL will fail to create the image as anv_image_create_info::isl_tiling_flags will be an invalid value. Correct this by making anv_image_create_info::isl_tiling_flags an opt-in, filtering bitmask, that allows the caller to specify which ISL tilings are acceptable, but not contradictory to the Vulkan tiling. Opt-out of filtering for vkCreateImage. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 10:35:40 -07:00
Nanley Chery	00caba4152	isl: Fix isl_tiling_is_any_y() Cc: 12.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 10:35:40 -07:00
Nanley Chery	a5748cb920	anv/device: Fix max buffer range limits Set limits that are consistent with ISL's assertions in isl_genX(buffer_fill_state_s)() and Anvil's format-DescriptorType mapping in anv_isl_format_for_descriptor_type(). Fixes the following new crucible tests: * stress.limits.buffer-update.range.uniform * stress.limits.buffer-update.range.storage These tests are in this patch: https://patchwork.freedesktop.org/patch/98726/ Cc: 12.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 10:35:40 -07:00
Nanley Chery	028f6d8317	isl: Fix assert on raw buffer surface state size See inline PRM reference. Cc: 12.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 10:35:40 -07:00
Nanley Chery	96c664cd03	anv/cmd_buffer: Simplify range member assignment A ternary is clearer because the range member is assigned one of two values dependant on one condition. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 10:35:40 -07:00
Nanley Chery	1a7344531f	anv/cmd_buffer: Remove unused variable This became unused due to commit `612e35b2c6` . Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 10:35:40 -07:00
Nanley Chery	fd16e64321	anv/descriptor_set: Fix binding partly undefined descriptor sets Section 13.2.3. of the Vulkan spec requires that implementations be able to bind sparsely-defined Descriptor Sets without any errors or exceptions. When binding a descriptor set that contains a dynamic buffer binding/descriptor, the driver attempts to dereference the descriptor's buffer_view field if it is non-NULL. It currently segfaults on undefined descriptors as this field is never zero-initialized. Zero undefined descriptors to avoid segfaulting. This solution was suggested by Jason Ekstrand. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96850 Cc: 12.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-15 10:35:40 -07:00
Brian Paul	50a669de4e	svga: handle mismatched number of samplers, sampler views in svga_init_shader_key_common(). Since the CSO module only tracks sampler views for fragment shaders, the number of samplers and sampler views can be mismatched for other types of shaders. This situation triggered an assertion in Chrome with maps.google.com This patch adds defensive code to handle that situation. Fixes VMware bug 1694027 Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-07-15 11:05:18 -06:00
Leo Liu	b9d10e79c8	st/omx/enc: check uninitialized list from task release The uninitialized list should be checked and returned. Thank Julien for the notification and suggested fix. Signed-off-by: Leo Liu <leo.liu@amd.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-07-15 09:17:36 -04:00
Samuel Pitoiset	ea6b236ab1	nv50/ir: add missing string for SV_WORK_DIM Fixes: `2aa1197` ("nouveau: Add support for SV_WORK_DIM") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Hans de Goede <hdegoede@redhat.com>	2016-07-14 22:28:39 +02:00
Marek Olšák	f84e9d749f	Revert "radeon/llvm: Use alloca instructions for larger arrays" This reverts commit `513fccdfb6`. Bioshock Infinite hangs with that.	2016-07-14 22:15:08 +02:00
Jan Vesely	489bb5473b	r600,compute: Reserve vtx 3 for kernel arguments Using vtx 0 does not work for dynamic offsets. v2: add explanatory comment Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2016-07-14 16:04:50 -04:00
Marek Olšák	33eddde4a7	radeon/uvd: fail to create a decoder if RUVD_MSG_CREATE submission fails This is the bare minimum for reporting the error to the user. Reviewed-by: Christian König <christian.koenig@amd.com>	2016-07-14 22:00:54 +02:00
Marek Olšák	85388652f9	winsys/amdgpu: return an error on IB submission failures Reviewed-by: Christian König <christian.koenig@amd.com>	2016-07-14 22:00:54 +02:00
Marek Olšák	a7d84f7731	gallium/radeon: add a return value to cs_flush Required by our UVD code. Reviewed-by: Christian König <christian.koenig@amd.com>	2016-07-14 22:00:54 +02:00
Jason Ekstrand	b919100d61	glsl/types: Use _mesa_hash_data for hashing function types This is way better than the stupid string approach especially since you could overflow the string. Again, I thought I had something better at one point but it obviously got lost. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-07-14 10:48:25 -07:00
Jason Ekstrand	11ac1c4dbb	glsl/types: Fix function type comparison function It was returning true if the function types have different lengths rather than false. This was new with the SPIR-V to NIR pass and I thought I'd fixed it a while ago but it may have gotten lost in rebasing somewhere. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-07-14 10:48:11 -07:00
francians@gmail.com	3db7f3458f	freedreno/a4xx: Fix sign compare warnings Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-14 09:55:02 -04:00
francians@gmail.com	948822018f	freedreno/a3xx: Fix sign compare warnings Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-14 09:55:02 -04:00
francians@gmail.com	cf2f345356	freedreno/a2xx: Fix sign compare warnings Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-14 09:55:02 -04:00
Boyuan Zhang	23c5e8bc58	radeon/vce: handle newly added parameters Replace the previous hardcoded value with newly defined parameters Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-07-14 09:49:21 +02:00
Boyuan Zhang	5490068fb1	st/omx: assign previous values to new structure Assign previously hardcoded values for OMX to newly defined structure. As a result, OMX behaviour will not change at all. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-07-14 09:49:14 +02:00
Boyuan Zhang	b86bf4b568	vl: add parameters for VAAPI encode Allow to specify more parameters in the encoding interface which previously just hardcoded in the encoder Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-07-14 09:49:07 +02:00
Christian König	9ce52baf7f	st/mesa: fix reference counting bug in st_vdpau Otherwise we leak the resources created for the DMA-buf descriptors. Signed-off-by: Christian König <christian.koenig@amd.com> Cc: 12.0 <mesa-stable@lists.freedesktop.org> Tested-and-Reviewed by: Leo Liu <leo.liu@amd.com> Ack-by: Tom St Denis <tom.stdenis@amd.com>	2016-07-14 09:33:44 +02:00
Eric Anholt	9194473dd2	vc4: Emit resets of the uniform stream at the starts of blocks. If a block might be entered from multiple locations, then the uniform stream will (probably) be at different points, and we need to make sure that it's pointing where we expect it to be. The kernel also enforces that any block reading a uniform resets uniforms, to prevent reading outside of the uniform stream by using looping.	2016-07-13 23:54:15 -07:00
Eric Anholt	44df061aaa	vc4: Add support for scheduling of branch instructions. For now we don't fill the delay slots, and instead just drop in NOPs.	2016-07-13 23:54:15 -07:00
Eric Anholt	a59da513d3	vc4: Move the QPU instructions to schedule into each block. We'll want to schedule them individually, to handle delay slots.	2016-07-13 23:54:15 -07:00
Eric Anholt	37ecc61662	vc4: Disable vc4_opt_vpm in the presence of control flow. It's a really valuable pass currently, but it will be a mess to rewrite for control flow. For now, just disable it if we have multiple blocks present.	2016-07-13 23:54:15 -07:00
Eric Anholt	ee69cfd11d	vc4: Convert vc4_opt_dead_code to work in the presence of control flow. With control flow, we can't be sure that we'll see the uses of a variable before its def as we walk backwards. Given that NIR is eliminating our long chains of dead code, a simple solution for now seems fine. This slightly changes the order of some optimizations, and so an opt_vpm happens before opt_dce, causing 3 dead MOVs to be turned into dead FMAXes in Minecraft: instructions in affected programs: 52 -> 54 (3.85%)	2016-07-13 23:54:15 -07:00
Eric Anholt	4e797bd98f	vc4: Update copy propagation for control flow. Previously, we could assume that a MOV from a temp was always an available copy, because all temps were SSA in NIR, and their non-SSA state in QIR was just due to the fact that they were from a bcsel or pack_unorm_4x8, so we could use the current value of the temp after that series of QIR instructions to define it. However, this is no longer the case with control flow. Instead, we track a new array of MOVs defined within the block that haven't had their source or dest killed yet, and use that primarily. We fall back to looking through the QIR defs array to handle across-block MOVs, but now require that copies from the SSA defs have an SSA src as well.	2016-07-13 23:54:15 -07:00
Samuel Iglesias Gonsálvez	94135e8736	i965/fs: emit DIM instruction to load 64-bit immediates in HSW v2 (Matt): - Use brw_imm_df() as source argument of DIM instruction. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-14 08:11:50 +02:00
Samuel Iglesias Gonsálvez	0534863c47	i965/eu: set DF imm value to the source of DIM According to HSW's PRM, vol02b, the DIM instruction has the following restriction: "Restriction : src0 must be immediate. src0 must specify the :f (F, Float) type encoding but is an immediate 64-bit DF (Double Float) value. dst must have type DF." This commit allows to upload the immediate 64-bit DF value to the source of a DIM instruction even when it is of float type encoding. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-14 08:06:01 +02:00
Samuel Iglesias Gonsálvez	6e28976d35	i965: enable the emission of the DIM instruction v2 (Matt): - Take a DF source argument for the DIM instruction emission in the visitors. - Indentation. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-14 08:06:01 +02:00
Jason Ekstrand	b9e99282a6	anv: Add a stub for CmdCopyQueryPoolResults on Ivy Bridge Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-07-13 20:31:27 -07:00
Timothy Arceri	a738732abf	i965: fix compiler warnings for 32bit build Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-14 12:03:59 +10:00
Tim Rowley	29f53d7937	Revert "gallium: Force blend color to 16-byte alignment" This reverts commit `d8d6091a84`. Heap allocations may be only 8-byte aligned on 32-bit system, and so having members with 16-byte alignment (such as in the case where pipe_blend_color is embedded in radeonsi's si_context) is undefined behavior which indeed causes crashes when compiled with gcc -O3. Cc: <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96835 Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com> Acked-by: Chuck Atkins <chuck.atkins@kitware.com>	2016-07-13 13:55:33 -05:00
Jason Ekstrand	48ed8b6f26	isl/state: Add support for handling auxiliary surfaces Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	76e2dcc131	isl: Add an auxiliary surface usage enum Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	3ab3d97ac9	isl: Add support for color control surfaces Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	219024b9a7	isl: Add support for multisample compression surfaces Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	33dc8549fb	isl: Add support for HiZ surfaces Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	fc3650a0a9	isl: Kill off isl_format_layout::bs Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	1f0433f075	isl: Take bpb rather than bs in tiling_get_info Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	01855d7331	isl: Use bpb in a few places where it's more natural than bs Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	8c76b9bdce	isl: Use bpb for determining YUV image padding When we initially dropped bpb in favor of bs, we accidentally didn't change this one line properly. This brings it back to what it should be. Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	cf9ff082b4	isl: Bring back isl_format_layout::bpb A while ago we got rid of the bits-per-block because we thought we didn't need it. We're about to introduce some very useful 1 and 2-bit formats so we really should be able to handle them again. Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	0bd3a7e931	isl: Change the physical size of a W-tile to 128x32 Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	4b62c19c32	isl: Rework the way we define tile sizes. This is based on a very long set of discussions between Chad and myself about how we should properly represent HiZ and CCS buffers. The end result of that discussion was that a tiling actually has two different sizes, a logical size in elements, and a physical size in bytes and rows. This commit reworks ISL's pitch and size calculations to work in terms of these two sizes. Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	7270bd0607	isl: Rework the way we handle surface padding Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	a52f26d6e8	isl: Use ARRAY_PITCH_SPAN_FULL for depth/stencil surfaces on gen7 We helpfully inserted a PRM quotation about how we need to use ARRAY_PITCH_SPAN_FULL and then set it to COMPACT. Oops... Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	0d48ac627a	isl: Stop multiplying height by block size The row pitch already specifies the size of a row of elements. Multiplying by the block height simply causes us to allocate as muc as 12 times more memory than needed for compressed textures. Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	58c1b1088b	isl: Get rid of tiling_get_extent It was unused Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	49476576dd	nir/spirv: Don't multiply the push constant block size by 4 I have no idea why we were multiplying by 4 before. The offsets we get from SPIR-V are in bytes and so is nir->num_uniforms so there's no need to do any adjustment whatsoever. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-07-13 11:35:29 -07:00
Jason Ekstrand	1eed753ee8	anv/pipeline: Assert that the number of uniforms from NIR fits	2016-07-13 11:35:24 -07:00
Marek Olšák	0f7a6ea5e7	radeonsi: report accurate SGPR and VGPR spills Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-13 19:46:16 +02:00
Marek Olšák	d227dbe272	radeonsi: add a workaround for a compute VGPR-usage LLVM bug v2: use abort(), describe which LLVM version is affected Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-13 19:46:16 +02:00
Marek Olšák	f4d1de7f86	radeonsi: use LLVMGetTypeKind to tell if an input is an array of descriptors just a cleanup Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-13 19:46:16 +02:00
Marek Olšák	785073ed0b	radeonsi: replace !tbaa with !invariant.load no change in generated code thanks to dereferenceable(n) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-13 19:46:16 +02:00
Marek Olšák	348b9a5b1c	radeonsi: set dereferenceable attribute on descriptor arrays This allows moving the loads arbitrarily in the Sinking pass. 26002 shaders in 14643 tests Totals: SGPRS: 2080160 -> 2080160 (0.00 %) VGPRS: 798875 -> 797826 (-0.13 %) Spilled SGPRs: 108485 -> 79165 (-27.03 %) Spilled VGPRs: 327 -> 327 (0.00 %) Scratch VGPRs: 1656 -> 1652 (-0.24 %) dwords per thread Code Size: 36127192 -> 35559780 (-1.57 %) bytes LDS: 767 -> 767 (0.00 %) blocks Max Waves: 212464 -> 212672 (0.10 %) Wait states: 0 -> 0 (0.00 %) PERCENTAGES / App Shaders SGPRs VGPRs SpillSGPR SpillVGPR Scratch CodeSize MaxWaves Waits (unknown) 4 . . . . . . . . 0ad 6 . . . . . . . . alien_isolation 2938 . 0.04 % -8.53 % . . -0.71 % -0.06 % . anholt 10 . . . . . . . . batman_arkham_origins 589 . -0.58 % -79.54 % . . -6.72 % 0.57 % . bioshock-infinite 1769 . -0.65 % -89.32 % . . -4.73 % 0.48 % . borderlands2 3968 . -0.31 % -51.21 % . . -4.09 % 0.22 % . brutal-legend 338 . -0.03 % -2.95 % . . -0.06 % . . civilization_beyond.. 116 . . -14.17 % . . -0.88 % . . counter_strike_glob.. 1142 . . . . . . . . dirt-showdown 541 . -0.56 % -40.14 % . -3.45 % -1.82 % 0.35 % . dolphin 22 . . . . . 0.16 % . . dota2 1747 . . . . . 0.01 % . . europa_universalis_4 76 . -0.23 % -42.11 % . . -0.96 % . . f1-2015 774 . -0.09 % -28.89 % . . -2.60 % 0.09 % . furmark-0.7.0 4 . . . . . . . . gimark-0.7.0 10 . . . . . . . . glamor 16 . . . . . . . . humus-celshading 4 . . . . . . . . humus-domino 6 . . . . . . . . humus-dynamicbranching 24 . 0.71 % . . . 0.29 % -0.45 % . humus-hdr 10 . . . . . . . . humus-portals 2 . . . . . . . . humus-volumetricfog.. 6 . . . . . . . . left_4_dead_2 1762 . . . . . . . . metro_2033_redux 2670 . -0.10 % -7.15 % . . -0.03 % . . nexuiz 80 . . . . . . . . pixmark-julia-fp32 2 . . . . . . . . pixmark-julia-fp64 2 . . . . . . . . pixmark-piano-0.7.0 2 . . . . . . . . pixmark-volplosion-.. 2 . . . . . . . . plot3d-0.7.0 8 . . . . . . . . portal 474 . . . . . . . . sauerbraten 7 . . . . . . . . serious_sam_3_bfe 392 . . -13.20 % . . -1.81 % . . supertuxkart 4 . . . . . . . . talos_principle 324 . -0.21 % -18.39 % . . -2.73 % 0.14 % . team_fortress_2 808 . . . . . . . . tesseract 430 . 0.08 % -68.57 % . . -0.45 % . . tessmark-0.7.0 6 . . . . . . . . thea 172 . . . . . 0.03 % . . ue4_effects_cave 299 . -0.04 % -10.15 % . . -0.25 % 0.04 % . ue4_elemental 586 . -0.02 % -13.93 % . . -0.13 % 0.02 % . ue4_lightroom_inter.. 74 . -0.17 % -70.00 % . . -1.27 % . . ue4_realistic_rende.. 92 . . -32.58 % . . -0.35 % . . unigine_heaven 322 . 0.12 % -54.17 % . . -1.42 % -0.12 % . unigine_sanctuary 264 . . . . . . . . unigine_tropics 210 . . . . . . . . unigine_valley 278 . -0.15 % -40.74 % . . -2.00 % 0.09 % . unity 72 . . . . . 0.03 % . . warsow 176 . . . . . . . . warzone2100 4 . . . . . 0.13 % . . witcher2 1040 . -0.03 % -86.28 % . . -0.28 % 0.01 % . xcom_enemy_within 1236 . -0.24 % -63.54 % . . -0.93 % 0.18 % . yofrankie 82 . -0.61 % -100.00 % . . -0.83 % 0.41 % . ----------------------------------------------------------------------------------------------------------- Total 26002 . -0.13 % -27.03 % . -0.24 % -1.57 % 0.10 % . Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-13 19:46:16 +02:00
Marek Olšák	6596ecf8c5	gallivm: add helper lp_add_attr_dereferenceable Not sure if this is the right way to do it, but it seems to work. v2: make it a no-op on LLVM <= 3.5 Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-13 19:46:16 +02:00
Marek Olšák	bccf9de4df	radeonsi: clean up shader value metadata code No change in behavior. BTW, tbaa_md_kind == 1, which was the magic number in the code. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-13 19:46:16 +02:00
Marek Olšák	d7d7e6adbe	radeonsi: remove LLVMNoUnwindAttribute uses always set by gallivm Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-13 19:46:16 +02:00
Marek Olšák	c4807505c0	radeonsi: fix a typo in SI_PARAM_LINEAR_* handling introduced in `476e9cee1d` Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-13 19:46:16 +02:00
Marek Olšák	f2f573e777	gallium/radeon: normalize the code style no change in behavior Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-13 19:46:16 +02:00
Marek Olšák	ed3912d0da	radeonsi: just save buffer sizes instead of buffers while recording IBs whole buffer objects are not needed Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-13 19:46:16 +02:00
Jon Turney	fc8139b146	Add c99_alloca.h include to fix compilation on Cygwin Fix compilation on Cygwin, since `50b22354`, by adding c99_alloca.h include, which should know how to portably make the alloc() prototype available. Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-13 16:11:36 +01:00
Topi Pohjolainen	7d29fee4a8	i965/blorp: Cleanup leftovers from push constant disabling Setup for pixel shader push constants is the same as for other stages. Note that on gen8+ the if-else branches were identical and the generation check for packet size redundant. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-13 12:10:03 +03:00
Topi Pohjolainen	26778da571	i965/blorp/gen7+: Bring back push constant setup This is partial revert of commit `cc2d0e64`. It looks that even though blorp disables a stage the corresponding 3DSTATE_CONSTANT_XS packet is needed to be programmed. Hardware seems to try to fetch the constants even for disabled stages. Therefore care needs to be taken that the constant buffer is set up properly. Blorp will continue to trash it into non-existing such as before. It is possible that this could be omitted on SKL where the constant buffer is considered when the corresponding binding table settings are changed. Bspec: "The 3DSTATE_CONSTANT_* command is not committed to the shader unit until the corresponding (same shader) 3DSTATE_BINDING_TABLE_POINTER_* command is parsed." However, as CONSTANT_XS packet itself does not seem to stall on its own, it is safer to emit the packets for SKL also. Possible alternative to blorp trashing could have been to setup defaults in the beginning of each batch buffer. However, hardware doesn't seem to tolerate these packets being programmed multiple times per primitive. Bspec for IVB: "It is invalid to execute this command more than once between 3D_PRIMITIVE commands." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96878 Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-13 12:09:35 +03:00
Nicolai Hähnle	65d48fcf8c	radeonsi: silence Coverity warning Coverity's analysis is too weak to understand that r600_init_flushed_depth(_, _, NULL) only returns true when flushed_depth_texture was assigned a non-NULL value. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-13 09:52:39 +02:00
Samuel Iglesias Gonsálvez	a2bd7334ed	i965/fs: do d2x lowering before simd splitting So that we can have gen7 split large writes produced by this lowering pass. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-07-13 07:09:41 +02:00
Iago Toral Quiroga	376d7ee587	i965/fs: do pack lowering before simd splitting So that we can have gen7 split large writes produced by the pack lowering. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-07-13 07:09:41 +02:00
Samuel Iglesias Gonsálvez	9979a3f2ac	i965/fs: do not require force_writemask_all with exec_size 4 So far we only used instructions with this size in situations where we did not operate per-channel and we wanted to ignore the execution mask, but gen7 fp64 will need to emit code with a width of 4 that needs normal execution masking. v2: - Modify the assert instead of deleting it (Curro) Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-07-13 07:09:41 +02:00
Iago Toral Quiroga	aa4796ae81	i965/fs/gen7: split instructions that run into exec masking bugs In fp64 we can produce code like this: mov(16) vgrf2<2>:UD, vgrf3<2>:UD That our simd lowering pass would typically split in instructions with a width of 8, writing to two consecutive registers each. Unfortunately, gen7 hardware has a bug affecting execution masking and as a result, the second GRF register write won't work properly. Curro verified this: "The problem is that pre-Gen8 EUs are hardwired to use the QtrCtrl+1 (where QtrCtrl is the 8-bit quarter of the execution mask signals specified in the instruction control fields) for the second compressed half of any single-precision instruction (for double-precision instructions it's hardwired to use NibCtrl+1, at least on HSW), which means that the EU will apply the wrong execution controls for the second sequential GRF write if the number of channels per GRF is not exactly eight in single-precision mode (or four in double-float mode)." In practice, this means that we cannot write more than one consecutive GRF in a single instruction if the number of channels per GRF is not exactly eight in single-precision mode (or four in double-float mode). This patch makes our SIMD lowering pass split this kind of instructions so that the split versions only write to a single register. In the example above this means that we split the write in 4 instructions, each one writing 4 UD elements (width = 4) to a single register. v2 (Curro): - Make explicit that the thing about hardwiring NibCtrl+1 for the second compressed half is known to happen in Haswell and the issue with IVB might not be exactly the same. - Assign max_width instead of returning early so that we can handle multiple restrictions affecting to the same instruction. - Avoid division by 0 if the instruction does not write any registers. - Ignore instructions what have WE_all set. - Use the instruction execution type size instead of the dst type size. v3 (Curro): - Move the implementation down so it is not placed in the middle of another workaround. - Declare channels_per_grf as const. - Don't break the loop early if we find a BAD_FILE source. - Fix the number of channels that the hardware shifts for the second half of a compressed instruction to be 8 in single precision and 4 in double precision. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-07-13 07:09:41 +02:00
Iago Toral Quiroga	87a13f598b	i965/fs: use the new helper function to create double immediates Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-13 07:09:41 +02:00
Iago Toral Quiroga	9e196e907e	i965/fs: add a helper function to create double immediates Gen7 hardware does not support double immediates so these need to be moved in 32-bit chunks to a regular vgrf instead. Instead of doing this every time we need to create a DF immediate, create a helper function that does the right thing depending on the hardware generation. v2: - Define setup_imm_df() as an independent function (Curro) - Create a specific builder to get rid of some instruction field assignments (Curro). v3: - Get devinfo from builder (Kenneth) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-13 07:09:41 +02:00
Eric Anholt	93794145dd	vc4: Validate QPU uniform pointer updates.	2016-07-12 17:42:42 -07:00
Eric Anholt	420845acb2	vc4: Add support for NIR loops and break/continue.	2016-07-12 17:42:42 -07:00
Eric Anholt	0adf2ec0ee	vc4: Add support for emitting NIR IF nodes.	2016-07-12 17:42:42 -07:00
Eric Anholt	f505f66cd5	vc4: Add support for storing to NIR registers in a non-SSA fashion. Previously, there were occasionally NIR registers in our programs, but they were always actually used SSA-only. Now that we're trying to support control flow, we need to actually conditionally move to registers based on whether channels are active or not.	2016-07-12 17:42:41 -07:00
Eric Anholt	ab1d40b84a	vc4: Add a flag in the screen to track control flow support. For now it's still always false, but I need it in place for kernel backwards compat support as I extend the backend for control flow.	2016-07-12 17:42:40 -07:00
Eric Anholt	05bcd9dd96	vc4: Define a QIR branch instruction This uses the branch condition code in inst->cond to jump to either successor[0] (condition matches) or successor[0] (condition doesn't match).	2016-07-12 17:42:40 -07:00
Eric Anholt	54800bb71c	vc4: Add kernel support for branching in shader validation. We're already checking that branch instructions are within the contents of the shader and the proper PROG_END sequence is present. The other thing we need in the presence of branching is to verify that the shader doesn't overflow past the end of the uniforms stream. To do that, we require that at the start of any basic block reading uniforms have the following instructions: load_imm temp, <offset within uniform stream> add unif_addr, temp, unif The instructions are generated by userspace, and the kernel verifies that the load_imm is of the expected offset, and that the add adds it to a uniform. We track which uniform in the stream that is, and at draw call time fix up the uniform stream to have the address of the start of the shader's uniforms for that draw call. Signed-off-by: Eric Anholt <eric@anholt.net>	2016-07-12 17:42:39 -07:00
Eric Anholt	e2d7760df5	vc4: Add a bitmap of branch targets in kernel validation. This isn't used yet, it's just a first step toward loop validation. During the main parsing of instructions, we need to know when we hit a new basic block so that we can reset validated state.	2016-07-12 17:42:38 -07:00
Eric Anholt	24095c8b3b	vc4: Track the current instruction into the validation_state. This reduces how much we need to pass around as arguments, which was becoming more of a problem with looping validation.	2016-07-12 17:42:38 -07:00
Eric Anholt	c73aa0a09b	vc4: Add QPU support for generating BRANCH instructions.	2016-07-12 17:42:38 -07:00
Eric Anholt	6d34345001	vc4: Print live variable start/ends during QIR dumping. This only happens when live variables are set up, which is not in the normal dump, but is set up when we've failed to register allocate.	2016-07-12 17:42:37 -07:00
Eric Anholt	89918c1e74	vc4: Implement live intervals using a CFG. Right now our CFG is always a trivial single basic block, but that will change when enable loops.	2016-07-12 17:41:59 -07:00
Eric Anholt	f2eb8e3052	vc4: Make vc4_qir_schedule handle each block in the program. Basically we just treat each block independently. The only inter-block scheduling I can think of that would be be interesting would be to move texture result collection to after a short loop/if block that doesn't do texturing. However, the kernel disallows that as part of its security validation.	2016-07-12 15:47:26 -07:00
Eric Anholt	46ec025ba9	vc4: Convert uniforms lowering to work with multiple blocks. We still decide which uniform to lower based on how many instructions-that-need-lowering use that uniform, but now we emit a new temporary uniform load in each of the basic blocks containing an instruction being lowered. This commit is best reviewed with diff -b.	2016-07-12 15:47:26 -07:00
Eric Anholt	0c923e6c33	vc4: Convert vc4_opt_peephole_sf to work with control flow. We need to apply the peephole pass to each of the blocks in the program. We don't do dataflow analysis for SF across blocks, but we also don't generate code that would need us to do so.	2016-07-12 15:47:26 -07:00
Eric Anholt	6c1f834a23	vc4: Create a basic block structure and move the instructions into it. The optimization passes and scheduling aren't actually ready for multiple blocks with control flow yet (as seen by the "cur_block" references in them instead of iterating over blocks), but this creates the structures necessary for converting them.	2016-07-12 15:47:26 -07:00
Eric Anholt	d3cdbf6fd8	vc4: Add a "qir_for_each_inst_inorder" macro and use it in many places. We have the prior list_foreach() all over the code, but I need to move where instructions live as part of adding support for control flow. Start by just converting to a helper iterator macro. (The simpler "qir_for_each_inst()" will be used for the for-each-inst-in-a-block iterator macro later)	2016-07-12 15:47:25 -07:00
Eric Anholt	6858f05924	vc4: Also enable phi elimination. This avoids a bunch of code gen regressions when enabling loops in vc4. Prior to that, the GLSL that would have generated these optimizable phi nodes was being lowered to csels between either (undef, a) or (a, a), and those were being dealt with by nir_opt_undef and nir_opt_algebraic.	2016-07-12 15:47:25 -07:00
Eric Engestrom	e8959ba7af	vc4: fix memory leak The allocation has succeeded by that point, so it needs to be freed. CovID: 1358929 Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-07-12 15:47:12 -07:00
Eric Anholt	c65a00eaff	vc4: Close our screen's fd on screen close. We're passed in a freshly dup()ed fd on screen create, so we should close it on exit. Debugged by Hugh Cole-Baker.	2016-07-12 15:46:09 -07:00
Eric Anholt	c93f6938d5	nir: Add optimization for (a \|\| True == True) This was appearing in vc4 VS/CS in mupen64, due to vertex attrib lowering producing some constants that were getting compared. total instructions in shared programs: 112276 -> 112198 (-0.07%) instructions in affected programs: 2239 -> 2161 (-3.48%) total estimated cycles in shared programs: 283102 -> 283038 (-0.02%) estimated cycles in affected programs: 2365 -> 2301 (-2.71%) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-12 15:46:09 -07:00
Tim Rowley	be126c8a2a	swr: [rasterizer core] correct MSAA behavior for conservative rasterization Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-07-12 11:10:55 -05:00
Tim Rowley	c6ca126591	swr: [rasterizer core] conservative rast backend changes Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-07-12 11:10:49 -05:00
Tim Rowley	b6dbb95dc9	swr: [rasterizer] buckets cleanup Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-07-12 11:10:44 -05:00
Tim Rowley	eb6b2b340e	swr: [rasterizer core] make all api functions call GetContext Small api cleanup. Make all api functions call GetContext instead of locally casting handle. Makes debugging easier by providing a single point to track context changes. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-07-12 11:10:36 -05:00
Tim Rowley	f810907669	swr: [rasterizer] add support for llvm-3.9 v2: use signed compare, remove unneeded vmask Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-07-12 11:09:49 -05:00
Tim Rowley	ae4f2c849a	swr: [rasterizer jitter] fix llvm-3.7 compile d3d97f8 broke llvm-3.7, which has a mismatched API for setDataLayout/getDataLayout. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-07-12 10:42:57 -05:00
Brian Paul	d46489ddea	docs: remove duplicated line in 12.0.1 release notes file Signed-off-by: Brian Paul <brianp@vmware.com>	2016-07-12 09:42:42 -06:00
Leo Liu	55f0b97b40	st/omx/dec: convert decoder video buffer to progressive with encode tunneling The idea of encode tunneling is to use video buffer directly for encoder, but currently the encoder doesn’t support interlaced surface, the OMX decoder set progressive surface before on that purpose. Since now we are polling the driver for interlacing information for decoder, we got the interlaced as preferred as other APIs(VDPAU, VA-API), thus breaking the transcode with tunneling. The solution is when with tunnel detected, re-allocate progressive target buffers, and then converting the interlaced decoder results to there. This has been tested with transcode results bit to bit matching as before with surface from progressive to progressive. Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Tested-by: Julien Isorce <j.isorce@samsung.com>	2016-07-12 09:27:53 -04:00
Leo Liu	82f875f4d8	vl/compositor: set layer of y or uv to render Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Tested-by: Julien Isorce <j.isorce@samsung.com>	2016-07-12 09:27:53 -04:00
Leo Liu	14761da9f9	vl/compositor: add weave to yuv shader This shader will make interlaced yuv to progressive yuv. Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Tested-by: Julien Isorce <j.isorce@samsung.com>	2016-07-12 09:27:53 -04:00
Leo Liu	2e18c2c6f8	vl/compositor: move weave shader out from rgb weaving We'll use weave shader in the later patch. Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Tested-by: Julien Isorce <j.isorce@samsung.com>	2016-07-12 09:27:53 -04:00
Marek Olšák	ead7736821	glsl_to_tgsi: don't use the negate modifier in integer ops after bitcast This bug is uncovered by glsl/lower_if_to_cond_assign. I don't know if it can be reproduced in any other way. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-12 11:58:53 +02:00
Francisco Jerez	e300696304	clover/api: Implement clLinkProgram per-device binary presence validation rule. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:35 -07:00
Serge Martin	f29ed2da24	clover: Add clLinkProgram (CL 1.2). [ Francisco Jerez: Use validate_build_common for error checking, simplify control flow slightly and handle additional exception types. ] Reviewed-by: Francisco Jerez <currojerez@riseup.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:35 -07:00
Francisco Jerez	c478db6c0a	clover: Trivial cleanups for api/program.cpp. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:35 -07:00
Francisco Jerez	9c7cda2792	clover/core: Remove compiler.hpp. header_map was the only definition left in compiler.hpp, move it into program.hpp which is its only user in clover/core. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:35 -07:00
Francisco Jerez	c2e37fe1f9	clover/llvm: Get rid of compile_program_llvm(). Superseded by compile_program() and link_program(). Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:35 -07:00
Francisco Jerez	010918f5aa	clover: Provide separate program methods for compilation and linking. [ Serge Martin: Fix inverted opts and log build ctor args. Keep the log related to the build. Fix indentation ] Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:35 -07:00
Francisco Jerez	1942490bae	clover: Unify program::build_* into a single method returning a struct. This gets rid of the program::build_* query methods and replaces them with the program::build() method that returns a single data structure containing all parameters for the last build done on the given target device (including build logs, options and the binary itself). [ Serge Martin: Fix inverted opts and log build ctor args ] Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:34 -07:00
Serge Martin	7f6a4a4342	clover: Change program::build opts argument to std::string. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:34 -07:00
Francisco Jerez	2a73ae662c	clover: Define error subclass to signal build option parse failure. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:34 -07:00
Francisco Jerez	4ef1c0918d	clover: Move back to using build_error to signal compilation failure. This partially reverts `7e0180d57d`. Having two different exception subclasses for compilation and linking makes it more difficult to share or move code between the two codepaths, because the exact same function under the same error condition would need to throw one exception or the other depending on what top-level API is being implemented with it. There is little benefit anyway because clCompileProgram() and clLinkProgram() can tell whether they are linking or compiling a program. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:34 -07:00
Serge Martin	70fe6267a3	clover: Override ret_object. Return an API object from an intrusive reference to a Clover object, incrementing the reference count of the object. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:34 -07:00
Francisco Jerez	85309e8b55	clover/tgsi: Add stub link_program() function. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:34 -07:00
Francisco Jerez	ba613636e8	clover/tgsi: Move compiler entry point declaration into tgsi directory and namespace. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:34 -07:00
Francisco Jerez	fb3eeb1314	clover/llvm: Implement the -create-library linker option. [ Serge Martin: disable internalize pass when building a library. Otherwise some functions may be inlined and removed ] Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:34 -07:00
Francisco Jerez	9de3f4a59f	clover/llvm: Implement linkage of multiple clover modules. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:34 -07:00
Francisco Jerez	132b6ccd4f	clover/llvm: Split compilation and linking. Split the work previously done by compile_program_llvm() into compile_program() (which simply runs the front-end and serializes the resulting LLVM IR) and link_program() (which takes care of everything else down to binary codegen). [ Serge Martin: allow LLVM IR dump after compilation ] Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:34 -07:00
Francisco Jerez	1a7d11aa3d	clover/llvm: Implement library bitcode codegen. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:34 -07:00
Francisco Jerez	86100e13ab	clover/llvm: Trivial assorted cleanups for invocation.cpp. Drop a few include and using directives which are no longer necessary. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:34 -07:00
Francisco Jerez	520cc26859	clover/llvm: Split native codegen into separate file. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:34 -07:00
Francisco Jerez	8195637363	clover/llvm: Split bitcode codegen into separate file. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:33 -07:00
Francisco Jerez	71ac9820d6	clover/llvm: Split shared codegen support code into separate file. This is the common part of the code used to generate a clover::module from LLVM bitcode, shared between the native and LLVM paths. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:33 -07:00
Francisco Jerez	26fa9bfd0d	clover/llvm: Define function for bitcode print-out. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:33 -07:00
Francisco Jerez	f0721020ad	clover/llvm: Split native codegen and assembly print-out into separate functions. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:33 -07:00
Francisco Jerez	1d042adc0a	clover/llvm: Clean up bitcode codegen. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:33 -07:00
Francisco Jerez	952d1e6fd6	clover/llvm: Use metadata introspection utils for kernel enumeration. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:33 -07:00
Francisco Jerez	d37d5842c1	clover/llvm: Use metadata introspection utils for kernel argument set-up. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:33 -07:00
Francisco Jerez	3ed31bbf05	clover/llvm: Add simplified utility functions for metadata introspection. v2: Fix for latest LLVM from SVN. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> (v1) Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:34:30 -07:00
Francisco Jerez	7da2c1ff0f	clover/llvm: Clean up codestyle of get_kernel_args(). Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:22:59 -07:00
Francisco Jerez	0601fe7438	clover/llvm: Fold compile_native() call into build_module_native(). Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:22:56 -07:00
Francisco Jerez	f98422eafd	clover/llvm: Factor out duplicated construction of clover::module. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:22:53 -07:00
Francisco Jerez	3ce6ab068c	clover/llvm: Clean up compile_native(). This switches compile_native() to the C++ API (which the rest of this file makes use of anyway so there is little benefit from using the C API), what should get rid of an amount of boilerplate and fix a leak of the TargetMachine object in the error path. v2: Additional fixes for LLVM 3.6. v3: Update for the latest LLVM SVN changes. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:22:50 -07:00
Francisco Jerez	7bcefa5903	clover/llvm: Clean up ELF parsing. This function was doing three separate things: - Initializing and releasing the ELF parsing state (the latter can be better done using RAII). - Searching for the symbol table in the ELF file. - Extraction of kernel symbol offsets from the symbol table. Split each one into a separate function for clarity and clean up the result slightly. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:22:48 -07:00
Francisco Jerez	574477e599	clover/llvm: Move a bunch of utility functions into separate file. Some of these will be useful from a different compilation unit in the same subtree so put them in a publicly accessible header file. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:22:43 -07:00
Francisco Jerez	92247cef3f	clover/llvm: Tidy debug handling. Most significant change is debugging flags are now a scoped enum and all debugging helpers live in the debug namespace. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:22:40 -07:00
Francisco Jerez	4614397ac2	clover/llvm: Use helper function to abort compilation with error message. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:22:37 -07:00
Francisco Jerez	423eecb76a	clover/llvm: Simplify diagnostic_handler(). Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:22:29 -07:00
Francisco Jerez	5884dfbc2a	clover/llvm: Trivial codestyle clean-up for optimize(). Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:22:21 -07:00
Francisco Jerez	bdc27f13d5	clover/llvm: Clean up compilation into LLVM IR. Some assorted and mostly trivial clean-ups for the source to bitcode compilation path. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:21:50 -07:00
Francisco Jerez	714b167f57	clover/llvm: Factor out LLVM context init. So it can be shared between the compilation and linking codepaths. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:21:30 -07:00
Francisco Jerez	fa94055d53	clover/llvm: Declare compiler instance at top level and pass down as argument. This allows simplifying the interface of compile_llvm() because it no longer needs to read out and return the optimization level and address space map from the compiler instance. Instead declare the compiler instance at the top level so that both properties are available directly. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:21:13 -07:00
Francisco Jerez	a27d4ec3b9	clover/llvm: Refactor compiler instance initialization. This will be shared between the compiler and linker codepaths. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:21:08 -07:00
Francisco Jerez	c2a167ad73	clover/llvm: Factor out compiler option tokenization. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:20:47 -07:00
Francisco Jerez	c513cfa747	clover/llvm: Factor out target string parsing. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:20:41 -07:00
Francisco Jerez	251054220e	clover/llvm: Collect #ifdef mess into a separate file. This gets rid of most ifdef's from the invocation.cpp code -- Only a couple of them are left which will be removed differently in the following commits. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:20:12 -07:00
Francisco Jerez	11afde89b8	clover/llvm: Drop dead code. This ifdef'ed out code was meant to handle compilation into TGSI, but it doesn't seem likely that it will ever be useful even if the TGSI back-end is resurrected because the TGSI bitcode can just be plumbed through in ELF format and dealt with as a regular "native" back-end. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:20:05 -07:00
Francisco Jerez	600ac51448	clover/llvm: Drop support for LLVM < 3.6. Reviewed-by: Serge Martin <edb+mesa@sigluy.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:19:49 -07:00
Serge Martin	8624888d6f	clover: Bump required LLVM version to 3.6. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-07-11 20:19:14 -07:00
Ilia Mirkin	da7223ebdc	mesa: set _NEW_BUFFERS when updating texture bound to current buffers When a glTexImage call updates the parameters of a currently bound framebuffer, we might miss out on revalidating whether it is complete. Make sure to set _NEW_BUFFERS which will trigger the revalidation in that case. Also while we're at it, fix the fb parameter passed in to the eventual RenderTexture call. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94148 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr>	2016-07-11 21:18:05 -04:00
Ilia Mirkin	8b7607d28a	meta/texsubimage: tex_image is always non-null, avoid confusing code Probably a copy-paste from mesa_meta_pbo_GetTexSubImage where tex_image may apparently be null. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-07-11 21:18:05 -04:00
Ilia Mirkin	00d4315d37	st/mesa: return appropriate mesa format for ETC texture formats Even when the backend driver does not support ETC formats, we handle the decoding into an uncompressed backing texture. However as far as core mesa is concerned, it's an ETC texture and we should return the relevant ETC mesa format. This condition can get hit when using glTexStorage to create the texture object. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-07-11 21:17:30 -04:00
Ilia Mirkin	8ee3cdde04	mesa: etc2 online compression is unsupported, don't attempt it Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-07-11 21:17:01 -04:00
Ben Skeggs	0d911a720d	nvc0: initial support for GP100 GPUs Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2016-07-12 10:56:35 +10:00
Samuel Pitoiset	9bc083284f	nvc0: use a define for the driver constant buffer size This might avoid mistakes if the size is bumped in the future. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-11 22:30:41 +02:00
Samuel Pitoiset	31a615677b	nvc0: fix the driver cb size when draw parameters are used The size of the driver constant buffer for each stage should be 2048 and not 512 because it has been increased recently for buffers/images. While we are at it, do the same change for indirect draws. This fixes all ARB_shader_draw_parameters tests on GM107. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 12.0 <mesa-stable@lists.freedesktop.org>	2016-07-11 22:11:27 +02:00
Samuel Pitoiset	19d0450b27	nvc0/ir: fix images indirect access on Fermi This fixes the following piglits: arb_arrays_of_arrays-basic-imagestore-mixed-const-non-const-uniform-index arb_arrays_of_arrays-basic-imagestore-mixed-const-non-const-uniform-index2 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 12.0 <mesa-stable@lists.freedesktop.org>	2016-07-11 21:01:21 +02:00
Marek Olšák	33c8723980	st/mesa: remove st_dump_program_for_shader_db replaced by MESA_SHADER_CAPTURE_PATH in core Mesa Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-11 19:06:05 +02:00
Marek Olšák	d7b6f90684	gallivm: set LLVMNoUnwindAttribute on all intrinsics RadeonSI stats: Mostly 0% difference, but Valley shows a small improvement: Application Files SGPRs VGPRs SpillSGPR SpillVGPR Code Size LDS Max Waves Waits unigine_valley 278 0.00 % -0.29 % 0.00 % 0.00 % 0.01 % 0.00 % 0.17 % 0.00 % Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-07-11 19:06:05 +02:00
Francesco Ansanelli	3c44629142	i965: fix ignored qualifiers warning Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-11 05:50:22 -07:00
Nicolai Hähnle	374aa2bb27	gallium/u_queue: assert that users must wait on fences before destroying them Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-11 11:04:44 +02:00
Nicolai Hähnle	a0a616720a	gallium/u_queue: guard fence->signalled checks with fence->mutex I have seen a hang during application shutdown that could be explained by the following race condition which this patch fixes: 1. Worker thread enters util_queue_fence_signal, sets fence->signalled = true. 2. Main thread calls util_queue_job_wait, which returns immediately. 3. Main thread deletes the job and fence structures, leaving garbage behind. 4. Worker thread calls pipe_condvar_broadcast, which gets stuck forever because it is accessing garbage. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-11 11:03:59 +02:00
Chad Versace	5c17fb2cd6	anv/dump: Fix post-blit memory barrier Swap srcAccessMask and dstAccessMask. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-09 20:58:33 -07:00
Chad Versace	bc33c9b455	anv/dump: Fix vkCmdPipelineBarrier flags 'true' is not valid for VkDependencyFlags. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-09 20:58:33 -07:00
Jason Ekstrand	ac7eeebce4	anv/dump: Add support for dumping framebuffers Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-09 20:58:33 -07:00
Jason Ekstrand	fad0b7b0b3	anv/dump: Add a barrier for the source image Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-09 20:58:33 -07:00
Jason Ekstrand	6ad183bf89	anv/dump: Refactor the guts into helpers Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-09 20:58:33 -07:00
Jason Ekstrand	adbed7ae7a	anv/dump: Use anv_minify instead of hand-rolling it Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-09 20:58:33 -07:00
Jason Ekstrand	a26cda5ca5	anv/dump: Take an aspect in dump_image_to_ppm Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-09 20:58:33 -07:00
Nicolai Hähnle	b479c47a9c	radeonsi: fix bad assertion in si_emit_sample_mask The blitter sets mask == 1, which is fine since it doesn't use smoothing. Fixes a regression introduced in commit `5bcfbf91`. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-07-09 19:46:54 +02:00
Matt Turner	6624174c0a	glx: Fix for commit `2c86668694`. Ian suggested these changes in his review and I made them, but I pushed the old version of the patch.	2016-07-08 16:46:17 -07:00
Emil Velikov	83a782cd5e	docs: add news item and link release notes for 12.0.0/12.0.1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-09 00:09:51 +01:00
Emil Velikov	386ceb4c61	docs: add sha256 checksums for 12.0.1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `edfc17a19a`)	2016-07-09 00:03:21 +01:00
Emil Velikov	c7c0adc7e6	docs: add release notes for 12.0.1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `04277f058d`)	2016-07-09 00:03:16 +01:00
Emil Velikov	286a71b01f	docs: add sha256 checksums for 12.0.0 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `3a146a789c`)	2016-07-09 00:03:10 +01:00
Emil Velikov	4644908a9f	docs: Update 12.0.0 release notes Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `8b06176f31`)	2016-07-09 00:03:04 +01:00
Matt Turner	2c86668694	glx: Undo memory allocation checking damage. This partially reverts commit `d41f5396f3`. That untested commit broke the tex-skipped-unit piglit test and the arbvparray Mesa demo when run with indirect GLX. state->array_state is used during initialization, so its assignment cannot be moved to the end of the function. The backtrace looked like: Program received signal SIGSEGV, Segmentation fault. 0x00007ffff77c7a5c in __glXGetActiveTextureUnit (state=0x6270e0) at indirect_vertex_array.c:1952 1952 return state->array_state->active_texture_unit; (gdb) bt 0 0x00007ffff77c7a5c in __glXGetActiveTextureUnit (state=0x6270e0) at indirect_vertex_array.c:1952 1 0x00007ffff77cbf62 in get_client_data (gc=0x626f50, cap=34018, data=0x7fffffffd7a0) at single2.c:159 2 0x00007ffff77cce51 in __indirect_glGetIntegerv (val=34018, i=0x7fffffffd830) at single2.c:498 3 0x00007ffff77c4340 in __glXInitVertexArrayState (gc=0x626f50) at indirect_vertex_array.c:193 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-07-08 14:03:19 -07:00
Colin McDonald	b36644bae6	glx: Fix indirect multi-texture GL_DOUBLE coordinate arrays. There is no draw arrays protocol support for multi-texture coordinate arrays, so it is implemented by sending batches of immediate mode commands from emit_element_none in indirect_vertex_array.c. This sends the target texture unit (which has been previously setup in the array_state header field), followed by the texture coordinates. But for GL_DOUBLE coordinates the texture unit must be sent after the texture coordinates. This is documented in the glx protocol description, and can also be seen in the indirect.c immediate mode commands generated from gl_API.xml. Sending the target texture unit in the wrong place can crash the remote X server. To fix this required some more extensive changes to indirect_vertex_array.c and indirect_vertex_array_priv.h, in order to remove the texture unit value out of the array_state "header" field, and send it separately. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61907	2016-07-08 14:03:16 -07:00
Colin McDonald	5ced100bf5	glx: Correct opcode typos in __indirect_glTexCoordPointer. At the same time, replace opcode numbers with names in __indirect_glVertexAttribPointer. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61907	2016-07-08 14:03:09 -07:00
Colin McDonald	d57c85c1bf	glx: Call __glXInitVertexArrayState() with a usable gc. For each indirect context the indirect vertex array state must be initialised by __glXInitVertexArrayState in indirect_vertex_array.c. As noted in the routine header it requires that the glx context has been setup prior to the call, in order to test the server version and extensions. Currently __glXInitVertexArrayState is called from indirect_bind_context in indirect_glx.c, as follows: state = gc->client_state_private; if (state->array_state == NULL) { glGetString(GL_EXTENSIONS); glGetString(GL_VERSION); __glXInitVertexArrayState(gc); } But, the gc context is not yet usable at this stage, so the server queries fail, and __glXInitVertexArrayState is called without the server version and extension information it needs. This breaks multi-texturing as glXInitVertexArrayState doesn't get GL_MAX_TEXTURE_UNITS. It probably also breaks setup of other arrays: fog, secondary colour, vertex attributes. To fix this I have moved the call to __glXInitVertexArrayState to the end of MakeContextCurrent in glxcurrent.c, where the glx context is usable. Fixes a regression caused by commit `4fbdde889c`. Fixes ARB_vertex_program usage in the arbvparray Mesa demo when run with indirect GLX and also the tex-skipped-unit piglit test when run with indirect GLX. Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61907	2016-07-08 14:02:56 -07:00
Christian König	64ac4aef27	radeon/uvd: simplify sending context buffer message Just send it whenever it is allocated. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-07-08 21:03:32 +02:00
Christian König	6b474e06a2	radeon/uvd: fix contex buffer destruction in the error path Destroying a not allocated buffer is harmless. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-07-08 21:03:32 +02:00
Christian König	36df04dac4	radeon/uvd: move polaris fw check into radeon_video.c v2 It's actually not very clever to claim to support H.264 and then fail to create a decoder. v2: prefix FW macro with UVD_. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-07-08 21:03:31 +02:00
Christian König	5290bf43c8	radeon/video: fix coding style in radeon_video.c v2 v2: fix other tabs as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-07-08 21:03:31 +02:00
Brian Paul	74163475b0	svga: simplify/fix 1D/2D array resource copies Fixes the one of the piglit arb_copy_image-targets tests for 1D arrays. Previously, we were applying the 1D array z/face adjustment twice. Also simplify the copy_region_vgpu10() function. It never has to copy multiple array layers/slices. The Mesa code for glCopyImageSubData does the loop over slices/faces. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-07-08 12:53:21 -06:00
Brian Paul	0e23f370c9	mesa: print number of samples in renderbuffer_storage error msg Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-07-08 12:53:21 -06:00
Brian Paul	fb26317604	svga: remove unused variable Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-07-08 12:53:21 -06:00
Brian Paul	689293ad52	svga: add dumping for more device commands Signed-off-by: Brian Paul <brianp@vmware.com>	2016-07-08 12:53:21 -06:00
Brian Paul	599c333d07	svga: silence a couple unused variable warnings Signed-off-by: Brian Paul <brianp@vmware.com>	2016-07-08 12:53:20 -06:00
Charmaine Lee	c3c7ff014b	svga: rebind using render target surfaces in hw draw state Currently when we rebind framebuffer resources at the beginning of the command buffer, we use the color buffer surfaces saved in the context hw clear state. But the surfaces could be different from the actual emitted render target surfaces if any of the color buffer surfaces is also used for shader resource, in that case, we create a backed surface for the collided render target surface. So to rebind the framebuffer resources correctly, use the render target surfaces saved in the context hw draw state. Tested with Heaven, Lightsmark2008, MTT piglit, glretrace, conform. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-07-08 12:53:20 -06:00
Charmaine Lee	da98cee067	svga: invalidate gb surface before it is reused With this patch, a guest-backed surface will be invalidated using the SVGA_3D_CMD_INVALIDATE_GB_SURFACE command before the surface is reused. This fixes the updating dirty image error from the device when a surface is reused. v2: Instead of invalidating the surface when it is reused, send the invalidate command before the surface is put into the recycle pool. v3: (1) surface invalidate is a noop operation in Linux winsys, since surface invalidation is not needed for DMA path. (2) Instead of invalidating the surface content in svga_screen_surface_destroy() when a surface is to be destroyed, it is done in svga_screen_cache_flush() when the surface is no longer referenced in a command buffer and is ready to be moved to the unused list. At this point, the surface will be moved to the invalidate list. When the surface invalidation is submitted, the surface will be moved to the unused list. Tested with piglit, glretrace. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com>	2016-07-08 12:53:20 -06:00
Brian Paul	ca531aeeb1	svga: fix use of provoking vertex control If the SVGA3D_DEVCAP_DX_PROVOKING_VERTEX query returns false, never define rasterizer state objects with provokingVertexLast set. Despite what the device reports, it may interpret the provokingVertexLast flag anyway. This fixes an issue when using capability clamping. Tested with piglit provoking-vertex and glsl-fs-flat-color tests. VMware bug 1550143. Reviewed-by: <charmainel@vmware.com>	2016-07-08 12:53:20 -06:00
Nayan Deshmukh	af18a04755	vl: add half pixel to v_tex before adding offsets Since pixel center lies at 0.5, add half_pixel to vtex before adding offsets to it. Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-07-08 20:51:12 +02:00
Samuel Pitoiset	a0bf1768c7	nvc0/ir: remove unused resource info loading helpers Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-08 19:12:23 +02:00
Samuel Pitoiset	ed3a284382	nvc0/ir: refactor the surfaces info loading logic Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-08 19:12:21 +02:00
Samuel Pitoiset	9cdbe80745	nvc0/ir: move the shift left op inside loadTexHandle() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-08 19:12:06 +02:00
Nicolai Hähnle	04d93ea619	radeonsi: disable multi-threading when shader dumps are enabled Otherwise, shader dumps can become interleaved and unusable. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-08 10:59:36 +02:00
Nicolai Hähnle	7ffc832ab8	radeonsi: use multi-threaded compilation in debug contexts We only have to stay single-threaded when debug output must be synchronous. This yields better parallelism in shader-db runs for me. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-08 10:59:32 +02:00
Nicolai Hähnle	084ca0d8e5	st/mesa: set debug callback async flag Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-08 10:59:29 +02:00
Nicolai Hähnle	2909e292fc	gallium: add async flag to pipe_debug_callback v2: fix typo db -> cb Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-08 10:58:52 +02:00
Nicolai Hähnle	5bcfbf91e5	radeonsi: catch a potential state tracker error with non-MSAA FBs At least st/mesa ensures this, so I'd rather not handle deviations in radeonsi. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-08 10:53:05 +02:00
Nicolai Hähnle	d938b8c0bf	radeonsi: explicitly choose center locations for 1xAA on Polaris Unlike SC, the small primitive filter does not automatically use center locations in 1xAA mode, so this is needed to avoid artifacts caused by the small primitive filter discarding triangles that it shouldn't. As a side effect of how the effective number of samples is now calculated, this patch also avoids submitting the sample locations for line/poly smoothing when they're not really needed. Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-08 10:52:50 +02:00
Nicolai Hähnle	7d2ce5258f	r600g: call cayman_emit_msaa_sample_locs only when needed In the case of nr_samples <= 1, that function is (currently) a no-op anyway. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-08 10:52:45 +02:00
Kenneth Graunke	b3c5df3ca4	mesa: Mark R*32F formats as filterable when an extension is present. GL_OES_texture_float_linear marks R32F, RG32F, RGB32F, and RGBA32F as texture filterable. Fixes glGenerateMipmap GL errors when visiting a WebGL demo in Chromium: http://www.iamnop.com/particles Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-08 01:26:23 -07:00
Eric Engestrom	b7be23b6e1	i965/blorp: fix indentation level Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-07-08 11:07:36 +03:00
Francisco Jerez	37b901003b	i965: Fix remaining flush vs invalidate race conditions in brw_emit_pipe_control_flush. This hardware race condition has caused problems several times already (see "i965: Fix cache pollution race during L3 partitioning set-up.", "i965: Fix brw_render_cache_set_check_flush's PIPE_CONTROLs." and "i965: intel_texture_barrier reimplemented"). The problem is that whenever we attempt to both flush and invalidate multiple caches with a single pipe control command the flush and invalidation happen in reverse order, so the contents flushed from the R/W caches aren't guaranteed to become visible from the invalidated caches after the PIPE_CONTROL command completes execution if some concurrent rendering workload happened to pollute any of the invalidated R/O caches in the short window of time between the invalidation and flush. This makes sure that brw_emit_pipe_control_flush() has the effect expected by most callers of making the contents flushed from any R/W caches visible from the invalidated R/O caches. Cc: "12.0 11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-07 14:16:39 -07:00
Francisco Jerez	0bd3a121c6	i965: Make room in the batch epilogue for three more pipe controls. Review carefully, it sucks to have to keep track of the number of command packet dwords emitted in the batch epilogue manually. The MI_REPORT_PERF_COUNT_BATCH_DWORDS calculation was obviously wrong. Cc: "12.0 11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-07 14:16:39 -07:00
Francisco Jerez	a10879f48c	i965: Emit SKL VF cache invalidation W/A from brw_emit_pipe_control_flush. There were two places in the driver doing a pipe control VF cache flush, one of them was missing this workaround, move it down into brw_emit_pipe_control_flush to make sure we don't miss it again. Cc: "12.0 11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-07-07 14:16:39 -07:00
Francisco Jerez	04f74d6629	i965: Emit SNB write cache flush W/A from brw_emit_pipe_control_flush. Shouldn't cause any functional changes at this point, but we have forgotten to apply this workaround several times in the past, make sure it doesn't happen again. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-07-07 14:16:38 -07:00
Frank Binns	8fd5779da4	egl: restrict swap_available dri2_egl_display field to X11 This field is only ever set and read by the X11 platform. Signed-off-by: Frank Binns <frank.binns@imgtec.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-07 13:28:50 -07:00
Guillaume Charifi	9fea9d6f8e	egl: Fix the bad surface attributes combination checking for pbuffers. (v3) Fixes a regression induced by commit `a0674ce5c4`: When EGL_TEXTURE_FORMAT and EGL_TEXTURE_TARGET were both specified (and both != EGL_NO_TEXTURE), an error was instantly triggered, before the other one had even a chance to be checked, which is obviously not the intended behaviour. v2: Full commit hash, remove useless variables. v3: [chadv] Add Fixes footers. Fixes: piglit "spec/egl 1.4/eglcreatepbuffersurface and then glclear" Fixes: piglit "spec/egl 1.4/largest possible eglcreatepbuffersurface and then glclear" Signed-off-by: Guillaume Charifi <guillaume.charifi@sfr.fr> Reviewed-by: Frank Binns <frank.binns@imgtec.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-07 11:28:55 -07:00
Eric Engestrom	7adb9b0948	egl/display: remove unnecessary code and make it easier to read Remove the two first level `if` as they will always be true, and flatten the two remaining `if`. No functional change. Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-07 11:13:13 -07:00
Gurchetan Singh	2e6d35809b	mesa: Make single-buffered GLES representation internally consistent There are a few places in the code where clearing and reading are done on incorrect buffers for GLES contexts. See comments for details. This fixes 75 GLES3 dEQP tests on the surfaceless platform with no regressions. v2: Corrected unclear comment v3: Make the change in context.c instead of get.c v4: Removed whitespace Reviewed-by: Stéphane Marchesin <marcheu@chromium.org> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-07 11:02:35 -07:00
Emil Velikov	f35f8464ec	bugzilla_mesa.sh: Drop "Bug " from sed command After a recent Bugzilla update the word is no longer in the title. Thus the script ended up producing bogus HTML. Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-07 15:58:46 +01:00
Akihiko Odaki	42968424fb	mesa: don't install GLX files if GLX is not built Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Akihiko Odaki <akihiko.odaki.4i@stu.hosei.ac.jp> [Emil Velikov: Drop guards around dri_interface.h, add stable tag] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-07 15:58:11 +01:00
Timothy Arceri	7a9d6abcae	nir: add glsl_dvec_type() helper Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-06 23:20:23 -07:00
Mathias Fröhlich	13affe0d3f	osmesa: Export OSMesaCreateContextAttribs. Since the function is exported like any other public api function and put in the header as if you could link against it, export it also from shared objects. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-07-07 06:19:13 +02:00
Timothy Arceri	7ed5bca21d	i965: consolidate generation check Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-07-07 12:29:21 +10:00
Timothy Arceri	e0dc3109d5	i965: don't copy VS attribute work arounds for HSW+ These workarounds are not required for HSW and above so stop copying them at VS key generation which is called at draw time. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-07 12:29:12 +10:00
Timothy Arceri	27e28197e8	i965: add double packing support to tess stages Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-07 10:26:43 +10:00
Timothy Arceri	8b80e9c31d	i965: add double support packing support to gs inputs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-07 10:26:43 +10:00
Timothy Arceri	20e935e6f6	nir: add glsl_double_type() helper Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-07 10:26:43 +10:00
Timothy Arceri	9d9b0b54cd	i965: add indirect packing support to gs load inputs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-07 10:26:43 +10:00
Timothy Arceri	2477e6cfad	i965: add indirect packing support for tcs and tes Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-07 10:26:43 +10:00
Timothy Arceri	2bda4b062f	i965: add component packing support for tcs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-07 10:26:43 +10:00
Timothy Arceri	cfff71a47a	i965: add component packing support for tes Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-07 10:26:43 +10:00
Timothy Arceri	a102ef2d4f	i965: add component packing support for gs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-07 10:26:43 +10:00
Timothy Arceri	448adfbc67	nir: use the same driver location for packed varyings Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-07 10:26:43 +10:00
Timothy Arceri	0eea6b3297	nir: add new intrinsic field for storing component offset This offset is used for packing. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-07 10:26:43 +10:00
Eric Engestrom	771f6db76f	i965/docs: update Intel Linux Graphics URLs Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-07-06 13:18:23 -07:00
Chad Versace	8910de39c7	anv: gitignore anv_timestamp.h	2016-07-06 13:13:18 -07:00
Tom Stellard	513fccdfb6	radeon/llvm: Use alloca instructions for larger arrays We were storing arrays in vectors, which was leading to some really bad spill code for large arrays. allocas instructions are a better fit for arrays and LLVM optimizations are more geared toward dealing with allocas instead of vectors. For arrays that have 16 or less 32-bit elements, we will continue to use vectors, because this will force LLVM to store them in registers and use indirect registers, which is usually faster for small arrays. In the future we should use allocas for all arrays and teach LLVM how to store allocas in registers. This fixes the piglit test: spec/glsl-1.50/execution/geometry/max-input-component Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 19:47:38 +00:00
Tom Stellard	02873a7b0c	radeon/llvm: Add helpers for loading and storing data from arrays. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 19:47:38 +00:00
Tom Stellard	2dc48984b2	radeon/llvm: Remove uses_temp_indirect_addressing() function bld->indirect_files is never set, so this function always returns false. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 19:47:38 +00:00
Emil Velikov	9618e2a24c	anv: vulkan: remove the anv_device.$(OBJEXT) rule Atm the actual rule will expand to foo.o which is used for static libraries only. Thus the automake manual recommendation [to use OBJEXT] won't help us, since since we're working with a shared library. Thus let's 'demote' the file and add it back to BUILT_SOURCES. This will manage all the complexity for us, at the (existing expense) of working only with the all, check and install targets. The crazy (why the issue was hard to spot): If the dependencies (.deps/*.Plo) are already created one can alter the anv_device.$(OBJEXT) line and/or nuke it all together. That won't lead to any warnings/issues, even though the Makefile is regenerated. Moral of the story: Always rm -rf top_builddir or don't resolve the dependencies manually and use BUILT_SOURCES. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96825 Fixes: d7a604c3f7a ("anv: use cache uuid based on the build timestamp.") Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Mark Janes <mark.a.janes@intel.com>	2016-07-06 10:19:19 -07:00
Rob Clark	64d35f817a	vbo: fix attr reset In `bc4e0c4` (vbo: Use a bitmask to track the active arrays in vbo_exec*.) we stopped looping over all the attributes and resetting all slots. Which exposed an issue in vbo_exec_bind_arrays() for handling GENERIC0 vs. POS. Split out a helper which can reset a particular slot, so that vbo_exec_bind_arrays() can re-use it to reset POS. This fixes an issue with 0ad (and possibly others). Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-07-06 10:17:30 -04:00
Rob Clark	23dd9eaa94	list: fix list_replace() for empty lists Before, it would happily copy list_head next/prev (ie. pointer to the from list_head), leaving things in a confused state and causing much mayhem. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-06 10:17:30 -04:00
Rob Clark	09fe35b450	gallium: un-inline pipe_surface_desc Want to re-use this struct, so un-inline it. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 10:17:30 -04:00
Rob Clark	def044376a	gallium/util: make util_copy_framebuffer_state(src=NULL) work Be more consistent with the other u_inlines util_copy_xyz_state() helpers and support NULL src. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 10:17:30 -04:00
Nicolai Hähnle	660cd3de4a	winsys/amdgpu: avoid flushed depth when possible If a depth/stencil texture has no mipmaps, we can always get a layout that is compatible with DB and TC. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 10:43:52 +02:00
Nicolai Hähnle	7000dfd5c3	gallium/radeon: add depth/stencil_adjusted output to surface computation This fixes a rare bug with stencil texturing -- seen on Polaris and Tonga, though it's basically a function of the memory configuration so could affect other parts as well. Fixes piglit "unaligned-blit * stencil downsample" and various "fbo-depth-array stencil" tests. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 10:43:52 +02:00
Nicolai Hähnle	68fe270e71	gallium/radeon: allocate only the required plane for flushed depth Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 10:43:52 +02:00
Nicolai Hähnle	1a0a8efcce	radeonsi: decompress to flushed depth texture when required v2: s/dirty_level_mask/stencil_dirty_level_mask/ in stencil case Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 10:43:51 +02:00
Nicolai Hähnle	4b7961da77	radeonsi: extract DB->CB copy logic into its own function Also clean up some of the looping. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 10:43:51 +02:00
Nicolai Hähnle	18cc825fb9	radeonsi: sample from flushed depth texture when required Note that this has no effect yet. A case where can_sample_z/s can be false in radeonsi will be added in a later patch. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 10:43:51 +02:00
Nicolai Hähnle	f2eb34f82f	gallium/radeon: replace is_flushing_texture with db_compatible This is a left-over of when I considered generalizing the separate stencil support. I do prefer the new name since it emphasizes what flushing vs. non-flushing means from a functional point-of-view, namely special handling of the texture format. v2: adjust r600_init_color_surface as well Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 10:43:48 +02:00
Nicolai Hähnle	dd65126153	gallium/radeon: add can_sample_z/s flags for textures v2: adjust r600_init_color_surface as well Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 10:43:43 +02:00
Nicolai Hähnle	065eeb79f7	radeonsi: correctly mark levels of 3D textures as fully decompressed Account for the fact that max_layer is minified for higher levels. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 10:42:49 +02:00
Nicolai Hähnle	19f8d2a843	gallium/radeon/winsyses: remove unused stencil_offset Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 10:42:49 +02:00
Nicolai Hähnle	3a1da559c5	gallium/radeon: remove redundant null-pointer check v2: keep using r600_texture_reference Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 10:42:48 +02:00
Nicolai Hähnle	5b87eef031	gallium/radeon: print StencilLayout only once It is the same for all levels. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 10:42:48 +02:00
Nicolai Hähnle	bae066c3f0	gallium/radeon: flush stdout after printing texture information Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 10:42:48 +02:00
Ilia Mirkin	a37e46323c	glsl: don't try to lower non-gl builtins as if they were gl_FragData If a shader has an output array, it will get treated as though it were gl_FragData and rewritten into gl_out_FragData instances. We only want this to happen on the actual gl_FragData and not everything else. This is a small part of the problem pointed out by the below bug. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96765 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-07-05 21:22:01 -04:00
Ian Romanick	795d8dff89	glsl: Document and enforce restriction on type values Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-07-05 17:55:29 -07:00
Ian Romanick	3119871bd9	glsl: Pack integer and double varyings as flat even if interpolation mode is none v2: Also update varying_matches::compute_packing_class(). Suggested by Timothy Arceri. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Cc: Gregory Hainaut <gregory.hainaut@gmail.com> Cc: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-05 16:58:27 -07:00
Ian Romanick	73a6a4ce49	mesa: Strip arrayness from interface block names in some IO validation Outputs from the vertex shader need to be able to match per-vertex-arrayed inputs of later stages. Acomplish this by stripping one level of arrayness from the names and types of outputs going to a per-vertex-arrayed stage. v2: Add missing checks for TESS_EVAL->GEOMETRY. Noticed by Timothy Arceri. v3: Use a slightly simpler stage check suggested by Ilia. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Cc: Gregory Hainaut <gregory.hainaut@gmail.com> Cc: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-05 16:58:27 -07:00
Charmaine Lee	32651c67d1	svga: avoid emitting redundant DXSetRenderTargets command Tested with Lightsmark2008, MTT piglit, glretrace, conform. Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-07-05 16:58:29 -06:00
Leo Liu	aa7d42a5f9	radeon/vce: update encRefPic addr and array mode to tiled Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-07-05 09:15:50 -04:00
Leo Liu	e560a11b87	radeon/vce: increase cpb height alignment Height should be aligned with 2 macroblocks, thus making safer for tiled mode Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-07-05 09:15:47 -04:00
Iago Toral Quiroga	fa0654fc3c	i965: Remove trailing whitespace Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-07-05 14:06:37 +02:00
Iago Toral Quiroga	d92ac67126	i965: Make inline function static Without this the i965 driver fails to load. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-07-05 14:05:58 +02:00
Emil Velikov	cbc37f72e3	anv: install the intel_icd.json to ${datarootdir} by default As mentioned by the spec (and used by Archlinux and Debian) default to ${datarootdir} as opposed to ${sysconfdir} for the default location. Cc: Jason Ekstrand <jason@jlekstrand.net> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-05 12:17:34 +01:00
Emil Velikov	744d0d8f3b	swr: automake: don't ship LLVM version specific generated sources Otherwise things will fail to build, if the builder is using another version of LLVM. v2: annotate all the dependencies of builder_gen.h v3: clean the generated files as needed v4: comment cleanups (Tim) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Tested-by: Tim Rowley <timothy.o.rowley@intel.com> Tested-by: Chuck Atkins <chuck.atkins@kitware.com> (v2) Reported-by: Chuck Atkins <chuck.atkins@kitware.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-05 12:17:05 +01:00
Emil Velikov	22e9357028	automake: don't mandate git_sha1.h/MESA_GIT_SHA1 It has proven subtle to get it right both from the build side POV (see commit list below) and builders due to their varying workflows. Furthermore it does not fully fulfil the reason why it was enforced - to detect uniqueness between different builds, in order to distinguish and invalidate Vulkan/GL caches. With that having a much better solution (previous commit) we can drop this solution. This effectively reverts the following commits: `359d9dfec3` ("mesa: automake: add directory prefix for git_sha1.h") `2c424e00c3` ("mesa: automake: ensure that git_sha1.h.tmp has the right attributes") `b7f7ec7843` ("mesa: automake: distclean git_sha1.h when building OOT") `8229fe68b5` ("automake: get in-tree `make distclean' working again.") Cc: Timo Aaltonen <tjaalton@debian.org> Cc: Haixia Shi <hshi@chromium.org> Cc: Jason Ekstrand <jason@jlekstrand.net> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-07-05 12:16:20 +01:00
Emil Velikov	e5c1229a9a	anv: automake: indent with tabs and not spaces Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-07-05 12:16:06 +01:00
Emil Velikov	addb099ce8	anv: use cache uuid based on the build timestamp. Do not rely on the git sha1: - its current truncated form makes it less unique - it does not attribute for local (Vulkand or otherwise) changes Use a timestamp produced at the time of build. It's perfectly unique, unless someone explicitly thinkers with their system clock. Even then chances of producing the exact same one are very small, if not zero. v2: Remove .tmp rule. Its not needed since we want for the header to be regenerated on each time we call make (Eric). v3: - Honour SOURCE_DATE_EPOCH, to make the build reproducible (Michel) - Replace the generated header with a define, to prevent needless builds on consecutive `make' and/or `make install' calls. (Dave) v4: - Keep the timestamp generation at make time. (Jason) v5: - Ensure that file is regenerated on incremental builds. Cc: Michel Dänzer <michel@daenzer.net> Cc: Dave Airlie <airlied@gmail.com> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-05 12:15:23 +01:00
Emil Velikov	f98530b739	clover: conditionally use MESA_GIT_SHA1 Considering how hard/annoying it was for many peoples' workflow to properly generate the macro, it will be demoted to conditionally available with follow-up commits. v2: Kill off gracious blank line (Vedran). Cc: mesa-stable@lists.freedesktop.org Cc: Vedran Miletić <vedran@miletic.net> Cc: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1) Reviewed-by: Vedran Miletić <vedran@miletic.net>	2016-07-05 12:14:34 +01:00
Timothy Arceri	9c9e3e7ee1	mesa: stop copying SamplerUnits twice The call to _mesa_update_shader_textures_used() already takes care of copying for us. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-07-05 20:18:05 +10:00
Timothy Arceri	25a32c2cbf	mesa: make attribute binding message more useful Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-07-05 20:18:05 +10:00
Timothy Arceri	8f1ca0ee3f	i965: make more effective use of SamplersUsed Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-07-05 20:18:05 +10:00
Timothy Arceri	51f912786f	glsl: stop allocating memory for UBOs during linking This just stops counting and assigning a storage location for these uniforms, the count is only used to create the uniform storage. These uniform types don't use this storage. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-07-05 20:18:05 +10:00
Timothy Arceri	549b9b12fc	glsl: mark link_uniform_blocks_are_compatible() as static Missed this when doing `6d1a59d15b`. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-07-05 20:18:05 +10:00
Timothy Arceri	30812e90d1	mesa: fix build error Fix build error cased by `6a524c76f5`.	2016-07-05 18:42:06 +10:00
Gregory Hainaut	6a524c76f5	mesa: faster validation of sampler unit mapping for SSO Code was inspired from _mesa_update_shader_textures_used However unlike _mesa_update_shader_textures_used that only check for a single stage, it will check all stages. It avoids to loop on all uniforms, only active samplers are checked. For my use case: high FS frequency switches with few samplers. Perf event (relative to nouveau_dri.so) goes from 5.01% to 1.68% for the _mesa_sampler_uniforms_pipeline_are_valid function. Signed-off-by: Gregory Hainaut <gregory.hainaut@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-07-05 16:44:31 +10:00
Dave Airlie	cb728df967	Revert "st/glsl_to_tgsi: don't increase immediate index by 1." This reverts commit `27d456cc87`. DOH, what seems right and what is right with fp64 are always two different things. This regressed: spec@arb_gpu_shader_fp64@shader_storage@layout-std140-fp64-mixed-shader on radeonsi Reported-by: Michel Dänzer <michel@daenzer.net> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-07-05 10:25:29 +10:00
Samuel Pitoiset	c1fb3290a6	nvc0/ir: rename NVE4_SU_INFO_XXX to NVC0_SU_INFO_XXX While we are at it, fix a typo inside the comment which describes what those constants are for. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-05 01:44:15 +02:00
Samuel Pitoiset	f3b9fff3c3	nvc0/ir: reset the base offset for indirect images accesses In presence of an indirect image access, the base offset should be zeroed because the stride will be computed twice. This is a pretty rare situation but it can happen when tex.r > 0. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-07-05 01:44:12 +02:00
Samuel Pitoiset	cb828b7b18	gm107/ir: fix sign bit emission for FADD32I When emitting OP_SUB, the sign bit for FADD and FADD32I is not at the same position. It's at position 45 for FADD but 51 for FADD32I. This fixes the following piglit test: tests/spec/arb_fragment_program/fdo30337b.shader_test Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>	2016-07-05 01:44:08 +02:00
Eric Anholt	ac772b24a1	vc4: Regularize instruction emit macros ALU0 didn't have the _dest variant, and ALU2 didn't unset the def the way ALU1 did. This should make the ALU[012] macros much clearer, by moving most of their contents to vc4_qir.c	2016-07-04 16:33:22 -07:00
Eric Anholt	8a52f03f5d	vc4: Enable dead CF elimination. Now that we're about to start generating control flow in our NIR, we want this in place. It optimizes things frequently in the CS, when the GL VS has control flow that doesn't affect the vertex position.	2016-07-04 16:33:22 -07:00
Eric Anholt	8f2af4763a	vc4: Optimize out redundant SF updates. Tiny change on shader-db currently, but it will be important when we start emitting a lot of SFs from the same variable as part of control flow support. total instructions in shared programs: 89463 -> 89430 (-0.04%) instructions in affected programs: 1522 -> 1489 (-2.17%) total estimated cycles in shared programs: 250060 -> 250015 (-0.02%) estimated cycles in affected programs: 8568 -> 8523 (-0.53%)	2016-07-04 16:33:22 -07:00
Eric Anholt	200b4e4bd5	vc4: Move SF removal to a separate peephole pass. The DCE pass is going to change significantly to handle control flow, while we don't really need to change it for the SF handling. We also need to add some more SF peephole optimization for SF updates generated by control flow support. No change on shader-db.	2016-07-04 16:33:22 -07:00
Eric Anholt	aa76ba6f2f	vc4: DCE instructions with a NULL destination. I'm going to add an optimization for redundant SF update removal, which will just remove the SF and leave us (in many cases) with an instruction with a NULL destination and no side effects. Rather than teaching that pass whether the whole instruction can be removed, leave that responsibility to this pass.	2016-07-04 16:33:22 -07:00
Eric Anholt	2a8973fb78	vc4: Mark texturing setup instructions as having side effects. We need to not DCE them even though they don't have a destination in QIR. We also shouldn't relocate them in vc4_opt_vpm. Neither of these things happen, but I'm about to make DCE consider instructions with a NULL destination.	2016-07-04 16:33:22 -07:00
Eric Anholt	44df374a9c	vc4: Fix a pasteo in scheduling condition flag usage. Noticed by code inspection. This hasn't been too big of a deal, because our cond usages all start out as adder ops, either MOVs or the FTOI for Z writes. MOVs can get converted to mul ops during scheduling, but apparently we hadn't hit this.	2016-07-04 16:33:22 -07:00
Eric Anholt	eaa53f80d9	vc4: Drop the dead QIR_PACK() macro. This isn't used since we switched to using the dst.pack field instead of custom instructions.	2016-07-04 16:33:18 -07:00
Marek Olšák	5c92c21369	radeonsi: do compilation from si_create_shader_selector asynchronously Main shader parts and geometry shaders are compiled asynchronously by util_queue. si_create_shader_selector doesn't wait and returns. si_draw_vbo(si_shader_select) waits for completion. This has the best effect when shaders are compiled at app-loading time. It doesn't help much for shaders compiled on demand, even though VS+PS compilation should take as much as time as the bigger one of the two. If an app creates more shaders, at most 4 threads will be used to compile them. Debug output disables this for shader stats to be printed in the correct order. (We could go even further and build variants asynchronously too, then emit draw calls without waiting and emit incomplete shader states, then force IB chaining to give the compiler more time, then sync the compilation at the IB flush and patch the IB with correct shader states. This is great for compilation before draw calls, but there are some difficulties such as scratch and tess states requiring the compiler output, and an on-disk shader cache will likely be a much better and simpler solution.) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:13 +02:00
Marek Olšák	84824935cf	radeonsi: don't lock shader cache mutex during compilation to allow multiple shaders to be compiled simultaneously. ALso, shader-db can again use all 4 cores. v2: Remove the pipe_mutex_unlock call in the error path. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)	2016-07-05 00:47:13 +02:00
Marek Olšák	850cd953b1	radeonsi: separate the compilation chunk of si_create_shader_selector The function interface is ready to be used by util_queue. Also, si_shader_select_with_key can no longer accept si_context. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:13 +02:00
Marek Olšák	6781a2a994	radeonsi: move LLVMTargetMachineRef creation to a separate function Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:13 +02:00
Marek Olšák	8a4ace4a47	gallium/radeon: add and use radeon_info::max_alloc_size (v2) v2: - squashed the patches - use INT_MAX - clamp max_const_buffer_size - check the DRM version in radeon Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Vedran Miletić <vedran@miletic.net>	2016-07-05 00:47:13 +02:00
Marek Olšák	027ad71b57	radeonsi: print LLVM IRs to ddebug logs Getting LLVM IRs of hanging shaders have never been easier. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:13 +02:00
Marek Olšák	28a03be06b	radeonsi: enable string markers and record apitrace call numbers Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:13 +02:00
Marek Olšák	642cf400aa	ddebug: add an option to dump info about a specific apitrace call Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:12 +02:00
Marek Olšák	1daec2b795	ddebug: implement pipe_context::generate_mipmap Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:12 +02:00
Marek Olšák	50b2235478	ddebug: record and dump apitrace call numbers Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:12 +02:00
Marek Olšák	861ecf1ca9	ddebug: implement emit_string_marker and remove some obsolete comments Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:12 +02:00
Marek Olšák	a446c40e0a	gallium/radeon: remove unused code - radeon_llvm_util.* Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:12 +02:00
Marek Olšák	eaccc4e8c8	radeonsi: keep using v_rcp_f32 for division in future LLVM (v2) This will be needed after some LLVM changes that haven't landed yet. v2: - use LLVMIsConstant to fix an LLVM assertion failure. LLVMSetMetadata doesn't work with constants. - don't set float metadata as string Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:12 +02:00
Marek Olšák	1c00086746	radeonsi: remove an obsolete comment It's not true. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:12 +02:00
Marek Olšák	4d1f32376d	radeonsi: don't interpolate colors if flatshading is enabled use v_interp_mov for those Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:12 +02:00
Marek Olšák	4accb02d7a	radeonsi: enable the barycentric optimization in all cases Handle the bc_optimize SGPR bit if both CENTER and CENTROID are enabled. This should increase the PS launch rate for big primitives with MSAA. Based on discussion with SPI guys. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:12 +02:00
Marek Olšák	476e9cee1d	radeonsi: compute only one set of interpolation (i,j) when MSAA is disabled This should increase the PS launch rate for shaders using at least 2 pairs of perspective (i,j) and same for linear. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:12 +02:00
Marek Olšák	a675c6a000	radeonsi: split ps.prolog.force_persample_interp into persp and linear bits This reduces the number of v_mov's in the prolog. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:12 +02:00
Marek Olšák	61010cfac0	radeonsi: don't dump the shader key for non-monolithic shaders early It's always zero. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:12 +02:00
Jan Vesely	015e2e0fce	r600g: Add double precision FMA ops Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96782 Fixes: `54c4d525da` ("r600g: Enable FMA on chips that support it") Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Tested-by: James Harvey <lothmordor@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-07-05 00:47:12 +02:00
Francesco Ansanelli	9827fc3f03	r600: fix duplicate 'const' declaration Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-04 21:26:31 +02:00
Topi Pohjolainen	2a60654f56	i965/urb: Allow blorp to record current settings This makes it possible to skip urb re-configuration if the subsequent renders agree with the settings. Also allows blorp to allocate the maximun amount of vs entries available. Core upload logic already knows how to calculate this. Helps one synthetic benchmark. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-04 20:43:11 +03:00
Topi Pohjolainen	39fdee6b2d	i965/blorp/gen7+: Do not trigger push constant space reconfig Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-04 20:43:11 +03:00
Topi Pohjolainen	cc2d0e64c0	i965/blorp/gen7+: Stop trashing push constant allocation Packet 3DSTATE_CONSTANT_PS is still emitted explicitly as ps stage itself is enabled and hardware may try to prefetch constants from the buffer. From the BSpec: 3D Pipeline - Windower - 3DSTATE_PUSH_CONSTANT_ALLOC_PS "Specifies the size of the PS constant buffer. This value will determine the amount of data the command stream can pre-fetch before the buffer is full." This is not possible on gen6. From the BSpec about 3DSTATE_CONSTANT_PS: "This packet must be followed by WM_STATE." Binding table emissions for stages other than PS can be now dropped, they were only needed for the 3DSTATE_CONSTANT_XS to be effective: From the BSpec: "The 3DSTATE_CONSTANT_* command is not committed to the shader unit until the corresponding (same shader) 3DSTATE_BINDING_TABLE_POINTER_* command is parsed." Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-04 20:43:11 +03:00
Topi Pohjolainen	175e095744	i965/blorp: Remove support for push constants Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-04 20:43:11 +03:00
Topi Pohjolainen	46e1132b80	i965/blorp: Use flat inputs instead of uniforms v2 (Jason): Use LOAD_INPUT() macro Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-04 20:43:11 +03:00
Topi Pohjolainen	07db95c24d	i965/blorp: Fix the size requirement for vertex elements v2: Rebased as this is needed before flat inputs are enabled Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-04 20:43:11 +03:00
Topi Pohjolainen	741a245ae4	i965/blorp: Load tranformation coordinates as vec4 In preparation for loading as flat vertex input. v2: Use LOAD_INPUT() macro Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-04 20:43:11 +03:00
Topi Pohjolainen	01f2f364d4	i965/blorp: Rename LOAD_UNIFORM to LOAD_INPUT Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-04 20:43:11 +03:00
Topi Pohjolainen	641868103c	i965/blorp: Organize pixel kill and blend/scaled inputs into vec4s In addition, as these are never used in parallel, add a few assertions. v2 (Jason): Skip some complexity by putting them into a union but pad rectangle grid into a vec4 instead. Also keep the LOAD_UNIFORM macro. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-04 20:43:11 +03:00
Lionel Landwerlin	dbbc4fb4cc	anv/wsi: create swapchain images using specified image usage The image usage specified by the caller of vkCreateSwapchainKHR should be passed onto the internal image creation. Otherwise the driver might later crash when the user tries to use the image as a combined sampler even though the creation was explicitly created with VK_IMAGE_USAGE_TRANSFER_SRC_BIT. Leaving the previous VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT as this might be expected even if the swapchain is created without any flag. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96791 Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-07-04 10:15:48 -07:00
Indrajit Das	51227b41c6	radeon/uvd: fix overflow error while calculating bit stream buffer size Reviewed-by: Christian König <christian.koenig@amd.com>	2016-07-04 11:38:05 +02:00
Topi Pohjolainen	9e3774a460	i965/blorp: Prepare for more than two vertex attributes Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-04 09:05:02 +03:00
Topi Pohjolainen	e762354309	i965/blorp: Tell vertex fetcher about flat inputs Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-04 09:04:38 +03:00
Topi Pohjolainen	89e6b4ef5d	i965/blorp: Add support for flat input buffer Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-04 09:04:00 +03:00
Topi Pohjolainen	9b2fa17e97	i965/blorp: Store input read mask Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-04 09:03:41 +03:00
Topi Pohjolainen	73f78ab44b	i965/blorp: Rename push constants to inputs Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-04 08:37:51 +03:00
Topi Pohjolainen	f2c472fcb3	i965/blorp: Use core vertex buffer state setup Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-04 08:37:44 +03:00
Topi Pohjolainen	4f7e68799f	i965/blorp: Split vertex data and element setup Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-04 08:33:41 +03:00
Topi Pohjolainen	575c8cbb54	i965: Unify vertex buffer setup On gen >= 8 one doesn't provide ending address but number of bytes available. This is relative to the given offset. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-04 08:33:41 +03:00
Topi Pohjolainen	bdab945edd	i965/draw: Expose vertex buffer state setup Also change the interface to use start and end offsets. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-07-04 08:33:41 +03:00
Rob Clark	7295428e41	freedreno: fix crash on smaller gpus and higher resolutions Devices with smaller GMEM size need more tiles. On db410c at 2048x1152, glmark2 shadow needed ~330 tiles for fullscreen. Lets bump it up to 512. (Maybe with MRT you could end up needing more, but at that point things are probably going to be painfully slow.) Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-03 11:16:28 -04:00
Rob Clark	01ccb0d91e	i965: don't drop const initializers in vector splitting Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-02 09:00:19 -04:00
Rob Clark	f78a6b1ce3	glsl: add driconf to zero-init unintialized vars Some games are sloppy.. perhaps because it is defined behavior for DX or perhaps because nv blob driver defaults things to zero. So add driconf param to force uninitialized variables to default to zero. This issue was observed with rust, from steam store. But has surfaced elsewhere in the past. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-02 09:00:19 -04:00
Rob Clark	202710d110	freedreno/ir3: support glsl linking for cmdline compiler For .vert/.frag, now multiple can be specified on the cmdline for purposes of linking, and the last one specified is the one that is fed into the ir3 backend (and dumped along the way if --verbose is specified) Without this, varyings in frag shaders would appear as undefined. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-07-02 09:00:19 -04:00
Rob Clark	07cfe4e6aa	glsl/standalone: initialize MaxUserAssignableUniformLocations Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-07-02 09:00:19 -04:00
Rob Clark	1759eb1d19	freedreno: update valid_buffer_range for SO buffers Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-02 08:58:50 -04:00
Rob Clark	da39ac9c51	freedreno/ir3: support non-user_buffer consts Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-02 08:58:50 -04:00
Rob Clark	2081c1ecc0	freedreno/a2xx: move setup/restore cmds into binning pass Rather than doing a separate submit at context create, move these cmds to before first tile, as is done on a3xx/a4xx. Otherwise state can be overwritten by other contexts. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-02 08:58:50 -04:00
Rob Clark	2c3b54c278	freedreno: pass index buffer as a pipe_resource This will be useful in a following patch. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-02 08:58:50 -04:00
Rob Clark	88cc11e971	freedreno: switch emit_const_bo() to take prsc's We can push the unwrap of pipe_resource down. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-02 08:58:50 -04:00
Hans de Goede	d7dfd4cb51	nv30: Fix "array subscript is below array bounds" compiler warning gcc6 does not like the trick where we point to one entry before the array start and then start a while with a pre-increment. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-02 12:21:28 +02:00
Hans de Goede	110ef733dc	nouveau: Fix a couple of "foo may be used uninitialized' compiler warnings These are all new false positives with gcc6. In nouveau_compiler.c: gcc6 no longer assumes that passing a pointer to a variable into a function initialises that variable. In nv50_ir_from_tgsi.cpp op and mode are not set if there are 0 enabled dst channels, this never happens, but gcc cannot know this. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-02 12:21:28 +02:00
Hans de Goede	1f3c8f3664	nouveau: Fix gcc6 / c++11 auto_ptr deprecation compiler warnings Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-07-02 12:21:28 +02:00
Hans de Goede	2aa1197eee	nouveau: Add support for SV_WORK_DIM Add support for SV_WORK_DIM for nvc0 and nve4. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-07-02 12:21:28 +02:00
Hans de Goede	3345f70f63	nvc0: Make NVC0_CB_AUX_GRID_INFO take an index argument This brings it inline with the other macros like NVC0_CB_AUX_UBO_INFO and NVC0_CB_AUX_TEX_INFO. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-07-02 12:21:28 +02:00
Hans de Goede	ef8e50a841	clover: Pass work_dim parameter of clEnqueueNDRangeKernel() to driver In order to implement get_work_dim() the driver may need to know the clEnqueueNDRangeKernel() work_dim parameter, so pass it to the driver. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-07-02 12:21:28 +02:00
Hans de Goede	d386cef246	tgsi: Add WORK_DIM System Value Add a new WORK_DIM SV type, this is will return the grid dimensions (1-4) for compute (opencl) kernels. This is necessary to implement the opencl get_work_dim() function. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-07-02 12:21:28 +02:00
Alejandro Piñeiro	da7efadf04	mesa/main: fix error checking logic on CopyImageSubData For the case (both src or dst) where we had a texobject, but the texobject target was not the same that the method target, this spec paragraph was appplied: /* Section 18.3.2 (Copying Between Images) of the OpenGL 4.5 Core * Profile spec says: * * "An INVALID_VALUE error is generated if either name does not * correspond to a valid renderbuffer or texture object according * to the corresponding target parameter." / But for that case, the correct spec paragraph should be: / Section 18.3.2 (Copying Between Images) of the OpenGL 4.5 Core * Profile spec says: * * "An INVALID_ENUM error is generated if either target is * not RENDERBUFFER or a valid non-proxy texture target; * is TEXTURE_BUFFER or one of the cubemap face selectors * described in table 8.18; or if the target does not * match the type of the object." */ specifically the last sentence: "or if the target does not match the type of the object". This patch fixes the error returned (s/INVALID/ENUM) for that case, and moves up the INVALID_VALUE spec paragraph, as that case (invalid texture object) was handled before. Fixes: GL44-CTS.copy_image.target_miss_match Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-02 11:54:40 +02:00
Dave Airlie	27d456cc87	st/glsl_to_tgsi: don't increase immediate index by 1. Immediates are stored into a separate table, and are consolidated, so if we get an immediate we don't need to offset it as the index it has is correct. Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-07-02 17:01:25 +10:00
Ilia Mirkin	6f4d35212b	st/mesa: get max supported number of image samples from driver Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-01 23:01:03 -04:00
Ilia Mirkin	b2b5075e04	nvc0: fix up image support for allowing multiple samples Basically we just have to scale up the coordinates and then add the relevant sample offset. The code to handle this was already largely present from Christoph's earlier attempts to pipe images through back in the dark ages, this just hooks it all up. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-01 23:01:02 -04:00
Nicolai Hähnle	07cc838b10	st/mesa: check the texture image level in st_texture_match_image Otherwise, 1x1 images of arbitrarily high level are accepted. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96639#add_comment Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-01 17:55:19 +02:00
Nicolai Hähnle	0ba053b34c	st/mesa: an incomplete texture may have a zero-size first image Fixes a regression introduced by commit `42624ea83` which triggered an assertion in dEQP-GLES2.functional.texture.completeness.cube.not_positive_level_0 While stImage must have a non-zero size as verified by the caller, we also look at the size of the base image in an attempt to make a better guess at the level0 size (this is important when the base image size is odd). However, the base image may have a zero size even when it exists. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96629 Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-01 17:54:40 +02:00
Nayan Deshmukh	de772bc060	st/vdpau: use bicubic filter for scaling(v6.1) use bicubic filtering as high quality scaling L1. v2: fix a typo and add a newline to code v3: -render the unscaled image on a temporary surface (Christian) -apply noise reduction and sharpness filter on unscaled surface -render the final scaled surface using bicubic interpolation v4: support high quality scaling v5: set dst_area and dst_clip in bicubic filter v6: set buffer layer before setting dst_area v6.1: add PIPE_BIND_LINEAR when creating resource Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-07-01 12:54:58 +02:00
Nayan Deshmukh	872dd9ad15	vl: add a bicubic interpolation filter(v5) This is a shader based bicubic interpolater which uses cubic Hermite spline algorithm. v2: set dst_area and dst_clip during scaling (Christian) v3: clear the render target before rendering v4: intialize offsets while initializing shaders use a constant buffer to send dst_size to frag shader small changes to reduce calculation in shader v5: send half pixel offset instead of sending dst_size Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-07-01 12:54:33 +02:00
Vinson Lee	3fea592c4e	mesa/st: Use 'struct nir_shader' instead of 'nir_shader'. Fix this build error with GCC 4.4. CC state_tracker/st_nir_lower_builtin.lo In file included from state_tracker/st_nir_lower_builtin.c:61: state_tracker/st_nir.h:34: error: redefinition of typedef ‘nir_shader’ ../../src/compiler/nir/nir.h:1830: note: previous declaration of ‘nir_shader’ was here Suggested-by: Rob Clark <robdclark@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96235 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-07-01 00:19:24 -07:00
Alejandro Piñeiro	a97ee60926	docs: update MESA_DEBUG envvar documentation. silent, flush, incomplete_tex and incomplete_fbo flags were not documented (see src/mesa/main.debug.c for more info). FP is not checked anymore. v2 (Brian Paul): * MESA_DEBUG accepts a comma-separated list of parameters. * Clarify how MESA_DEBUG behaves with mesa debug and release builds. * Updated wording. v3: Better wording for one paragraph (Brian Paul) Reviewed-by: Brian Paul <brianp@vmware.com>	2016-07-01 08:15:15 +02:00
Alejandro Piñeiro	5e553a6bb3	i965: intel_texture_barrier reimplemented Fixes: GL44-CTS.texture_barrier_ARB.same-texel-rw-multipass On Haswell, Broadwell and Skylake (note that in order to execute that test, it is needed to override GL and GLSL versions). On gen6 this test was already working without this change. It keeps working after it. This commit replaces the call to brw_emit_mi_flush for gen6+ with two calls to brw_emit_pipe_control_flush: * The first one with RENDER_TARGET_FLUSH and CS_STALL set to initiate a render cache flush after any concurrent rendering completes and cause the CS to stop parsing commands until the render cache becomes coherent with memory. * The second one have TEXTURE_CACHE_INVALIDATE set (and no CS stall) to clean up any stale data from the sampler caches before rendering continues. Didn't touch gen4-5, basically because I don't have a way to test them. More info on commits: `0aa4f99f56` `72473658c5` Thanks to Curro to help to tracking this down, as the root case was a hw race condition. v2: use two calls to pipe_control_flush instead of a combination of gen7_emit_cs_stall_flush and brw_emit_mi_flush calls (Curro) v3: no need to const cache invalidation (Curro) Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-07-01 08:09:27 +02:00
Ilia Mirkin	51ca57df01	nv30: go back to not using viewport validate function for swtnl The output of draw requires a null viewport transform, which the regular code is ill-equiped to do. Reinstate the original settings in the render path, and add setting of the viewport clip polygon based on fb width/height (as that is all taken care of by draw). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-01 01:04:10 -04:00
Ilia Mirkin	71609c9954	nv30: fix viewport clipping settings to be based on viewport, not rt This fixes a ton of "clip" dEQP GLES2 tests, as well as triangle-guardband-viewport in piglit. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-07-01 00:02:23 -04:00
Brian Paul	c823ff8dfb	gallium/util: check for window cliprects in util_can_blit_via_copy_region() We can't blit with resource_copy_region() if there are window clip rects. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-06-30 18:19:09 -06:00
Chuck Atkins	d8d6091a84	gallium: Force blend color to 16-byte alignment This aligns the 4-element color float array to 16 byte boundaries. This should allow compiler vectorizers to generate better optimizations. Also fixes broken vectorization generated by Intel compiler. v2: Fixed indentation and added a lengthy comment explaining the reason for the alignment. Cc: <mesa-stable@lists.freedesktop.org> Reported-by: Tim Rowley <timothy.o.rowley@intel.com> Tested-by: Tim Rowley <timothy.o.rowley@intel.com> Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com> Acked-by: Roland Scheidegger <sroland@vmware.com>	2016-06-30 17:04:41 -05:00
Chuck Atkins	c1bf6692be	swr: Refactor checks for compiler feature flags Encapsulate the test for which flags are needed to get a compiler to support certain features. Along with this, give various options to try for AVX and AVX2 support. Ideally we want to use specific instruction set feature flags, like -mavx2 for instance instead of -march=haswell, but the flags required for certain compilers are different. This allows, for AVX2 for instance, GCC to use -mavx2 -mfma -mbmi2 -mf16c while the Intel compiler which doesn't support those flags can fall back to using -march=core-avx2. This addresses a bug where the Intel compiler will silently ignore the AVX2 instruction feature flags and then potentially fail to build. v2: Pass preprocessor-check argument as true-state instead of false-state for clarity. v3: Reduce AVX2 define test to just __AVX2__. Additional defines suchas __FMA__, __BMI2__, and __F16C__ appear to be inconsistently defined w.r.t thier availability. v4: Fix C++11 flags being added globally and add more logic to swr_require_cxx_feature_flags Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com> Tested-by: Tim Rowley <timothy.o.rowley@Intel.com> Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com>	2016-06-30 16:55:01 -05:00
Brian Paul	eb79b2b331	st/wgl: make own_mutex() non-static Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-06-30 15:29:07 -06:00
Andres Gomez	e0f4504adf	glsl: atomic counters are different than their uniforms The linker deals with atomic counters in terms of uniforms but the data structure are called after the atomic counters. Renamed the data structures used in the linker for disambiguation. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com>	2016-06-30 23:55:32 +03:00
Andres Gomez	0f00c6dd77	glsl: count atomic counters correctly Currently the linker uses the uniform count for the total number of atomic counters. However uniforms don't include the innermost array dimension in their count, but atomic counters are expected to include them. Although the spec doesn't directly state this, it's clear how offsets will be assigned for arrays. From OpenGL 4.2 (Core Profile), page 98: " * Arrays of type atomic_uint are stored in memory by element order, with array element member zero at the lowest offset. The difference in offsets between each pair of elements in the array in basic machine units is referred to as the array stride, and is constant across the entire array. The stride can be queried by calling GetIntegerv with a pname of ATOMIC_COUNTER_- ARRAY_STRIDE after a program is linked." From that it is clear how arrays of atomic counters will interact with GL_MAX_ATOMIC_COUNTER_BUFFER_SIZE. For other kinds of uniforms it's also clear that each entry in an array counts against the relevant limits. Hence, although inferred, this is the expected behavior. Fixes GL44-CTS.arrays_of_arrays_gl.AtomicDeclaration Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com>	2016-06-30 23:55:32 +03:00
Brian Paul	c84444ea85	svga: use SVGA3D_vgpu10_BufferCopy() for buffer copies So that we do copies host-side rather than in the guest with map/memcpy. Tested with piglit arb_copy_buffer-subdata-sync test and new arb_copy_buffer-intra-buffer-copy test. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Acked-by: Roland Scheidegger <sroland@vmware.com>	2016-06-30 14:32:11 -06:00
Brian Paul	29a38f37ee	svga: add SVGA3D_vgpu10_BufferCopy() Acked-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-30 14:32:10 -06:00
Brian Paul	88a344253c	svga: flush buffers when mapping for reading With host-side buffer copies (via SVGA3D_vgpu10_BufferCopy()) we have to make sure any pending map-write operations are completed before reading if the buffer is dirty. Otherwise the ReadbackSubResource operation could get stale data from the host buffer. This allows the piglit arb_copy_buffer-subdata-sync test to pass when we start using the SVGA3D_vgpu10_BufferCopy command. v2: check the sbuf->dirty flag in the outer conditional, per Charmaine. Acked-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-30 14:32:10 -06:00
Neha Bhende	fa2cdd973d	svga: enable ARB_copy_image extension in the driver Reviewed-by: Brian Paul <brianp@vmware.com> Acked-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-30 14:32:09 -06:00
Brian Paul	4a54514958	svga: try blitting with copy region in more cases We previously could do blits with util_resource_copy_region() when doing 'loose' format checking. Also do blits with util_resource_copy_region() when the blit src/dst formats (not the underlying resources) exactly match. Needed for GL_ARB_copy_image. Acked-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-30 14:32:08 -06:00
Brian Paul	92b44efef4	svga: use copy_region_vgpu10() for region copies when possible v2: remove extra svga_define_texture_level() call, per Charmaine. Acked-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-30 14:32:08 -06:00
Neha Bhende	1d0be402c7	svga: use vgpu10 CopyRegion command when possible Do texture->texture copies host-side with this command when possible. Use the previous software fallback otherwise. Reviewed-by: Brian Paul <brianp@vmware.com> Acked-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-30 14:32:07 -06:00
Brian Paul	3a3c3d124a	svga: set render target flag for snorm surfaces We don't normally support rendering to SNORM surfaces, but with GL_ARB_copy_image we can copy to them if we treat them as typeless and use a UNORM surface view. Acked-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-30 14:32:07 -06:00
Brian Paul	46e7355a13	svga: add new svga_format_is_uncompressed_snorm() helper Acked-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-30 14:32:07 -06:00
Brian Paul	68388043f3	svga: adjust sampler view format for RGBX We previously handled the case of a RGBX sampler view of a RGBA surface. Add the reverse case too. For GL_ARB_copy_image. Acked-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-30 14:32:07 -06:00
Brian Paul	1049002eae	svga: adjust render target view format for RGBX For GL_ARB_copy_image we may be asked to create an RGBA view of a RGBX surface. Use an RGBX view format for that case. Acked-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-30 14:32:07 -06:00
Neha Bhende	429ace2fbc	svga: don't advertise support for R32G32B32_UINT/SINT surface formats We want to be able to copy between different 32-bit, 3-channel surface formats for GL_ARB_copy_image but since we don't support R32G32B32_FLOAT for textures (it's not blendable and wouldn't work for render to texture) we can't support 32-bit, 3-channel integer formats. The state tracker will choose 4-channel formats instead. Fixes the piglit arb_copy_image-format test for several cases. Note: This change may need to be revisited if/when the texture_view exension is enabled in driver. Reviewed-by: Brian Paul <brianp@vmware.com> Acked-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-30 14:32:06 -06:00
Brian Paul	eb0ced74f6	svga: use untyped surface formats in most cases This allows us to do copies between different, but compatible, surface formats such as RGBA8_UNORM, RGBA8_SINT, RGBA8_UINT, etc. for GL_ARB_copy_image. Acked-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-30 14:32:06 -06:00
Brian Paul	5f1335878e	gallium/util: add tight_format_check param to util_can_blit_via_copy_region() The VMware driver will use this for implementing GL_ARB_copy_image. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-30 14:32:06 -06:00
Brian Paul	a029d9f074	gallium/util: simplify a few things in util_can_blit_via_copy_region() Since only the src box can have negative dims for flipping, just comparing the src/dst box sizes is enough to detect flips. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-30 14:32:06 -06:00
Brian Paul	5d31ea4b8f	gallium/util: new util_try_blit_via_copy_region() function Pulled out of the util_try_blit_via_copy_region() function. Subsequent changes build on this. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-30 14:32:06 -06:00
Neha Bhende	7988513ac3	svga: Fix failures caused in fedora 24 SVGA_3D_CMD_DX_GENRATE_MIPMAP & SVGA_3D_CMD_DX_SET_PREDICATION commands are not presents in fedora 24 kernel module. Because of this reason application like supertuxkart are not running. v2: Add few comments and code modifications suggested by Brian P. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-30 12:45:09 -06:00
Brian Paul	52f297d144	st/wgl: remove unneeded inline qualifiers No effect on size of the .o files (optimized build). Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-06-30 12:43:50 -06:00
Brian Paul	395ee18bac	st/wgl: add a stw_device::initialized field Set when the stw_dev object's initialization is completed. We test for this in the window callback function to avoid potential crashes on start-up in multi-threaded applications. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-06-30 12:43:50 -06:00
Brian Paul	128feef40e	st/wgl: refactor framebuffer locking code Split the old stw_framebuffer_reference() function into two new functions: stw_framebuffer_reference_locked() which increments the refcount and stw_framebuffer_release_locked() which decrements the refcount and destroys the buffer when the count hits zero. Original patch by Jose. Modified by Brian (clean-ups, lock assertion checks, etc). Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-06-30 12:43:50 -06:00
José Fonseca	25cccb5bec	st/wgl: rename curctx to old_ctx in stw_make_current() Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-30 12:43:49 -06:00
Brian Paul	24004a2435	st/wgl: release the pbuffer DC at the end of wglBindTexImageARB() Otherwise we were leaking DC GDI objects and if wglBindTexImageARB() was called enough we'd eventually hit the GDI limit of 10,000 objects. Things started failing at that point. v2: also release DC if we return early, per Charmaine. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-06-30 12:43:49 -06:00
Matt Turner	058c70bae1	mesa: Close fp on error path. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-30 11:08:39 -07:00
Matt Turner	e3d9125b77	i965: Simplify foreach_inst_in_block_safe() macro. We know what the end looks like without examining .tail: it's NULL. It's always NULL.	2016-06-30 11:08:39 -07:00
Andres Gomez	c4e47ab971	Revert "i965: get PrimitiveMode from the program rather than the shader struct" This reverts commit `644e015f0b`. PrimitiveMode from the program doesn't always hold a valid value that is neither of GL_TRIANGLES, GL_QUADS nor GL_ISOLINES when reaching this code. This caused regressions in the following CTS tests: GL44-CTS.stencil_texturing.functional GL44-CTS.shading_language_420pack.binding_images GL44-CTS.shading_language_420pack.binding_samplers GL44-CTS.shading_language_420pack.binding_uniform_single_block GL44-CTS.shading_language_420pack.implicit_conversions GL44-CTS.shading_language_420pack.initializer_list GL44-CTS.shading_language_420pack.length_of_vector_and_matrix GL44-CTS.shading_language_420pack.line_continuation Hence, we rather take it from the linked shader. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com>	2016-06-30 16:20:22 +03:00
Timothy Arceri	1591e668e1	glsl/mesa: move duplicate shader fields into new struct gl_shader_info Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-06-30 16:51:25 +10:00
Timothy Arceri	fd2b3da5c8	glsl/main: remove unused params and make function static Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-06-30 16:51:25 +10:00
Timothy Arceri	32c410d2df	glsl: simplify link_uniform_blocks() There is only ever one shader so simplify the input params. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-06-30 16:51:25 +10:00
Timothy Arceri	1fb8c6df88	glsl/mesa: split gl_shader in two There are two distinctly different uses of this struct. The first is to store GL shader objects. The second is to store information about a shader stage thats been linked. The two uses actually share few fields and there is clearly confusion about their use. For example the linked shaders map one to one with a program so can simply be destroyed along with the program. However previously we were calling reference counting on the linked shaders. We were also creating linked shaders with a name even though it is always 0 and called the driver version of the _mesa_new_shader() function unnecessarily for GL shader objects. Acked-by: Iago Toral Quiroga <itoral@igalia.com>	2016-06-30 16:51:25 +10:00
Timothy Arceri	378f07ccb5	mesa: don't print name in _mesa_append_uniforms_to_file() This is only used to print linked shaders which always have a name of 0 so this was pointless. Acked-by: Iago Toral Quiroga <itoral@igalia.com>	2016-06-30 16:51:25 +10:00
Timothy Arceri	e8c8aa0320	mesa: remove unreachable code from _mesa_write_shader_to_file() _mesa_write_shader_to_file() is only used to print gl shader objects so Program should never be set as it only gets set for linked shaders. Acked-by: Iago Toral Quiroga <itoral@igalia.com>	2016-06-30 16:51:25 +10:00
Timothy Arceri	9b41c743cc	glsl: pass symbols to find_matching_signature() rather than shader This will allow us to later split gl_shader into two structs. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-06-30 16:51:25 +10:00
Timothy Arceri	47f8381730	glsl: pass symbols rather than shader to _mesa_get_main_function_signature() This will allow us to split gl_shader into two different structs, one for shader objects and one for linked shaders. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-06-30 16:51:25 +10:00
Timothy Arceri	9e9d01cbe8	mesa: don't use drivers NewShader function when creating shader objects The drivers function only needs to be used when creating a struct for linked shaders. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-06-30 16:51:25 +10:00
Timothy Arceri	962933b6d4	glsl: make cross_validate_globals() more generic Rather than passing in gl_shader we now pass in the IR. This will allow us to later split gl_shader into two structs. One for use as a linked per stage shader struct and one for use as a GL shader object. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-06-30 16:51:25 +10:00
Ian Romanick	5921f372c8	mapi: Export all GLES 3.1 functions in libGLESv2.so Khronos recommends that the GLES 3.1 library also be called libGLESv2. It also requires that functions be statically linkable from that library. NOTE: Mesa has supported the EGL_KHR_get_all_proc_addresses extension since at least Mesa 10.5, so applications targeting Linux should use eglGetProcAddress to avoid problems running binaries on systems with older, non-GLES 3.1 libGLESv2 libraries. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Cc: Mike Gorchak <mike.gorchak.qnx@gmail.com> Reported-by: Mike Gorchak <mike.gorchak.qnx@gmail.com> Acked-by: Chad Versace <chad.versace@intel.com>	2016-06-29 14:28:59 -07:00
Chad Versace	d3a147ba40	i965: Use drmIoctl for DRM_I915_GETPARAM (v2) Stop using drmCommandWriteRead for such a simple ioctl. v2: Handle errno correctly. [ickle] Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-06-29 13:44:23 -07:00
sonjiang	b928ff6f62	radeon/uvd: fix a h265 context size bug Signed-off-by: sonjiang <sonny.jiang@amd.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-06-29 15:30:25 -04:00
sonjiang	5c80354a23	radeon/uvd: seperate uvd context buffer from DPB Signed-off-by: sonjiang <sonny.jiang@amd.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-06-29 15:30:20 -04:00
sonjiang	28f85eab49	radeon uvd add uvd fw version for amdgpu Signed-off-by: sonjiang <sonny.jiang@amd.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-06-29 15:30:14 -04:00
Samuel Pitoiset	fa10d1d674	nv50/ir: print EMIT subops in debug mode Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-06-29 20:37:38 +02:00
Samuel Pitoiset	a6d3b2e176	nv50/ir: print RSQ/RCP subops in debug mode Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-06-29 20:37:36 +02:00
Samuel Pitoiset	908ba19554	nv50/ir: print PIXLD subops in debug mode Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-06-29 20:37:33 +02:00
Samuel Pitoiset	c0d92078bb	nv50/ir: print SHFL subops in debug mode Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-06-29 20:37:18 +02:00
Rodrigo Vivi	85ea8deb26	i965: Removing PCI IDs that are no longer listed as Kabylake. This is unusual. Usually IDs listed on early stages of platform definition are kept there as reserved for later use. However these IDs here are not listed anymore in any of steppings and devices IDs tables for Kabylake on configurations overview section of BSpec. So it is better removing them before they become used in any other future platform. Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2016-06-29 11:14:19 -07:00
Rodrigo Vivi	bdff2e5547	i956: Add more Kabylake PCI IDs. The spec has been updated adding new PCI IDs. Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2016-06-29 11:14:19 -07:00
Marek Olšák	63f8d648f0	gallium/radeon: remove zombie textures kept alive by DCC stat gathering Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 20:12:00 +02:00
Marek Olšák	44906101c4	gallium/radeon: don't re-create queries for DCC stat gathering Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 20:12:00 +02:00
Marek Olšák	82b39f3521	gallium/radeon: assume X11 DRI3 can use at most 5 back buffers Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 20:12:00 +02:00
Marek Olšák	9ae41227c2	gallium/radeon: separate DCC starts as disabled (ps_draw_ratio = 0) DRI3: - Only slows clears can enable it for the first frame. - A good PS/draw ratio can enable it for other frames. DRI2: - Only slows clears can enable it for a frame. - Page-flipped color buffers are unref'd at the end of each frame, so it can't be enabled in any other way. - Relying on slow clears is sufficient for our synthetic benchmarks. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 20:12:00 +02:00
Marek Olšák	9fd4eff43c	gallium/radeon: R600_DEBUG=nodccfb disables separate DCC Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 20:12:00 +02:00
Marek Olšák	36cf5a57c2	gallium/radeon: add and use r600_texture_reference Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Vedran Miletić <vedran@miletic.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 20:12:00 +02:00
Marek Olšák	6da92df538	gallium/radeon: add a HUD query for PS draw ratio stats from separate DCC Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 20:12:00 +02:00
Marek Olšák	49e3c74cdd	gallium/radeon: add a heuristic enabling DCC for scanout surfaces (v2) DCC for displayable surfaces is allocated in a separate buffer and is enabled or disabled based on PS invocations from 2 frames ago (to let queries go idle) and the number of slow clears from the current frame. At least an equivalent of 5 fullscreen draws or slow clears must be done to enable DCC. (PS invocations / (width * height) + num_slow_clears >= 5) Pipeline statistic queries are always active if a color buffer that can have separate DCC is bound, even if separate DCC is disabled. That means the window color buffer is always monitored and DCC is enabled only when the situation is right. The tracking of per-texture queries in r600_common_context is quite ugly, but I don't see a better way. The first fast clear always enables DCC. DCC decompression can disable it. A later fast clear can enable it again. Enable/disable typically happens only once per frame. The impact is expected to be negligible because games usually don't have a high level of overdraw. DCC usually activates when too much blending is happening (smoke rendering) or when testing glClear performance and CMASK isn't supported (Stoney). v2: rename stuff, add assertions Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 20:12:00 +02:00
Marek Olšák	9124457bff	gallium/radeon: add state setup for a separate DCC buffer Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 20:12:00 +02:00
Marek Olšák	fa7c927625	radeonsi: always calculate DCC info even if it's not used immediately for a later use Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 20:12:00 +02:00
Marek Olšák	ebb9c7d7c4	radeonsi: unreference framebuffer state with set_framebuffer_state Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 20:12:00 +02:00
Marek Olšák	e607a6be2b	gallium/radeon: add flag R600_QUERY_HW_FLAG_BEGIN_RESUMES Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 20:12:00 +02:00
Chad Versace	a2ae888929	i965: Use intel_get_param() more often Replace some open-coded ioctls with intel_get_param(). This is just a cleanup. No change in behavior. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-29 09:34:21 -07:00
Chad Versace	844e0bd946	i965: Refactor intel_get_param() Replace the function's __DRIscreen parameter with struct intel_screen. The callsites feel more natural that way. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-29 09:34:21 -07:00
Marek Olšák	0c135a773f	radeonsi: don't advertise multisample shader images Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 16:34:22 +02:00
Marek Olšák	eff81cbc81	radeonsi: enable distributed tess on multi-SE parts only ported from Vulkan Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 16:34:22 +02:00
Marek Olšák	dd56d04568	radeonsi: set optimal VGT_HS_OFFCHIP_PARAM ported from Vulkan Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 16:34:22 +02:00
Marek Olšák	9a71bf8858	radeonsi: enable CU0 in each SE for LS-HS execution Offchip-only tessellation allows this. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 16:34:22 +02:00
Marek Olšák	4b11ef23b4	radeonsi: use conformant line rasterization AA lines are not completely correct (see TODO), but everything else should be. + 3 linestipple piglits Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 16:34:22 +02:00
Rob Herring	789ed13284	Android: add missing u_math.h include path for libmesa_isl Commit `87d062a940` ("i965: Fix shared local memory size for Gen9+.") added u_math.h include which broke the Android build: In file included from external/mesa3d/src/intel/isl/isl_storage_image.c:25: In file included from external/mesa3d/src/mesa/drivers/dri/i965/brw_compiler.h:29: external/mesa3d/src/mesa/main/macros.h:35:10: fatal error: 'util/u_math.h' file not found ^ Add the missing include paths for libmesa_isl. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Kenneth Garunke <kenneth@whitecape.org>	2016-06-28 12:48:46 -07:00
Charmaine Lee	6397c12f32	svga: force direct map for transfering multiple slices With commit `fb9fe35`, we start using transfer_inline_write for memcpy of TexSubImage. But SurfaceDMA command does not work well with texture array. This patch forces direct map when transfering multiple slices of a texture array. Fixes piglit regression "texelFetch fs sampler1DArray" Tested with MTT piglit, glretrace, conform. Reviewed-by: Sinclair Yeh <syeh@vmware.com>	2016-06-28 13:43:23 -06:00
Brian Paul	d65c4e22a8	svga: whitespace, line wrapping fixes in svga_surface.c	2016-06-28 13:43:23 -06:00
Samuel Pitoiset	cc97b6a34a	gm107/ir: make sure that flagsDef is set when emitting setcond Rely on the existence of a second destination when emitting a setcond flag is dangerous, because this doesn't mean that the flag has been correctly set. Instead rely on flagsDef like what emitX() does for flagsSrc. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>	2016-06-28 18:38:56 +02:00
Grazvydas Ignotas	234323558d	doc: improve INTEL_DEBUG documentation Remove 'reg' option that does not actually exist, elaborate more about 'sync' and add the missing options. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-28 07:21:07 -07:00
Marek Olšák	c1dbc563f4	radeonsi: set PA_SU_SMALL_PRIM_FILTER_CNTL register on Polaris This was missing. Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-06-28 15:47:13 +02:00
Boyuan Zhang	06f0a4d9ed	radeon/vce: use vce structure for vce 52 firmware Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-06-28 08:58:03 -04:00
Boyuan Zhang	533bd6ae17	radeon/vce: add vce structures Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-06-28 08:58:00 -04:00
Leo Liu	05d302ffe2	st/omx: fix decoder fillout for the OMX result buffer The call for vl_video_buffer_adjust_size is with wrong order of arguments, apparently it will have problem when interlaced false; The size of OMX result buffer depends on real size of clips, vl buffer dimension is aligned with 16, so 1080p(1920*1080) video will overflow the OMX buffer Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Tested-by: Julien Isorce <j.isorce@samsung.com>	2016-06-28 08:57:56 -04:00
Hans de Goede	459cc94507	pipe_loader_sw: Fix fd leak when instantiated via pipe_loader_sw_probe_kms Make pipe_loader_sw_probe_kms take ownership of the passed in fd, like pipe_loader_drm_probe_fd does. The only caller is dri_kms_init_screen which passes in a dupped fd, just like dri2_init_screen passes in a dupped fd to pipe_loader_drm_probe_fd. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-28 12:29:54 +02:00
Jan Vesely	87787e9079	clover: Fix kernel metadata retrieval after clang r273425 Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Acked-by: Francisco Jerez <currojerez@riseup.net>	2016-06-27 23:12:37 -07:00
Francisco Jerez	a8a966ddb5	clover/llvm: Fix copyright attribution of invocation.cpp. This file still only has my name on the copyright notice even though most of the code (likely more than 90% of it) was authored by various contributors -- It doesn't seem right to have the whole file attributed to myself. Acked-by: Michel Dänzer <michel.daenzer@amd.com> Acked-by: Serge Martin <edb+mesa@sigluy.net>	2016-06-27 23:12:35 -07:00
Kenneth Graunke	034bd25327	i965: Print EOT in fs_visitor::dump_instruction(). This was useful when debugging the previous commit's issue. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-06-27 16:36:57 -07:00
Kenneth Graunke	7e7e501acf	i965: Make emit_urb_writes() not produce an EOT message for GS. emit_urb_writes() contains code to emit an EOT write with no actual data when there are no output varyings. This makes sense for the VS and TES stages, where it's called once at the end of the program. However, in the geometry shader stage, emit_urb_writes() is called once for every EmitVertex(). We explicitly emit a URB write with EOT set at the end of the shader, separately from this path. So we'd better not terminate the thread. This could get us into trouble for shaders which do EmitVertex() with no varyings followed by SSBO/image/atomic writes. It also caused us to emit multiple sends with EOT set, which apparently confuses the register allocator into not using g112-g127 for all but the first one. This caused EU validation failures in OglGSCloth shaders in shader-db. (The actual application was fine, but shader-db thinks there are no outputs because it doesn't understand transform feedback.) Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-06-27 16:36:51 -07:00
Kenneth Graunke	a36a73a7b8	glsl: Ignore ir_texture in lower_const_arrays_to_uniforms. The only part of an ir_texture which can be an array is the offsets array in textureGatherOffsets() calls. We don't want to lower those, because they're required to remain constants. Fixes textureGatherOffsets with Gallium drivers such as llvmpipe, which commit `ef78df8d3b` regressed. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-06-27 16:36:30 -07:00
Samuel Pitoiset	7b9b096775	gm107/ir: add missing setcond flags for LOP variants Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>	2016-06-28 00:30:01 +02:00
Samuel Pitoiset	83a4f28dc2	gm107/ir: make use of LOP32I for all immediates LOP only allows to emit 19-bits immediates. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>	2016-06-28 00:29:53 +02:00
Dave Airlie	c7cc264ca9	virgl: reduce some limits for now These need to be passed from the host in caps structure if they are larger, this fixes a bunch of tests on Intel hw, that I'd put the limits too high for. Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-28 06:49:26 +10:00
Julien Isorce	6e4cf937f8	st/omx: count number of slices Used by nouveau driver. Similar patch was done for st/va: `851e7e12aa` Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-06-27 17:52:15 +01:00
Julien Isorce	e10f1fcebe	st/omx: add support for nouveau / interlaced Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-06-27 17:52:15 +01:00
Julien Isorce	23b7a83cc1	st/omx: retrieve preferred interlaced and buffer_formats Interlaced can be true for nouveau driver. Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-06-27 17:52:15 +01:00
Marek Olšák	f6ff483646	radeonsi: use optimal WD settings for primitive restart on Polaris ported from Vulkan Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-27 13:54:39 +02:00
Gurkirpal Singh	46dba701d8	st/va: Check NULL pointer Call to handle_table_get in vlVaDestroySurfaces can return NULL on failure. CID: 1243522 Signed-off-by: Gurkirpal Singh <gurkirpal204@gmail.com> Reviewed-by: Julien Isorce <j.isorce@samsung.com>	2016-06-27 08:09:08 +01:00
Eric Anholt	d20b89e928	nir: Fix copy_prop_src when src is an indirect access on a reg. The intent was to continue down the indirect chain, not to call ourselves with unchanged input arguments. Found by code inspection, and comparison to copy_prop_alu_src(). We haven't hit this because callers of NIR's copy prop are doing so in SSA, before indirect variable dereferences have been lowered to registers. Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-26 15:38:09 -07:00
Samuel Pitoiset	c7fa3c92f8	gm107/ir: make use of MOV32I for all immediates MOV only allows to emit 19-bits immediates. This is similar to the previous fix I did for IMUL. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>	2016-06-27 00:28:02 +02:00
Jordan Justen	367cf3a2e3	i965: Use miptree to decide format on multi-plane images for gen < 7 This wasn't handled correctly for multi-plane images on gen < 7 in `727a9b2493`. Reported-by: Mark Janes <mark.a.janes@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96674 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-26 10:49:34 -07:00
Ilia Mirkin	1f5f64b91f	nvc0: update "derived" state function names derived_1/2/etc aren't too informative. Instead name them based on the state they're derived from. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-06-26 12:04:55 -04:00
Ilia Mirkin	89a7496b9d	nvc0: provide support for unscaled poly offset units On at least Kepler hardware, the units differ based on RT format. Emit a properly scaled value for Z16 depth buffers vs other formats, to help out st/nine. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-06-26 12:04:55 -04:00
Samuel Pitoiset	b84c97587b	gm107/ir: make use of IMUL32I for all immediates IMUL only allows to emit 19-bits immediates. This is similar to `d30768025a` which fixed the same thing for the GK110 emitter. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>	2016-06-26 17:33:06 +02:00
Marek Olšák	d93bacc1fa	radeonsi: make si_is_format_supported static Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Vedran Miletić <vedran@miletic.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-25 23:13:42 +02:00
Marek Olšák	3eacbc52d5	radeonsi: boolean -> bool, TRUE -> true, FALSE -> false Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Vedran Miletić <vedran@miletic.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-25 23:13:42 +02:00
Marek Olšák	7db10093d3	gallium/radeon: boolean -> bool, TRUE -> true, FALSE -> false Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Vedran Miletić <vedran@miletic.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-25 23:13:42 +02:00
Marek Olšák	1c5a10497a	gallium/radeon/winsyses: boolean -> bool, TRUE -> true, FALSE -> false Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Vedran Miletić <vedran@miletic.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-25 23:13:42 +02:00
Marek Olšák	d5383a7d31	gallium/radeon: use r600_resource_reference Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Vedran Miletić <vedran@miletic.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-25 23:13:42 +02:00
Jason Ekstrand	81978c6feb	nir: Add a NIR_VALIDATE environment variable It defaults to true so default behavior doesn't change but it allows you to do NIR_VALIDATE=false if you don't want validation. Disabling validation can substantially speed up shader compiles so you frequently want to turn it off if compiler invariants aren't in question. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-06-25 07:34:20 -04:00
Axel Davy	b76fa56739	st/nine: Use offset_units_unscaled offset_units_unscaled enables proper support for depth bias for gallium nine. Use it if available. Solves issues with some games using depth bias. For example: https://github.com/iXit/Mesa-3D/issues/220 Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-06-25 10:16:15 +02:00
Axel Davy	f6704f2a4d	r600g: Implement POLYGON_OFFSET_UNITS_UNSCALED Empirical tests show that the polygon offset behaviour is entirely determined by the content of the PA_SU_POLY_OFFSET states, and not by the depth buffer format bound. PA_SU_POLY_OFFSET seems to directly set the parameters of the polygon offset formula, and setting 0 for PA_SU_POLY_OFFSET_DB_FMT_CNTL (ie setting the unorm depth bias behaviour with a scale of 2^0 = 1.0f) gives the unscaled behaviour. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-25 10:16:15 +02:00
Axel Davy	be7957b156	radeonsi: Implement POLYGON_OFFSET_UNITS_UNSCALED Empirical tests show that the polygon offset behaviour is entirely determined by the content of the PA_SU_POLY_OFFSET states, and not by the depth buffer format bound. PA_SU_POLY_OFFSET seems to directly set the parameters of the polygon offset formula, and setting 0 for PA_SU_POLY_OFFSET_DB_FMT_CNTL (ie setting the unorm depth bias behaviour with a scale of 2^0 = 1.0f) gives the unscaled behaviour. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-25 10:16:15 +02:00
Axel Davy	c2b7b48a54	radeon: Remove useless pa_su_poly_offset_db_fmt_cntl pa_su_poly_offset_db_fmt_cntl usages were removed in previous patches. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-25 10:16:15 +02:00
Axel Davy	fe2ec50d75	r600g: move PA_SU_POLY_OFFSET_DB_FMT_CNTL to poly offset states for evergreen Emit PA_SU_POLY_OFFSET_DB_FMT_CNTL with the other poly_offset states. This will be useful to implement PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED. v2: Increase the num_dw field for the poly offset atom Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-25 10:16:15 +02:00
Axel Davy	400e8d8c40	r600g: move PA_SU_POLY_OFFSET_DB_FMT_CNTL to poly offset states for r600 Emit PA_SU_POLY_OFFSET_DB_FMT_CNTL with the other poly_offset states. This will be useful to implement PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED. v2: Increase the num_dw field for the poly offset atom Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-25 10:16:15 +02:00
Axel Davy	ff5abe9d90	radeonsi: move PA_SU_POLY_OFFSET_DB_FMT_CNTL to poly offset states Emit PA_SU_POLY_OFFSET_DB_FMT_CNTL with rasterizer poly_offset states. This will be useful to implement PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-25 10:16:15 +02:00
Axel Davy	59a692916c	gallium: Add a cap for offset_units_unscaled D3D9 has a different behaviour for depth bias. For OGL/D3D1X, the depth bias unit is the minimal resolvable value for the depth buffer, which depends on the format (and has different behaviour for float depth buffers). For D3D9, the depth bias unit is 1.0f. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-25 10:16:15 +02:00
Jordan Justen	727a9b2493	i965: Skip update_texture_surface when the plane doesn't exist Reported-by: Grazvydas Ignotas <notasas@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96607 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Cc: Kristian Høgsberg <krh@bitplanet.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-06-24 18:13:18 -07:00
Kenneth Graunke	c4a6b0d2d2	i965: Validate a few SEND-from-GRF requirements. We recently had a mistake where we emitted SEND instructions with EOT set, but from g107 rather than g112-g127. Adding validation code should prevent these sorts of problems from slipping back in. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-06-24 15:03:55 -07:00
Kenneth Graunke	192813e50e	i965: Delete send-from-GRF only opcodes from implied_mrf_writes(). These only exist post-Sandybridge, and always use send-from-GRF. So inst->base_mrf will be -1, and we will have already returned 0. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-06-24 15:03:55 -07:00
Kenneth Graunke	255cff76d9	i965: Drop unnecessary inst->base_mrf = -1 assignments. These are now unnecessary, as base_mrf is -1 by default. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-06-24 15:03:55 -07:00
Kenneth Graunke	3e04e3758e	i965: Set fs_inst::base_mrf = -1 by default. On MRF platforms, we need to set base_mrf to the first MRF value we'd like to use for the message. On send-from-GRF platforms, we set it to -1 to indicate that the operation doesn't use MRFs. As MRF platforms are becoming increasingly a thing of the past, we've forgotten to bother with this. It makes more sense to set it to -1 by default, so we don't have to think about it for new code. I searched the code for every instance of 'mlen =' in brw_fs*cpp, and it appears that all MRF-based messages correctly program a base_mrf. Forgetting to set base_mrf = -1 can confuse the register allocator, causing it to think we have a large fake-MRF region. This ends up moving the send-with-EOT registers earlier, sometimes even out of the g112-g127 range, which is illegal. For example, this fixes illegal sends in Piglit's arb_gpu_shader_fp64-layout-std430-fp64-shader, which had SSBO messages with mlen > 0 but base_mrf == 0. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-06-24 15:03:55 -07:00
Kenneth Graunke	3e258f7e31	i965: Drop unused return value from intel_finalize_mipmap_tree(). The old return type of GLuint was wonky - it should have been bool. But nothing actually uses the return value anyway, so we can just drop that and make it a void function. In theory, it might make sense to ask whether the texture validated successfully, but just checking intel_obj->mt != NULL works for that. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-06-24 15:03:44 -07:00
Kenneth Graunke	8ee23d6866	i965: Move contents of brw_tex.c into intel_tex_validate.c. brw_tex.c is a tiny file containing a single function. It's closely tied to the validation logic in intel_tex_validate.c, so it makes sense to put both in the same file. While we're at it, update the function to our modern style. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-06-24 15:03:44 -07:00
Marek Olšák	28d0d0c5b4	radeonsi: fix fractional odd tessellation spacing for Polaris ported from Vulkan (and no source explains why this is needed) Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-24 17:36:43 +02:00
Marek Olšák	0d638f4b3d	radeonsi: set some VGT context registers on SI-CI the kernel sets them, but other UMDs can change them Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-24 16:24:53 +02:00
Marek Olšák	8f3ef4e8b8	radeonsi: optimize rendering to linear color buffers loosely ported from Vulkan Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-24 16:24:53 +02:00
Marek Olšák	e4b22c9fa1	radeonsi: set almost optimal settings in SC_MODE_CNTL_1 ported from Vulkan Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-24 16:24:53 +02:00
Marek Olšák	603c073ec2	gallium/radeon: let drivers specify SC_MODE_CNTL_1 fields radeonsi will set more fields Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-24 16:24:53 +02:00
Marek Olšák	ae0d2d15cc	gallium/radeon: disable complicated point clipping against user clip planes Nothing in the GL spec says that we should expand points to triangles. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-24 16:24:53 +02:00
Marek Olšák	1e8adb0ee4	radeonsi: fix a compute shader hang with big threadgroups on SI & CI ported from Vulkan Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-24 16:24:53 +02:00
Ilia Mirkin	b433cb51e5	nvc0: when mapping directly, provide accurate xfer info + start We were ignoring the incoming box parameters, and were providing totally bogus stride/layer stride, and other bits, for when a non-full-surface map was requested. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: <mesa-stable@lists.freedesktop.org>	2016-06-24 09:53:13 -04:00
Ilia Mirkin	3f0fa3b32d	st/mesa: don't assume that the whole surface gets mapped Under some circumstances, the driver may choose to return a temporary surface instead of a pointer to the original. Make sure to pass the actual view volume to be mapped to the transfer function rather than adjusting the map pointer after-the-fact. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-24 09:53:13 -04:00
Nicolai Hähnle	0da890e62c	radeonsi: drop the DRAW_PREAMBLE packet on Polaris It will be removed from the firmware for the Polaris. Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-24 13:28:46 +02:00
Nicolai Hähnle	2aa0485902	radeonsi: use DRAW_(INDEX_)INDIRECT_MULTI on Polaris The non-MULTI variants will be removed in Polaris firmware. Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-24 13:28:32 +02:00
Francesco Ansanelli	82ab3f27ff	st/mesa: handle negative _ColorDrawBufferIndexes values correctly Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-06-24 12:41:22 +02:00
Nicolai Hähnle	bc4b7ebbfd	winsys/radeon: add guard pages when R600_DEBUG=check_vm is enabled This should help flush out GPU VM faults. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-24 12:36:03 +02:00
Nicolai Hähnle	49c0b4a0db	winsys/amdgpu: add guard pages when R600_DEBUG=check_vm is enabled This should help flush out GPU VM faults. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-24 12:36:03 +02:00
Nicolai Hähnle	dbac88a839	radeonsi: report a failure to parse dmesg instead of asserting Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-24 12:36:03 +02:00
Nicolai Hähnle	d46a9db840	radeon: check VM faults from DMA flush Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-24 12:36:03 +02:00
Nicolai Hähnle	80dd7870fe	radeonsi: move gfx fence wait out of si_check_vm_faults Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-24 12:36:03 +02:00
Nicolai Hähnle	ad8438403b	radeonsi: extract IB and bo list saving into separate functions Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-24 12:36:02 +02:00
Nicolai Hähnle	b3de274b05	st/mesa: fix readpixels regression with MESA_pack_invert Fixes an error introduced in commit `3948cd3797`. Reported-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-24 12:36:02 +02:00
Marek Olšák	05e741c6d6	radeonsi: set LLVM denormal flags - make sure FP32 denormals will stay disabled in LLVM in the future (the current default is disabled) - tell LLVM that FP64 denormals are enabled Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-06-24 12:31:03 +02:00
Marek Olšák	0e1fefa722	radeonsi: emit 1/sqrt for RSQ We don't need the clamped version and we don't have to use any intrinsic. Stats on Tonga: 15382 shaders in 9128 tests Totals: SGPRS: 1230560 -> 1230560 (0.00 %) VGPRS: 469577 -> 462504 (-1.51 %) Code Size: 22089908 -> 21730052 (-1.63 %) bytes LDS: 598 -> 598 (0.00 %) blocks Scratch: 283648 -> 281600 (-0.72 %) bytes per wave Max Waves: 125664 -> 126969 (1.04 %) Wait states: 0 -> 0 (0.00 %) Totals from affected shaders: SGPRS: 547280 -> 547280 (0.00 %) VGPRS: 269132 -> 262059 (-2.63 %) Code Size: 15709604 -> 15349748 (-2.29 %) bytes LDS: 198 -> 198 (0.00 %) blocks Scratch: 74752 -> 72704 (-2.74 %) bytes per wave Max Waves: 47840 -> 49145 (2.73 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-06-24 12:31:03 +02:00
Jan Vesely	54c4d525da	r600g: Enable FMA on chips that support it v2: Merge with PIPE_SHADER_CAP_DOUBLES Add CHIP_HEMLOCK v3: only set the instruction on EG and CM Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-06-24 12:30:59 +02:00
Marek Olšák	cbb5adb908	gallium/u_queue: allow the execute function to differ per job so that independent types of jobs can use the same queue. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-24 12:24:40 +02:00
Marek Olšák	4a06786efd	gallium/u_queue: reduce the number of mutexes by 2 by converting semaphores to condvars and using the main mutex Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-24 12:24:40 +02:00
Marek Olšák	2fba0aaa70	gallium/u_queue: add an option to name threads for debugging v2: correct the snprintf use Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-24 12:24:40 +02:00
Marek Olšák	404d0d50d8	gallium/u_queue: add an option to have multiple worker threads independent jobs don't have to be stuck on only one thread v2: use CALLOC & FREE Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-24 12:24:40 +02:00
Marek Olšák	4358f6dd13	gallium/u_queue: rewrite util_queue_fence to allow multiple waiters Checking "signalled" is first done without a mutex, then with a mutex. Also, checking without waiting doesn't lock the mutex. This is racy, but should be safe. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-24 12:24:40 +02:00
Marek Olšák	d8367e91f2	gallium/u_queue: use a ring instead of a stack and allow specifying its size in util_queue_init. v2: use CALLOC & FREE Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-24 12:24:40 +02:00
Jordan Justen	c36a363a2d	i965: Preserve the internal format of the dri image Since the OpenGLES API is strict about the internal format matching the for many operations, we need to preserve it. See _mesa_es3_error_check_format_and_type in src/mesa/main/glformats.c. Fixes ES2-CTS.gtf.GL2ExtensionTests.egl_image.egl_image Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96351 Reported-by: Mark Janes <mark.a.janes@intel.com> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Cc: Kristian Høgsberg <krh@bitplanet.net> Cc: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-06-23 20:44:00 -07:00
Chad Versace	a0f3c3c9d4	anv: Add anv_render_pass_attachment::store_op Will be needed for resolving auxiliary surfaces. I didn't add anv_render_pass_attachment::stencil_store_op, as the driver would likely never use it, as stencil surfaces never have auxiliary surfaces. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-23 16:10:25 -07:00
Gurkirpal Singh	15d3777b74	gbm: Fix comments Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-06-23 13:55:03 -07:00
Eric Engestrom	b293e8b470	gbm: doc fixes Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-06-23 13:55:03 -07:00
Giuseppe Bilotta	60a27ad122	Remove wrongly repeated words in comments Clean up misrepetitions ('if if', 'the the' etc) found throughout the comments. This has been done manually, after grepping case-insensitively for duplicate if, is, the, then, do, for, an, plus a few other typos corrected in fly-by v2: * proper commit message and non-joke title; * replace two 'as is' followed by 'is' to 'as-is'. v3: * 'a integer' => 'an integer' and similar (originally spotted by Jason Ekstrand, I fixed a few other similar ones while at it) Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-06-23 13:55:03 -07:00
Brian Paul	5d07998317	svga: update some comments in svga_buffer_handle() Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-23 13:02:28 -06:00
Brian Paul	fe76212873	svga: add a const qualifier in svga_buffer_upload_piecewise() Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-23 13:02:28 -06:00
Brian Paul	e82fa96d19	svga: minor code refactor for svga_buffer_upload_command() Put the HBS code into a separate function. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-23 13:02:28 -06:00
Brian Paul	db721da5a3	svga: minor code simplification in svga_context_finish() Signed-off-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-23 13:02:28 -06:00
Kenneth Graunke	b0629e6894	i965: Implement rasterizer discard via SOL unless required for queries. We currently use CL_INVOCATION_COUNT for the GL_PRIMITIVES_GENERATED query, which involves passing all primitives to the clipper. When rasterizer discard is enabled, we program the clipper in REJECT_ALL mode, rather than using the SOL stage's "Rendering Disable" feature. See commit `f09b91f782` for an explanation of why we implement GL_PRIMITIVES_GENERATED this way. Apparently the SOL stage's "Rendering Disable" feature is a lot faster than having the clipper reject all primitives. It's safe to use when no GL_PRIMITIVES_GENERATED query is active, as we don't care about CL_INVOCATION_COUNT incrementing. This patch makes us use SO_RENDERING_DISABLE when no query is active, but continues falling back to the clipper in REJECT_ALL mode when the queries are enabled. It brings back the perf_debug for the clipper case (which I removed in commit `1f9445ff57`, thinking it wasn't useful). Improves performance in Gl32GSCloth by 84.8303% +/- 2.07132% (n = 10) on my Broadwell GT2 laptop. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-23 11:58:50 -07:00
Kenneth Graunke	4db98f8beb	i965: Combine 3DSTATE_STREAMOUT emitters and genX_sol_state atoms. They're basically the same. Let's avoid the code duplication. v2: Fix SO_BUFFER_ENABLE stuff to only happen on Gen < 8 (caught by Jason Ekstrand). Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-23 11:58:50 -07:00
Kenneth Graunke	fb857b5eea	glsl: Don't constant propagate arrays. Constant propagation on arrays doesn't make a lot of sense. If the array is only accessed with constant indexes, then opt_array_splitting would split it up. Otherwise, we have variable indexing. If there's multiple accesses, then constant propagation would end up replicating the data. The lower_const_arrays_to_uniforms pass creates uniforms for each ir_constant with array type that it encounters. This means that it creates redundant uniforms for each copy of the constant, which means uploading too much data. It can even mean exceeding the maximum number of uniform components, causing link failures. We could try and teach the pass to de-duplicate the data by hashing constants, but it makes more sense to avoid duplicating it in the first place. We should promote constant arrays to uniforms, then propagate the uniform access. Fixes the TressFX shaders from Tomb Raider, which exceeded the maximum number of uniform components by a huge margin and failed to link. On Broadwell: total instructions in shared programs: 9067702 -> 9068202 (0.01%) instructions in affected programs: 10335 -> 10835 (4.84%) helped: 10 (Hoard, Shadow of Mordor, Amnesia: The Dark Descent) HURT: 20 (Natural Selection 2) loops in affected programs: 4 -> 0 The hurt programs appear to no longer have a constarray uniform, as all constants were successfully propagated. Apparently before this patch, we successfully unrolled a loop containing array access, but only after promoting constant arrays to uniforms. With this patch, we unroll it first, so all array access is direct, and the array is split up, and individual constants are propagated. This seems better. Cc: mesa-stable@lists.freedesktop.org Reported-by: Karol Herbst <nouveau@karolherbst.de> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-06-23 11:58:50 -07:00
Kenneth Graunke	ef78df8d3b	glsl: Make lower_const_arrays_to_uniforms work directly on constants. There's really no point in looking at ir_dereference_array of a constant. It also misses cases like: (assign () (var_ref tmp) (constant (array ...) ...)) No changes in shader-db, but keeps it working after the next commit. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-06-23 11:58:50 -07:00
Kenneth Graunke	f7741c5211	i965: Copy propagate before doing variable index lowering. The scalar backend currently doesn't support variable indexing on temporary arrays, but it does support it on uniform arrays, and some stages support it for input arrays. Make sure these are propagated through before exploding indirects into piles of if-ladders unnecessarily. On Broadwell, no instruction count change in shader-db. total cycles in shared programs: 80675652 -> 80674928 (-0.00%) cycles in affected programs: 649972 -> 649248 (-0.11%) helped: 386 HURT: 165 This will help avoid code quality regressions in a future commit. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-06-23 11:58:50 -07:00
Kenneth Graunke	586f4a42e7	glsl: Propagate invariant/precise after lowering const arrays. The new uniform may need precise as well. Fixes copy propagation of constant array uniforms in Tomb Raider shaders. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-06-23 11:58:50 -07:00
Kenneth Graunke	c264fdbc07	glsl: Split arrays even in the presence of whole-array copies. Previously, we failed to split constant arrays. Code such as int[2] numbers = int[](1, 2); would generates a whole-array assignment: (assign () (var_ref numbers) (constant (array int 4) (constant int 1) (constant int 2))) opt_array_splitting generally tried to visit ir_dereference_array nodes, and avoid recursing into the inner ir_dereference_variable. So if it ever saw a ir_dereference_variable, it assumed this was a whole-array read and bailed. However, in the above case, there's no array deref, and we can totally handle it - we just have to "unroll" the assignment, creating assignments for each element. This was mitigated by the fact that we constant propagate whole arrays, so a dereference of a single component would usually get the desired single value anyway. However, I plan to stop doing that shortly; early experiments with disabling constant propagation of arrays revealed this shortcoming. This patch causes some arrays in Gl32GSCloth's geometry shaders to be split, which allows other optimizations to eliminate unused GS inputs. The VS then doesn't have to write them, which eliminates the entire VS (5 -> 2 instructions). It still renders correctly. No other change in shader-db. v2: Drop !AOA check and improve a comment (feedback from Tim Arceri). Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-06-23 11:58:50 -07:00
Kenneth Graunke	acf5444044	glsl: Make constant propagation's folder not propagate into an LHS. opt_constant_propagation.cpp contains constant folding code which can actually do constant propagation in some cases. It was happily propagating constants into the left-hand-side of assignments. For example, (assign () (var_ref temp) (constant ...)) would brilliantly be turned into: (assign () (constant ...) (constant ....)) This is a bigger hammer than necessary - it prevents propagation into the left-hand-side altogether. We could certainly do better someday. Notably, the constant propagation pass itself already takes this approach - it's just the constant propagation pass's built-in constant folding code (which actually propagates, too) that was broken. No change in shader-db, but prevents regressions after future commits. It seems plausible that this could be hit today, but I haven't seen it happen. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-06-23 11:58:50 -07:00
Topi Pohjolainen	3487d2e7bf	i965/blorp: Disable vertex element swizzling Without vertex elements originating directly from vertex fetcher are not passed to wm-state correctly. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-23 21:39:09 +03:00
Topi Pohjolainen	12783aac50	i965/blorp: Let program data tell if push constants are needed Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-23 21:39:09 +03:00
Topi Pohjolainen	874f2e9523	i965/blorp: Use prog data counters to guide wm/ps setup just as core upload logic does. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-23 21:39:09 +03:00
Topi Pohjolainen	f5e8575ab4	i965/blorp: Use prog data counters to guide sf/sbe setup just as core upload logic does. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-23 21:39:09 +03:00
Ardinartsev Nikita	01c89ccc5d	i965: Avoid division by zero. Fixes regression introduced by `af5ca43f26` Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95419	2016-06-23 10:08:58 -07:00
Tim Rowley	a16d274032	swr: [rasterizer core] fix dependency bug Never be dependent on "draw 0", instead have a bool that makes the draw dependent on the previous draw or not dependent at all. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-06-23 10:51:11 -05:00
Tim Rowley	73a9154bde	swr: [rasterizer core] use wrap-around safe compares for dependency checking Move drawIDs from 64-bit to 32-bit to increase perf. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-06-23 10:51:06 -05:00
Tim Rowley	dd189536dc	swr: [rasterizer jitter] add support for component packing for 'odd' formats Add early-out if no components are enabled. Add asserts. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-06-23 10:51:00 -05:00
Tim Rowley	35935ca4f2	swr: [rasterizer core] track whether GS outputs viewport array index So we can skip the index gather in PA. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-06-23 10:50:55 -05:00
Tim Rowley	2d80295a6e	swr: [rasterizer core] GS viewport array index attribute Only adds the attribute mapping to the jitter; no implementation yet. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-06-23 10:50:47 -05:00
Tim Rowley	c7cd33b605	swr: [rasterizer core] conservative rasterization frontend support Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-06-23 10:50:41 -05:00
Tim Rowley	c867c22d85	swr: [rasterizer core] stop single threaded crash exit crash Function static destructors were getting called by exit handlers before context teardown. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-06-23 10:50:36 -05:00
Tim Rowley	0f025eb478	swr: [rasterizer jitter] small fetch jit cleanup Handle SGV stores separate from the stream fetch code. Because of this change, there is a potential to jit an extra unused store. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-06-23 10:50:30 -05:00
Tim Rowley	eca877f27b	swr: [rasterizer core] remove old comment Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-06-23 10:50:25 -05:00
Tim Rowley	d3d97f8395	swr: [rasterizer jitter] cleanup supporting different llvm versions Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-06-23 10:50:19 -05:00
Tim Rowley	42215e6116	swr: [rasterizer jitter] unitialized component fix in fetch jit Was trying to store an extra uninitialized component. Only affects component packing, which isn't enabled (yet). Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-06-23 10:50:12 -05:00
Tim Rowley	b6d2c96851	swr: [rasterizer] add support for building avx512 version Currently, most code paths between AVX2 and AVX512 are identical (see changes to knobs.h). Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-06-23 10:50:05 -05:00
Tim Rowley	695af2a7e2	swr: [rasterizer common] fix include for Intel compiler Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-06-23 10:49:59 -05:00
Tim Rowley	95f21a9766	swr: [rasterizer common] workaround clang for windows __cpuid() bug Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-06-23 10:49:46 -05:00
Tim Rowley	9ca741c645	swr: push/pop DEBUG macro around llvm includes llvm redefines DEBUG; adding push/pop prevents a undefined reference to debug_refcnt_state in llvm-3.7+. v2: add undef DEBUG Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-06-23 09:58:08 -05:00
Jose Fonseca	805dbdf06d	include: Require MSVC 2013 Update 4. Earlier MSVC 2013 releases have troubles compiling some of our C99 code, so make sure we have Update 4 to avoid confusion. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-23 15:07:19 +01:00
Brian Paul	4f5d513755	svga: rename svga_surface_copy() to svga_resource_copy_region() To be consistent with the pipe_context function name. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-23 07:31:20 -06:00
Brian Paul	743ff588f2	svga: don't copy blit_info into local var There's no reason for doing so. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-23 07:31:20 -06:00
Brian Paul	e0dc3c5f19	gallium/util: fix some 4-space indentation in blitter code Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-23 07:31:20 -06:00
Charmaine Lee	2aa9ff0cda	svga: fix texture array update regression With commit `fb9fe35`, we start using transfer_inline_write for memcpy TexSubImage path, but that triggers a regression with texture array in the svga driver. With this patch, the direct map code will update the texture array correctly. Fixes VMware bug 1679293. Tested with MTT piglit, glretrace, conform. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-23 07:31:20 -06:00
Charmaine Lee	d4a77254cb	svga: fix index/vertex buffer surface reference at draw Currently with the SetVertexBuffers optimization, we avoid emitting redundant DXSetVertexBuffers commands. However, these buffers surfaces will still need to be referenced, otherwise, in the case of linux, the subsequent surface discard map will map to the existing mob instead of a new one, causing rendering artifacts. With this patch, we'll call resource_rebind() to reference the resources even if we are avoiding the actual set command. This fixes the rendering artifacts in the window title area running with unity in Ubuntu 14.04 Tested with piglit, glretrace. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com>	2016-06-23 07:31:20 -06:00
Charmaine Lee	2b81e31d44	svga: fix vertex buffer references in the hw state This patch fixes three issues with vertex buffer references: (1) Instead of copy the vertex buffer resource handles to the hw state in the context structure, use pipe_resource_reference to properly reference the vertex buffer resources in the context. (2) Make sure to unbind those unused vertex buffer resources. (3) Force to rebind the vertex buffer resources at the first draw of each command buffer to make sure the vertex buffer resources are paged in. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-23 07:31:20 -06:00
Charmaine Lee	a1d74f5528	svga: fix index buffer reference in the hw state Instead of copy the index buffer resource handle to the hw state in the context structure, use pipe_resource_reference to properly reference the index buffer resource in the context. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-23 07:31:19 -06:00
Timothy Arceri	ab99196b6b	glsl/mesa: stop duplicating geom and tcs layout values We already store these in gl_shader and gl_program here we remove it from gl_shader_program and just use the values from gl_shader. This will allow us to keep the shader cache restore code as simple as it can be while making it somewhat clearer where these values originate from. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-06-23 11:01:46 +10:00
Timothy Arceri	24b3be0938	glsl/mesa: stop duplicating tes layout values We already store this in gl_shader and gl_program here we remove it from gl_shader_program and just use the values from gl_shader. This will allow us to keep the shader cache restore code as simple as it can be while making it somewhat clearer where these values originate from. V2: remove unnecessary NULL check Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Iago Toral <itoral@igalia.com>	2016-06-23 11:01:36 +10:00
Edward O'Callaghan	f3ae370a36	.mailmap: Fixup my email address Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-06-23 00:00:46 +02:00
Christian Gmeiner	22304554a2	st/mesa: expose EXT_vertex_array_bgra when supported by backend Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-22 12:46:08 -07:00
Jason Ekstrand	c2f2c8e407	anv: Use different BOs for different scratch sizes and stages This solves a race condition where we can end up having different stages stomp on each other because they're all trying to scratch in the same BO but they have different views of its layout. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:39:45 -07:00
Jason Ekstrand	45c0f60999	genxml: Make ScratchSpaceBasePointer an address instead of an offset While we're here, we also fixup MEDIA_VFE_STATE and rename the field in 3DSTATE_VS on gen6-7.5 to be consistent with the others. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:39:42 -07:00
Jason Ekstrand	966bed17c1	anv: Add an allocator for scratch buffers Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:39:20 -07:00
Jason Ekstrand	89ded099f8	genxml: Put append counter fields before MCS in RENDER_SURFACE_STATE on gen7 The pack header generation scripts can't handle the case where you have two addresses in the same dword; they just take whatever is the last one. This meant that the MCS address wasn't properly getting handled. Since we don't care about append counters, we can just re-arrange the XML for now. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	d82322eb18	anv,isl: Lower storage image formats in anv ISL was being a bit too clever for its own good and lowering the format for us. This is all well and good if we always want to lower it. However, the GL driver selectively lowers the format depending on whether the surface is write-only or not. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	97f12773b8	isl/state: Allow for full 31-bit buffer texture sizes Ivy Bridge and above can handle up to 2^31 elements for RAW buffer surfaces. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	bb64e666ba	isl/state: Don't use designated initializers for buffer surface state Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	4061fde66e	isl/state: Add assertions for buffer surface restrictions Acked-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	ce24097abe	isl/state: Don't set SurfacePitch for gen9 1-D textures This field is ignored by the hardware in this case and, on very large 1-D textures, it can end up being larger than the maximum allowed value. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	f47e23a8b6	isl/state: Use TILEWALK_XMAJOR for linear surfaces on gen7 This matches better what happens on gen8 where the "Tiled Surface" and "Tile Walke" bits are combined into a single two-bit value. This is also more consistent with what the GL driver does. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	96706bad5f	isl/state: Emit no-op mip tail setup on SKL This hasn't ever been a problem in the past but it is recommended by the hardware docs. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	14d7c16e50	isl/state: Only set cube face enables if usage includes CUBE_BIT It seems safe to set it all the time, but this reduces the diff between the way i965 does it and what ISL does. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	5d24e9cfa1	isl/state: Use the layout for computing qpitch rather than dimensions For depth/stencil 1-D textures on SKL, we want them layed out in the old format that has been used since gen4. In order for the surface state fill-out code to handle, this it needs to distinguish based on layout rather than just dimensionality. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	6a43204afa	isl/state: Set the IntegerSurfaceFormat bit on Haswell This fixes 688 Vulkan CTS tests on Haswell. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	324103da75	isl/format: Mark R9G9B9E5 as containing 9-bit unsigned float channels Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	215282c9f4	isl/state: Don't set RenderTargetViewExtent for texture surfaces The docs specify that this only matters for render targets and surfaces used with typed dataport messages. On some platforms (gen4-6) the Depth field has more bits than RenderTargetViewExtent so we can have textures with more levels than we can render to. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	bb326f7b01	isl/state: Set SurfaceArray based on the surface dimension According to the PRM, you can't set SurfaceArray for 3D or buffer textures. There doesn't seem to be a good reason not to set it when we can. On the other hand, if we don't set it we can end up getting strange results for 1-layer array textures such as textureSize() returning the wrong results. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	d050ffbce9	isl/state: Don't force-disable L2 bypass for everything We already set the bit in the few cases where it's required by the docs so there's no need to set it all the time. This has no noticable perf impact for Dota 2 on Vulkan with the time demo I have. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	87f0ffa646	isl/state: Refactor the setup of clear colors This commit switches clear colors to use #if's instead of a C if. This lets us properly handle SNB where the clear color field doesn't exist. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	62a5e6e031	isl/state: Refactor the per-gen isl_to_gen_h/valign tables This moves the #if's around so that halign and valign have different sets of #if conditions. This also prepares us for SNB because isl_to_gen_halign is not defined at all on gen6. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	b1b0d6fb54	isl/state: Return an extent3d from the halign/valign helper Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	a60ae9e10a	isl/state: Put pitch calculations together This is purely cosmetic, but it makes things look a bit more readable. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	70c8afc0c8	isl/state: Put all dimension setup together and towards the top This is purely cosmetic, but it makes things look a bit more readable. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	e66e70ef47	isl/state: Put surface format setup at the top This is purely cosmetic, but it makes things look a bit more readable. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	39baea551f	isl/state: Remove some unused fields They're already zero-initialized and we have no plans of doing anything more interesting with them. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	caf2af4181	isl/state: Don't use designated initializers for the surface state While designated initializers are nice, they also force us to put some things in the initializer and some things later. Surface state setup is complicated enough that this really hurts readability in the long run. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	de1d194856	genxml/gen8,9: Prefix the multisample format enum with MSFMT This is what gen7 does and it's nice to have a prefix Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	320de71858	i965/blorp: Only set src_z for gen8+ 3D textures Otherwise, we end up with a bogus value in the third component. On gen6-7 where we always use 2D textures, this can cause problems if the SurfaceArray bit is set in the SURFACE_STATE. Acked-by: Chad Versace <chad.versace@intel.com>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	664dc89a1b	i965/gen7,8: Set SURFACE_IS_ARRAY for all non-3D texture types There's no real reason why we shouldn't set this bit. It does affect how the sampler operates a bit but since you can have a 2D non-array view of a 2D_ARRAY texture that distinction is very weak. Also, this is what ISL will do and we would like this change to be isolated from using ISL. Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	2a1cc94d27	i965/gen4: Subtract 1 from buffer sizes The PRM states that the values put in Width, Height, and Depth should be various bits from the value size - 1. We seem to have done this wrong more-or-less from the start. Reviewed-by: Chad Versace <chad.versace@intel.com> Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	e8580b8f98	i965: Remove fake W-tiled render target support This hasn't been used since `1cfb4bc890` where we deleted the meta stencil blit path. Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	0195299c86	i965/fs: Use a default Y coordinate of 0 for TXF on gen9+ Previously, we were incrementing length but not actually putting anything in the Y coordinate. This meant that 1-D TXF operations had a garbage array index. If the surface is emitted as 1-D non-array, the coordinate gets discarded and it works fine. If it happens to be bound as an array surface, it may count as an out-of-bounds array access and you get zero. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	1436238b75	i965/gen8: Use the qpitch from the aux_mt for AUX_QPITCH Reviewed-by: Chad Versace <chad.versace@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	620f81d2ed	i965/blorp/gen8: Use the correct max level and layer in emit_surface_states We were adding in the base which is wrong because the values given in the miptree are relative to zero and not the base layer/level. Reviewed-by: Chad Versace <chad.versace@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	6ba88bce64	i965: Drop the maximum 3D texture size to 512 on Sandy Bridge The RenderTargetViewExtent field of RENDER_SURFACE_STATE is supposed to be set to the depth of a 3-D texture when rendering. Unfortunatley, that field is only 9 bits on Sandy Bridge and prior so we can't actually bind a 3-D texturing for rendering if it has depth > 512. On Ivy Bridge, this field was bumpped to 11 bits so we can go all the way up to 2048. On Iron Lake and prior, we don't support layered rendering and we use OffsetX/Y hacks to render to particular layers so 2048 is ok there too. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	0f9cd74aab	i965/gen4-6: Handle gl_texture_object::BaseLevel and MinLayer correctly This is basically a direct translation of what we do for gen7. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83036 Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Jason Ekstrand	ee39d3ba91	i965/gen4: Pull texture formats from the texture object not the miptree This makes texture views sort-of work. It doesn't add full texture view support for gen4-5 but it is enough to fix the GL_ARB_copy_image formats piglit test on Iron Lake. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83036 Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-06-22 12:26:43 -07:00
Kenneth Graunke	77d6add00d	i965: Fix point size with tessellation/geometry shaders in GLES. Our previous code worked for desktop GL, and ES without geometry or tessellation shaders. But those features require fancier point size handling. Fortunately, we can use one rule for all APIs. Fixes a number of dEQP tests with EXT_tessellation_shader enabled: dEQP-GLES31.functional.tessellation_geometry_interaction.point_size.* Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-22 12:22:50 -07:00
Marek Olšák	5d85a21fee	.mailmap: fix my main address	2016-06-22 14:45:52 +02:00
Timothy Arceri	356ea9a8da	i965: move vs outputs written into a helper We will reuse this for fs key generation for the on disk shader cache. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-06-22 20:59:26 +10:00
Nicolai Hähnle	3948cd3797	st/mesa: use a single memcpy in st_ReadPixels when possible This avoids costly address recomputations, function overhead, and may trigger large copy optimizations. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-22 11:44:03 +02:00
Ilia Mirkin	36ed1b695e	glsl: only match gl_FragData and not gl_SecondaryFragDataEXT There's special logic around finding gl_FragData. It latches onto any array with FRAG_RESULT_DATA0. However gl_SecondaryFragDataEXT[], added by GL_EXT_blend_func_extended, fits those parameters as well. The real frag data array should have index 0 though, so we can use that to distinguish them. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96617 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-21 21:58:34 -04:00
Ilia Mirkin	1f4bca798d	nv50,nvc0: fix start_instance in manual push path The start instance is applied as an offset into the buffer directly, ignoring the divisor, not as an instance id offset that respects the divisor. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-06-21 21:50:16 -04:00
Ilia Mirkin	5b0d64886d	translate: fix start_instance parameter in sse version The generic version gets this right already, but this was using an incorrect formula in SSE. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-06-21 21:50:16 -04:00
Jason Ekstrand	35b53c8d47	anv/cmd: Dirty descriptor sets when a new pipeline is bound Ever since `c2581a9375`, the binding table layout has depended on the pipeline. This means that whenever we change pipelines we also need to re-emit binding tables for the new layout. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-21 16:45:25 -07:00
Jason Ekstrand	2bfe0c3374	anv/cmd: Move emit_descriptor_pointers to genX_cmd_buffer.c It's tiny and fully generic so there's really no reason for it to be in a gen7-specific file. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-21 16:45:25 -07:00
Jason Ekstrand	9df4d6bb36	anv/cmd: Move flush_descriptor_sets to anv_cmd_buffer.c There's no good reason for recompiling it Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-21 16:45:25 -07:00
Jason Ekstrand	295e03c980	spirv: Use the system value version of gl_FrontFace SPIR-V treats it as an input but NIR wants the system value. This shouldn't have been too much of a surprise given that we have to do the same conversion in the GLSL IR to NIR pass. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-21 16:45:25 -07:00
Kenneth Graunke	40013c5033	i965: Reorganize prog_data->total_scratch code a bit. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-21 10:24:45 -07:00
Marek Olšák	b16d21270f	radeonsi: add a debug flag for unsafe math LLVM optimizations Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-21 13:52:05 +02:00
Marek Olšák	70a25478fe	radeonsi: use u_blitter for mipmap generation This reduces time spend in glGenerateMipmap by a half. v2: don't decompress the levels to be overwritten Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-21 13:52:05 +02:00
Marek Olšák	5fed1122e8	gallium/u_blitter: implement mipmap generation for pipe_context::generate_mipmap first move some of the blit code from util_blitter_blit_generic to a separate function, then use it from util_blitter_generate_mipmap Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-21 13:52:05 +02:00
Nicolai Hähnle	3735a925ef	st/mesa: cache staging texture for glReadPixels v2: add ST_DEBUG flag for disabling (suggested by Ilia) Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)	2016-06-21 11:02:41 +02:00
Nicolai Hähnle	a571859fc4	st/mesa: invalidate readpixels cache Whenever a draw happens or some other function call might change the result of future glReadPixels calls, we must invalidate the cache. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-21 10:54:19 +02:00
Nicolai Hähnle	615ba11563	st/mesa: add readpix_cache structure Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-21 10:54:16 +02:00
Nicolai Hähnle	b74c23138c	st/mesa: move ReadPixels blit into a separate function Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-21 10:54:12 +02:00
Nicolai Hähnle	f9ddd52317	st/mesa: flush bitmap cache before CopyImageSubData Found by inspection. Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-21 10:54:10 +02:00
Nicolai Hähnle	e7fff3cfe1	st/mesa: flush bitmap cache before texture functions As far as I can tell, a sequence of glBitmap followed by texture functions that refer to a texture bound as the framebuffer is well within what should be allowed. Found by inspection. Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-21 10:54:08 +02:00
Nicolai Hähnle	c542b7e43d	st/mesa: flush bitmap cache before compute dispatch In the unlikely case that a program uses glBitmap to render to a framebuffer whose texture is bound in a compute shader. Found by inspection. Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-21 10:54:00 +02:00
Timothy Arceri	644e015f0b	i965: get PrimitiveMode from the program rather than the shader struct This is more consistent with what we do elsewhere and will allow us to only cache one of the values in the shader cache. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-06-21 12:43:18 +10:00
Vedran Miletić	82e0bbd01a	clover: Fix build against clang SVN >= r273191 setLangDefaults() now requires PreprocessorOptions as an argument. Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2016-06-21 10:08:57 +09:00
Kenneth Graunke	cd89c834a8	i965: Fix multiplication of immediates on Cherryview/Broxton. Cherryview and Broxton don't support DW x DW multiplication. We have piles of code to handle this, but apparently weren't retyping in the immediate case. For example, tests/spec/arb_tessellation_shader/execution/dvec3-vs-tcs-tes makes the simulator angry about instructions such as: mul(8) r18<1>:D r10.0<8;8,1>:D 0x00000003:D Just retype to W or UW. It should be safe on all platforms. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462 Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-20 17:48:03 -07:00
Jason Ekstrand	eb6764c4a7	anv: Add proper support for depth clamping Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-20 12:04:08 -07:00
Jason Ekstrand	8a46b505cb	anv/cmd_buffer: Split emit_viewport in two Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-20 12:03:09 -07:00
Jason Ekstrand	20e95a746d	anv/cmd_buffer: Set depth/stencil extent based on the image It used to be based on the framebuffer which isn't quite right. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-20 12:03:05 -07:00
Jason Ekstrand	b65f2e4163	anv/cmd_buffer: Don't crash if push constants are provided for missing stages Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-20 12:03:02 -07:00
Jason Ekstrand	e6c2fe4519	anv/pipeline: Do invariance propagation on SPIR-V shaders Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-20 12:02:58 -07:00
Jason Ekstrand	bec07b7292	nir/alu_to_scalar: Respect the exact ALU operation qualifier Just setting builder->exact isn't sufficient because that only applies to instructions that are built with the builder but instructions created manually and only inserted using the builder are left alone. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-20 12:02:55 -07:00
Jason Ekstrand	202751fbb7	nir: Add a pass for propagating invariant decorations This pass is similar to propagate_invariance in the GLSL compiler. The real "output" of this pass is that any algebraic operations which are eventually consumed by an invariant variable get marked as "exact". Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-20 12:02:45 -07:00
Jason Ekstrand	68e308d853	nir/algebraic: Remove imprecise flog2 optimizations While mathematically correct, these two optimizations result in an expression with substantially lower precision than the original. For any positive finite floating-point value, log2(x) is well-defined and finite. More precisely, it is in the range [-150, 150] so any sum of logarithms log2(a) + log2(b) is also well-defined and finite as long as a and b are both positive and finite. However, if a and b are either very small or very large, their product may get flushed to infinity or zero causing log2(a * b) to be nowhere close to log2(a) + log2(b). This imprecision was causing incorrect rendering in Talos Principal because part of its HDR rendering process involves doing 8 texture operations, clamping the result to [0, 65000], taking a dot-product with a constant, and then taking the log2. This is done 6 or 8 times and summed to produce the final result which is written to a red texture. In cases where you have a region of the screen that is very dark, it can end up getting a result value of -inf which is not what is intended. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96425 Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-06-20 11:56:57 -07:00
Ian Romanick	895f7ddfb5	i965: Delete redundant extension enables A nearly identical block already exists in the gen >= 6 block above. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-20 11:18:39 -07:00
Ian Romanick	d3a5cae60a	mesa: Fix incorrect "see also" comments Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-06-20 11:18:39 -07:00
Ian Romanick	08cd234db8	mesa: Silence unused parameter warning main/pipelineobj.c: In function ‘delete_pipelineobj_cb’: main/pipelineobj.c:110:30: warning: unused parameter ‘id’ [-Wunused-parameter] delete_pipelineobj_cb(GLuint id, void data, void userData) ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-06-20 11:18:38 -07:00
Rob Clark	64180de1bf	gallium: make image_view const Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-20 12:36:20 -04:00
Rob Clark	ef534b9389	gallium: make constant_buffer const Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-20 12:36:20 -04:00
Rob Clark	e1c1c40cbc	gallium: make shader_buffers const Be consistent with the rest of the "set_xyz" state interfaces. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-20 12:36:20 -04:00
Nicolai Hähnle	1167905c41	radeonsi: use trapezoid distribution for tess on Fiji and Polaris This yields a small performance improvement in Unigine Heaven. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-20 18:29:55 +02:00
Nicolai Hähnle	650137a9c8	radeonsi/sid: add Fiji+ tesselation distribution mode Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-20 18:29:15 +02:00
Nicolai Hähnle	32fd92e028	radeonsi: emit PA_SC_RASTER_CONFIG_1 only once It is the same for all SEs. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-20 18:28:34 +02:00
Nicolai Hähnle	c95175581e	radeonsi: fix calculation of valid RB mask per SE The old calculation treated too many RBs as disabled. Cc: 11.0 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-20 18:28:31 +02:00
Nicolai Hähnle	6c2e636982	radeonsi: raise SI_PM4_MAX_DW The old limit, introduced in commit `afa752d3f0`, was exceeded by 4 SE configurations which hit si_write_harvested_raster_configs. Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-20 18:28:17 +02:00
Roland Scheidegger	b0cf99165a	gallivm: don't use integer min/max sse intrinsics with llvm >= 3.9 Apparently, these are deprecated. There's some AutoUpgrade feature which is supposed to promote these to cmp/select, which apparently doesn't work with jit code. It is possible it's not actually even meant to work (see the bug filed against llvm which couldn't provide an answer neither) but in any case this is meant to be only temporary unless the intrinsics are really illegal. So, just use the fallback code (which should be cmp/select, we're actually doing cmp/sext/trunc/select, but in any case llvm 3.9 manages to optimize this back to pmin/pmax in the end). This addresses https://llvm.org/bugs/show_bug.cgi?id=28176 CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Tested-by: Vinson Lee <vlee@freedesktop.org> Tested-by: Aaron Watry <awatry@gmail.com>	2016-06-20 17:19:03 +02:00
Ilia Mirkin	154c0a42a2	nvc0: don't make use of push hint if there are no non-const user vbos This makes the check match up what we do on nv50 as well - there's no point in switching over the push path if everything's in managed buffers. This can happen when a shader uses a vertex without an enabled array - we end up passing it a constant attribute. This also has the effect of "fixing" some flickering in Talos. I have no idea why. I've stared at the push logic forwards, backwards, and sideways. By always forcing the push path (which is slow), the flickering also goes away, but other rendering is still wrong (specifically draw 383068 as identified in the bug). However by not switching over to the push path, draw 383068 is correct. Note that other flickering remains in Talos, like the red/green walls/floors. This takes care of the shadow flickering though. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90513 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-06-19 10:14:57 -04:00
Ilia Mirkin	1804aa0b80	gk104/ir: fix tex use generation to be more careful about eliding uses If we have a loop, instructions before the tex might be added as tex uses, and those may in fact dominate all other uses of the tex results. This however doesn't mean that we don't need a texbar after the tex. Only check if uses dominate each other they are dominated by the tex. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96565 Fixes: `7752bbc44` (gk104/ir: simplify and fool-proof texbar algorithm) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-06-19 10:14:46 -04:00
Ilia Mirkin	194bcb49d1	nv50: add support for GL_EXT_window_rectangles Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-06-18 13:38:30 -04:00
Ilia Mirkin	b21a00d129	nvc0: add support for GL_EXT_window_rectangles Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-06-18 13:38:30 -04:00
Ilia Mirkin	d1bdc1238a	st/mesa: add support for GL_EXT_window_rectangles Make sure to pass the requisite information in draws, blits, and clears that work on the context's draw buffer. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-18 13:38:30 -04:00
Ilia Mirkin	07fcb06fe0	gallium: add PIPE_CAP_MAX_WINDOW_RECTANGLES to all drivers This says how many window rectangles are supported by the implementation, although it may not exceed PIPE_MAX_WINDOW_RECTANGLES. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-18 13:38:29 -04:00
Ilia Mirkin	82fab73246	gallium: add API for setting window rectangles Window rectangles apply to all framebuffer operations, either in inclusive or exclusive mode. They may also be specified as part of a blit operation. In exclusive mode, any fragment inside any of the specified rectangles will be discarded. In inclusive mode, any fragment outside every rectangle will be discarded. The no-op state is to have 0 rectangles in exclusive mode. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-18 12:59:12 -04:00
Ilia Mirkin	d68c1e2ac2	mesa: add GL_EXT_window_rectangles state storage/retrieval functionality Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-18 12:51:55 -04:00
Ilia Mirkin	78506ad246	glapi: add GL_EXT_window_rectangles entrypoints Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-18 12:51:55 -04:00
Samuel Pitoiset	b214e0d2fb	nv50/ir: add missing strings for some recent sysvals This is pretty useful for debugging purposes and those should not be omitted. Fixes: `517a93b3` ("nvc0: add ARB_shader_draw_parameters support") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-06-18 18:34:50 +02:00
Bruce Cherniak	6b0ac95c28	swr: Update screen->context pointer with multiple contexts. A pipe pointer in the screen allows for access to current device context in flush_frontbuffer and resource_destroy. This wasn't tracking current context in multi-context situations. v2: More caffeine. Corrected compare, removed unnecessary set of screen-pipe in create_context, and added a few comments.	2016-06-17 13:56:03 -05:00
Brian Paul	ace3124f22	scons: put the generated git_sha1.h file in top-level src/ directory To match what's done in the automake build. v2: Use git rev-parse to get a 10-character hash ID Fix Python imports Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-06-17 10:33:00 -06:00
Tim Rowley	5a64549f54	swr: switch from overriding -march to selecting features Acked-by: Chuck Atkins <chuck.atkins@kitware.com> Tested-by: Chuck Atkins <chuck.atkins@kitware.com>	2016-06-17 10:34:17 -05:00
Timothy Arceri	481e924951	mesa: remove remaining tabs in api_validate.c Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-06-17 22:07:21 +10:00
Samuel Iglesias Gonsálvez	bdab572a86	i965/fs: indirect addressing with doubles is not supported in CHV/BSW/BXT From the Cherryview's PRM, Volume 7, 3D Media GPGPU Engine, Register Region Restrictions, page 844: "When source or destination datatype is 64b or operation is integer DWord multiply, indirect addressing must not be used." v2: - Fix it for Broxton too. v3: - Simplify code by using subscript() and not creating a new num_components variable (Kenneth). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-17 11:33:18 +02:00
Iago Toral Quiroga	0177dbb6c2	i965/fs: Fix single-precision to double-precision conversions for CHV/BSW/BXT From the Cherryview PRM, Volume 7, 3D Media GPGPU Engine, Register Region Restrictions: "When source or destination is 64b (...), regioning in Align1 must follow these rules: 1. Source and destination horizontal stride must be aligned to the same qword. (...)" v2: - Fix it for Broxton too. v3: - Remove inst->regs_written change as it is not necessary (Ken) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462 Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-17 08:46:02 +02:00
Kenneth Graunke	48593eaf2d	docs: Mention GL_ARB_ES3_1_compatibility in release notes. Ilia reminded me that I forgot this.	2016-06-16 17:10:35 -07:00
Kenneth Graunke	a08a16541b	i965: Fix comment about CS scratch space encodings on Broadwell+. I typo'd this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-16 16:11:35 -07:00
Kenneth Graunke	93d8f80a9a	docs: Update ARB_ES3_1_compatibility status for i965.	2016-06-16 14:39:44 -07:00
Kenneth Graunke	1f9445ff57	i965: Drop perf_debug about rasterizer discard in SOL vs. clipper. I recently experimented with performing rasterizer discard in the SOL unit instead of the clipper, and as far as I can tell, it's basically the same performance. The clipper comes directly after SOL anyway, and setting the clipper to REJECT_ALL should be pretty darn cheap. Keep the perf_debug on Sandybridge, where the GS actually does work. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-16 14:37:07 -07:00
Kenneth Graunke	32b1c0b694	i965: Enable GL_ARB_ES3_1_compatibility on Gen8+ if CS are available. There are almost no tests in any test suite, but what little I've found seems to work. Ilia believes everything is in place. v2: Predicate the enable on ES 3.1 being available (Gen8+) and also ARB_compute_shader being available (requested by Ilia). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-06-16 14:33:24 -07:00
Ian Romanick	6bec55a780	mesa: If validation fails in a debug context just emit a debug message There are quite a few pipelines that desktop applications (including a bunch of piglit test) can expect to have run but don't meet the GLES requirements. Instead of failing validation, just emit a debug message. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358 Cc: "12.0" <mesa-stable@lists.freedesktop.org> Cc: Gregory Hainaut <gregory.hainaut@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-06-16 09:33:54 -07:00
Ian Romanick	9c87282041	glsl: Always strip arrayness in precision_qualifier_allowed Previously some callers of precision_qualifier_allowed would strip the arrayness from the type and some would not. As a result, some places would not notice that float[6], for example, needed a precision qualifier. Fixes the new piglit test no-default-float-array-precision.frag. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358 Cc: "12.0" <mesa-stable@lists.freedesktop.org> Cc: Gregory Hainaut <gregory.hainaut@gmail.com> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-06-16 09:33:53 -07:00
Jose Fonseca	d04f652b75	mesa/main: Update _mesa_new_shader. Left over from `31dee99e05`. It should fix Clang Windows build. Trivial.	2016-06-16 15:22:37 +01:00
Christian König	6d877d7121	st/vdpau: we support lumakeying now Signed-off-by: Christian König <christian.koenig@amd.com>	2016-06-16 09:41:13 +02:00
Christian König	bf89e672cf	vl: support luma keying for interlaced surfaces as well We had the CSC code twice in there, factor it out into a separate function. Signed-off-by: Christian König <christian.koenig@amd.com>	2016-06-16 09:41:12 +02:00
Timothy Arceri	456b5d9ac9	i965: remove remaining tabs in brw_link.cpp Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-16 16:24:19 +10:00
Mathias Fröhlich	0e73d9454d	vbo: Use a bitmask to track the active arrays in vbo_save*. The use of a bitmask makes functions iterating only active attributes less visible in profiles. v2: Use _mesa_bit_scan{,64} instead of open coding. v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:55 +02:00
Mathias Fröhlich	bc4e0c4868	vbo: Use a bitmask to track the active arrays in vbo_exec*. The use of a bitmask makes functions iterating only active attributes less visible in profiles. v2: Use _mesa_bit_scan{,64} instead of open coding. v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:55 +02:00
Mathias Fröhlich	22e5d4a1ee	mesa: Use bitmask/ffs to iterate the active_samplers bitmask. Replaces an iterate and test bit in a bitmask loop by a loop only iterating over the bits set in the bitmask. v2: Use _mesa_bit_scan{,64} instead of open coding. v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:55 +02:00
Mathias Fröhlich	34f741b080	mesa: Use bitmask/ffs to iterate the enabled textures. Replaces an iterate and test bit in a bitmask loop by a loop only iterating over the bits set in the bitmask. v2: Use _mesa_bit_scan{,64} instead of open coding. v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:55 +02:00
Mathias Fröhlich	11a5b776c2	mesa: Use designated bool value to check texture unit completeness. The change helps to use the bitmask/ffs in the next change. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:55 +02:00
Mathias Fröhlich	c14ec9aafa	mesa: Use bitmask/ffs to iterate SamplersUsed Replaces an iterate and test bit in a bitmask loop by a loop only iterating over the bits set in the bitmask. v2: Use _mesa_bit_scan{,64} instead of open coding. v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:55 +02:00
Mathias Fröhlich	53691b7cb1	i965: Use bitmask/ffs to iterate used vertex attributes. Replaces an iterate and test bit in a bitmask loop by a loop only iterating over the bits set in the bitmask. v2: Use _mesa_bit_scan{,64} instead of open coding. v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:55 +02:00
Mathias Fröhlich	b670f0d1d7	i965: Use bitmask/ffs to iterate enabled clip planes. Replaces an iterate and test bit in a bitmask loop by a loop only iterating over the bits set in the bitmask. v2: Use _mesa_bit_scan{,64} instead of open coding. v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:55 +02:00
Mathias Fröhlich	a0fe569e53	radeon/r200: Use bitmask/ffs to iterate enabled clip planes. Replaces an iterate and test bit in a bitmask loop by a loop only iterating over the bits set in the bitmask. v2: Use _mesa_bit_scan{,64} instead of open coding. v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:54 +02:00
Mathias Fröhlich	dc9e604ef1	mesa: Use bitmask/ffs to iterate enabled clip planes. Replaces an iterate and test bit in a bitmask loop by a loop only iterating over the bits set in the bitmask. v2: Use _mesa_bit_scan{,64} instead of open coding. v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:54 +02:00
Mathias Fröhlich	d8a3ac90df	mesa: Use bitmask/ffs to iterate color material attributes. Replaces an iterate and test bit in a bitmask loop by a loop only iterating over the bits set in the bitmask. v2: Use _mesa_bit_scan{,64} instead of open coding. v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:54 +02:00
Mathias Fröhlich	d4eb2f9cda	mesa: Use bitmask/ffs to build ff fragment shader keys. Replaces an iterate and test bit in a bitmask loop by a loop only iterating over the bits set in the bitmask. The bitmask used here for iteration is a combination of different enabled masks present for texture units. v2: Use _mesa_bit_scan{,64} instead of open coding. v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:54 +02:00
Mathias Fröhlich	3ee409bebf	mesa: Use bitmask/ffs to build ff vertex shader keys. Replaces an iterate and test bit in a bitmask loop by a loop only iterating over the bits set in the bitmask. The bitmask used here for iteration is a combination of different enabled masks present for texture units. v2: Use _mesa_bit_scan{,64} instead of open coding. v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:54 +02:00
Mathias Fröhlich	b5820759de	mesa: Remove the linked list of enabled lights Clean up after conversion to bitmasks. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:54 +02:00
Mathias Fröhlich	21f7f67685	mesa: Switch to bitmask based enabled lights in gen_matypes.c Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:54 +02:00
Mathias Fröhlich	f0391ba6c1	radeon/r200: Use bitmask/ffs to iterate enabled lights Replaces a loop that iterates all lights and test which of them is enabled by a loop only iterating over the bits set in the enabled bitmask. v2: Use _mesa_bit_scan{,64} instead of open coding. v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:54 +02:00
Mathias Fröhlich	f69a400513	nouveau: Use bitmask/ffs to iterate enabled lights Replaces a loop that iterates all lights and test which of them is enabled by a loop only iterating over the bits set in the enabled bitmask. v2: Use _mesa_bit_scan{,64} instead of open coding. v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:54 +02:00
Mathias Fröhlich	9a3fcb010c	tnl: Use bitmask/ffs to iterate enabled lights Replaces loops that iterate all lights and test which of them is enabled by a loop only iterating over the bits set in the enabled bitmask. v2: Use _mesa_bit_scan{,64} instead of open coding. v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:53 +02:00
Mathias Fröhlich	664aec4370	mesa: Use bitmask/ffs to iterate enabled lights for ff shader keys. Replaces a loop that iterates all lights and test which of them is enabled by a loop only iterating over the bits set in the enabled bitmask. v2: Use _mesa_bit_scan{,64} instead of open coding. v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:53 +02:00
Mathias Fröhlich	ccb1be2fab	mesa: Use bitmask/ffs to iterate enabled lights Replaces loops that iterate all lights and test which of them is enabled by a loop only iterating over the bits set in the enabled bitmask. v2: Use _mesa_bit_scan{,64} instead of open coding. v3: Use u_bit_scan{,64} instead of _mesa_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:53 +02:00
Mathias Fröhlich	b60c730235	mesa: Track enabled lights in a bitmask This enables some optimizations afterwards. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:53 +02:00
Mathias Fröhlich	6749d77c69	mesa: Rename CoordReplaceBits back to CoordReplace. It used to be called like that and fits better with 80 columns. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:53 +02:00
Mathias Fröhlich	291f00fa12	mesa: Remove the now unused CoordsReplace array. Now that all users are converted, remove the array. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:53 +02:00
Mathias Fröhlich	d19c69659a	i965: Convert i965 to use CoordsReplaceBits. Switch over to use the CoordsReplaceBits bitmask. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:53 +02:00
Mathias Fröhlich	97f67be0a7	i915: Convert i915 to use CoordsReplaceBits. Switch over to use the CoordsReplaceBits bitmask. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:52 +02:00
Mathias Fröhlich	8e01fd6396	r200: convert r200 to use CoordsReplaceBits. Switch over to use the CoordsReplaceBits bitmask. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:52 +02:00
Mathias Fröhlich	da79d76503	gallium: Convert the state_tracker to use CoordsReplaceBits. Switch over to use the CoordsReplaceBits bitmask. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:52 +02:00
Mathias Fröhlich	664ba9ccc9	swrast: Convert swrast to use CoordsReplaceBits. Switch over to use the CoordsReplaceBits bitmask. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:52 +02:00
Mathias Fröhlich	1c78515d93	mesa: Add gl_point_attrib::CoordReplaceBits bitfield. The aim is to replace the CoordReplace array by a bitfield. Until all drivers are converted, establish the bitfield in parallel to the CoordReplace array. v2: Fix bitmask logic. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-16 05:50:52 +02:00
Timothy Arceri	31dee99e05	mesa/glsl: stop using GL shader type internally Instead use the internal gl_shader_stage enum everywhere. This makes things more consistent and gets rid of unnecessary conversions. Ideally it would be nice to remove the Type field from gl_shader altogether but currently it is used to differentiate between gl_shader and gl_shader_program in the ShaderObjects hash table. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-16 10:45:35 +10:00
Brian Paul	bb1292e226	auxilary/os: allow appending to GALLIUM_LOG_FILE If the log file specified by the GALLIUM_LOG_FILE begins with '+', open the file in append mode. This is useful to log all gallium output for an entire piglit run, for example. v2: put GALLIUM_LOG_FILE support inside an #ifdef DEBUG block. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-06-15 17:16:42 -06:00
Chad Versace	c99a0a8bce	anv: Fix a harmless overflow warning anv_pipeline_binding::index is a uint8_t, but some code assigned to it UINT16_MAX. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewd-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-15 15:34:13 -07:00
Rob Herring	067c5b10b6	vc4: fix vc4_resource_from_handle() stride calculation The expected stride calculation is completely wrong. It should ultimately be multiplying cpp and width rather than dividing. The width also needs to be aligned to the tiling width first before converting to stride bytes. The whole stride check here is possibly pointless. Any buffers which were allocated outside of vc4 may have strides with larger alignment requirements. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-06-15 14:54:38 -07:00
Kenneth Graunke	c319512e16	i965: Use a uniform for gl_PatchVerticesIn in the TCS on Gen8+. We still need to recompile the passthrough shader when this value changes, as it also affects the output vertex count. But otherwise, we can eliminate recompiles on Gen8+. We probably want to do this for Gen7 as well, but that requires rewriting the input release code to use a loop, which is a trade-off I'd need to consider in more detail. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Cc: mesa-stable@lists.freedesktop.org	2016-06-15 12:47:37 -07:00
Kenneth Graunke	2b867264d2	glsl: Optionally lower TCS gl_PatchVerticesIn to a uniform. i965 has no special hardware for this, so the best way to implement this is to pass it in via a uniform. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Cc: mesa-stable@lists.freedesktop.org	2016-06-15 12:47:37 -07:00
Kenneth Graunke	1bc194cd64	i965: Use a uniform for gl_PatchVerticesIn in the TES. Fixes three GL44-CTS.tessellation_shader subtests: - max_patch_vertices - single.max_patch_vertices - tessellation_control_to_tessellation_evaluation.gl_PatchVerticesIn These use gl_PatchVerticesIn in the TES, but don't link against a TCS (which would allow the linker to lower it to a constant). We had no handling for the system value in the backend, so it would just assert fail. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Cc: mesa-stable@lists.freedesktop.org	2016-06-15 12:44:44 -07:00
Kenneth Graunke	0be2105137	glsl: Optionally lower TES gl_PatchVerticesIn to a uniform. i965 has no special hardware for this, so we need to pass this value in as a uniform (unless the TES is linked against a TCS, in which case the linker can just replace this with a constant). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Cc: mesa-stable@lists.freedesktop.org	2016-06-15 12:44:09 -07:00
Marek Olšák	d794072b3e	winsys/radeon: use the common job queue for multithreaded command submission v2 v2: fixup after renaming to util_queue_fence Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-15 21:07:34 +02:00
Marek Olšák	562cb03d76	gallium/util: import the multithreaded job queue from amdgpu winsys (v2) v2: rename the event to util_queue_fence Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-15 21:07:34 +02:00
Nicolai Hähnle	44e0c0e6ec	radeonsi: fix undefined left-shift into sign bit Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-15 09:27:56 +02:00
Nicolai Hähnle	494e4b8976	st_glsl_to_tgsi: don't read potentially uninitialized buffer variable Found by -fsanitize=undefined. Note that this should be a harmless issue in practice because the inst->op check always dominates anyway. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-06-15 09:27:40 +02:00
Nicolai Hähnle	6510e07345	mesa/main: fix integer overflows in _mesa_image_offset Found using -fsanitize=undefined. Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-15 09:27:30 +02:00
Timothy Arceri	a8a9d1bf41	i965: remove type_size_vec4_times_4() type_size_vec4_times_4() was introduced as a fix in `8dcf807cb4` however since `3810c1561` we can just use type_size_scalar() and get the actual number of outputs we need. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-15 15:01:10 +10:00
Kenneth Graunke	8b408972ff	mesa: Pass gl_constant_value union into _mesa_fetch_state(). We've had some trouble in the past with copying integers around via float pointers, as the C compiler sometimes uses x87 floating point registers to load values on 32-bit systems. Passing the gl_constant_value union should be safer. To avoid churn, this patch creates a "GLfloat *value" variable so existing uses can stay the same. Not observed to fix anything, but I was in the area adding more integer state vars, and thought it'd be wise. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Dave Airlie <airlied@redhat.com> Cc: mesa-stable@lists.freedesktop.org	2016-06-14 16:09:57 -07:00
Marek Olšák	6ef50efc10	gallium/radeon: num-cs-flushes query should display per-frame average Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-14 20:22:16 +02:00
Marek Olšák	4140afd04b	gallium/radeon: add driver queries for compute/dma call stats and spills also print the average count per frame Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-14 20:22:16 +02:00
Marek Olšák	8fc688c303	radeonsi: don't generate "ret void undef" Use LLVMBuildRetVoid in epilogs and the GS copy shader and si_llvm_build_ret otherwise. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-14 20:22:16 +02:00
Marek Olšák	4eea710b0d	radeonsi: try to hit direct hw MSAA resolve by changing micro mode in clear We could also do MSAA resolve in a compute shader like Vulkan and remove these workarounds. v2: comment the magic numbers Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-14 20:22:16 +02:00
Marek Olšák	373060652c	radeonsi: clarify the MSAA resolve limitation with scanout this is the correct hw requirement Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-14 20:22:16 +02:00
Marek Olšák	789618e3b4	gallium/radeon: add micro_tile_mode to radeon_surf for easier access Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-14 20:22:16 +02:00
Gurchetan Singh	63c5d5c6c4	Added pbuffer hooks for surfaceless platform This change enables the creation of pbuffer surfaces on the surfaceless platform. v3: Going back to single-buffered pbuffer plus additional code review changes Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-06-14 08:51:02 -07:00
Roland Scheidegger	afbf5888f5	gallium/util: don't use blocksize for minify for assertions The previous assertions required for texture sizes smaller than block_size that src_box.x + src_box.width still be block size. (e.g. for a texture with width 3, and src_box.x = 0, src_box.width would have to be 4 to not assert.) This caused some assertions with some other state tracker. It looks though like callers aren't expected to round up widths to block sizes (for sizes larger than block size the assertion would still have verified it wouldn't have been rounded up) so we simply shouldn't use a minify which rounds up to block size. (No piglit change with llvmpipe.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-14 17:03:34 +02:00
Roland Scheidegger	f4184d5450	llvmpipe: hack-fix bugs due to bogus bind flags The gallium contract would be that bind flags must indicate all possible bindings a resource might get used, but fact is the mesa state tracker does not set bind flags correctly, and this is more or less unfixable due to GL. This caused a bug with piglit arb_uniform_buffer_object-rendering-dsa since `6e6fd911da` - the commit is correct, but it caused us to miss updates to fs UBOs completely, since the corresponding buffer didn't have the appropriate bind flag set (thus we wouldn't check if it is indeed currently bound). See the discussion about this starting here: https://lists.freedesktop.org/archives/mesa-dev/2016-June/119829.html So, update the bind flags when we detect such usage. Note we update this value for now only in places which matter for us - that is creating sampler/surface view, or binding constant buffer. There's plenty more places (setting streamout buffers, vertex/index buffers, ...) where things can be set with the wrong bind flags, but the bind flags there never matter. While here also make sure we only set dirty constant bit when it's a fs constant buffer - totally doesn't matter if it's vs/gs. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-06-14 17:03:34 +02:00
Rob Clark	243417810b	freedreno: support start param for sampler views/states Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-06-14 11:00:59 -04:00
Rob Clark	b8eb1493a9	freedreno: only do extra vertex-buffer state logic on a2xx Possibly this should move into an fd2 wrapper fxn, similar to the texture state tracking done for fd3/fd4 (clamp emulation, etc) Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-06-14 11:00:59 -04:00
Rob Clark	26d0efa9ce	freedreno: use util_copy_constant_buffer() helper Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-06-14 11:00:59 -04:00
Nayan Deshmukh	fdec8f9e42	st/vdpau: replace 0.f and 1.f with 0.0f and 1.0f respectively Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-06-14 15:32:04 +01:00
Tomasz Figa	e7ab358e81	i965: Check return value of screen->image.loader->getBuffers (v2) The images struct is an uninitialized local variable on the stack. If the callback returns 0, the struct might not have been updated and so should be considered uninitialized. Currently the code ignores the return value, which (depending on stack contents) might end up in reading a non-zero value from images.image_mask and dereferencing further fields. Another solution would be to initialize image_mask with 0, but checking the return value seems more sensible and it is what Gallium is doing. v2: fix typos in commit message, fix indentation, remove unnecessary parentheses and pointer dereference to keep line length reasonable. Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-14 15:32:04 +01:00
Michel Dänzer	9ee3f097b6	st/dri: Clear drawable texture_mask in dri2_invalidate_drawable This makes sure that dri_set_tex_buffer2 -> dri_drawable_validate_att will re-create the front left attachment buffer after the drawable got invalidated. Fixes window contents not updating until the window is resized when using DRI2 PRIME. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-14 18:16:54 +09:00
Eduardo Lima Mitev	a93bb2e33f	glsl/builtin_variables: Populate MaxCombinedShaderStorageBlocks on GLSL 4.40 Built-in variable "MaxCombinedShaderStorageBlocks" was added to GLSL 4.40 revision 9. Section "1.2.1 Changes since revision 8 of GLSL version 4.40", page 3 of the PDF states: "Bug 11734: Add gl_MaxCombinedShaderOutputResources and mark gl_MaxCombinedImageUnitsAndFragmentOutputs as deprecated." Fixes: GL44-CTS.shader_image_load_store.basic-glsl-const Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-14 10:21:26 +02:00
Julien Isorce	1cdb4da1d6	st/va: ensure linear memory for dmabuf In order to do zero-copy between two different devices the memory should not be tiled. Tested with GStreamer on a laptop that has 2 GPUs: 1- gstvaapidecode: HW decoding and dmabuf export with nouveau driver on Nvidia GPU. 2- glimagesink: EGLImage imports dmabuf on Intel GPU. TEST: DRI_PRIME=1 gst-launch vaapidecodebin ! glimagesink Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-06-14 08:40:33 +01:00
Dylan Baker	5a87bc7181	isl: Replace bash generator with python generator This replaces the current bash generator with a python based generator using mako. It's quite fast and works with both python 2.7 and python 3.5, and should work with 3.3+ and maybe even 3.2. It produces an almost identical file except for a minor layout changes, and the addition of a "generated file, do not edit" warning. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-13 22:40:52 -07:00
Mathias Fröhlich	ed2dae86ae	mesa: Make use of u_bit_scan{,64}. Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-14 05:19:10 +02:00
Mathias Fröhlich	c3b6656676	mesa/gallium: Move u_bit_scan{,64} from gallium to util. The functions are also useful for mesa. Introduce src/util/bitscan.{h,c}. Move ffs function implementations from src/mesa/main/imports.{h,c}. Move bit scan related functions from src/gallium/auxiliary/util/u_math.h. Merge platform handling with what is available from within mesa. v2: Try to fix MSVC compile. Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2016-06-14 05:19:10 +02:00
Aaron Watry	fafe026dbe	clover: Include generated sources in AM_CPPFLAGS git_sha1.c is generated in $(top_builddir)/src. Fixes out-of-tree builds since `4825264f75`. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96516 Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2016-06-14 12:04:42 +09:00
Stephan Bergmann	0140938b26	nv50/ir: make Graph destructor virtual Avoid ASan new-delete-type-mismatch when Function::domTree is created as DominatorTree in Function::convertToSSA but destroyed only as base Graph in ~Function. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-06-13 22:55:11 -04:00
Jason Ekstrand	be32a21327	i965/compiler: Bring back the INTEL_PRECISE_TRIG environment variable This was removed in `d9546b0c5d` and replced with the precise_trig driconf option. However, we still need precise trig in the Vulkan driver so this commit brings back the environment variable and compiler->precise_trig is effectively the logical OR of the two. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96484 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-13 19:54:47 -07:00
Samuel Iglesias Gonsálvez	a0ed8503b7	i965: Defeat the register stride checker in pull uniform messages. Pulling DF uniforms from pull constant buffer generates messages like: send(4) g12<1>DF g12<0,1,0>F sampler ld SIMD4x2 Surface = 1 Sampler = 0 mlen 1 rlen 1 which produces GPU hangs in Cherryview/Braswell: "For 64-bit Align1 operation or multiplication of dwords in CHV, source horizontal stride must be aligned to qword." This seems to be documented in the Cherryview PRM, Volume 7, Page 843: "When source or destination datatype is 64b or operation is integer DWord multiply, regioning in Align1 must follow these rules: 1. Source and Destination horizontal stride must be aligned to the same qword." We should set the destination type to UD, D, or F so that the register stride checker doesn't notice. The destination type of send messages is basically irrelevant anyway. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462 Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-06-13 19:36:59 -07:00
Kenneth Graunke	ed3ba651f6	i965: Defeat the register stride checker in URB reads. Pulling DF inputs from the URB generates messages like: send(8) g23<1>DF g1<8,8,1>UD urb 3 SIMD8 read mlen 1 rlen 2 { align1 1Q }; which makes the simulator angry: "For 64-bit Align1 operation or multiplication of dwords in CHV, source horizontal stride must be aligned to qword." This seems to be documented in the Cherryview PRM, Volume 7, Page 823: "When source or destination datatype is 64b or operation is integer DWord multiply, regioning in Align1 must follow these rules: 1. Source and Destination horizontal stride must be aligned to the same qword." Setting the source horizontal stride to QWord is insane, as it's the message header containing 8 URB handles in a single 32-bit DWord. Instead, we should whack the destination type to UD, D, or F so that the register stride checker doesn't notice. The destination type of send messages is basically irrelevant anyway. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-06-13 19:36:46 -07:00
Kenneth Graunke	9f37df06da	i965: Fix issues with number of VS URB entries on Cherryview/Broxton. Cherryview/Broxton annoyingly have a minimum number of VS URB entries of 34, which is not a multiple of 8. When the VS size is less than 9, the number of VS entries has to be a multiple of 8. Notably, BLORP programmed the minimum number of VS URB entries (34), with a size of 1 (less than 9), which is invalid. It seemed like this could be a problem in the regular URB code as well, so I went ahead and updated that to be safe. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-06-13 19:35:52 -07:00
Timothy Arceri	b010fa8567	glsl: make sure UBO arrays are sized in ES This check was removed in `5b2675093e` add it back in. Reviewed-by: Dave Airlie <airlied@redhat.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> https://bugs.freedesktop.org/show_bug.cgi?id=96349	2016-06-14 11:33:24 +10:00
Vedran Miletić	4825264f75	clover: Update OpenCL version string to match OpenGL Change MESA into Mesa in CL_PLATFORM_VERSION and CL_DEVICE_VERSION. For both, always append git version suffix from git_sha1.h. v5: move semicolon to same line as MESA_GIT_SHA1. v4: drop #ifdef guards. v3: add missing include. v2: change CL_DEVICE_VERSION as well. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-06-13 15:55:59 -07:00
Francisco Jerez	bd9f972651	i965/fs: Fix regs_written for SIMD-lowered instructions some more. ISTR having suggested this during review of the recent FP64 changes to the SIMD lowering pass, but it doesn't look like it was taken into account in the end. Using the fs_reg::component_size helper instead of this open-coded variant makes sure that the stride is taken into account correctly. Fixes at least the following piglit tests with spilling forced on (since otherwise regs_written would be calculated incorrectly and the spilling code would be rather confused about how much data needs to be spilled): spec.arb_gpu_shader_fp64.shader_storage.layout-std140-fp64-shader spec.arb_gpu_shader_fp64.shader_storage.layout-std140-fp64-mixed-shader Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-06-13 15:55:59 -07:00
Francisco Jerez	a84b5d43e2	i965: Fix cross-primitive scratch corruption when changing the per-thread allocation. I haven't found any mention of this in the hardware docs, but experimentally what seems to be going on is that when the per-thread scratch slot size is changed between two pipelined draw calls, shader invocations using the old and new scratch size setting may end up being executed in parallel, causing their scratch offset calculations to be based in a different partitioning of the scratch space, which can cause their thread-local scratch space to overlap leading to cross-thread scratch corruption. I've been experimenting with alternative workarounds, like emitting a PIPE_CONTROL with DC flush and CS stall between draw (or dispatch compute) calls using different per-thread scratch allocation settings, or avoiding reuse of the scratch BO if the per-thread scratch allocation doesn't exactly match the original. Both seem to be as effective as this workaround, but they have potential performance implications, while this should be basically for free. Fixes over 40 failures in our CI system with spilling forced on (including CTS, dEQP and Piglit failures) on a number of different platforms from Gen4 to Gen9. The 'glsl-max-varyings' piglit test seems to be able to reproduce this bug consistently in the vertex shader on at least Gen4, Gen8 and Gen9 with spilling forced on. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-13 15:55:58 -07:00
Francisco Jerez	d960284e44	i965: Keep track of the per-thread scratch allocation in brw_stage_state. This will be used to find out what per-thread slot size a previously allocated scratch BO was used with in order to fix a hardware race condition without introducing additional stalls or memory allocations. Instead of calling brw_get_scratch_bo() manually from the various codegen functions, call a new helper function that keeps track of the per-thread scratch size and conditionally allocates a larger scratch BO. v2: Handle BO allocation manually instead of relying on brw_get_scratch_bo (Ken). Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-13 15:55:58 -07:00
Francisco Jerez	013ae4a70a	i965: Fix scratch overallocation if the original slot size was already a power of two. The bitwise arithmetic trick used in brw_get_scratch_size() to clamp the scratch allocation to 1KB has the unintended side effect that it will cause us to allocate 2x the required amount of scratch space if the original per-thread scratch size happened to be already a power of two. Instead use the obvious MAX2 idiom to clamp the scratch allocation to the expected range. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-13 15:55:58 -07:00
Kenneth Graunke	2df8f4a253	mesa: Make TexSubImage check negative dimensions sooner. Two dEQP tests expect INVALID_VALUE errors for negative width/height parameters, but get INVALID_OPERATION because they haven't actually created a destination image. This is arguably not a bug in Mesa, as there's no specified ordering of error conditions. However, it's also really easy to make the tests pass, and there's no real harm in doing these checks earlier. Fixes: dEQP-GLES3.functional.negative_api.texture.texsubimage3d_neg_width_height dEQP-GLES31.functional.debug.negative_coverage.get_error.texture.texsubimage3d_neg_width_height v2: Drop redundant check (caught by Anuj Phogat). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-06-13 15:38:47 -07:00
Brian Paul	cf9bb9acac	util: update some assertions in util_resource_copy_region() To cope with copies of compressed images which are not multiples of the block size. Suggested by Jose. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@sroland@vmware.com>	2016-06-13 13:30:19 -06:00
Kenneth Graunke	5a0d294d38	i965: Fix encode_slm_size() to take a generation, not a device info. In the Vulkan driver, we have the generation number (a compile time constant) but not necessarily the brw_device_info struct. I meant to rework the function to take a generation number instead of a brw_device_info pointer to accomodate this. But I forgot, and left it taking a brw_device_info pointer, while making Vulkan pass the generation number (8, 9, ...) directly. This led to crashes. Brown paper bag fix for commit `87d062a940`. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96504 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-13 12:23:11 -07:00
Kenneth Graunke	667e5cec76	i965: Don't leak scratch BOs for TCS/TES. These need to be freed too. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-13 12:22:06 -07:00
Nanley Chery	a4a5917248	anv/pipeline: Don't dereference NULL dynamic state pointers Add guards to prevent dereferencing NULL dynamic pipeline state. Asserts of pCreateInfo members are moved to the earliest points at which they should not be NULL. This fixes a segfault seen in the McNopper demo, VKTS_Example09. v3 (Jason Ekstrand): - Fix disabled rasterization check - Revert opaque detection of color attachment usage Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-13 11:35:45 -07:00
Nanley Chery	a0d84a9ef9	anv: Document and rename anv_pipeline_init_dynamic_state() To reduce confusion, clarify that the state being copied is not dynamic. This agrees with the Vulkan spec's usage of the term. Various sections specify that the various pipeline state which have VkDynamicState enums (e.g. viewport, scissor, etc.) may or may not be dynamic. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-13 11:35:45 -07:00
Samuel Pitoiset	7f257abc1b	nvc0/ir: clamp the UBO index for compute on Kepler We already check that the address is not "too far", but we should also clamp the UBO index in order to avoid looking at the wrong place in the driver cb. This is a pretty rare situation though. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-13 20:12:48 +02:00
Marek Olšák	6e1b12c788	radeonsi: enable scratch coalescing This makes one particular compute shader 8x faster. Latest LLVM git is required. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-13 18:13:51 +02:00
Jimmy Berry	0c0f841e5d	st/va: hardlink driver instances to gallium_drv_video.so Removes the need to set LIBVA_DRIVER_NAME=gallium for supported targets and is consistent with vdpau and general gallium drivers. Note: some versions of libva can detect the gallium name and use the backend. Although that behaviour seems inconsistent since it only works for some platforms/backends. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-13 15:31:29 +01:00
Jan Vesely	1fb4179f92	vl: Fix trivial sign compare warnings v2: add whitepace fixes Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Acked-by: Jose Fonseca <jfonseca@vmware.com> [Emil Velikov: squash a few more whitespace issues] Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-13 15:31:29 +01:00
Rob Herring	112e988329	Android: move libdrm settings to top-level Android.common.mk Fix warnings like these due to HAVE_LIBDRM being inconsistently defined: external/libdrm/include/drm/drm.h:839:30: warning: redefinition of typedef 'drm_clip_rect_t' is a C11 feature [-Wtypedef-redefinition] typedef struct drm_clip_rect drm_clip_rect_t; HAVE_LIBDRM needs to be set project wide to fix this. This change also harmlessly links libdrm with everything, but simplifies the makefiles a bit. Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-13 15:31:29 +01:00
Rob Herring	54e550ab8a	Android: disable some noisy warnings Turn off warnings for -Wpointer-arith, -Wno-missing-field-initializers, -Wno-initializer-overrides, and -Wno-mismatched-tags. These are all deemed pointless, on purpose or no plans to fix. Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-13 15:31:29 +01:00
Emil Velikov	db8790c0da	st/mesa: inline _mesa_create_context() into its only caller Inline the function into it's only caller. This way it's more obvious how the classic and gallium drivers (st/mesa) use _mesa_initialize_context. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-13 15:31:29 +01:00
Emil Velikov	a4fa8bf819	st/mesa: remove unneeded break from st_api_create_context() We have return on the previous line, thus the break will never be reached. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-13 15:31:28 +01:00
Emil Velikov	6406bc1592	st/mesa: use c99 initializer for st_gl_api Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-13 15:31:28 +01:00
Emil Velikov	15bc7856bf	gallium: remove st_api::get_proc_address hook It has been unused for a long time, plus makes the gallium dri modules require an extra glapi symbol relative to their classic counterparts. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-13 15:31:28 +01:00
Emil Velikov	23a7fca6aa	mesa: remove _mesa_init_get_hash() The actual code of the function print_table_stats() is guarded by a ifdef GET_DEBUG, which was not been defined in years. The last fix in 2013 (`7db6b5aa91`) indicates that it's rarely used/tested. Since the issue has gone unnoticed for a whole year (broken with `2ad4a47547`). Let's remove it for now. We can always revive it at a later stage. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-13 15:31:28 +01:00
Emil Velikov	b81685eb32	mesa: kill off _mesa_do_init_remap_table() ... and inline its contents in _mesa_init_remap_table(). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-13 15:31:28 +01:00
Emil Velikov	bfbf286f7d	mesa: use native types when possible All of the functions and related data is internal, so there's no point if using the GL types. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-13 15:31:28 +01:00
Emil Velikov	3f80c95f35	mesa: make _mesa_map_function_spec() static Used only locally. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-13 15:31:28 +01:00
Emil Velikov	390678f27d	mesa: remove used _mesa_get_function_spec() and gl_function_remap Final user was killed with last commit. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-13 15:31:28 +01:00
Emil Velikov	5b700059a8	mesa: remove unused _mesa_map_function_array() Unused as of commit `5a175127f3` ("dri: Remove all extension enabling utility functions") and the patch before the previous patch. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-13 15:31:28 +01:00
Emil Velikov	5378ee8187	glapi: remap_helper.py: remove MESA_alt_functions The final user was nuked with last commit. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-13 15:31:28 +01:00
Emil Velikov	b5dd8e0cf8	mesa: remove unused function _mesa_map_static_functions() Unused as of commit `5a175127f3` ("dri: Remove all extension enabling utility functions") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-13 15:31:28 +01:00
Emil Velikov	07ae8c7df7	dri/common: remove unused libdri_test_stubs.la ... and associated file(s). No longer needed since commit `057259655e` ("i965: Don't link libmesa or libdri_test_stubs into tests") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-13 15:31:27 +01:00
Emil Velikov	fcb5a75a66	swr: automake: add missing -I flag When building from a release tarball (where the generated/built files are in srcdir) in an OOT fashion we need to have both builddir and srcdir in the includes list. Otherwise we'll error out, as the file (header gen_knobs.h in this case) won't be in the location where we are looking. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Cc: Tim Rowley <timothy.o.rowley@intel.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-13 15:31:24 +01:00
Emil Velikov	f4d26856df	automake: add SWR to `make distcheck' gallium drivers Will allows us to catch missing files and build issues before getting the tarball out for general consumption. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Cc: Tim Rowley <timothy.o.rowley@intel.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-13 15:24:44 +01:00
Emil Velikov	bab5ab6940	configure.ac: strip out the llvm-config -march/mtune flags Otherwise drivers such as SWR that depend on providing their own values will fail to build. v2: Add -mcpu for good measure (Chuck) Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Cc: Tim Rowley <timothy.o.rowley@intel.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Chuck Atkins <chuck.atkins@kitware.com> Tested-by: Chuck Atkins <chuck.atkins@kitware.com>	2016-06-13 15:24:44 +01:00
Chuck Atkins	c86fcaca72	swr: Add missing headers for package inclusion CC: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-13 15:24:44 +01:00
Emil Velikov	8229fe68b5	automake: get in-tree `make distclean' working again. With earlier commit we've handled the `make distclean' out of tree build, yet we failed to attribute that for in-tree builds the test condition will return 1. Thus effectively the target will be considered as "failed". Fixes: `b7f7ec7843` ("mesa: automake: distclean git_sha1.h when building OOT") Cc: <mesa-stable@lists.freedesktop.org> Tested-by: Andy Furniss <adf.lists@gmail.com> Reported-by: Andy Furniss <adf.lists@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-06-13 15:24:44 +01:00
Jan Vesely	ace70aedcf	gallivm: Fix trivial sign warnings v2: include whitespace fixes Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-06-13 09:23:09 -04:00
Julien Isorce	a04804746f	st/va: use proper temp pipe_video_buffer template Instead of changing the format on the existing template which makes error handling not nice and confuses coverity. CoverityID: 1337953 Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-06-13 09:14:32 +01:00
Julien Isorce	6c43e0016e	st/va: it is valid to release the VABuffer of an exported resource pipe_resource_reference(&res, NULL) will decrement reference counting, i.e. p_atomic_dec(res->count). But the va surface still has the initial reference since it has created the resource. So calling vaDestroyImage on a derived image calls VaDestroyBuffer but the decrementation won't reach 0. It is just wrong for vlVaDestroyBuffer to rely on the export_refcount flag. Finally the vaapi intel driver has the same logic. Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-06-13 09:14:32 +01:00
Timothy Arceri	30df78236c	glsl: fix component overlap validation for doubles This change makes sure to remove arrays when checking if type is a double. The check for the end of the first slot of a multi-slot double is also fixed by bumping the check to 4 rather than 3. Previously we were we not reserving the last component. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-12 21:56:32 +10:00
Timothy Arceri	ad3def919e	glsl: fix max varyings count for ARB_enhanced_layouts Since this extension allows more than one varying to share a single location we can't just count the number of slots a varying takes and add it to the total. Instead we now reuse the reserved varyings bitfield to determine how many slots are reserved for explicit locations instead. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-12 21:56:28 +10:00
Kenneth Graunke	0fb85ac08d	i965: Use the correct number of threads for compute shaders. We were programming the number of threads per subslice, when we should have been programming the total number of threads on the GPU as a whole. Thanks to Curro and Jordan for helping track this down! On Skylake GT3e: - Improves performance in Unreal's Elemental Demo by roughly 1.5-1.7x. - Improves performance in Synmark's Gl43CSDof by roughly 3.7x. - Improves performance in Synmark's Gl43GSCloth by roughly 1.18x. On Broadwell GT2: - Improves performance in Unreal's Elemental Demo by roughly 1.2-1.5x. - Improves performance in Synmark's Gl43CSDof by roughly 2.0x. - Improves performance in Synmark's Gl43GSCloth by 1.47035% +/- 0.255654% (n=25). On Haswell GT3e: - Improves performance in Unreal's Elemental Demo (in GL 4.3 mode) by roughly 1.10x. - Improves performance in Synmark's Gl43CSDof by roughly 1.18x. - Decreases performance in Synmark's Gl43CSCloth by -1.99484% +/- 0.432771% (n=64). On Ivybridge GT2: - Improves performance in Unreal's Elemental Demo (in GL 4.2 mode) by roughly 1.03x. - Improves performance in Synmark's G/43CSDof by roughly 1.25x. - No change in Synmark's Gl43CSCloth (n=28). Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-06-12 00:40:15 -07:00
Kenneth Graunke	1db37ebecf	i965: Assert that the scratch spaces are in range. I don't know that anything actually guarantees this, but if we exceed the limits, we may end up overflowing and trashing random buffers that happen to be nearby in the VMA space, leading to rendering corruption, hangs, or worse. We should really fix this properly. However, the pitfall has existed for ages, so for now we should at least detect it. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-06-12 00:40:15 -07:00
Kenneth Graunke	a42a93dc12	i965: Fix CS scratch size calculations on Ivybridge and Baytrail. These are linear, not powers of two, and much more limited. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-06-12 00:40:14 -07:00
Kenneth Graunke	147a90d82a	i965: Fix Haswell CS per-thread scratch space encoding. Most scratch stages use power of two sizes, in kilobytes, where 0 means 1kB. But compute shaders on Haswell have a minimum of 2kB, and use a representation where 0 = 2kB. This meant that we were effectively telling the hardware to allocate each thread twice as much space as we meant to, while simultaneously not allocating that much space in the buffer, leading to overflows. Note that the existing code is completely wrong for Ivybridge, but that will take additional work to sort out, so I've left it as is for now. A subsequent commit will take care of that. Together with the previous patches, this fixes rendering corruption on Synmark's Gl43CSDof on Haswell. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-06-12 00:40:14 -07:00
Kenneth Graunke	a7d029d3df	i965: Account for poor address calculations in Haswell CS scratch size. Curro figured this out by investigating the simulator. Apparently there's also a workaround in the Windows driver. I'm not sure it's actually documented anywhere. We were underallocating the scratch buffer by a factor of 128/70. v2: Rename threads_per_subslice to scratch_ids_per_subslice (suggested by Jordan Justen). Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-06-12 00:39:45 -07:00
Kenneth Graunke	2213ffdb4b	i965: Allocate scratch space for the maximum number of compute threads. We were allocating enough space for the number of threads per subslice, when we should have been allocating space for the number of threads in the entire GPU. Even though we currently run with a reduced thread count (due to a bug), we might still overflow the scratch buffer because the address calculation is based on the FFTID, which can depend on exactly which threads, EUs, and threads are executing. We need to allocate enough for every possible thread that could run. Fixes rendering corruption in Synmark's Gl43CSDof on Gen8+. Earlier platforms need additional bug fixes. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-06-12 00:38:50 -07:00
Kenneth Graunke	9cd8f95809	i965: Set subslice_total on Gen7/7.5 platforms. We'll use this for compute shader thread counts and scratch space calculations shortly. Note that subslices are referred to as "half slices" on Ivybridge. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-06-12 00:38:47 -07:00
Kenneth Graunke	87d062a940	i965: Fix shared local memory size for Gen9+. Skylake changes the representation of shared local memory size: Size \| 0 kB \| 1 kB \| 2 kB \| 4 kB \| 8 kB \| 16 kB \| 32 kB \| 64 kB \| ------------------------------------------------------------------- Gen7-8 \| 0 \| none \| none \| 1 \| 2 \| 4 \| 8 \| 16 \| ------------------------------------------------------------------- Gen9+ \| 0 \| 1 \| 2 \| 3 \| 4 \| 5 \| 6 \| 7 \| The old formula would substantially underallocate the amount of space. This fixes GPU hangs on Skylake when running with full thread counts. v2: Fix the Vulkan driver too, use a helper function, and fix the table in the comments and commit message. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-06-12 00:38:26 -07:00
Ilia Mirkin	3f48548a6f	nv50: reinstate dedicated constbuf push path This was disabled due to occasionally incorrect behavior when trying to upload data. It later became apparent that nvc0 also had a similar but slightly different issue, which was resolved in commit `e50c01d5`. This takes the same logic as nvc0 and applies it to nv50 (which has somewhat different interfaces). Unfortunately I did not note down precisely what was broken with UBOs when removing the support from nv50, but I've tested a bunch of local traces, and none of them appear to regress. This should hopefully improve performance when UBOs are used, but this was not directly verified. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-06-11 12:18:43 -04:00
Ilia Mirkin	f47845596b	nv50: enable indirect addressing of fragment shader inputs Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-06-11 11:50:42 -04:00
Ilia Mirkin	7d7e015381	mesa: add drawbuffer argument to ClearNamedFramebufferfi This was fixed in revision 47 of the ARB_dsa spec in Oct 22, 2015. Since it's horrible to have differing APIs across library versions, we should attempt to minimize the impact by backporting it as far as possible and hope no one notices. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-06-10 20:32:03 -04:00
Ilia Mirkin	92351a71a8	GL: update glcorearb.h to svn 32433 This brings in the fixed glClearNamedFramebufferfi definition, as well as a lot of GLsizei -> GLsizeiptr changes. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-06-10 20:31:53 -04:00
Ilia Mirkin	f81374fd3e	GL: update glext to svn 32957 This brings in defines from GL_EXT_window_rectangles and fixes the glClearNamedFramebufferfi definition. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-06-10 20:24:53 -04:00
Brian Paul	5cfc91624c	docs: GL_ARB_copy_image done for softpipe, llvmpipe Signed-off-by: Brian Paul <brianp@vmware.com>	2016-06-10 15:50:55 -06:00
Brian Paul	e9b86bb92c	llvmpipe: turn on pipe cap for GL_ARB_copy_image support Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-10 15:50:04 -06:00
Brian Paul	2db747cf26	llvmpipe: don't use 3-component formats, except 32-bit x 3 formats This basically disallows all 8-bit x 3 and 16-bit x 3 formats for textures and render targets. Some 3-component formats were already disallowed before. This avoids problems with GL_ARB_copy_image. v2: the previous version of this patch disallowed all 3-component formats Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-06-10 15:50:04 -06:00
Brian Paul	672e92a146	softpipe: turn on pipe cap for GL_ARB_copy_image support Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-10 15:50:04 -06:00
Brian Paul	d8fe6332d8	softpipe: don't use 3-component formats Mesa and gallium don't have a complete set of matching 3-component texture formats. For example, 8-bit sRGB unorm. To fully support the GL_ARB_copy_image extension we need to have support for all of these formats: RGB8_UNORM, RGB8_SNORM, RGB8_SRGB, RGB8_UINT, and RGB8_SINT using the same component order. Since we don't have that, disable the 3-component formats for now. v2: Simplify 3-component format check, per Marek. Also check that target != PIPE_BUFFER. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-10 15:50:04 -06:00
Brian Paul	e295b4e800	st/mesa: tweak surface format mapping table 1. Try to choose R8G8B8A8 unorm/srgb formats before others in an effort to try to match component ordering for UINT/SINT/etc. 2. If we can't get a format such as PIPE_FORMAT_A16_UNORM, try PIPE_FORMAT_R16G16B16A16_UNORM before shallower formats. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-10 15:50:04 -06:00
Brian Paul	dd4be2e19a	util: update util_resource_copy_region() for GL_ARB_copy_image This primarily means added support for copying between compressed and uncompressed formats. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-06-10 15:50:04 -06:00
Anuj Phogat	466b320163	gallium: Fix region overlap conditions for rectangles with a shared edge >From OpenGL 4.0 spec, section 4.3.2 "Copying Pixels": "The pixels corresponding to these buffers are copied from the source rectangle bounded by the locations (srcX0, srcY 0) and (srcX1, srcY 1) to the destination rectangle bounded by the locations (dstX0, dstY 0) and (dstX1, dstY 1). The lower bounds of the rectangle are inclusive, while the upper bounds are exclusive." So, the rectangles sharing just an edge shouldn't overlap. ----------- \| \| ------- --- \| \| \| \| \| \| ------- --- Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-10 14:35:21 -07:00
Anuj Phogat	f8679badd4	mesa: Fix region overlap conditions for rectangles with a shared edge >From OpenGL 4.0 spec, section 4.3.2 "Copying Pixels": "The pixels corresponding to these buffers are copied from the source rectangle bounded by the locations (srcX0, srcY 0) and (srcX1, srcY 1) to the destination rectangle bounded by the locations (dstX0, dstY 0) and (dstX1, dstY 1). The lower bounds of the rectangle are inclusive, while the upper bounds are exclusive." So, the rectangles sharing just an edge shouldn't overlap. ----------- \| \| ------- --- \| \| \| \| \| \| ------- --- Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-10 14:35:21 -07:00
Dave Airlie	1584918996	gallivm: more 64-bit integer prep work. This converts one other place to using the new helper. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-11 06:44:30 +10:00
Dave Airlie	f550b6d296	radeonsi: convert to 64-bitness checks instead of doubles. This converts to testing for 64-bit types and renames some things in anticipation of 64-bit integer support. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-11 06:44:21 +10:00
Dave Airlie	e5c57824ec	gallivm: make non-float return code bitcast consistent. This just uses the same form across the fetches. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-11 06:44:17 +10:00
Dave Airlie	3b97e50b9a	gallium/gallivm: use 64-bit test instead of doubles. This just makes some generic code that currently emits double suitable for emitting 64-bit values. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-11 06:44:13 +10:00
Dave Airlie	213ab8db87	gallium/tgsi: add 64-bitness type check function. Currently this just doubles, but we'll convert users to this so making adding 64-bit integers easier. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-11 06:43:45 +10:00
Jason Ekstrand	8d37556ec9	anv/entrypoints: Rework #if guards This reworks the #if guards a bit. When Emil originally wrote them, he just guarded everything. However, part of what anv_entrypoints_gen.py generates is a hash table for looking up entrypoints based on their name. This table cannot get out of sync between C and python regardless of preprocessor flags. In order to prevent this, this commit makes us use void pointers in the dispatch table for those entrypoints which aren't available. This means that the dispatch table size and entry order is constant and it should never get out-of-sync with the python. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Emil Velikov <emil.velikov@collabora.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-10 13:21:07 -07:00
Jason Ekstrand	9ed0d9dd06	anv/entrypoints: Use the function pointer types provided by vulkan.h This is a bit cleaner than generating the types ourselves when making the table. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Emil Velikov <emil.velikov@collabora.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-10 13:21:07 -07:00
Nicolai Hähnle	42624ea837	st/mesa: use base level size as "guess" when available When an applications specifies mip levels _before_ setting a mipmap texture filter, we will initially guess a single texture level. When the second level image is created, we try to allocate the full texture -- however, we get the base level size guess wrong if that size is odd. This leads to yet another re-allocation of the texture later during st_finalize_texture. Even worse, this re-allocation breaks a (reasonable) assumption made by st_generate_mipmaps, because the re-allocation in the finalization call will again allocate a single-level pipe texture (based on the non-mipmap texture filter!). As a result, mipmap generation fails in interesting ways. All of this can be avoided by just using the fact that we already know the size of the base level. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95529 Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-10 20:20:39 +02:00
Jason Ekstrand	a1e69930e4	anv: Remove the PhysicalDeviceLimits FINISHME At this point, the limits are probably more-or-less correct. If there is an invalid limit, that's a bug not a FINSHME. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-10 09:43:45 -07:00
Jason Ekstrand	4f5bbf804b	anv/pipeline_cache: Allow for an zero-sized cache This gets ANV_ENABLE_PIPELINE_CACHE=false working again. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-10 09:43:10 -07:00
Jason Ekstrand	a1a25db699	anv/pipeline: Store the (set, binding, index) tripple in the bind map This way the the bind map (which we're caching) is mostly independent of the pipeline layout. The only coupling remaining is that we pull the array size of a binding out of the layout. However, that size is also specified in the shader and should always match so it's not really coupled. This rendering issues in Dota 2. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-10 09:43:07 -07:00
Jason Ekstrand	c13c5ac561	anv/descriptor_set: Ensure that bindings are always in increasing order Since applications are allowed to specify some set of bindings which need not be dense they also need not be in order. For most things, this doesn't matter, but it could result getting the wrong dynamic offsets. This adds a quick-and-dirty sort to ensure that everything is always in increasing order of binding index. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-10 09:43:03 -07:00
Jason Ekstrand	e2265926f2	anv/descriptor_set: Add a type field in debug builds This allows for some extra validation and makes it easier to see what's going on when poking around in gdb. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-10 09:42:59 -07:00
Jason Ekstrand	cd21015abd	anv/descriptor_set: Set array_size to zero for non-existant descriptors Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-10 09:42:45 -07:00
Leo Liu	2ad443e4cc	vl/dri3: support receiving new pixmap for front buffer With glx of gstreamer-vaapi, the temporary pixmap for front buffer gets renewed in each frame, so when we receive a new pixmap, should get a new front buffer for it. This also fixes Totem player playback corruption. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-10 11:24:24 -04:00
Leo Liu	0ef8500aab	vl/dri3: get Makefile properly From original commit, the macro "if HAVE_DRI3" was in Makefile.sources, this file is shared with SCons, SCons is not able to parse this marco, the SCons build failed. Jose quickly gave two approaches and quick fix with his second approach, thanks Jose for the solutions and fixes. This patch is Jose's first approach, and it's more proper, because the dri3 c file should not be included to build when DRI3 is not enabled. Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Emil Velikov <emil.velikov@collabora.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-10 11:24:19 -04:00
Jose Fonseca	2b4cee0571	gallivm: Never emit llvm.fmuladd on LLVM 3.3. Besides the old JIT bug, it seems the X86 backend on LLVM 3.3 doesn't handle llvm.fmuladd and instead it fall backs to a C function. Which in turn causes a segfault on Windows. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-06-10 16:17:04 +01:00
Jose Fonseca	320d1191c6	gallivm: Use llvm.fmuladd.*. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-06-10 13:47:35 +01:00
Jose Fonseca	9e8edfa190	util,gallivm: Explicitly enable/disable fma attribute. As suggested by Roland Scheidegger. Use the same logic as f16c, since fma requires VEX encoding. But disable FMA on LLVM 3.3 without MCJIT. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-06-10 13:47:35 +01:00
Bas Nieuwenhuizen	54f755fa0f	radeonsi: Reinitialize all descriptors in CE preamble. This fixes a problem with the CE preamble and restoring only stuff in the preamble when needed. To illustrate suppose we have two graphics IB's 1 and 2, which are submitted in that order. Furthermore suppose IB 1 does not use CE ram, but IB 2 does, and we have a context switch at the start of IB 1, but not between IB 1 and IB 2. The old code put the CE RAM loads in the preamble of IB 2. As the preamble of IB 1 does not have the loads and the preamble of IB 2 does not get executed, the old values are not load into CE RAM. Fix this by always restoring the entire CE RAM. v2: - Just load all descriptor set buffers instead of load and store the entire CE RAM. - Leave the ce_ram_dirty tracking in place for the non-preamble case. v3: - Fixed parameter alignment. - Rebased to master (Nicolai's descriptor series). Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-10 12:18:29 +02:00
Jose Fonseca	f93c22109e	mesa: Wrap extensions.h declarations with extern "C". This should fix the MSVC linker failures that arose with commit `5e2d25894b`. Trivial.	2016-06-10 11:00:42 +01:00
Ilia Mirkin	f48f344700	st/mesa: fix type confusion with reladdrs The reality is that this doesn't matter, because we manually emit the ARL to the sampler reladdr, and those arguments don't get an extra load later, so it's effectively just a boolean. However having the types be wrong is confusing and could trigger very odd bugs should usage change down the line. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-09 21:01:53 -04:00
Dave Airlie	f140ed6d95	glsl/ir: remove TABs in ir_constant_expression.cpp Adding 64-bit integers support was going to make this file worse, just remove the tabs from it now. Acked-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-10 10:30:18 +10:00
Anuj Phogat	73a54e4892	i965/gen9: Don't change halign and valign to fit in fast copy blit An update in graphics specs has deleted the halign and valign fields from XY_FAST_COPY_BLT command. See mesa commit `97f0f91`. Cc: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-06-09 15:50:07 -07:00
Anuj Phogat	46c8967813	mesa: Add a helper function for shared code in get_tex_rgba_{un}compressed Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-09 15:50:07 -07:00
Samuel Pitoiset	5e2d25894b	mesa: Let compute shaders work in compatibility profiles The extension is already advertised in compatibility profile, but the _mesa_has_compute_shaders only returns true in core profile. If we advertise it, we should allow it to work. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-06-09 21:03:28 +02:00
Tim Rowley	2c85128e01	swr: implement clipPlanes/clipVertex/clipDistance/cullDistance v2: only load the clip vertex once v3: fix clip enable logic, add cullDistance v4: remove duplicate fields in vs jit key, fix test of clip fixup needed v5: fix clipdistance linkage for slot!=0,4 v6: support clip+cull; passes most piglit clip (failures understood) Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-06-09 13:28:35 -05:00
Daniel Czarnowski	cf804b4455	glx: fix crash with bad fbconfig GLX documentation states: glXCreateNewContext can generate the following errors: (...) GLXBadFBConfig if config is not a valid GLXFBConfig Function checks if the given config is a valid config and sets proper error code. Fixes currently crashing glx-fbconfig-bad Piglit test. v2: coding style cleanups (Emil, Topi) use DefaultScreen macro (Emil) Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Cc: "11.2" <mesa-stable@lists.freedesktop.org>	2016-06-09 17:55:44 +03:00
Nayan Deshmukh	2d140ae70a	st/vdpau: implement luma keying Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-06-09 14:23:24 +02:00
Nayan Deshmukh	f24eb5a178	vl: Apply luma key filter before CSC conversion Apply the luma key filter to the YCbCr values during the CSC conversion in video buffer shader. The initial values of max and min luma are set to opposite values to disable the filter initially and will be set when enabling it. Add extra parmeters min and max luma for the luma key filter in vl_compositor_set_csc_matrix in va, xvmc. Setting them to opposite value 1.f and 0.f respectively won't effect the CSC conversion v2: -Squash 1,2 and 3 into one patch to avoid breaking build of other components. (Christian) -use ureg_swizzle. (Christian) -change name of the variables. (Christian) v3: -Squash all patches in one to avoid breaking of build. (Emil) -wrap functions properly. (Emil) -use 0.0f and 1.0f instead of 0.f and 1.f respectively. (Emil) v4: -Divide it in two patches one which introduces the functionality and assigs dummy values to the changed functions and second which implements the lumakey filter. (Christian) -use ureg_scalar instead ureg_swizzle. (Christian) Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-06-09 14:23:07 +02:00
Jason Ekstrand	037ce5d734	i965: Emit surface states for extra planes prior to gen8 When Kristian implemented GL_TEXTURE_EXTERNAL_OES, he hooked it up for gen8 but not for gen7 or earlier. It all works, we just need to emit the states for the extra planes. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-08 21:57:57 -07:00
Marc-André Lureau	dc81b3ad43	virgl: fix checking fences When calling virgl_fence_wait() with timeout=0, virgl_{drm,vtest}_resource_is_busy() is called. However, it returns TRUE for a busy resource, whereace virgl_fence_wait() should return TRUE for a completed (non-busy) resource. This fixes running supertuxkart in a VM (I could not reproduce locally with vtest though there is a similar fix) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-09 14:07:53 +10:00
Dave Airlie	15896a470b	glsl/types: rename is_dual_slot_double to is_dual_slot_64bit. In the future int64 support will have the same requirements. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-09 09:17:24 +10:00
Dave Airlie	45c901f7a3	st/glsl_to_tgsi: move to checking 64-bitness instead of double This uses the new types interfaces to check for 64-bit types, as futureproofing against int64 support. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-09 07:37:49 +10:00
Dave Airlie	bbbc45b8e1	st/glsl_to_tgsi: use enum glsl_base_type instead of unsigned This is just some better type safety that I noticed while working on 64-bit integer support. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-09 07:37:49 +10:00
Dave Airlie	152f5eea62	mesa: use new 64-bit checks instead of explicit double checks. This just moves to the new interfaces in advance of int64. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-09 07:37:47 +10:00
Dave Airlie	2df46519e4	glsl/link_varyings: switch to 64bit check instead of double. This is prep work for int64 support. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-09 07:37:43 +10:00
Dave Airlie	35616a9e0e	glsl: use new interfaces for 64-bit checks. This is just prep work for int64 support, changing places where 64-bit matters no doubles. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-09 07:37:19 +10:00
Dave Airlie	a82b8e8b36	compiler: use 64bit check for sizing instead of double check. This just moves code to the new check in advance of int64 support. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-09 07:37:15 +10:00
Dave Airlie	246518154e	compiler/types: add 64-bitness queries. This adds an inline and type query for if a type is 64-bit. Fow now this is equivalent to double, but int64 will change this. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-09 07:37:04 +10:00
Adam Jackson	a1c5cd426c	glapi/glx: Add overflow checks to the client-side indirect code Coverity complains that the computed sizes can lead to negative lengths passed to memcpy. If that happens we've been handed invalid arguments anyway, so just bomb out. The funky "0%s" is because the size string for the variable-length part of the request is of the form "+ safe_pad() ...", and a unary + would coerce the result to always be positive, defeating the overflow check. Signed-off-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-06-08 14:39:46 -04:00
Marek Olšák	26b69ad250	radeonsi: improve the computation and comment of scratch_waves 2% isn't much. If you think the number should be decreased, please speak up. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-08 19:28:25 +02:00
Marek Olšák	1d9c1d9386	radeonsi: print the number of spilled VGPRs Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-08 19:28:25 +02:00
Marek Olšák	2b18d67a1e	gallium/radeon: remove dead code creating LLVMTargetMachine This was for some old unsupported LLVM version. Only si_create_context creates the target machine now. r600g doesn't use this function. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-08 19:23:42 +02:00
Marek Olšák	a343ab55f7	radeonsi: don't enable scratch just for SGPR spills Diff from shader-db: Scratch: 3221504 -> 17408 (-99.46 %) bytes per wave v2: add "break;" Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-08 19:23:41 +02:00
Marek Olšák	55b097d004	st/mesa: try not to compile compute shader on the first use Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-06-08 19:23:41 +02:00
Marek Olšák	95288277d5	Revert "radeonsi: allow direct hw MSAA resolve for scanout surfaces" This reverts commit `ffd54d1936`. No, it doesn't work. The test case is "glxgears -samples 2".	2016-06-08 19:21:55 +02:00
Nicolai Hähnle	bd5c41fe5f	st/mesa: directly compute level=0 texture size in st_finalize_texture The width0/height0/depth0 on stObj may not have been set at this point. Observed in a trace that set up levels 2..9 of a 2d texture, and set the base level to 2, with height 1. This made the guess logic always bail. Originally investigated by Ilia Mirkin, this patch gets rid of the somewhat redundant storage of width0/height0/depth0 and makes sure we always compute pipe texture sizes that are compatible with the base level image of the GL texture. Fixes the gl-1.2-texture-base-level piglit test provided by Brian Paul. v2: - try to re-use an existing pipe texture when possible - handle a corner case where the base level is not level 0 and it is of size 1x1x1 v3: - ptHeight = ptWidth in cube map 1x1 case (suggested by Brian) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-08 19:12:07 +02:00
Timothy Arceri	8c3ecde0e1	glsl: stop allocating memory for SSBOs and builtins This just stops counting and assigning a storage location for these uniforms, the count is only used to create the uniform storage. These uniform types don't use this storage. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-06-08 13:19:32 +10:00
Ilia Mirkin	6e6fd911da	st/mesa: use buffer usage history to set dirty flags for revalidation We were previously unconditionally doing this for arrays and ubo's, and ignoring texture/storage/atomic buffers. Instead use the usage history to determine which atoms need to be revalidated. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-07 22:27:04 -04:00
Gurchetan Singh	d9546b0c5d	i965: Integrate precise trig into configuration infrastructure With this change, to enable precise SIN and COS instructions on Intel hardware, one can put <option name="precise_trig" value="true"/> in the proper drirc file. V2: Make option name more generic Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Stephane Marchesin <stephane.marchesin@gmail.com>	2016-06-07 15:42:21 -07:00
Marek Olšák	f39439d166	radeonsi: re-enable PBO ReadPixels acceleration disabled by `4f1cccf570` Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-08 00:22:45 +02:00
Marek Olšák	7c6e88b643	radeonsi: allow MSAA resolving into a texture that has DCC enabled Since DCC is enabled almost everywhere now, it's important not to disable this fast path. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-08 00:22:45 +02:00
Marek Olšák	9a472a3e0b	gallium/radeon: move DCC clearing into a separate function Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-08 00:22:45 +02:00
Marek Olšák	ffd54d1936	radeonsi: allow direct hw MSAA resolve for scanout surfaces No idea why this was disabled, but it works fine. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-08 00:22:45 +02:00
Marek Olšák	4be46c7d9d	radeonsi: don't allocate DCC for the temporary MSAA resolve surface Allocating it has no effect, but it adds overhead (useless DCC clear). Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-08 00:22:45 +02:00
Marek Olšák	c06246501e	radeonsi: don't enable DCC in the sampler if first_level doesn't have it If first_level > 0 and DCC is disabled for that level, let's skip DCC reads entirely. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-08 00:22:45 +02:00
Marek Olšák	00389100b6	winsys/amdgpu: enable DCC for mipmapped textures Also add dcc_fast_clear_size for clearing only the necessary subset of DCC. For no AA, it's equal to the size of the whole DCC level. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-08 00:22:45 +02:00
Marek Olšák	c65361763c	gallium/radeon: don't disable DCC because of SDMA We want to keep DCC enabled to save bandwidth. It was a bad idea to disable it here. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-08 00:22:45 +02:00
Marek Olšák	2fd74a05bb	radeonsi: don't flag renderbuffer feedback loop if DCC has just been disabled Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-08 00:22:45 +02:00
Marek Olšák	aa7fe70443	radeonsi: add per-level dcc_enabled flags Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-08 00:22:45 +02:00
Marek Olšák	60e93ddd06	radeonsi: compute DCC register parameters in si_emit_framebuffer_state This will get more complicated with mipmapped DCC or when DCC is enabled after allocation. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-08 00:22:45 +02:00
Marek Olšák	a01536a29f	gallium/radeon: add an assertion checking the validity of PIPE_BIND_SCANOUT Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-08 00:22:45 +02:00
Marek Olšák	d4d733e39d	gallium/radeon: don't allocate DCC for non-renderable texture formats R9G9B9E5 is the only uncompressed one hopefully. This fixes incorrect rendering not discovered (due to a lack of tests) until DCC mipmapping was enabled. Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-08 00:22:45 +02:00
Nicolai Hähnle	b42bc90b6a	radeonsi: enable WQM in PS prolog when needed WQM is needed when the PS prolog computes a VGPR that is consumed by a shader with (implicit or explicit) derivatives. Depends on http://reviews.llvm.org/D20839 / LLVM r272063 for this to be effective (otherwise it's just a no-op). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95130 Cc: 12.0 <mesa-dev@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-07 23:46:02 +02:00
Nicolai Hähnle	d3a584defe	tgsi/scan: add uses_derivatives (v2) v2: - TG4 does not calculate derivatives (Ilia) - also handle SAMPLE* instructions (Roland) Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1) Reviewed-by: Brian Paul <brianp@vmware.com> (v1) Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-06-07 23:45:17 +02:00
Nanley Chery	b7a0c0ec7f	docs/devinfo: Expound on helpful extension tips Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-07 11:16:23 -07:00
Nanley Chery	9e7de50cab	docs/devinfo: Update bullet in stale extension guide Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-07 11:16:23 -07:00
Nanley Chery	26b0f023d7	docs/devinfo: Add closing paragraph tag Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-07 11:16:23 -07:00
Tim Rowley	87f0a0448f	swr: fix provoking vertex Use rasterizer provoking vertex API. Fix rasterizer provoking vertex for tristrips and quad list/strips. v2: make provoking vertex tables static const Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-06-07 11:47:52 -05:00
Ilia Mirkin	c81b090c92	st/mesa: revalidate image atoms when a texture is updated A texture may be redefined with _NEW_TEXTURE, which might have been bound to a shader image slot. We have to revalidate the image atoms to pick up on the new resource. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-07 10:18:34 -04:00
Ilia Mirkin	71ad8a173f	gk104/ir: fix conditions for adding a texbar Sometimes a register source can actually be double- or even quad-wide. We must make sure that the inserted texbars take that width into account. Based on an earlier patch by Samuel Pitoiset. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org>	2016-06-07 10:18:13 -04:00
Nicolai Hähnle	8239da28e8	radeonsi: keep track of dirty descriptor sets Reduces CPU load for draw calls that change none or few of the descriptors. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-07 15:18:10 +02:00
Nicolai Hähnle	d152c73712	radeonsi: move si_descriptors into a per-context array Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-07 15:18:07 +02:00
Nicolai Hähnle	a29c4f9ebd	radeonsi: pass shader stage to si_disable_shader_image Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-07 15:18:05 +02:00
Nicolai Hähnle	4e0fb72786	radeonsi: access descriptor sets via local variables This will simplify moving them to a per-context array. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-07 15:18:02 +02:00
Nicolai Hähnle	ba4a2840c7	radeonsi: add si_set_rw_buffer to be used for internal descriptors So that callers outside of si_descriptors.c need to worry less about the details of descriptor handling. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-07 15:17:59 +02:00
Nicolai Hähnle	c615a055f4	radeonsi: pass shader stage to si_set_shader_image Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-07 15:17:57 +02:00
Nicolai Hähnle	e6612a3e68	radeonsi: pass shader stage to si_set_sampler_view Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-07 15:17:55 +02:00
Nicolai Hähnle	c32cd4b78d	radeonsi: move descriptor set begin_new_cs handling into a separate function Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-07 15:17:39 +02:00
Nicolai Hähnle	031b57bc2f	radeonsi: move enabled_mask out of si_descriptors This mask is irrelevant for the generic descriptor set handling, and having it outside simplifies subsequent changes slightly. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-07 15:17:23 +02:00
Jason Ekstrand	d1e141a661	anv/entrypoints: Stop using the C preprocessor Now that we emit guards for everything, we can just generate the files and trust build flags to keep us safe. This should also fix the tarball problems. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-07 12:30:25 +01:00
Jason Ekstrand	d1a53f91ee	anv/entrypoints: Emit #if guards for all platforms Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-07 12:30:25 +01:00
Haixia Shi	1ea233c6f3	platform_android: prevent deadlock in droid_swap_buffers To avoid blocking other EGL calls, release the display mutex before we enqueue buffer to android frameworks and re-acquire the mutex upon return. v2: moved lock/unlock inside droid_window_enqueue_buffer(). TEST=verify pinch zoom in Photos app no longer causes hangs Signed-off-by: Haixia Shi <hshi@chromium.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-07 12:30:25 +01:00
Emil Velikov	b7f7ec7843	mesa: automake: distclean git_sha1.h when building OOT In the case of out-of-tree (OOT) builds, in particular when building from tarball, we'll end up with the file in both srcdir and builddir. We want the former to remain intact (since we need it on rebuild) while the latter should be removed otherwise `make distclean' gets angry at us. Ideally there'll be a solution that feels a bit less of a hack. Until then this does the job exactly as expected. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-07 12:30:23 +01:00
Emil Velikov	2c424e00c3	mesa: automake: ensure that git_sha1.h.tmp has the right attributes ... when copied from git_sha1.h. As the latter file can we lacking the write attribute, one should set it explicitly. Otherwise we'll get a warning/failure at cleanup stage. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-07 12:21:46 +01:00
Emil Velikov	359d9dfec3	mesa: automake: add directory prefix for git_sha1.h Otherwise the build will assume that we've talking about builddir, which is not the case in the else statement. Here the file is already generated and is part of the tarball. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-07 12:21:45 +01:00
Emil Velikov	1816c837c1	egl: android: don't add the image loader extension for !render_node With earlier commit we introduced support for render_node devices, which was couples with the use of the image loader extension. As the work was inspired by egl/wayland we (erroneously) added the extension for the !render_node path as well. That works for wayland, as the implementations of the DRI2 and IMAGE loader extensions converge behind the scenes. As that is not yet the case for Android we shouldn't expose the extension. Fixes: `34ddef39ce` ("egl: android: add dma-buf fd support") Cc: <mesa-stable@lists.freedesktop.org> Reported-by: Mauro Rossi <issor.oruam@gmail.com> Tested-by: Mauro Rossi <issor.oruam@gmail.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-06-07 12:21:45 +01:00
Marek Olšák	095803a37a	gallium/radeon: add support for sharing textures with DCC between processes v2: use a function for calculating WORD1 of bo metadata Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-07 11:12:26 +02:00
Marek Olšák	9e5b5fbde0	gallium/radeon: don't discard DCC if an external user can write to it We don't import textures with DCC now, but soon we will. v2: if we can't disable DCC for image writes, at least decompress DCC at bind time Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-07 11:12:26 +02:00
Dave Airlie	c6b14bafa4	i915: fix typo CAP. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-07 18:31:14 +10:00
Jakob Sinclair	b450f29073	glsl: initialise pointer to NULL Could cause issues if you tried to read from an uninitialised pointer. This just initalises the pointer to null to avoid that being a problem. Discovered by Coverity. CID: 1343616 Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-06-07 08:13:25 +02:00
Dave Airlie	c295923d13	i965/gen8: fix cull distance emission for tessellation shaders. This fixes some cases of: GL45-CTS.cull_distance.functional on Skylake. Reviewed-by: Chris Forbes <chrisforbes@google.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-07 11:52:17 +10:00
Ilia Mirkin	704bc0f0e9	nvc0: add support for VOTE tgsi opcodes Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-06-06 20:49:29 -04:00
Ilia Mirkin	f64c36e2d7	st/mesa: expose GL_ARB_shader_group_vote when supported by backend Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-06-06 20:49:29 -04:00
Ilia Mirkin	edfa7a4b25	gallium: add PIPE_CAP_TGSI_VOTE for when the VOTE ops are allowed Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-06-06 20:49:29 -04:00
Ilia Mirkin	30684b50d7	gallium: add VOTE_* opcodes to implement GL_ARB_shader_group_vote Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-06-06 20:49:28 -04:00
Ilia Mirkin	5189f0243a	mesa: hook up core bits of GL_ARB_shader_group_vote Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-06-06 20:48:46 -04:00
Kenneth Graunke	13b859de04	glsl: Make opt_copy_propagation_elements actually propagate into loops. We've had a FINISHME here since Eric originally wrote the code in 2011. This patch implements his suggested approach, which makes us actually able to copy propagate into the loops, at the unfortunate cost of making this pass even more expensive. The shader-db statistics are basically a wash: No change in instruction counts. total cycles in shared programs: 78685980 -> 78680730 (-0.01%) cycles in affected programs: 2102646 -> 2097396 (-0.25%) helped: 48 HURT: 83 I figured if we're going to do this for one copy propagation pass, we may as well do it in both. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-06-06 14:14:31 -07:00
Kenneth Graunke	0756e3a25c	glsl: Make opt_copy_propagation actually propagate into loops. We've had a FINISHME here since Eric originally wrote the code in 2010. This patch implements his suggested approach, which makes us actually able to copy propagate into the loops, at the unfortunate cost of making this pass even more expensive. The shader-db statistics are not terribly impressive: total instructions in shared programs: 9008589 -> 9008613 (0.00%) instructions in affected programs: 4293 -> 4317 (0.56%) helped: 0 HURT: 10 total cycles in shared programs: 78550978 -> 78575760 (0.03%) cycles in affected programs: 655426 -> 680208 (3.78%) helped: 75 HURT: 88 GAINED: 2 Most of the "regressions" appear to be us successfully copy propagating uniforms, which i965 is loading as pull constants instead of push, so we occasionally have two pulls instead of one. That doesn't seem like this pass's job - it's propagating correctly, and we should be smarter about pull loads in the backend. This patch is also useful for a couple of reasons: 1. It can clean up copies created by varying packing (previously, we couldn't if the uses were inside a loop). This fixes a bug when interpolateAt*() is used on a packed varying inside a loop: glsl_to_nir struggles to see through the extra copy and mistakenly believed the variable was not an input. 2. It will help propagate uniform array access created by lower_const_array_to_uniforms(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-06-06 14:14:31 -07:00
Samuel Pitoiset	08ddfe7b2f	nv50/ir: use round toward 0 when converting doubles to integers Like floats, we should use the round toward 0 mode instead of the nearest one (which is the default) for doubles to integers. This fixes all arb_gpu_shader_fp64 piglits which convert doubles to integers (16 tests). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-06-06 22:56:04 +02:00
Marek Olšák	00e6899ae5	gallium/radeon: don't re-set BO metadata after CMASK deallocation CMASK has no effect on metadata, because it's not sharable. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-06 22:50:55 +02:00
Marek Olšák	589d6b58c3	st/mesa: change SQRT lowering to fix the game Risen Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94627 (against nouveau) Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-06-06 22:50:55 +02:00
Marek Olšák	991cbfcb14	radeonsi: add a performance tweak for 4 SE parts Ported from Vulkan. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-06 22:50:55 +02:00
Marek Olšák	2802310c25	radeonsi: simplify PRIMGROUP_SIZE computation for tessellation Ported from Vulkan. v2: keep the comment Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-06 22:50:55 +02:00
Marek Olšák	014c8ec770	r600g: use hw MSAA resolve for non-trivial resolves This improves MSAA resolve performance. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-06 22:50:55 +02:00
Marek Olšák	6b449783f6	radeonsi: use hw MSAA resolve for non-trivial resolves This improves MSAA resolve performance. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-06 22:50:55 +02:00
Dave Airlie	07403014c3	mesa/program_resource: return -1 for index if no location. The GL4.5 spec quote seems clear on this: "The value -1 will be returned by either command if an error occurs, if name does not identify an active variable on programInterface, or if name identifies an active variable that does not have a valid location assigned, as described above." This fixes: GL45-CTS.program_interface_query.output-built-in [airlied: use _mesa_program_resource_location_index as suggested by Eduardo] Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-07 06:10:19 +10:00
Nicolai Hähnle	ec2b52e2d9	radeonsi: set descriptor dirty mask on shader buffer unbind Found randomly while skimming the code. This might have caused VM faults in robustness tests. Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-06 21:43:18 +02:00
Nicolai Hähnle	0f916d4ca7	st/mesa: fix resource leak in try_pbo_readpixels Found by inspection after seeing https://bugs.freedesktop.org/show_bug.cgi?id=96343 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-06 21:42:27 +02:00
Charmaine Lee	627e975896	tgsi: fix mixed data type comparison in tgsi_point_sprite.c Cast the unsigned semantic index to integer datatype before comparing to max_generic, otherwise, max_generic which is initialized to -1 will be converted to unsigned int before the comparison, causing a wrong semantic index to be assigned to a shader output. Fixes the assert running TurboCAD_gl.trace. (VMware bug 1667265) Also tested with glretrace, mesa demos pointblast, spriteblast and pointcoord. v2: use the original max_generic variable but add the (int) cast to the semantic index, as suggested by Brian. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-06 10:20:45 -06:00
Charmaine Lee	304b5a1446	svga: print shader linkage info when tgsi debug bit is on When TGSI debug flag is enabled, print the shader linkage info as well. Tested with mesa demos with SVGA_DEBUG=tgsi Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-06 10:20:45 -06:00
Ilia Mirkin	4f1cccf570	st/mesa: check shader image format support before using PBO download ARB_shader_image_load_store only requires a very fixed list of formats to be supported, while textures may be in all kinds of formats, like BGRA which are presently not supported on at least Kepler. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek OlÅ¡Ã¡k <marek.olsak@amd.com>	2016-06-06 12:05:59 -04:00
Lars Hamre	4163c71010	tgsi: use truncf in micro_trunc Switches to using truncf in micro_trunc. Fixes the following piglit tests (for softpipe): /spec/glsl-1.30/execution/built-in-functions/... fs-trunc-float fs-trunc-vec2 fs-trunc-vec3 fs-trunc-vec4 vs-trunc-float vs-trunc-vec2 vs-trunc-vec3 vs-trunc-vec4 /spec/glsl-1.50/execution/built-in-functions/... gs-trunc-float gs-trunc-vec2 gs-trunc-vec3 gs-trunc-vec4 Signed-off-by: Lars Hamre <chemecse@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-06-06 15:56:28 +02:00
Samuel Iglesias Gonsálvez	2b648ec17c	i965/gs/scalar: Fix load input for doubles Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-06 12:37:16 +02:00
Samuel Iglesias Gonsálvez	2d6f82a294	i965/fs: fix offset when loading double vector input varyings When we are not packing a double input varying, we might need to read its data in a non-aligned to 64-bit offset, so we read the wrong data. This is happening when using explicit locations in varyings because Mesa disables packing varying for that case. const_index is in 32-bit size units but offset() is multiplying it by destination type size units. When operating with double input varyings, const_index value could be not aligned to 64 bits. To fix it, we load the double vector as if it was a float based vector with twice the number of components. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-06 12:37:16 +02:00
Samuel Iglesias Gonsálvez	cb30727648	i965/fs: fix FS_OPCODE_CINTERP for unpacked double input varyings Data starts at suboffet 3 in 32-bit units (12 bytes), so it is not 64-bit aligned and the current implementation fails to read the data properly. Instead, when there is is a double input varying, read it as vector of floats with twice the number of components. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-06 12:37:16 +02:00
Dave Airlie	4c86399378	glsl: geom shader max_vertices layout must match. From GLSL 4.5 spec, "4.4.2.3 Geometry Outputs". "all geometry shader output vertex count declarations in a program must declare the same count." Fixes: GL45-CTS.geometry_shader.output.conflicted_output_vertices_max Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-06 18:02:19 +10:00
Jason Ekstrand	ffcef720b7	anv/pipeline: Add support for caching the push constant map Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-06-06 00:44:32 -07:00
Dave Airlie	78659ade40	glsl: use enum glsl_interface_packing in more places. (v2) Although the glsl_types.h stores this in a bitfield, we should hide that from everyone else. Hide the cast in an accessor method and use the enum everywhere. This makes things a bit nicer in gdb, and improves type safety. v2: fix a few pieces of interface I missed that caused some piglit regressions. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-06-06 15:58:37 +10:00
Dave Airlie	ff2e569153	i965: don't use NumLayers for 3D textures. For 3D textures we shouldn't be using NumLayers, we need to get it from the depth. This fixes: GL45-CTS.geometry_shader.layered_framebuffer.clear_call_support Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-06 13:07:07 +10:00
Dave Airlie	1f66a4b689	glsl: for anonymous struct matching use without_array() (v3) With tessellation shaders we can have cases where we have arrays of anon structs, so make sure we match using without_array(). Fixes: GL45-CTS.tessellation_shader.tessellation_control_to_tessellation_evaluation.gl_in v2: test lengths match as well (Ilia) v3: descend array lengths to check for matches as well (Ilia) Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-06 12:54:41 +10:00
Dave Airlie	6702c15810	glsl/ast: don't crash when func_name is NULL This fixes a crash in GL43-CTS.shader_subroutine.subroutines_not_allowed_as_variables_constructors_and_argument_or_return_types If we can't find the func_name in one of these paths, we have emitted an earlier error so just return here. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-06 12:54:30 +10:00
Dave Airlie	4336196b7f	glsl: handle ast_aggregate in has_sequence_subexpression. (v2) GL43-CTS.compute_shader.work-group-size does uniform uint g_uniform[gl_WorkGroupSize.z + 20] = { 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24 }; The initializer triggers the GLSL 4.30/GLES3 tests for constant sequence subexpressions, so it doesn't happen unless you are using those, so just return false as this path is now reachable. v2: update commit msg with diagnosis Acked-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-06 12:54:19 +10:00
Kenneth Graunke	f657a59d98	mesa: Try to unbreak the MSVC build. PATH_MAX is apparently not a thing on Windows. Borrow the hack from pipe_loader.c to try and make this work. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-05 16:32:08 -07:00
Kenneth Graunke	c417c0c9c3	mesa: Add MESA_SHADER_CAPTURE_PATH for writing .shader_test files. This writes linked shader programs to .shader_test files to $MESA_SHADER_CAPTURE_PATH in the format used by shader-db (http://cgit.freedesktop.org/mesa/shader-db). It supports both GLSL shaders and ARB programs. All stages that are linked together are written in a single .shader_test file. This eliminates the need for shader-db's split-to-files.py, as Mesa produces the desired format directly. It's much more reliable than parsing stdout/stderr, as those may contain extraneous messages, or simply be closed by the application and unavailable. We have many similar features already, but this is a bit different: - MESA_GLSL=dump writes to stdout, not files. - MESA_GLSL=log writes each stage to separate files (rather than all linked shaders in one file), at draw time (not link time), with uniform data and state flag info. - Tapani's shader replacement mechanism (MESA_SHADER_DUMP_PATH and MESA_SHADER_READ_PATH) also uses separate files per shader stage, but allows reading in files to replace an app's shader code. v2: Dump ARB programs too, not just GLSL. v3: Don't dump bogus 0.shader_test file. v4: Add "GL_ARB_separate_shader_objects" to the [require] block. v5: Print "GLSL 4.00" instead of "GLSL 4.0" in the [require] block. v6: Don't hardcode /tmp/mesa. v7: Fix memoization of getenv(). v8: Also print "SSO ENABLED" (suggested by Timothy). v9: Also handle ES shaders (suggested by Ilia). v10: Guard against MESA_SHADER_CAPTURE_PATH being too long; add _mesa_warning calls on error handling (suggested by Ben). v11: Fix crash when variable is unset introduced in v10. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-06-05 13:48:57 -07:00
Ilia Mirkin	092ec3920f	nv50,nvc0: fix BGR10_A2UI vertex format This is mostly academic as this is not reachable from GL, which only has the packed RGB10_A2UI vertex format. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-05 15:13:46 -04:00
Samuel Pitoiset	be365f34f0	nvc0: do not clear surfaces bins in the validate function We should not call nouveau_bufctx_reset() inside a validate function. This only affects Fermi where images are aliased between 3D and CP. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-05 19:02:59 +02:00
Samuel Pitoiset	43d3ecfb33	nvc0: re-validate images after launching a grid on Fermi Images invalidation is a bit weird on Fermi and there is already a hack which forces invalidating all images when launching a computer shader to help in fixing 3D<->CP interaction. However, we need to re-validate images for compute because nvc0_compute_invalidate_surfaces() will destroy the previous binding. This is not really good for performance purposes but this might be improved later. This fixes the following piglits: - spec/arb_compute_shader/execution/basic-uniform-access - spec/arb_compute_shader/execution/mutiple-texture-reading - spec/arb_compute_shader/execution/multiple-workgroups - spec/glsl-4.30/execution/built-in-functions/cs-* (207 tests) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-05 18:48:02 +02:00
Marek Olšák	3b44864ab7	radeonsi: fix images with level > 0 This should fix spec@arb_shader_image_load_store@level. Broken by: Commit: `95c5bbae66` radeonsi: set some image descriptor fields at bind time Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-05 17:00:14 +02:00
Ilia Mirkin	fd6bbc2ee2	nvc0: reduce overhead from always marking images dirty We would revalidate images when anything was touched at all. Which is unfortunate, since the state tracker does not use CSO's to reduce the workload. So instead implement a protocol to ensure that something has changed before revalidating all the images. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-04 23:50:56 -04:00
Ilia Mirkin	0f673db6f0	nvc0: reduce overhead from always marking buffers dirty We would revalidate buffers when anything was touched at all. Which is unfortunate, since the state tracker does not use CSO's to reduce the workload. So instead implement a protocol to ensure that something has changed before revalidating all the SSBOs. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-04 23:50:56 -04:00
Ilia Mirkin	e8ee161b16	nvc0: fix memory barrier flag handling Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-04 23:50:56 -04:00
Ilia Mirkin	29abbeecd8	nvc0: mark bound buffer range valid Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-04 23:50:56 -04:00
Dave Airlie	f018456901	anv/entrypoints: don't go using wayland/xcb unless they are configured The fix in: anv: let anv_entrypoints_gen.py generate proper Wayland/Xcb guards breaks things if wayland headers aren't installed. Separate things out properly to avoid that problem. [airlied: fixed up to put in pre-existing sections]. Reported-by: Arjan van de Ven Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-05 07:03:12 +10:00
Marek Olšák	d5491a81ff	gallium/radeon: don't use the DMA ring for pipelined buffer uploads Submitting a DMA IB flushes the GFX IB and all GPU caches. Vedran Miletić said: "On Tonga 380X, this improves The Talos Principle from 8.3 fps to 28.3 fps (all graphics settings Ultra, 4xAA, 1080p resolution with downsampling from 1200p)." Some anonymous dude said: R9 390 results: Tomb Raider (normal settings): 80 -> 88 FPS Talos Principle (custom settings): 23 -> 56 FPS Metro Last Light Redux (default benchmark settings): 39 -> 40 FPS Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Vedran Miletić <vedran@miletic.net> Tested-by: Grazvydas Ignotas <notasas@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2016-06-04 15:42:33 +02:00
Marek Olšák	9c35ec2042	r600g: don't flush caches when binding shader resources Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Grazvydas Ignotas <notasas@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2016-06-04 15:42:33 +02:00
Marek Olšák	eff94af794	r600g: only do necessary cache flushes in cp_dma_copy_buffer The main impact is that {upload, draw, upload, draw, ..} doesn't flush framebuffer caches before every upload. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Grazvydas Ignotas <notasas@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2016-06-04 15:42:33 +02:00
Marek Olšák	9e62012c30	r600g: only do necessary cache flushes in cp_dma_clear_buffer The main impact is that fast color clear doesn't flush TC, CONST, DB. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Grazvydas Ignotas <notasas@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2016-06-04 15:42:33 +02:00
Marek Olšák	c92a3ae7e9	r600g: remove a CP DMA workaround that's not needed anymore Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Grazvydas Ignotas <notasas@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2016-06-04 15:42:33 +02:00
Marek Olšák	5ea5ed6050	r600g: fix CP DMA hazard with index buffer fetches (v3) v3: use PFP_SYNC_ME on EG-CM only when supported by the kernel, otherwise use MEM_WRITE + WAIT_REG_MEM to emulate that Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Grazvydas Ignotas <notasas@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2016-06-04 15:42:33 +02:00
Marek Olšák	ade16e1f5d	r600g: properly sync CP with CP DMA on R6xx This will allow removing useless cache & IB flushes. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Grazvydas Ignotas <notasas@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2016-06-04 15:42:33 +02:00
Marek Olšák	7746903d3a	r600g: write WAIT_UNTIL in the correct place This has been wrong all along. Fixing this will allow removing useless cache flushes. Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Grazvydas Ignotas <notasas@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2016-06-04 15:42:33 +02:00
Marek Olšák	ee0c96c11e	gallium/radeon: rename allocator_so_filled_size -> allocator_zeroed_memory Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Grazvydas Ignotas <notasas@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2016-06-04 15:42:33 +02:00
Marek Olšák	ada3d8f31e	gallium/u_suballoc: allow different alignment for each allocation Just move the alignment parameter from u_suballocator_create to u_suballocator_alloc. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Grazvydas Ignotas <notasas@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2016-06-04 15:42:33 +02:00
Jason Ekstrand	441194edd9	anv/blit: Use CLAMP_TO_EDGE for scaled blits When upscaling you can end up interpolating between the edge pixel and one past the edge. Using CLAMP_TO_EDGE seems like the most reasonable thing to do in this case. This fixes two of the new Vulkan CTS tests in dEQP-VK.api.copy_and_blit.blit_image.* Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	9313a56816	anv/copy: Account for the anv_surface.offset when creating a blit2d_surf This was causing problems if the user tried to copy to/from the stencil portion of a combined depth/stencil image. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	526a8de22d	nir/spirv: Make a decoration switch complete Getting rid of the default case makes the compiler warn if we are missing cases. While we're here, we also add the one missing case. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	62c6e94bd6	nir/spirv: Make unhandled decorations and capabilities non-fatal glslang frequently throw bogus decorations into shaders. While we are free to assert-fail, it's a bit nicer to the application to just warn. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	ed14d21d04	nir/spirv: Add a way to print non-fatal warnings Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	2e46a5d155	nir/spirv: Add string lookup tables for a couple of SPIR-V enums Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	5a1e56f344	nir/spirv: Complete the list of capabilities Previously we supported a subset of capabilities and just left a default case for the others. It's time to stop being lazy and actually audit the capabilities. This should bring them up-to-date with reality. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	9fa958e95b	anv/pipeline: Add support for early depth stencil Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	66bd2e1133	mesa: Get rid of _mesa_active_fragment_shader_has_side_effects It is no longer used. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	35bf4d9dc2	i965/ps_state: Use wm_prog_data.has_side_effects Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	3fb289f957	i965/fs Add a wm_prog_data bit for has_side_effects This is more accurate than calling _mesa_active_fragment_shader_has_side_effects because it looks at whether or not the SSBOs, images, or atomic buffers are actually written rather than just existing in the program. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	4d3b8318a7	nir/info: Get rid of uses_interp_var_at_offset We were using this briefly in the i965 driver to trigger recompiles but we haven't been using it since we switched to the NIR y-transform lowering pass. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	56a178922f	anv/pipeline: Silently pass tests if depth or stencil is missing Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	bc7f7e1953	anv/pipeline: Unify gen7/8 emit_ds_state Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	fdc3c5dd05	genxml/gen6,7,75: s/BackFace/Backface This is more consistent with gen8+ Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	1f7b54ed29	nir/spirv: Handle the WorkgroupSize builtin decoration This fixes the 7 dEQP-VK.pipeline.spec_constant.compute.local_size.* tests in the latest dev version of the Vulkan CTS. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	b26cdd65e8	nir/spirv: Use breaks instead of returns in constant handling Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	a19ae36ce5	anv/pipeline: Refactor specialization constant handling a bit Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	45542f554c	nir/lower_indirect_derefs: Use the direct array deref for recursion This fixes about 100 of the new Vulkan CTS tests. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-03 19:29:28 -07:00
Jason Ekstrand	59f06ac389	anv/clear: Handle ClearImage on 3-D images Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-03 19:29:28 -07:00
Francisco Jerez	7244dc1e06	Revert "i965/fs: Allow scalar source regions on SNB math instructions." This reverts commit `c1107cec44`. Apparently the hardware spec text I quoted in the commit message was outright lying about scalar source math being supported on SNB, the hardware seems to load 32 contiguous bits of data for each channel regardless of the regioning mode. Fixes regressions in the following CTS tests (which we didn't catch early due to CTS being temporarily disabled in our CI system): es2-cts.gtf.gl.atan.atan_vec3_frag_xvary es2-cts.gtf.gl.cos.cos_vec2_frag_xvary es2-cts.gtf.gl.atan.atan_vec2_frag_xvary es2-cts.gtf.gl.pow.pow_vec2_frag_xvary_yconsthalf es2-cts.gtf.gl.cos.cos_float_frag_xvary es2-cts.gtf.gl.pow.pow_float_frag_xvary_yconsthalf es2-cts.gtf.gl.atan.atan_vec3_frag_xvaryyvary es2-cts.gtf.gl.pow.pow_vec3_frag_xvary_yconsthalf es2-cts.gtf.gl.cos.cos_vec3_frag_xvary es2-cts.gtf.gl.atan.atan_vec2_frag_xvaryyvary Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96346 Reported-by: Mark Janes <mark.a.janes@intel.com> Acked-by: Matt Turner <mattst88@gmail.com>	2016-06-03 18:47:29 -07:00
Francisco Jerez	a2135c6fd9	i965/vec4: Fix cmod propagation not to propagate non-identity cmod into CMP(N). The conditional mod of these instructions determines the semantics of the comparison itself (rather than being evaluated based on the result of the instruction as is usually the case for most other instructions that allow conditional mods), so it's in general not legal to propagate a conditional mod into a CMP instruction. This prevents cmod propagation from (mis)optimizing: cmp.z.f0 tmp, ... mov.z.f0 null, tmp into: cmp.z.f0 tmp, ... which gives the negation of the flag result of the original sequence. I originally noticed this while working on SIMD32 in the scalar back-end, but the same scenario is likely to be possible in vec4 programs so this commit ports the bugfix with the same name from the scalar back-end to the vec4 cmod propagation pass. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-03 18:38:51 -07:00
Emil Velikov	7a3a0d9212	anv: add the X related and Wayland CFLAGS to VULKAN_ENTRYPOINT_CPPFLAGS Otherwise we will fail to find the headers in some scenarios. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reported-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Tested-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>	2016-06-04 00:52:00 +01:00
Emil Velikov	a1256c0ea7	nir: automake: add nir_search_helpers.h to the sources list(s) Fixes: `dfbae7d64f` ("nir/algebraic: support for power-of-two optimizations") Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-06-04 00:18:40 +01:00
Rob Clark	1535519e51	freedreno/ir3: do idiv lowering after main opt loop Give algebraic-opt pass a chance to catch udiv by const power-of-two, before running lower-idiv pass. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-06-03 16:05:03 -04:00
Rob Clark	dfbae7d64f	nir/algebraic: support for power-of-two optimizations Some optimizations, like converting integer multiply/divide into left/ right shifts, have additional constraints on the search expression. Like requiring that a variable is a constant power of two. Support these cases by allowing a fxn name to be appended to the search var expression (ie. "a#32(is_power_of_two)"). Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-03 16:05:03 -04:00
Nicolai Hähnle	a64c7cd2ba	radeonsi: mark buffer texture range valid for shader images When a shader image view into a buffer texture can be written to, the buffer's valid range must be updated, or subsequent transfers may incorrectly skip synchronization. This fixes a bug that was exposed in Xephyr by PBO acceleration for glReadPixels, reported by Michel Dänzer. Cc: Michel Dänzer <michel.daenzer@amd.com> Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-03 14:11:05 +02:00
Marek Olšák	8c361e84ad	Revert "egl: Check if API is supported when using eglBindAPI." This reverts commit `e8b38ca202`. It broke Glamor for Gallium at least.	2016-06-03 11:33:45 +02:00
Alejandro Piñeiro	9bdbb9c0e0	mesa/formatquery: expand NUM_SAMPLE_COUNTS OpenGL ES comment For ES 3.0 NUM_SAMPLE_COUNTS spec points that some formats will be always zero. But on ES 3.1 can be different to zero. The current code is correctly checking exactly against version 3.0, but the comment only mentions 3.0 spec. It is clearer mentioning both. v2: better wording on the comment (Ian Romanick) Acked-by: Eduardo Lima <elima@igalia.com> Acked-by: Antia Puentes <apuentes@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-03 07:38:25 +02:00
Dave Airlie	d10ae20b96	mesa/get: return correct value for layer provoking vertex. This fixes: GL45-CTS.geometry_shader.layered_rendering.layered_rendering on Skylake. Reviewed-by: Chris Forbes <chrisforbes@google.com> Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-03 12:33:34 +10:00
Plamena Manolova	0b67efaed2	egl: Account for default values of texture target and format When validating attributes during surface creation we should account for the default values of texture target and format (EGL_NO_TEXTURE) since the user is not obligated to explicitly set both via the attribute list passed to eglCreatePbufferSurface. Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-06-02 16:07:31 -07:00
Samuel Pitoiset	28590eb949	nvc0: mark buffer texture range valid for shader images Loosely based on radeonsi (Thanks to Nicolai). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 12.0 <mesa-stable@lists.freedesktop.org>	2016-06-03 00:12:23 +02:00
Mauro Rossi	278c2212ac	isl: add support for Android libmesa_isl static library isl library is needed to build i965, libmesa_isl static library is added to fix related Android building errors. Any attempt to build libmesa_genxml as phony package module failed to deliver gen{7,75,8,9}_pack.h generated headers, needed for libmesa_isl_gen{7,75,8,9} Due to constraints in Android Build System, libmesa_genxml is built as static, at least one source is needed, so dummy.c is autogenerated for this scope, libmesa_genxml dependency is declared using LOCAL_WHOLE_STATIC_LIBRARIES, to avoid building errors due to missing genxml/gen{7,75,8,9}_pack.h headers. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-02 22:31:44 +01:00
Mauro Rossi	4143245c23	android: libmesa_glsl: add a dependency on libmesa_nir static Fixes the following building error: target C++: libmesa_glsl <= external/mesa/src/compiler/glsl/glsl_to_nir.cpp In file included from external/mesa/src/compiler/glsl/glsl_to_nir.h:28:0, from external/mesa/src/compiler/glsl/glsl_to_nir.cpp:28: external/mesa/src/compiler/nir/nir.h:42:25: fatal error: nir_opcodes.h: No such file or directory compilation terminated. build/core/binary.mk:432: recipe for target 'out/target/product/x86/obj/STATIC_LIBRARIES/libmesa_glsl_intermediates/glsl/glsl_to_nir.o' failed make: * [out/target/product/x86/obj/STATIC_LIBRARIES/libmesa_glsl_intermediates/glsl/glsl_to_nir.o] Error 1 make: * Waiting for unfinished jobs.... Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-06-02 22:31:00 +01:00
Emil Velikov	af1a0ae8ce	isl: automake: don't include isl_format_layout.c in two lists. Including the file in both ISL_FILES and ISL_GENERATED_FILES makes the actual dependency list less obvious. v2: Drop unrelated vulkan hunk (Jason). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-02 22:26:04 +01:00
Emil Velikov	af2637aa32	automake: bring back the .PHONY git_sha1.h.tmp rule With earlier commit `3689ef32af` ("automake: rework the git_sha1.h rule, include in tarball") we/I erroneously removed the PHONY rule and the temporary file. The former is used to ensure that the header is regenerated when on each make invocation, while the latter helps us avoid the unneeded rebuild(s) when the SHA1 hasn't changed. Reported-by: Grazvydas Ignotas <notasas@gmail.com> Tested-by: Grazvydas Ignotas <notasas@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-06-02 22:23:12 +01:00
Kenneth Graunke	f74a29188c	i965: Add _NEW_POINT to a couple of comments. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-06-02 14:11:55 -07:00
Charmaine Lee	0cf0d7c02e	svga: allow copy box in svga_transfer_dma_band() Instead of just allow copy of a rectangle in svga_transfer_dma_band(), this patch allows it to copy a box, hence allows copy a 3d texture in one transfer. Fixes black screen in running Heaven after commit `fb9fe35`. (Bug 1663282) Tested with Heaven, glretrace, piglit. Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-02 15:03:41 -06:00
Rob Clark	94d8fbd217	freedreno: fix bad bitshift warnings Coverity doesn't realize idx will never be negative. Throw in some assert()s to help it out. (Hopefully assert() isn't getting compiled out for coverity build.. but there seems to be just one way to find out. We might have to change these to assume()) Fixes CID 1362442, 1362443 Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-06-02 16:29:32 -04:00
Rob Clark	676c77a923	freedreno: assume builtin shaders do compile Maybe we should switch to ureg to build the builtin shaders. But at any rate, if they fail to compile it is because someone messed them up (or changed TGSI syntax?). CID 1362444 Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-06-02 16:29:32 -04:00
Francisco Jerez	060c8d245d	i965/fs: Reindent emit_zip(). Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-02 13:24:48 -07:00
Francisco Jerez	7aa76d66a1	i965/fs: Skip SIMD lowering destination zipping if possible. Skipping the temporary allocation and copy instructions is easy (just return dst), but the conditions used to find out whether the copy can be optimized out safely without breaking the program are rather complex: The destination must be exactly one component of at most the execution width of the lowered instruction, and all source regions of the instruction must be either fully disjoint from the destination or be aligned with it group by group. v2: Don't handle partial source-destination overlap for simplicity (Jason). No instruction count regressions with respect to v1 in either shader-db or the few FP64 shader_runner test-cases with partial overlap I've checked manually. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-02 13:24:48 -07:00
Anuj Phogat	75da9c9933	blorp: Fix 16x multisample scaled blits Piglit test ext_framebuffer_multisample_blit_scaled-blit-scaled (with added 16x sample support) now passes with this patch. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-02 13:21:26 -07:00
Anuj Phogat	59c19b7687	meta: Fix indentation in shader code Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2016-06-02 13:21:26 -07:00
Dave Airlie	af7bf610cf	mesa/copyimage: report INVALID_VALUE for missing cube face The specs says INVALID_VALUE for exceeding dimensions, which is really what is happening here. This fixes: GL45-CTS.copy_image.non_existent_mipmap Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Antia Puentes <apuentes@igalia.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-03 06:08:44 +10:00
Dave Airlie	c0856eacf1	mesa/copyimage: fix num samples check to handle renderbuffers. This test was only happening for textures, but there is nothing in the spec to say this, so test it for all cases. This fixes: GL45-CTS.copy_image.invalid_target Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-03 06:08:22 +10:00
Rob Clark	80c2886033	freedreno/a4xx: silence coverity warning CID 1362451 Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-06-02 15:44:07 -04:00
Rob Clark	9b854ce53c	freedreno/a3xx+a4xx: fix potential null ptr deref Coverity spotted the a3xx case (not sure why not the a4xx). CID 1362452 Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-06-02 15:44:07 -04:00
Rob Clark	27a97097e1	freedreno/ir3: fix coverity warning CID 1362453 Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-06-02 15:44:07 -04:00
Rob Clark	374ad2e2bd	freedreno/ir3: use nir_shader_get_entrypoint() helper Should also fix coverity warning: CID 1362454 Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-06-02 15:44:07 -04:00
Rob Clark	df64cd6814	freedreno/a4xx: fix incorrect enum type a4xx has it's own enum, different from a2xx/a3xx. Spotted by coverity: CID 1362458, 1362459 Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-06-02 15:44:07 -04:00
Rob Clark	1632b0eac0	freedreno: fix coverity negative array index warning Never can happen, since query would not have been created in the first place if pidx(query_type) return negative. Lets let coverity realize this. CID 1362460 Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-06-02 15:44:07 -04:00
Rob Clark	ba452d43e0	freedreno: fix dereference before null check ptr can actually never be null so just drop the check. CID 1362464 (#1 of 1): Dereference before null check (REVERSE_INULL) check_after_deref: Null-checking ptr suggests that it may be null, but it has already been dereferenced on all paths leading to the check. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-06-02 15:44:07 -04:00
Rob Clark	228b2b36f4	gallium/util: remove u_staging Unused, and fixes a couple of coverity warnings: CID `1362171`, `1362170` Signed-off-by: Rob Clark <robclark@freedesktop.org> Acked-by: Marek Olšák <marek.olsak@amd.com>	2016-06-02 15:44:07 -04:00
Rob Clark	18fb922faa	freedreno/a3xx: only update/emit bordercolor state when needed Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-06-02 15:44:07 -04:00
Rob Clark	11f0652404	freedreno/a4xx: only update/emit bordercolor state when needed I noticed in stk that it was contributing to a lot of overhead. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-06-02 15:44:07 -04:00
Matt Turner	0d81a684c1	i965: Add missing types to type_sz(). Coverity warns in multiple places about the potential for division by zero, caused by this function's default case. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-06-02 11:34:09 -07:00
Nanley Chery	c06cef7f9b	mesa/extensions: Fix ES1 extension reporting Commit `eda15abd84` , unintentionally advertised these extensions in ES1 contexts. Undo this error. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-02 10:46:59 -07:00
Plamena Manolova	e8b38ca202	egl: Check if API is supported when using eglBindAPI. According to the EGL specifications before binding an API we must check whether it's supported first. If not eglBindAPI should return EGL_FALSE and generate a EGL_BAD_PARAMETER error. Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-02 07:45:19 -07:00
Eric Engestrom	17f4c723eb	st/osmesa: remove double-write (overwriting) These two lines have been here since the file was created. I'm guessing the second one was just for testing during dev, so it's the one that's going away. CoverityID: 1296205 Signed-off-by: Eric Engestrom <eric@engestrom.ch> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com>	2016-06-02 07:05:05 -06:00
Nayan Deshmukh	6c9a352d79	st/vdpau: check for null pointer in get/put bits. Check for null pointer before accessing arrays in get/put bits native/YCbCr/Indexed in VdpOutputSurface and VdpVideoSurface. Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-06-02 09:28:48 +02:00
Christian König	b3e75c3997	radeon/uvd: fix the H264 level for Tonga v2 We support 5.2 for a while now. v2: we even support 5.2 for H264, 5.1 is for HEVC. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: <mesa-stable@lists.freedesktop.org>	2016-06-02 09:27:57 +02:00
Alejandro Piñeiro	b48c42cd1f	mesa/formatquery: add a comment to clarify INTERNALFORMAT_PREFERRED The comment clarifies that the driver is called only to try to get a preferred internalformat, and that it was already checked if the format is supported or not. Acked-by: Eduardo Lima <elima@igalia.com> Acked-by: Antia Puentes <apuentes@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-02 08:54:17 +02:00
Alejandro Piñeiro	c1ceee6cc9	i965/formatquery: remove INTERNALFORMAT_PREFERRED implementation Right now the implementation only checks if the internalformat is supported or not. But that implementation is wrong, returning unsupported for some internalformats. Additionally, checking if the internalformat is supported or not is already done at mesa/main before calling the driver hook, so this new check is not needed. Acked-by: Eduardo Lima <elima@igalia.com> Acked-by: Antia Puentes <apuentes@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-06-02 08:54:10 +02:00
Alejandro Piñeiro	58617bcebe	i965/eu: use simd8 when exec_size != EXECUTE_16 Among other thigs, fix a gpu hang when using INTEL_DEBUG=shader_time for any shader. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-06-02 08:08:10 +02:00
Jordan Justen	0a3acff5b5	i965: Remove old CS local ID handling The old method pushed data for each channels uvec3 data of gl_LocalInvocationID. The new method pushes 1 dword of data that is a 'thread local ID' value. Based on that value, we can generate gl_LocalInvocationIndex and gl_LocalInvocationID with some calculations. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-01 19:29:02 -07:00
Jordan Justen	b1f22c6317	i965: Enable cross-thread constants and compact local IDs for hsw+ The cross thread constant support appears on Haswell. It allows us to upload a set of uniform data for all threads without duplicating it per thread. One complication is that cross-thread constants are loaded into registers before per-thread constants. Previously, our local IDs were loaded before the uniform data and treated as 'payload' data, even though they were actually pushed into the registers like the other uniform data. Therefore, in this patch we simultaneously enable a newer layout where each thread now uses a single uniform slot for a unique local ID for the thread. This uniform is handled specially to make sure it is added last into the uniform push constant registers. This minimizes our usage of push constant registers, and maximizes our ability to use cross-thread constants for registers. To swap from the old to the new layout, we also need to flip some lowering pass switches to let our driver handle the lowering instead. We also no longer force thread_local_id_index to -1. v4: * Minimize size of patch that switches from the old local ID layout to the new layout (Jason) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-01 19:29:02 -07:00
Jordan Justen	3ba9594f32	anv: Support new local ID generation & cross-thread constants The cross thread constant support appears on Haswell. It allows us to upload a set of uniform data for all threads without duplicating it per thread. We also support per-thread data which allows us to store a per-thread ID in one of the uniforms that can be used to calculate the gl_LocalInvocationIndex and gl_LocalInvocationID variables. v4: * Support the old local ID push constant layout as well (Jason) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-01 19:29:02 -07:00
Jordan Justen	30685392e0	i965: Support new local ID push constant & cross-thread constants The cross thread constant support appears on Haswell. It allows us to upload a set of uniform data for all threads without duplicating it per thread. We also support per-thread data which allows us to store a per-thread ID in one of the uniforms that can be used to calculate the gl_LocalInvocationIndex and gl_LocalInvocationID variables. v4: * Support the old local ID push constant layout as well (Jason) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-01 19:29:02 -07:00
Jordan Justen	d437798ace	i965: Add CS push constant info to brw_cs_prog_data We need information about push constants in a few places for the GL driver, and another couple places for the vulkan driver. When we add support for uploading both a common (cross-thread) set of push constants, combined with the previous per-thread push constant data, things are going to get even more complicated. To simplify things, we add push constant info into the cs prog_data struct. The cross-thread constant support is added as of Haswell. To support it we need to make sure all push constants with uniform values are added to earlier registers. The register that varies per thread and holds the thread invocation's unique local ID needs to be added last. For now we add the code that would calculate cross-thread constatn information for hsw+, but we force it (cross_thread_supported) off until the other parts of the driver support it. v4: * Support older local ID push constant layout as well. (Jason) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-01 19:29:02 -07:00
Jordan Justen	1b79e7ebbd	i965: Store number of threads in brw_cs_prog_data Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-01 19:29:02 -07:00
Jordan Justen	3ef0957dac	i965: Add nir based intrinsic lowering and thread ID uniform We add a lowering pass for nir intrinsics. This pass can replace nir intrinsics with driver specific nir lower code. We lower the gl_LocalInvocationIndex intrinsic based on a uniform which is loaded with a thread specific ID. We also lower the gl_LocalInvocationID based on gl_LocalInvocationIndex. v2: * Create variable during lowering pass. (Ken) v3: * Don't create a variable, but instead just insert an intrisic call to load a uniform from the allocated location. (Jason) v4: * Don't run this pass if thread_local_id_index < 0 Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-01 19:29:02 -07:00
Jordan Justen	04fc72501a	i965: Put CS local thread ID uniform in last push register This thread ID uniform will be used to compute the gl_LocalInvocationIndex and gl_LocalInvocationID values. It is important for this uniform to be added in the last push constant register. fs_visitor::assign_constant_locations is updated to make sure this happens. The reason this is important is that the cross-thread push constant registers are loaded first, and the per-thread push constant registers are loaded after that. (Broadwell adds another push constant upload mechanism which reverses this order, but we are ignoring this for now.) v2: * Add variable in intrinsics lowering pass * Make sure the ID is pushed last in assign_constant_locations, and that we save a spot for the ID in the push constants v3: * Simplify code based with Jason's suggestions. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-01 19:29:02 -07:00
Jordan Justen	fa279dfbf0	i965: Add uniform for a CS thread local base ID v4: * Force thread_local_id_index to -1 for now, and have fs_visitor::setup_cs_payload look at thread_local_id_index. This enables us to more easily cut over from the old local ID layout to the new layout, as suggested by Jason. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-01 19:29:02 -07:00
Jordan Justen	8f48d23e0f	i965: Add nir channel_num system value v2: * simd16/32 fixes (curro) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-01 19:29:02 -07:00
Jordan Justen	6f316c9d86	nir: Make lowering gl_LocalInvocationIndex optional Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-01 19:29:02 -07:00
Jordan Justen	7b9def3583	glsl: Add glsl LowerCsDerivedVariables option v2: * Move lower flag to context constants. (Ken) Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-01 19:29:02 -07:00
Jason Ekstrand	1205999c22	i965/fs: Copy the offset when lowering logical pull constant sends This fixes 64 Vulkan CTS tests per gen Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96299 Reviewed-by: Francisco Jerez <currojerez@riseup.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-01 16:00:44 -07:00
Dave Airlie	8d4f4adfbd	glsl/distance: make sure we use clip dist varying slot for lowered var. When lowering, we always want to use the clip dist varying. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-02 07:09:21 +10:00
Nicolai Hähnle	c7877b9dab	winsys/amdgpu: decay max_ib_size over time So that memory use will eventually decrease again after a temporary peak. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:52:20 +02:00
Nicolai Hähnle	6aff6377b1	winsys/amdgpu: implement IB chaining on the gfx ring As a consequence, CE IB size never triggers a flush anymore. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:52:20 +02:00
Nicolai Hähnle	45be461f55	winsys/amdgpu: consolidate IB size management in amdgpu_ib_finalize Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:52:20 +02:00
Nicolai Hähnle	89ba076de4	radeon/winsys: introduce radeon_winsys_cs_chunk We will chain multiple chunks together and will keep pointers to the older chunks to support IB dumping. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:52:20 +02:00
Nicolai Hähnle	a7c26bfc0c	radeonsi/sid: add packet definitions for IB chaining While we're at it, add packet printing in si_debug. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:52:19 +02:00
Nicolai Hähnle	83a01cb498	winsys/amdgpu: start with smaller IBs, growing as necessary This avoids allocating giant IBs from the outset, especially for CE and DMA. Since we now limit max_dw only by the size that the buffer happens to be (which, due to the buffer cache, can be even larger than the rounded-up size we request), the new function amdgpu_ib_max_submit_dwords controls when we submit an IB. With this change, we effectively never flush prematurely due to the CE IB, after an initial warm-up phase. v2: - clean up buffer_size calculation Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:52:19 +02:00
Nicolai Hähnle	f80c6abb9e	winsys/amdgpu: add amdgpu_ib and amdgpu_cs_from_ib helper functions The latter function allows getting the containing amdgpu_cs from any IB (including non-main ones). Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:52:19 +02:00
Nicolai Hähnle	9e5ed559ba	winsys/amdgpu: extract IB big buffer allocation for re-use Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:52:19 +02:00
Nicolai Hähnle	9db851b5ee	winsys/amdgpu: add IB buffer in amdgpu_get_new_ib Adding the buffer when we start using it for the IB makes the logic for chaining a bit simpler. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:52:19 +02:00
Nicolai Hähnle	d6211a61b0	gallium/radeon: use cs_check_space throughout Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:52:18 +02:00
Nicolai Hähnle	46ad3561be	radeon/winsys: add cs_check_space Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:52:18 +02:00
Nicolai Hähnle	92d5d97b10	winsys/amdgpu: simplify interface of amdgpu_get_new_ib We'll want to have an amdgpu_cs pointer for future changes. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:52:18 +02:00
Nicolai Hähnle	8396ab4241	winsys/amdgpu: add amdgpu_cs_has_user_fence v2: style change Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:52:18 +02:00
Kenneth Graunke	25e1b8d366	i965: Fix isoline reads in scalar TES. Isolines aren't reversed. commit `5b2d8c2273` fixed this for the vec4 TES backend, but not the scalar one. Found while debugging GL45-CTS.tessellation_shader. tessellation_control_to_tessellation_evaluation.gl_tessLevel. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Cc: mesa-stable@lists.freedesktop.org	2016-06-01 13:46:09 -07:00
Nicolai Hähnle	ed0e9862c5	st/mesa: implement PBO downloads for ReadPixels v2: require PIPE_CAP_SAMPLER_VIEW_TARGET; technically only needed for some of the texture targets, but all hardware that has shader images should also have this cap. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:37:51 +02:00
Nicolai Hähnle	f3b62d4c74	st/mesa: hook up a no-op try_pbo_readpixels For better bisectability given that the order of some of the fallback tests in the blit path are rearranged. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:37:48 +02:00
Nicolai Hähnle	1cb4be94ae	st/mesa: add layer_offset to PBO fragment shader This will be used to select a slice of a 3D texture. v2: fix a comment (Marek) Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:37:43 +02:00
Nicolai Hähnle	2bf6dfac8a	st/mesa: create PBO download fragment shaders Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:37:40 +02:00
Nicolai Hähnle	852d3fcd3b	st/mesa: add PBO download enable bit and fragment shaders For downloads, the fragment shader must know the source texture target, hence we may cache multiple fragment shaders. v2: break long line (Marek) Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:37:34 +02:00
Nicolai Hähnle	581c001532	st/mesa: move shareable parts of PBO upload state and draw to st_pbo.c Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:37:31 +02:00
Nicolai Hähnle	e16800226e	st/mesa: move PBO buffer address calculation to st_pbo.c Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:37:28 +02:00
Nicolai Hähnle	21e069f7d4	st/mesa: move PBO upload fs creation to st_pbo.c Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:37:26 +02:00
Nicolai Hähnle	979688a027	st/mesa: rename pbo_upload to pbo At the same time, rename members that are upload-specific to say so. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:37:23 +02:00
Nicolai Hähnle	be82065fbe	st/mesa: move PBO vertex and geometry shader creation to st_pbo.c Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:37:20 +02:00
Nicolai Hähnle	4ecc32b0e1	st/mesa: begin moving PBO functions into their own file Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:37:18 +02:00
Nicolai Hähnle	d9893feb2c	gallium/cso: allow saving the first fragment shader image slot Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:37:15 +02:00
Nicolai Hähnle	fc0352ff9c	gallium/u_inlines: allow NULL src in util_copy_image_view Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:37:12 +02:00
Nicolai Hähnle	57f576f1fb	gallium: add PIPE_BARRIER_ALL define Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:36:48 +02:00
Ian Romanick	a428c955ce	glsl: Use Geom.VerticesOut == -1 to specify unset Because apparently layout(max_vertices=0) is a thing. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-01 11:11:39 -07:00
Ian Romanick	b27dfa5403	i965: If control_data_header_size_bits is zero, don't do EndPrimitive This can occur when max_vertices=0 is explicitly specified. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-01 11:11:39 -07:00
Ian Romanick	049bb94d2e	mesa: Fix bogus strncmp The string "[0]\0" is the same as "[0]" as far as the C string datatype is concerned. That string has length 3. strncmp(s, length_3_string, 4) is the same as strcmp(s, length_3_string), so make it be strcmp. v2: Not the same as strncmp(..., 3). Noticed by Ilia. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-01 11:11:25 -07:00
Marek Olšák	12740efd29	radeonsi: set correct stencil tile mode for texturing Sadly, this doesn't affect SI and VI in any way. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-06-01 17:35:30 +02:00
Marek Olšák	ea68215c54	winsys/amdgpu: set flags correctly when allocating depth-stencil buffers This mimics Vulkan. It also documents how to fix stencil texturing. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-06-01 17:35:30 +02:00
Marek Olšák	532a5af47f	gallium/radeon: lower memory usage during texture transfers This improves throughput by keeping TTM overhead down. Some piglit tests such as texelFetch and streaming-texture-leak will use less memory now. v2: use gart_size / 4 as the threshold Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-01 17:35:30 +02:00
Marek Olšák	614e3c6272	gallium/radeon: invalidate busy linear textures for whole-texture uploads Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-01 17:35:30 +02:00
Marek Olšák	fc1479a954	gallium/radeon: degrade tiled textures mapped often to linear Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-01 17:35:30 +02:00
Marek Olšák	9927c8138a	gallium/radeon: clean up and better comment use_staging_texture Next commits will add other things around this. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-01 17:35:30 +02:00
Marek Olšák	b033584299	radeonsi: set some colorbuffer register fields at emit time to allow reallocating the texture storage with different parameters Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-01 17:35:30 +02:00
Marek Olšák	30b2b860b0	radeonsi: implement global resetting of texture descriptors it will be used by texture reallocation Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-01 17:35:30 +02:00
Marek Olšák	28de7aec0c	radeonsi: move code for setting one shader image into separate function v2: fix set_shader_images(..., NULL). Found by Christoph Haag. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-01 17:35:30 +02:00
Marek Olšák	95c5bbae66	radeonsi: set some image descriptor fields at bind time mainly the fields that can change by reallocating a texture and changing the tile mode Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-01 17:35:30 +02:00
Marek Olšák	ef765d0789	gallium/radeon: strenghten some checking for DMA preparation Just for consistency. This doesn't fix anything, because DCC is not supported with non-mipmapped textures. v1.1: fix the comment about DCC Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-01 17:35:30 +02:00
Marek Olšák	9d881cc0ac	gallium/util: add util_texrange_covers_whole_level from radeon Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-01 17:35:30 +02:00
Ilia Mirkin	ca135a2612	nir: allow sat on all float destination types With the introduction of fp64 and fp16 to nir, there are now a bunch of float types running around. A F1 2015 shader ends up with an i2f.sat operation, which has a nir_type_float32 destination. Allow sat on all the float destination types. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-06-01 10:44:40 -04:00
Alex Deucher	bd85e4a041	radeonsi: fix the raster config setup for 1 RB iceland chips I didn't realize there were 1 and 2 RB variants when this code was originally added. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>	2016-06-01 09:59:57 -04:00
Dave Airlie	6400144041	mesa/sampler: fix error codes for sampler parameters. The initial ARB_sampler_objects spec had GL_INVALID_VALUE in it, however version 8 of it fixed this, and the GL specs also have the fixed value in them. Fixes: GL45-CTS.texture_border_clamp.samplerparameteri_non_gen_sampler_error Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-01 17:01:19 +10:00
Dave Airlie	0ebf4257a3	glsl: define some GLES3 constants in GLSL 4.1 The GLSL 4.1 spec adds: gl_MaxVertexUniformVectors gl_MaxFragmentUniformVectors gl_MaxVaryingVectors This fixes: GL45-CTS.gtf31.GL3Tests.uniform_buffer_object.uniform_buffer_object_build_in_constants Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-01 17:01:13 +10:00
Topi Pohjolainen	6ca118d2f4	i965: Add norbc debug option This INTEL_DEBUG option disables lossless compression (also known as render buffer compression). v2: (Matt) Use likely(!lossless_compression_disabled) instead of !likely(lossless_compression_disabled) (Grazvydas) Update docs/envvars.html Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-01 09:16:36 +03:00
Topi Pohjolainen	30e9e6bd07	i965/gen9: Configure rbc buffers as plain for non-rbc tex views Fixes rendering in Shadow of Mordor with rbc. Application writes RGBA_UNORM texture filling it with values the application wants to later on treat as SRGB_ALPHA. Intel driver enables lossless compression for the buffer by the time of writing. However, the driver fails to make sure the buffer can be sampled as something else later on and unfortunately there is restriction in the hardware for using lossless compression for srgb formats which looks to extend itself to the sampling engine also. Requesting srgb to linear conversion on top of compressed buffer results the color values to be pretty much garbage. Fortunately none of tracked benchmarks showed a regression with this. v2 (Matt): Add missing space Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-06-01 09:16:36 +03:00
Kenneth Graunke	a3dc99f3d4	i965: Fix the passthrough TCS for isolines. We weren't setting up several of the uniform values for the patch header, so we'd crash when uploading push constants. We at least need to initialize them to zero. We also had the isoline parameters reversed, so it would also render incorrectly (if it didn't crash). Fixes a new Piglit test() (isoline-no-tcs), as well as crashes in GL44-CTS.tessellation_shader.single.max_patch_vertices. () https://lists.freedesktop.org/archives/piglit/2016-May/019866.html Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Dave Airlie <airlied@redhat.com> Cc: mesa-stable@lists.freedesktop.org	2016-05-31 23:09:13 -07:00
Dave Airlie	ebb81cd683	i965/xfb: skip components in correct buffer. The driver was adding the skip components but always for buffer 0. This fixes: GL45-CTS.gtf40.GL3Tests.transform_feedback3.transform_feedback3_skip_multiple_buffers Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-01 15:53:00 +10:00
Dave Airlie	1fe7bbb911	glsl/linker: fix multiple streams transform feedback. `e2791b38b4` mesa/program_interface_query: fix transform feedback varyings. caused a regression in GL45-CTS.gtf40.GL3Tests.transform_feedback3.transform_feedback3_multiple_streams on radeonsi. The problem was it was using the skip components varying to set the stream id, when it should wait until a varying was written, this just adds the varying checks in the right place. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-01 13:30:41 +10:00
Dave Airlie	e891f7cf55	mesa/bufferobj: use mapping range in BufferSubData. According to GL4.5 spec: An INVALID_OPERATION error is generated if any part of the speci- fied buffer range is mapped with MapBufferRange or MapBuffer (see sec- tion 6.3), unless it was mapped with MAP_PERSISTENT_BIT set in the Map- BufferRange access flags. So we should use the if range is mapped path. This fixes: GL45-CTS.buffer_storage.map_persistent_buffer_sub_data Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Cc: "12.0, 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-06-01 13:30:40 +10:00
Ilia Mirkin	18d11c9989	nv50/ir: fix error finding free element in bitset in some situations This really only hits for bitsets with a size of a multiple of 32. We can end up with pos = -1 as a result of the ffs, which we in turn decide is a valid position (since we fall through the loop and i == 1, we end up adding 32 to it, so end up returning 31 again). Up until recently this was largely unreachable, as the register file sizes were all 63 or 255. However with the advent of compute shaders which can restrict the number of registers, this can now happen. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-05-31 23:25:51 -04:00
Ilia Mirkin	d873608bcf	nv50/ir: print relevant file's bitset when showing RA info Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-31 23:25:50 -04:00
Timothy Arceri	98d40b4d11	Revert "glsl: fix xfb_offset unsized array validation" This reverts commit `aac90ba292`. The commit caused a regression in: piglit.spec.glsl-1_50.compiler.gs-input-nonarray-named-block.geom Also the CTS test it was meant to fix seems like it may be bogus. Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-06-01 10:33:57 +10:00
Francisco Jerez	c1107cec44	i965/fs: Allow scalar source regions on SNB math instructions. I haven't found any evidence that this isn't supported by the hardware, in fact according to the SNB hardware spec: "The supported regioning modes for math instructions are align16, align1 with the following restrictions: - Scalar source is supported. [...] - Source and destination offset must be the same, except the case of scalar source." Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-31 15:57:41 -07:00
Francisco Jerez	06d8765bc0	i965/fs: Fix constant combining for instructions that cannot accept source mods. This is the case for SNB math instructions so we need to be careful and insert the literal value of the immediate into the table (rather than its absolute value) if the instruction is unable to invert the sign of the constant on the fly. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-31 15:57:41 -07:00
Francisco Jerez	303ec22ed6	i965/fs: Extend remove_duplicate_mrf_writes() to handle non-VGRF to MRF copies. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-31 15:57:41 -07:00
Francisco Jerez	4fe4f6e8a7	i965/fs: Fix compute_to_mrf() to coalesce VGRFs initialized by multiple single-GRF writes. Which requires using a bitset instead of a boolean flag to keep track of the GRFs we've seen a generating instruction for already. The search loop continues until all instructions initializing the value of the source VGRF have been found, or it is determined that coalescing is not possible. Fixes a few piglit test cases on Gen4-6 which were regressed by `6956015aa5` due to the different (yet perfectly valid) ordering in which copy instructions are emitted now by the simd lowering pass, which had the side effect of causing this optimization pass to start corrupting the program in cases where a VGRF-to-MRF copy instruction would be eliminated but only the last instruction writing to the source VGRF region would be rewritten to point to the target MRF. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-31 15:57:41 -07:00
Francisco Jerez	1898673f58	i965/fs: Teach compute_to_mrf() about the COMPR4 address transformation. This will be required to correctly transform the destination of 8-wide instructions that write a single GRF of a VGRF to MRF copy marked COMPR4. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-31 15:57:40 -07:00
Francisco Jerez	485fbaff03	i965/fs: Refactor compute_to_mrf() to split search and rewrite into separate loops. This will allow compute_to_mrf to handle cases where the source of the VGRF-to-MRF copy is initialized by more than one instruction. In such cases we cannot rewrite the destination of any of the generating instructions until it's known whether the whole VGRF source region can be coalesced into the destination MRF, which will imply continuing the search until all generating instructions have been found or it has been determined that the VGRF and MRF registers cannot be coalesced. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-31 15:57:40 -07:00
Francisco Jerez	4b0ec9f475	i965/fs: Fix compute-to-mrf VGRF region coverage condition. Compute-to-mrf was checking whether the destination of scan_inst is more than one component (making assumptions about the instruction data type) in order to find out whether the result is being fully copied into the MRF destination, which is rather inaccurate in cases where a single-component instruction is only partially contained in the source region, or when the execution size of the copy and scan_inst instructions differ. Instead check whether the destination region of the instruction is really contained within the bounds of the source region of the copy. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-31 15:57:40 -07:00
Francisco Jerez	bb61e24787	i965/fs: Simplify and improve accuracy of compute_to_mrf() by using regions_overlap(). Compute-to-mrf was being rather heavy-handed about checking whether instruction source or destination regions interfere with the copy instruction, which could conceivably lead to program miscompilation. Fix it by using regions_overlap() instead of the open-coded and dubiously correct overlap checks. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-31 15:56:54 -07:00
Francisco Jerez	88f380a2dd	i965/fs: Teach regions_overlap() about COMPR4 MRF regions. Cc: "12.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-31 15:22:04 -07:00
Dylan Baker	604010a7ed	Don't use python 3 Now there are not files that require python 3, so for now just remove the python 3 dependency and use python 2. I think the right plan is to just get all of the python ready for python 3, and then use whatever python is available. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> cc: 12.0 <mesa-stable@lists.freedesktop.org>	2016-05-31 15:09:06 -07:00
Dylan Baker	ab31817fed	genxml: change chbang to python 2 Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> cc: 12.0 <mesa-stable@lists.freedesktop.org>	2016-05-31 15:09:06 -07:00
Dylan Baker	12c1a01c72	genxml: use the isalpha method rather than str.isalpha. This fixes gen_pack_header to work on python 2, where name[0] is unicode not str. Signed-off-by: Dylan Bake <dylanx.c.baker@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> cc: 12.0 <mesa-stable@lists.freedesktop.org>	2016-05-31 15:09:06 -07:00
Dylan Baker	a45a25418b	genxml: require future imports for python2 compatibility. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> cc: 12.0 <mesa-stable@lists.freedesktop.org>	2016-05-31 15:09:06 -07:00
Dylan Baker	e5681e4d70	genxml: mark re strings as raw This is a correctness issue. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> cc: 12.0 <mesa-stable@lists.freedesktop.org>	2016-05-31 15:09:06 -07:00
Dylan Baker	de2e9da2e9	genxml: Make classes descendants of object This is the default in python3, but in python2 you get old style classes. No one likes old-style classes. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> cc: 12.0 <mesa-stable@lists.freedesktop.org>	2016-05-31 15:09:06 -07:00
Dylan Baker	9f50e3572c	genxml: mark gen_pack_header.py as encoded in utf-8 There is unicode in this file, and I'm actually surprised that the python interpreter hasn't gotten grumpy. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> cc: 12.0 <mesa-stable@lists.freedesktop.org>	2016-05-31 15:09:06 -07:00
Bas Nieuwenhuizen	35818129a6	radeonsi: Decompress DCC textures in a render feedback loop. By using a counter to quickly reject textures that are not bound to a framebuffer, the performance impact when binding sampler_views/images is not too large. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-31 21:43:04 +02:00
Bas Nieuwenhuizen	cbe3421f05	radeonsi: Add counter to check if a texture is bound to a framebuffer. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-31 21:43:00 +02:00
Rhys Kidd	8cb74dd4e6	vc4: Fix compiler warnings in fail_instr path of QIR validate pass Introduced in `8e2d0843c0`. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-05-31 10:56:02 -07:00
Emil Velikov	b8e1f59d62	anv: let anv_entrypoints_gen.py generate proper Wayland/Xcb guards The generated sources should follow the example set by the vulkan headers and our non-generated code. Namely: the code for all supported platforms should be available, each one guarded by its respective VK_USE_PLATFORM_*_KHR macro. v2: Reword commit message. Cc: Mark Janes <mark.a.janes@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96285 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1 over IRC)	2016-05-31 18:41:28 +01:00
Brian Paul	6bea33008e	svga: change enum pipe_resource_usage back to unsigned This parameter is actually a bitmask of PIPE_TRANSFER_x flags. Change it back to a simple unsigned type. IIRC, some compilers complain about masks of enum values. Also, this make the function signature match u_resource_vtbl::transfer_map() again. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-31 10:20:36 -06:00
Marek Olšák	7ca55d2da8	radeonsi: fix CP DMA hazard with index buffer fetches Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-05-31 16:59:32 +02:00
Marek Olšák	d427110882	r600g: do GL-compliant integer resolves The GL spec has been clarified and the new rule says we should just copy 1 sample. u_blitter does the right thing. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-31 16:48:55 +02:00
Marek Olšák	d5882bb0df	radeonsi: do GL-compliant integer resolves The GL spec has been clarified and the new rule says we should just copy 1 sample. u_blitter does the right thing. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-31 16:48:54 +02:00
Marek Olšák	921ab0028e	gallium/u_blitter: do GL-compliant integer resolves The GL spec has been clarified and the new rule says we should just copy 1 sample. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-31 16:48:53 +02:00
Marek Olšák	8a10192b4b	mesa: fix crash in driver_RenderTexture_is_safe This just fixed the crash with the apitrace in bug report. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95246 Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-31 16:43:34 +02:00
Marek Olšák	fc4896e686	radeonsi: don't flush TC at the end of IBs on DRM >= 3.2.0 It's not needed since it was fixed in the kernel. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-05-31 16:41:22 +02:00
Jakob Sinclair	877c00c653	gallium/radeon: fixed division by zero Coverity is getting a false positive that a division by zero can occur here. This change will silence the Coverity warnings as a division by zero cannot occur in this case. Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-05-31 12:51:20 +02:00
Eric Engestrom	35fd5282ea	st/glsl_to_tgsi: prevent infinite loop `unsigned j` would never fail `j >= 0`, leading to an infinite loop as `j--` wraps around. Signed-off-by: Eric Engestrom <eric@engestrom.ch> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-05-31 11:46:30 +02:00
Dave Airlie	f87352d769	glsl/images: bounds check image unit assignment The CTS test: GL45-CTS.multi_bind.dispatch_bind_image_textures binds 192 image uniforms, we reject this later, but not until after we trash the contents of the struct gl_shader. Error now reads: Too many compute shader image uniforms (192 > 16) instead of Too many compute shader image uniforms (2745344416 > 16) Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "12.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-31 10:41:44 +10:00
Ilia Mirkin	4b1a167a2b	nvc0/ir: fix spilling predicates to registers Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: "11.1 11.2 12.0" <mesa-stable@lists.freedesktop.org>	2016-05-30 18:15:14 -04:00
Ilia Mirkin	1f895caba0	nvc0/ir: limit max number of regs based on availability in SM This effectively limits registers to 32 and 64 for fermi and kepler when 1024 threads are used, but allows the full amount to be used with smaller thread sizes. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-05-30 18:15:10 -04:00
Ilia Mirkin	27a51ff9b4	nv50/ir: record number of threads in a compute shader Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-05-30 18:14:55 -04:00
Pierre Moreau	ae70879530	nv50/ir: Add missing handling of U64/S64 in inlines Signed-off-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-30 16:12:12 -04:00
Emil Velikov	9074470d7b	docs: rename release notes to 12.0.0 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `7ad2cb6f08`)	2016-05-30 20:33:30 +01:00
Ilia Mirkin	68d135011b	docs: move nvc0 out of individual lines of GL 4.2, 4.3, ES 3.1 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-30 15:18:32 -04:00
Emil Velikov	888cf6eea2	docs: add 12.1.0-devel release notes template, bump version Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 20:03:19 +01:00
Marek Olšák	4291229488	docs/GL3: mark radeonsi as all done up to GL 4.3 and GLES 3.1	2016-05-30 20:48:51 +02:00
Emil Velikov	922b471777	nir: add the SConscript.nir to the tarball Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 19:19:01 +01:00

1833 changed files with 148720 additions and 65542 deletions

									
										34

.editorconfig
									
										Normal file
									
												View File
												
				@@ -0,0 +1,34 @@

				# To use this config on you editor, follow the instructions at:

				# http://editorconfig.org

				root = true

				[*]

				charset = utf-8

				insert_final_newline = true

				[*.{c,h,cpp,hpp,cc,hh}]

				indent_style = space

				indent_size = 3

				[{Makefile*,*.mk}]

				indent_style = tab

				[{*.py,SCons*}]

				indent_style = space

				indent_size = 4

				[*.pl]

				indent_style = space

				indent_size = 4

				[*.m4]

				indent_style = space

				indent_size = 2

				[*.yml]

				indent_style = space

				indent_size = 2

				[*.patch]

				trim_trailing_whitespace = false

1

.gitignore vendored

View File

@@ -49,3 +49,4 @@ Makefile.in
 .install-mesa-links
 .install-gallium-links
 /src/git_sha1.h
 TAGS

12

.mailmap

View File

@@ -88,9 +88,11 @@ Carl-Philip Hänsch <cphaensch@googlemail.com> Carl-Philip Haensch <s3734770@mai
 Carl-Philip Hänsch <cphaensch@googlemail.com> Carl-Philip Haensch <carli@carli-laptop.(none)>
 Carl-Philip Hänsch <cphaensch@googlemail.com> Carl-Philip Haensch <Carl-Philip.Haensch@mailbox.tu-dresden.de>
 Chad Versace <chad.versace@intel.com> <chad@chad-versace.us>
 Chad Versace <chad.versace@intel.com> <Chad Versace chad@chad-versace.us>
 Chad Versace <chad.versace@intel.com> <chad.versace@linux.intel.com>
 Chad Versace <chadversary@chromium.org> <chad@kiwitree.net>
 Chad Versace <chadversary@chromium.org> <chad@chad-versace.us>
 Chad Versace <chadversary@chromium.org> <Chad Versace chad@chad-versace.us>
 Chad Versace <chadversary@chromium.org> <chad.versace@intel.com>
 Chad Versace <chadversary@chromium.org> <chad.versace@linux.intel.com>
 Chia-I Wu <olvaffe@gmail.com> <olv@lunarg.com>
 Chia-I Wu <olvaffe@gmail.com> Chia-Wu <olvaffe@gmail.com>
@@ -138,6 +140,8 @@ Dmitry Cherkassov <dcherkassov@gmail.com> Dmitry Cherkasov <dcherkassov@gmail.co
 Dylan Baker <dylanx.c.baker@intel.com> <baker.dylan.c@gmail.com>
 Edward O'Callaghan <funfunctor@folklore1984.net> <eocallaghan@alterapraxis.com>
 Emeric Grange <emeric.grange@gmail.com> Emeric <emeric.grange@gmail.com>
 Emil Velikov <emil.l.velikov@gmail.com> <emil.velikov@collabora.com>
@@ -274,7 +278,7 @@ Marc Dietrich <marvin24@gmx.de> marvin24 <marvin24@gmx.de>
 Marcin Ślusarz <marcin.slusarz@gmail.com> Marcin Slusarz <marcin.slusarz@gmail.com>
 Marek Olšák <marek.olsak@amd.com> <maraeo@gmail.com>
 Marek Olšák <maraeo@gmail.com> <marek.olsak@amd.com>
 Mario Kleiner <mario.kleiner.de@gmail.com> kleinerm <mario.kleiner@tuebingen.mpg.de>
 Mario Kleiner <mario.kleiner.de@gmail.com> <mario.kleiner@tuebingen.mpg.de>

									
										29

.travis.yml
									
												View File
												
				@@ -1,6 +1,7 @@

				language: c

				sudo: false

				sudo: true

				dist: trusty

				cache:

				  directories:

				@@ -10,12 +11,15 @@ addons:

				  apt:

				    packages:

				      - libdrm-dev

				      - libudev-dev

				      - x11proto-xf86vidmode-dev

				      - libexpat1-dev

				      - libxcb-dri2-0-dev

				      - libx11-xcb-dev

				      - llvm-3.4-dev

				      - llvm-3.5-dev

				      # llvm-config is not in the dev package?

				      - llvm-3.5

				      # LLVM packaging is broken and misses this dep.

				      - libedit-dev

				      - scons

				env:

				@@ -41,6 +45,16 @@ install:

				  - export PATH="/usr/lib/ccache:$PATH"

				  - pip install --user mako

				  # Since libdrm gets updated in configure.ac regularly, try to pick up the

				  # latest version from there.

				  - for line in `grep "^LIBDRM_.*_REQUIRED=" configure.ac`; do

				      old_ver=`echo $LIBDRM_VERSION | sed 's/libdrm-//'`;

				      new_ver=`echo $line | sed 's/.*REQUIRED=//'`;

				      if `echo "$old_ver,$new_ver" | tr ',' '\n' | sort -Vc 2> /dev/null`; then

				        export LIBDRM_VERSION="libdrm-$new_ver";

				      fi;

				    done

				  # Install dependencies where we require specific versions (or where

				  # disallowed by Travis CI's package whitelisting).

				@@ -78,22 +92,19 @@ install:

				  - wget http://dri.freedesktop.org/libdrm/$LIBDRM_VERSION.tar.bz2

				  - tar -jxvf $LIBDRM_VERSION.tar.bz2

				  - (cd $LIBDRM_VERSION && ./configure --prefix=$HOME/prefix && make install)

				  - (cd $LIBDRM_VERSION && ./configure --prefix=$HOME/prefix --enable-vc4 && make install)

				  - wget $XORG_RELEASES/lib/$LIBXSHMFENCE_VERSION.tar.bz2

				  - tar -jxvf $LIBXSHMFENCE_VERSION.tar.bz2

				  - (cd $LIBXSHMFENCE_VERSION && ./configure --prefix=$HOME/prefix && make install)

				# Disabled LLVM (and therefore r300 and r600) because the build fails

				# with "undefined reference to `clock_gettime'" and "undefined

				# reference to `setupterm'" in llvmpipe.

				script:

				  - if test "x$BUILD" = xmake; then

				      ./autogen.sh --enable-debug

				        --disable-gallium-llvm

				        --with-egl-platforms=x11,drm

				        --with-dri-drivers=i915,i965,radeon,r200,swrast,nouveau

				        --with-gallium-drivers=svga,swrast,vc4,virgl

				        --with-gallium-drivers=svga,swrast,vc4,virgl,r300,r600

				        --disable-llvm-shared-libs

				        ;

				      make && make check;

				    elif test x$BUILD = xscons; then

									
										10

Android.common.mk
									
												View File
												
				@@ -34,6 +34,10 @@ MESA_VERSION := $(shell cat $(MESA_TOP)/VERSION)

				LOCAL_CFLAGS += \

					-Wno-unused-parameter \

					-Wno-date-time \

					-Wno-pointer-arith \

					-Wno-missing-field-initializers \

					-Wno-initializer-overrides \

					-Wno-mismatched-tags \

					-DPACKAGE_VERSION=\"$(MESA_VERSION)\" \

					-DPACKAGE_BUGREPORT=\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\" \

					-DANDROID_VERSION=0x0$(MESA_ANDROID_MAJOR_VERSION)0$(MESA_ANDROID_MINOR_VERSION)

				@@ -78,6 +82,12 @@ LOCAL_CFLAGS += \

					-D__STDC_LIMIT_MACROS

				endif

				# add libdrm if there are hardware drivers

				ifneq ($(filter-out swrast,$(MESA_GPU_DRIVERS)),)

				LOCAL_CFLAGS += -DHAVE_LIBDRM

				LOCAL_SHARED_LIBRARIES += libdrm

				endif

				LOCAL_CPPFLAGS += \

					$(if $(filter true,$(MESA_LOLLIPOP_BUILD)),-D_USING_LIBCXX) \

					-Wno-error=non-virtual-dtor \

									
										4

Android.mk
									
												View File
												
				@@ -95,8 +95,8 @@ SUBDIRS := \

					src/mesa \

					src/util \

					src/egl \

					src/intel/genxml \

					src/intel/isl \

					src/amd \

					src/intel \

					src/mesa/drivers/dri

				INC_DIRS := $(call all-named-subdir-makefiles,$(SUBDIRS))

									
										2

Makefile.am
									
												View File
												
				@@ -44,7 +44,7 @@ AM_DISTCHECK_CONFIGURE_FLAGS = \

					--with-egl-platforms=x11,wayland,drm,surfaceless \

					--with-dri-drivers=i915,i965,nouveau,radeon,r200,swrast \

					--with-gallium-drivers=i915,ilo,nouveau,r300,r600,radeonsi,freedreno,svga,swrast,vc4,virgl,swr \

					--with-vulkan-drivers=intel

					--with-vulkan-drivers=intel,radeon

				ACLOCAL_AMFLAGS = -I m4

4

REVIEWERS

View File

@@ -104,3 +104,7 @@ F: src/egl/drivers/dri2/platform_wayland.c
 FREEDRENO
 R:	Rob Clark <robclark@freedesktop.org>
 F:	src/gallium/drivers/freedreno/
 GLX
 R: Adam Jackson <ajax@redhat.com>
 F: src/glx/

2

VERSION

View File

@@ -1 +1 @@
 .0.1
 .1.0-devel

									
										6

appveyor.yml
									
												View File
												
				@@ -37,6 +37,8 @@ cache:

				- win_flex_bison-2.4.5.zip

				- llvm-3.3.1-msvc2013-mtd.7z

				os: Visual Studio 2013

				environment:

				  WINFLEXBISON_ARCHIVE: win_flex_bison-2.4.5.zip

				  LLVM_ARCHIVE: llvm-3.3.1-msvc2013-mtd.7z

				@@ -47,11 +49,13 @@ install:

				- python -m pip --version

				# Install Mako

				- python -m pip install --egg Mako

				# Install pywin32 extensions, needed by SCons

				- python -m pip install pypiwin32

				# Install SCons

				- python -m pip install --egg scons==2.4.1

				- scons --version

				# Install flex/bison

				- if not exist "%WINFLEXBISON_ARCHIVE%" appveyor DownloadFile "http://downloads.sourceforge.net/project/winflexbison/%WINFLEXBISON_ARCHIVE%"

				- if not exist "%WINFLEXBISON_ARCHIVE%" appveyor DownloadFile "https://downloads.sourceforge.net/project/winflexbison/old_versions/%WINFLEXBISON_ARCHIVE%"

				- 7z x -y -owinflexbison\ "%WINFLEXBISON_ARCHIVE%" > nul

				- set Path=%CD%\winflexbison;%Path%

				- win_flex --version

2

bin/.cherry-ignore

View File

@@ -1,2 +0,0 @@
 # The offending commit that this patch (part) reverts isn't in 12.0
 be32a2132785fbc119f17e62070e007ee7d17af7 i965/compiler: Bring back the INTEL_PRECISE_TRIG environment variable

									
										3

bin/.editorconfig
									
										Normal file
									
												View File
												
				@@ -0,0 +1,3 @@

				[*.sh]

				indent_style = space

				indent_size = 2

									
										2

common.py
									
												View File
												
				@@ -86,7 +86,7 @@ def AddOptions(opts):

				        from SCons.Options.EnumOption import EnumOption

				    opts.Add(EnumOption('build', 'build type', 'debug',

				                        allowed_values=('debug', 'checked', 'profile',

				                                        'release')))

				                                        'release', 'opt')))

				    opts.Add(BoolOption('verbose', 'verbose output', 'no'))

				    opts.Add(EnumOption('machine', 'use machine-specific assembly code',

				                        default_machine,

368

configure.ac

View File

@@ -74,11 +74,11 @@ LIBDRM_AMDGPU_REQUIRED=2.4.63
 LIBDRM_INTEL_REQUIRED=2.4.61
 LIBDRM_NVVIEUX_REQUIRED=2.4.66
 LIBDRM_NOUVEAU_REQUIRED=2.4.66
 LIBDRM_FREEDRENO_REQUIRED=2.4.67
 LIBDRM_FREEDRENO_REQUIRED=2.4.68
 LIBDRM_VC4_REQUIRED=2.4.69
 DRI2PROTO_REQUIRED=2.6
 DRI3PROTO_REQUIRED=1.0
 PRESENTPROTO_REQUIRED=1.0
 LIBUDEV_REQUIRED=151
 GLPROTO_REQUIRED=1.4.14
 LIBOMXIL_BELLAGIO_REQUIRED=0.0
 LIBVA_REQUIRED=0.38.0
@@ -89,7 +89,8 @@ XCBDRI2_REQUIRED=1.8
 XCBGLX_REQUIRED=1.8.1
 XSHMFENCE_REQUIRED=1.1
 XVMC_REQUIRED=1.0.6
 PYTHON_MAKO_REQUIRED=0.3.4
 PYTHON_MAKO_REQUIRED=0.8.0
 LIBSENSORS_REQUIRED=4.0.0
 dnl Check for progs
 AC_PROG_CPP
@@ -108,6 +109,7 @@ LT_PREREQ([2.2])
 LT_INIT([disable-static])
 AC_CHECK_PROG(RM, rm, [rm -f])
 AC_CHECK_PROG(XXD, xxd, [xxd])
 AX_PROG_BISON([],
               AS_IF([test ! -f "$srcdir/src/compiler/glsl/glcpp/glcpp-parse.c"],
@@ -225,6 +227,7 @@ AX_GCC_FUNC_ATTRIBUTE([packed])
 AX_GCC_FUNC_ATTRIBUTE([pure])
 AX_GCC_FUNC_ATTRIBUTE([returns_nonnull])
 AX_GCC_FUNC_ATTRIBUTE([unused])
 AX_GCC_FUNC_ATTRIBUTE([visibility])
 AX_GCC_FUNC_ATTRIBUTE([warn_unused_result])
 AX_GCC_FUNC_ATTRIBUTE([weak])
@@ -254,15 +257,12 @@ case "$host_os" in
 *-android)
     android=yes
     ;;
 linux*|*-gnu*|gnu*)
 linux*|*-gnu*|gnu*|cygwin*)
     DEFINES="$DEFINES -D_GNU_SOURCE"
     ;;
 solaris*)
     DEFINES="$DEFINES -DSVR4"
     ;;
 cygwin*)
     DEFINES="$DEFINES -D_XOPEN_SOURCE=700"
     ;;
 esac
 AM_CONDITIONAL(HAVE_ANDROID, test "x$android" = xyes)
@@ -301,16 +301,9 @@ if test "x$GCC" = xyes; then
     # Restore CFLAGS; VISIBILITY_CFLAGS are added to it where needed.
     CFLAGS=$save_CFLAGS
     # Work around aliasing bugs - developers should comment this out
     CFLAGS="$CFLAGS -fno-strict-aliasing"
     # We don't want floating-point math functions to set errno or trap
     CFLAGS="$CFLAGS -fno-math-errno -fno-trapping-math"
     # gcc's builtin memcmp is slower than glibc's
     # http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
     CFLAGS="$CFLAGS -fno-builtin-memcmp"
     # Flags to help ensure that certain portions of the code -- and only those
     # portions -- can be built with MSVC:
     # - src/util, src/gallium/auxiliary, rc/gallium/drivers/llvmpipe, and
@@ -347,12 +340,8 @@ if test "x$GXX" = xyes; then
     # Restore CXXFLAGS; VISIBILITY_CXXFLAGS are added to it where needed.
     CXXFLAGS=$save_CXXFLAGS
     # Work around aliasing bugs - developers should comment this out
     CXXFLAGS="$CXXFLAGS -fno-strict-aliasing"
     # gcc's builtin memcmp is slower than glibc's
     # http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
     CXXFLAGS="$CXXFLAGS -fno-builtin-memcmp"
     # We don't want floating-point math functions to set errno or trap
     CXXFLAGS="$CXXFLAGS -fno-math-errno -fno-trapping-math"
 fi
 AC_SUBST([MSVC2013_COMPAT_CFLAGS])
@@ -398,6 +387,17 @@ fi
 AM_CONDITIONAL([SSE41_SUPPORTED], [test x$SSE41_SUPPORTED = x1])
 AC_SUBST([SSE41_CFLAGS], $SSE41_CFLAGS)
 dnl Check for new-style atomic builtins
 AC_COMPILE_IFELSE([AC_LANG_SOURCE([[
 int main() {
     int n;
     return __atomic_load_n(&n, __ATOMIC_ACQUIRE);
 }]])], GCC_ATOMIC_BUILTINS_SUPPORTED=1)
 if test "x$GCC_ATOMIC_BUILTINS_SUPPORTED" = x1; then
     DEFINES="$DEFINES -DUSE_GCC_ATOMIC_BUILTINS"
 fi
 AM_CONDITIONAL([GCC_ATOMIC_BUILTINS_SUPPORTED], [test x$GCC_ATOMIC_BUILTINS_SUPPORTED = x1])
 dnl Check for Endianness
 AC_C_BIGENDIAN(
    little_endian=no,
@@ -783,6 +783,7 @@ if test "x$enable_asm" = xyes; then
     esac
 fi
 AC_HEADER_MAJOR
 AC_CHECK_HEADER([xlocale.h], [DEFINES="$DEFINES -DHAVE_XLOCALE_H"])
 AC_CHECK_HEADER([sys/sysctl.h], [DEFINES="$DEFINES -DHAVE_SYS_SYSCTL_H"])
 AC_CHECK_FUNC([strtof], [DEFINES="$DEFINES -DHAVE_STRTOF"])
@@ -825,9 +826,21 @@ dnl to -pthread, which causes problems if we need -lpthread to appear in
 dnl pkgconfig files.
 test -z "$PTHREAD_LIBS" && PTHREAD_LIBS="-lpthread"
 PKG_CHECK_MODULES(PTHREADSTUBS, pthread-stubs)
 AC_SUBST(PTHREADSTUBS_CFLAGS)
 AC_SUBST(PTHREADSTUBS_LIBS)
 dnl pthread-stubs is mandatory on targets where it exists
 case "$host_os" in
 cygwin* )
     pthread_stubs_possible="no"
     ;;
 * )
     pthread_stubs_possible="yes"
     ;;
 esac
 if test "x$pthread_stubs_possible" = xyes; then
     PKG_CHECK_MODULES(PTHREADSTUBS, pthread-stubs)
     AC_SUBST(PTHREADSTUBS_CFLAGS)
     AC_SUBST(PTHREADSTUBS_LIBS)
 fi
 dnl SELinux awareness.
 AC_ARG_ENABLE([selinux],
@@ -870,6 +883,32 @@ AC_ARG_ENABLE([dri],
     [enable_dri="$enableval"],
     [enable_dri=yes])
 AC_ARG_ENABLE([gallium-extra-hud],
     [AS_HELP_STRING([--enable-gallium-extra-hud],
         [enable HUD block/NIC I/O HUD stats support @<:@default=disabled@:>@])],
     [enable_gallium_extra_hud="$enableval"],
     [enable_gallium_extra_hud=no])
 AM_CONDITIONAL(HAVE_GALLIUM_EXTRA_HUD, test "x$enable_gallium_extra_hud" = xyes)
 if test "x$enable_gallium_extra_hud" = xyes ; then
     DEFINES="${DEFINES} -DHAVE_GALLIUM_EXTRA_HUD=1"
 fi
 #TODO: no pkgconfig .pc available for libsensors.
 #PKG_CHECK_MODULES([LIBSENSORS], [libsensors >= $LIBSENSORS_REQUIRED], [enable_lmsensors=yes], [enable_lmsensors=no])
 AC_ARG_ENABLE([lmsensors],
     [AS_HELP_STRING([--enable-lmsensors],
         [enable HUD lmsensor support @<:@default=disabled@:>@])],
     [enable_lmsensors="$enableval"],
     [enable_lmsensors=no])
 AM_CONDITIONAL(HAVE_LIBSENSORS, test "x$enable_lmsensors" = xyes)
 if test "x$enable_lmsensors" = xyes ; then
     DEFINES="${DEFINES} -DHAVE_LIBSENSORS=1"
     LIBSENSORS_LDFLAGS="-lsensors"
 else
     LIBSENSORS_LDFLAGS=""
 fi
 AC_SUBST(LIBSENSORS_LDFLAGS)
 case "$host_os" in
 linux*)
     dri3_default=yes
@@ -1101,11 +1140,20 @@ if test "x$have_libdrm" = xyes; then
 	DEFINES="$DEFINES -DHAVE_LIBDRM"
 fi
 require_libdrm() {
     if test "x$have_libdrm" != xyes; then
        AC_MSG_ERROR([$1 requires libdrm >= $LIBDRM_REQUIRED])
     fi
 }
 # Select which platform-dependent DRI code gets built
 case "$host_os" in
 darwin*)
     dri_platform='apple' ;;
 gnu*|cygwin*)
 cygwin*)
     dri_platform='windows' ;;
 gnu*)
     dri_platform='none' ;;
 *)
     dri_platform='drm' ;;
@@ -1121,6 +1169,9 @@ AM_CONDITIONAL(HAVE_DRISW_KMS, test "x$have_drisw_kms" = xyes )
 AM_CONDITIONAL(HAVE_DRI2, test "x$enable_dri" = xyes -a "x$dri_platform" = xdrm -a "x$have_libdrm" = xyes )
 AM_CONDITIONAL(HAVE_DRI3, test "x$enable_dri3" = xyes -a "x$dri_platform" = xdrm -a "x$have_libdrm" = xyes )
 AM_CONDITIONAL(HAVE_APPLEDRI, test "x$enable_dri" = xyes -a "x$dri_platform" = xapple )
 AM_CONDITIONAL(HAVE_LMSENSORS, test "x$enable_lmsensors" = xyes )
 AM_CONDITIONAL(HAVE_GALLIUM_EXTRA_HUD, test "x$enable_gallium_extra_hud" = xyes )
 AM_CONDITIONAL(HAVE_WINDOWSDRI, test "x$enable_dri" = xyes -a "x$dri_platform" = xwindows )
 AC_ARG_ENABLE([shared-glapi],
     [AS_HELP_STRING([--enable-shared-glapi],
@@ -1301,23 +1352,9 @@ if test "x$with_sha1" = "x"; then
     fi
 fi
 AM_CONDITIONAL([ENABLE_SHADER_CACHE], [test x$enable_shader_cache = xyes])
 case "$host_os" in
 linux*)
     need_pci_id=yes ;;
 *)
     need_pci_id=no ;;
 esac
 PKG_CHECK_MODULES([LIBUDEV], [libudev >= $LIBUDEV_REQUIRED],
                   have_libudev=yes, have_libudev=no)
 AC_ARG_ENABLE([sysfs],
     [AS_HELP_STRING([--enable-sysfs],
         [enable /sys PCI identification @<:@default=disabled@:>@])],
     [have_sysfs="$enableval"],
     [have_sysfs=no]
 )
 if test "x$enable_shader_cache" = "xyes"; then
    AC_DEFINE([ENABLE_SHADER_CACHE], [1], [Enable shader cache])
 fi
 if test "x$enable_dri" = xyes; then
     if test "$enable_static" = yes; then
@@ -1361,9 +1398,7 @@ xdri)
     if test x"$driglx_direct" = xyes; then
         if test x"$dri_platform" = xdrm ; then
             DEFINES="$DEFINES -DGLX_USE_DRM"
             if test "x$have_libdrm" != xyes; then
                AC_MSG_ERROR([Direct rendering requires libdrm >= $LIBDRM_REQUIRED])
             fi
             require_libdrm "Direct rendering"
             PKG_CHECK_MODULES([DRI2PROTO], [dri2proto >= $DRI2PROTO_REQUIRED])
             GL_PC_REQ_PRIV="$GL_PC_REQ_PRIV libdrm >= $LIBDRM_REQUIRED"
@@ -1385,6 +1420,9 @@ xdri)
         if test x"$dri_platform" = xapple ; then
             DEFINES="$DEFINES -DGLX_USE_APPLEGL"
         fi
         if test x"$dri_platform" = xwindows ; then
             DEFINES="$DEFINES -DGLX_USE_WINDOWSGL"
         fi
     fi
     # add xf86vidmode if available
@@ -1404,17 +1442,6 @@ xdri)
     ;;
 esac
 have_pci_id=no
 if test "$have_libudev" = yes; then
     DEFINES="$DEFINES -DHAVE_LIBUDEV"
     have_pci_id=yes
 fi
 if test "$have_sysfs" = yes; then
     DEFINES="$DEFINES -DHAVE_SYSFS"
     have_pci_id=yes
 fi
 # This is outside the case (above) so that it is invoked even for non-GLX
 # builds.
 AM_CONDITIONAL(HAVE_XF86VIDMODE, test "x$HAVE_XF86VIDMODE" = xyes)
@@ -1523,10 +1550,6 @@ if test "x$enable_dri" = xyes; then
             DEFINES="$DEFINES -DHAVE_DRI3"
         fi
         if test "x$have_pci_id" != xyes; then
             AC_MSG_ERROR([libudev-dev or sysfs required for building DRI])
         fi
         case "$host_cpu" in
         powerpc* | sparc*)
             # Build only the drivers for cards that exist on PowerPC/sparc
@@ -1639,11 +1662,18 @@ esac
 AC_ARG_WITH([vulkan-icddir],
     [AS_HELP_STRING([--with-vulkan-icddir=DIR],
         [directory for the Vulkan driver icd files @<:@${sysconfdir}/vulkan/icd.d@:>@])],
         [directory for the Vulkan driver icd files @<:@${datarootdir}/vulkan/icd.d@:>@])],
     [VULKAN_ICD_INSTALL_DIR="$withval"],
     [VULKAN_ICD_INSTALL_DIR='${datarootdir}/vulkan/icd.d'])
 AC_SUBST([VULKAN_ICD_INSTALL_DIR])
 AC_ARG_ENABLE([vulkan-icd-full-driver-path],
    [AS_HELP_STRING([--disable-vulkan-icd-full-driver-path],
                    [create Vulkan ICD files with just a .so name and no path])],
    [vulkan_icd_driver_path="$enableval"],
    [vulkan_icd_driver_path="yes"])
 AM_CONDITIONAL(VULKAN_ICD_DRIVER_PATH, test "x$vulkan_icd_driver_path" = xyes)
 if test -n "$with_vulkan_drivers"; then
     VULKAN_DRIVERS=`IFS=', '; echo $with_vulkan_drivers`
     for driver in $VULKAN_DRIVERS; do
@@ -1658,6 +1688,13 @@ if test -n "$with_vulkan_drivers"; then
             HAVE_INTEL_VULKAN=yes;
             ;;
         xradeon)
             PKG_CHECK_MODULES([AMDGPU], [libdrm_amdgpu >= $LIBDRM_AMDGPU_REQUIRED])
             HAVE_RADEON_VULKAN=yes;
             if test "x$with_sha1" == "x"; then
                 AC_MSG_ERROR([radv vulkan driver requires SHA1])
             fi
 	    ;;
         *)
             AC_MSG_ERROR([Vulkan driver '$driver' does not exist])
             ;;
@@ -1727,10 +1764,6 @@ if test "x$enable_gbm" = xauto; then
     esac
 fi
 if test "x$enable_gbm" = xyes; then
     if test "x$need_pci_id$have_pci_id" = xyesno; then
         AC_MSG_ERROR([gbm requires udev >= $LIBUDEV_REQUIRED or sysfs])
     fi
     if test "x$enable_dri" = xyes; then
         if test "x$enable_shared_glapi" = xno; then
             AC_MSG_ERROR([gbm_dri requires --enable-shared-glapi])
@@ -1745,11 +1778,8 @@ if test "x$enable_gbm" = xyes; then
     fi
 fi
 AM_CONDITIONAL(HAVE_GBM, test "x$enable_gbm" = xyes)
 if test "x$need_pci_id$have_libudev" = xyesyes; then
     GBM_PC_REQ_PRIV="libudev >= $LIBUDEV_REQUIRED"
 else
     GBM_PC_REQ_PRIV=""
 fi
 # FINISHME: GBM has a number of dependencies which we should add below
 GBM_PC_REQ_PRIV=""
 GBM_PC_LIB_PRIV="$DLOPEN_LIBS"
 AC_SUBST([GBM_PC_REQ_PRIV])
 AC_SUBST([GBM_PC_LIB_PRIV])
@@ -1997,8 +2027,8 @@ if test "x$with_egl_platforms" != "x" -a "x$enable_egl" != xyes; then
     AC_MSG_ERROR([cannot build egl state tracker without EGL library])
 fi
 PKG_CHECK_MODULES([WAYLAND_SCANNER], [wayland_scanner],
         WAYLAND_SCANNER=`$PKG_CONFIG --variable=wayland_scanner wayland_scanner`,
 PKG_CHECK_MODULES([WAYLAND_SCANNER], [wayland-scanner],
         WAYLAND_SCANNER=`$PKG_CONFIG --variable=wayland_scanner wayland-scanner`,
         WAYLAND_SCANNER='')
 if test "x$WAYLAND_SCANNER" = x; then
     AC_PATH_PROG([WAYLAND_SCANNER], [wayland-scanner])
@@ -2009,8 +2039,6 @@ egl_platforms=`IFS=', '; echo $with_egl_platforms`
 for plat in $egl_platforms; do
 	case "$plat" in
 	wayland)
 		test "x$have_libdrm" != xyes &&
 			AC_MSG_ERROR([EGL platform wayland requires libdrm >= $LIBDRM_REQUIRED])
 		PKG_CHECK_MODULES([WAYLAND], [wayland-client >= $WAYLAND_REQUIRED wayland-server >= $WAYLAND_REQUIRED])
@@ -2026,13 +2054,9 @@ for plat in $egl_platforms; do
 	drm)
 		test "x$enable_gbm" = "xno" &&
 			AC_MSG_ERROR([EGL platform drm needs gbm])
 		test "x$have_libdrm" != xyes &&
 			AC_MSG_ERROR([EGL platform drm requires libdrm >= $LIBDRM_REQUIRED])
 		;;
 	surfaceless)
 		test "x$have_libdrm" != xyes &&
 			AC_MSG_ERROR([EGL platform surfaceless requires libdrm >= $LIBDRM_REQUIRED])
 		;;
 	android)
@@ -2043,10 +2067,11 @@ for plat in $egl_platforms; do
 		;;
 	esac
         case "$plat$need_pci_id$have_pci_id" in
                 waylandyesno|drmyesno)
                     AC_MSG_ERROR([cannot build $plat platform without udev >= $LIBUDEV_REQUIRED or sysfs]) ;;
         esac
 	case "$plat" in
 	wayland|drm|surfaceless)
 		require_libdrm "Platform $plat"
 		;;
 	esac
 done
 # libEGL wants to default to the first platform specified in
@@ -2141,7 +2166,7 @@ if test "x$enable_gallium_llvm" = xauto; then
     i*86|x86_64|amd64) enable_gallium_llvm=yes;;
     esac
 fi
 if test "x$enable_gallium_llvm" = xyes; then
 if test "x$enable_gallium_llvm" = xyes || test "x$HAVE_RADEON_VULKAN" = xyes; then
     if test -n "$llvm_prefix"; then
         AC_PATH_TOOL([LLVM_CONFIG], [llvm-config], [no], ["$llvm_prefix/bin"])
     else
@@ -2182,8 +2207,12 @@ if test "x$enable_gallium_llvm" = xyes; then
         LLVM_COMPONENTS="engine bitwriter mcjit mcdisassembler"
         if $LLVM_CONFIG --components | grep -q inteljitevents ; then
             LLVM_COMPONENTS="${LLVM_COMPONENTS} inteljitevents"
         fi
         if test "x$enable_opencl" = xyes; then
             llvm_check_version_for "3" "5" "0" "opencl"
             llvm_check_version_for "3" "6" "0" "opencl"
             LLVM_COMPONENTS="${LLVM_COMPONENTS} all-targets ipo linker instrumentation"
             LLVM_COMPONENTS="${LLVM_COMPONENTS} irreader option objcarcopts profiledata"
@@ -2262,12 +2291,6 @@ AC_SUBST([D3D_DRIVER_INSTALL_DIR])
 dnl
 dnl Gallium helper functions
 dnl
 gallium_require_drm() {
     if test "x$have_libdrm" != xyes; then
        AC_MSG_ERROR([$1 requires libdrm >= $LIBDRM_REQUIRED])
     fi
 }
 gallium_require_llvm() {
     if test "x$MESA_LLVM" = x0; then
         case "$host" in *gnux32) return;; esac
@@ -2277,12 +2300,6 @@ gallium_require_llvm() {
     fi
 }
 gallium_require_drm_loader() {
     if test "x$need_pci_id$have_pci_id" = xyesno; then
         AC_MSG_ERROR([Gallium drm loader requires libudev >= $LIBUDEV_REQUIRED or sysfs])
     fi
 }
 dnl This is for Glamor. Skip this if OpenGL is disabled.
 require_egl_drm() {
     if test "x$enable_opengl" = xno; then
@@ -2307,10 +2324,7 @@ radeon_llvm_check() {
     else
         amdgpu_llvm_target_name='amdgpu'
     fi
     if test "x$enable_gallium_llvm" != "xyes"; then
         AC_MSG_ERROR([--enable-gallium-llvm is required when building $1])
     fi
     llvm_check_version_for "3" "6" "0" $1
     llvm_check_version_for $2 $3 $4 $1
     if test true && $LLVM_CONFIG --targets-built | grep -iqvw $amdgpu_llvm_target_name ; then
         AC_MSG_ERROR([LLVM $amdgpu_llvm_target_name not enabled in your LLVM build.])
     fi
@@ -2321,6 +2335,13 @@ radeon_llvm_check() {
     fi
 }
 radeon_gallium_llvm_check() {
     if test "x$enable_gallium_llvm" != "xyes"; then
         AC_MSG_ERROR([--enable-gallium-llvm is required when building $1])
     fi
     radeon_llvm_check $*
 }
 swr_llvm_check() {
     gallium_require_llvm $1
     if test ${LLVM_VERSION_INT} -lt 306; then
@@ -2331,6 +2352,45 @@ swr_llvm_check() {
     fi
 }
 swr_require_cxx_feature_flags() {
     feature_name="$1"
     preprocessor_test="$2"
     option_list="$3"
     output_var="$4"
     AC_MSG_CHECKING([whether $CXX supports $feature_name])
     AC_LANG_PUSH([C++])
     save_CXXFLAGS="$CXXFLAGS"
     save_IFS="$IFS"
     IFS=","
     found=0
     for opts in $option_list
     do
         unset IFS
         CXXFLAGS="$opts $save_CXXFLAGS"
         AC_COMPILE_IFELSE(
             [AC_LANG_PROGRAM(
                 [   #if !($preprocessor_test)
                     #error
                     #endif
                 ])],
             [found=1; break],
             [])
         IFS=","
     done
     IFS="$save_IFS"
     CXXFLAGS="$save_CXXFLAGS"
     AC_LANG_POP([C++])
     if test $found -eq 1; then
         AC_MSG_RESULT([$opts])
         eval "$output_var=\$opts"
         return 0
     fi
     AC_MSG_RESULT([no])
     AC_MSG_ERROR([swr requires $feature_name support])
     return 1
 }
 dnl Duplicates in GALLIUM_DRIVERS_DIRS are removed by sorting it after this block
 if test -n "$with_gallium_drivers"; then
     gallium_drivers=`IFS=', '; echo $with_gallium_drivers`
@@ -2338,35 +2398,30 @@ if test -n "$with_gallium_drivers"; then
         case "x$driver" in
         xsvga)
             HAVE_GALLIUM_SVGA=yes
             gallium_require_drm "svga"
             gallium_require_drm_loader
             require_libdrm "svga"
             ;;
         xi915)
             HAVE_GALLIUM_I915=yes
             PKG_CHECK_MODULES([INTEL], [libdrm_intel >= $LIBDRM_INTEL_REQUIRED])
             gallium_require_drm "Gallium i915"
             gallium_require_drm_loader
             require_libdrm "Gallium i915"
             ;;
         xilo)
             HAVE_GALLIUM_ILO=yes
             PKG_CHECK_MODULES([INTEL], [libdrm_intel >= $LIBDRM_INTEL_REQUIRED])
             gallium_require_drm "Gallium i965/ilo"
             gallium_require_drm_loader
             require_libdrm "Gallium i965/ilo"
             ;;
         xr300)
             HAVE_GALLIUM_R300=yes
             PKG_CHECK_MODULES([RADEON], [libdrm_radeon >= $LIBDRM_RADEON_REQUIRED])
             gallium_require_drm "Gallium R300"
             gallium_require_drm_loader
             require_libdrm "Gallium R300"
             gallium_require_llvm "Gallium R300"
             ;;
         xr600)
             HAVE_GALLIUM_R600=yes
             PKG_CHECK_MODULES([RADEON], [libdrm_radeon >= $LIBDRM_RADEON_REQUIRED])
             gallium_require_drm "Gallium R600"
             gallium_require_drm_loader
             require_libdrm "Gallium R600"
             if test "x$enable_opencl" = xyes; then
                 radeon_llvm_check "r600g"
                 radeon_gallium_llvm_check "r600g" "3" "6" "0"
                 LLVM_COMPONENTS="${LLVM_COMPONENTS} bitreader asmparser"
             fi
             ;;
@@ -2374,22 +2429,19 @@ if test -n "$with_gallium_drivers"; then
             HAVE_GALLIUM_RADEONSI=yes
             PKG_CHECK_MODULES([RADEON], [libdrm_radeon >= $LIBDRM_RADEON_REQUIRED])
             PKG_CHECK_MODULES([AMDGPU], [libdrm_amdgpu >= $LIBDRM_AMDGPU_REQUIRED])
             gallium_require_drm "radeonsi"
             gallium_require_drm_loader
             radeon_llvm_check "radeonsi"
             require_libdrm "radeonsi"
             radeon_gallium_llvm_check "radeonsi" "3" "6" "0"
             require_egl_drm "radeonsi"
             ;;
         xnouveau)
             HAVE_GALLIUM_NOUVEAU=yes
             PKG_CHECK_MODULES([NOUVEAU], [libdrm_nouveau >= $LIBDRM_NOUVEAU_REQUIRED])
             gallium_require_drm "nouveau"
             gallium_require_drm_loader
             require_libdrm "nouveau"
             ;;
         xfreedreno)
             HAVE_GALLIUM_FREEDRENO=yes
             PKG_CHECK_MODULES([FREEDRENO], [libdrm_freedreno >= $LIBDRM_FREEDRENO_REQUIRED])
             gallium_require_drm "freedreno"
             gallium_require_drm_loader
             require_libdrm "freedreno"
             ;;
         xswrast)
             HAVE_GALLIUM_SOFTPIPE=yes
@@ -2400,36 +2452,27 @@ if test -n "$with_gallium_drivers"; then
         xswr)
             swr_llvm_check "swr"
             AC_MSG_CHECKING([whether $CXX supports c++11/AVX/AVX2])
             AVX_CXXFLAGS="-march=core-avx-i"
             AVX2_CXXFLAGS="-march=core-avx2"
             swr_require_cxx_feature_flags "C++11" "__cplusplus >= 201103L" \
                 ",-std=c++11" \
                 SWR_CXX11_CXXFLAGS
             AC_SUBST([SWR_CXX11_CXXFLAGS])
             AC_LANG_PUSH([C++])
             save_CXXFLAGS="$CXXFLAGS"
             CXXFLAGS="-std=c++11 $CXXFLAGS"
             AC_COMPILE_IFELSE([AC_LANG_PROGRAM()],[],
                               [AC_MSG_ERROR([c++11 compiler support not detected])])
             CXXFLAGS="$save_CXXFLAGS"
             swr_require_cxx_feature_flags "AVX" "defined(__AVX__)" \
                 ",-mavx,-march=core-avx" \
                 SWR_AVX_CXXFLAGS
             AC_SUBST([SWR_AVX_CXXFLAGS])
             save_CXXFLAGS="$CXXFLAGS"
             CXXFLAGS="$AVX_CXXFLAGS $CXXFLAGS"
             AC_COMPILE_IFELSE([AC_LANG_PROGRAM()],[],
                               [AC_MSG_ERROR([AVX compiler support not detected])])
             CXXFLAGS="$save_CXXFLAGS"
             save_CFLAGS="$CXXFLAGS"
             CXXFLAGS="$AVX2_CXXFLAGS $CXXFLAGS"
             AC_COMPILE_IFELSE([AC_LANG_PROGRAM()],[],
                               [AC_MSG_ERROR([AVX2 compiler support not detected])])
             CXXFLAGS="$save_CXXFLAGS"
             AC_LANG_POP([C++])
             swr_require_cxx_feature_flags "AVX2" "defined(__AVX2__)" \
                 ",-mavx2 -mfma -mbmi2 -mf16c,-march=core-avx2" \
                 SWR_AVX2_CXXFLAGS
             AC_SUBST([SWR_AVX2_CXXFLAGS])
             HAVE_GALLIUM_SWR=yes
             ;;
         xvc4)
             HAVE_GALLIUM_VC4=yes
             gallium_require_drm "vc4"
             gallium_require_drm_loader
             PKG_CHECK_MODULES([VC4], [libdrm_vc4 >= $LIBDRM_VC4_REQUIRED])
             require_libdrm "vc4"
             PKG_CHECK_MODULES([SIMPENROSE], [simpenrose],
                               [USE_VC4_SIMULATOR=yes;
@@ -2438,8 +2481,7 @@ if test -n "$with_gallium_drivers"; then
             ;;
         xvirgl)
             HAVE_GALLIUM_VIRGL=yes
             gallium_require_drm "virgl"
             gallium_require_drm_loader
             require_libdrm "virgl"
             require_egl_drm "virgl"
             ;;
         *)
@@ -2449,6 +2491,10 @@ if test -n "$with_gallium_drivers"; then
     done
 fi
 if test "x$HAVE_RADEON_VULKAN" = "xyes"; then
     radeon_llvm_check "radv" "3" "9" "0"
 fi
 dnl Set LLVM_LIBS - This is done after the driver configuration so
 dnl that drivers can add additional components to LLVM_COMPONENTS.
 dnl Previously, gallium drivers were updating LLVM_LIBS directly
@@ -2497,7 +2543,7 @@ if test "x$MESA_LLVM" != x0; then
         AC_MSG_WARN([Building mesa with statically linked LLVM may cause compilation issues])
         dnl We need to link to llvm system libs when using static libs
         dnl However, only llvm 3.5+ provides --system-libs
         if test $LLVM_VERSION_MAJOR -eq 3 -a $LLVM_VERSION_MINOR -ge 5; then
         if test $LLVM_VERSION_MAJOR -ge 4 -o $LLVM_VERSION_MAJOR -eq 3 -a $LLVM_VERSION_MINOR -ge 5; then
             LLVM_LIBS="$LLVM_LIBS `$LLVM_CONFIG --system-libs`"
         fi
     fi
@@ -2540,8 +2586,13 @@ AM_CONDITIONAL(HAVE_R200_DRI, test x$HAVE_R200_DRI = xyes)
 AM_CONDITIONAL(HAVE_RADEON_DRI, test x$HAVE_RADEON_DRI = xyes)
 AM_CONDITIONAL(HAVE_SWRAST_DRI, test x$HAVE_SWRAST_DRI = xyes)
 AM_CONDITIONAL(HAVE_RADEON_VULKAN, test "x$HAVE_RADEON_VULKAN" = xyes)
 AM_CONDITIONAL(HAVE_INTEL_VULKAN, test "x$HAVE_INTEL_VULKAN" = xyes)
 AM_CONDITIONAL(HAVE_AMD_DRIVERS, test "x$HAVE_GALLIUM_R600" = xyes -o \
                                       "x$HAVE_GALLIUM_RADEONSI" = xyes -o \
                                       "x$HAVE_RADEON_VULKAN" = xyes)
 AM_CONDITIONAL(HAVE_INTEL_DRIVERS, test "x$HAVE_INTEL_VULKAN" = xyes -o \
                                         "x$HAVE_I965_DRI" = xyes)
@@ -2560,6 +2611,8 @@ fi
 AM_CONDITIONAL(HAVE_LIBDRM, test "x$have_libdrm" = xyes)
 AM_CONDITIONAL(HAVE_OSMESA, test "x$enable_osmesa" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_OSMESA, test "x$enable_gallium_osmesa" = xyes)
 AM_CONDITIONAL(HAVE_COMMON_OSMESA, test "x$enable_osmesa" = xyes -o \
                                         "x$enable_gallium_osmesa" = xyes)
 AM_CONDITIONAL(HAVE_X86_ASM, test "x$asm_arch" = xx86 -o "x$asm_arch" = xx86_64)
 AM_CONDITIONAL(HAVE_X86_64_ASM, test "x$asm_arch" = xx86_64)
@@ -2578,6 +2631,8 @@ VA_MINOR=`$PKG_CONFIG --modversion libva | $SED -n 's/.*\.\(.*\)\..*$/\1/p'`
 AC_SUBST([VA_MAJOR], $VA_MAJOR)
 AC_SUBST([VA_MINOR], $VA_MINOR)
 AM_CONDITIONAL(HAVE_VULKAN_COMMON, test "x$VULKAN_DRIVERS" != "x")
 AC_SUBST([XVMC_MAJOR], 1)
 AC_SUBST([XVMC_MINOR], 0)
@@ -2631,6 +2686,9 @@ CXXFLAGS="$CXXFLAGS $USER_CXXFLAGS"
 dnl Substitute the config
 AC_CONFIG_FILES([Makefile
 		src/Makefile
 		src/amd/Makefile
 		src/amd/common/Makefile
 		src/amd/vulkan/Makefile
 		src/compiler/Makefile
 		src/egl/Makefile
 		src/egl/main/egl.pc
@@ -2705,10 +2763,11 @@ AC_CONFIG_FILES([Makefile
 		src/glx/Makefile
 		src/glx/apple/Makefile
 		src/glx/tests/Makefile
 		src/glx/windows/Makefile
 		src/glx/windows/windowsdriproto.pc
 		src/gtest/Makefile
 		src/intel/Makefile
 		src/intel/genxml/Makefile
 		src/intel/isl/Makefile
 		src/intel/tools/Makefile
 		src/intel/vulkan/Makefile
 		src/loader/Makefile
 		src/mapi/Makefile
@@ -2732,16 +2791,14 @@ AC_CONFIG_FILES([Makefile
 		src/mesa/drivers/x11/Makefile
 		src/mesa/main/tests/Makefile
 		src/util/Makefile
 		src/util/tests/hash_table/Makefile])
 		src/util/tests/hash_table/Makefile
 		src/vulkan/wsi/Makefile])
 AC_OUTPUT
 # Fix up dependencies in *.Plo files, where we changed the extension of a
 # source file
 $SED -i -e 's/brw_blorp.cpp/brw_blorp.c/' src/mesa/drivers/dri/i965/.deps/brw_blorp.Plo
 $SED -i -e 's/gen6_blorp.cpp/gen6_blorp.c/' src/mesa/drivers/dri/i965/.deps/gen6_blorp.Plo
 $SED -i -e 's/gen7_blorp.cpp/gen7_blorp.c/' src/mesa/drivers/dri/i965/.deps/gen7_blorp.Plo
 $SED -i -e 's/gen8_blorp.cpp/gen8_blorp.c/' src/mesa/drivers/dri/i965/.deps/gen8_blorp.Plo
 dnl
@@ -2840,6 +2897,19 @@ else
     echo "        Gallium:         no"
 fi
 echo ""
 if test "x$enable_gallium_extra_hud" != xyes; then
     echo "        HUD extra stats: no"
 else
     echo "        HUD extra stats: yes"
 fi
 if test "x$enable_lmsensors" != xyes; then
     echo "        HUD lmsensors:   no"
 else
     echo "        HUD lmsensors:   yes"
 fi
 dnl Shader cache
 echo ""
 echo "        Shader cache:    $enable_shader_cache"

									
										2

docs/developers.html
									
												View File
												
				@@ -38,7 +38,7 @@ including:

				<p>

				Other companies including

				<a href="http://www.intellinuxgraphics.org/index.html">Intel</a>

				<a href="https://01.org/linuxgraphics">Intel</a>

				and RedHat also actively contribute to the project.

				Intel has recently contributed the new GLSL compiler in Mesa 7.9.

				</p>

									
										22

docs/devinfo.html
									
												View File
												
				@@ -252,10 +252,14 @@ check for regressions.

				<h3>Mailing Patches</h3>

				<p>

				Patches should be sent to the Mesa mailing list for review.

				When submitting a patch make sure to use git send-email rather than attaching

				patches to emails. Sending patches as attachments prevents people from being

				able to provide in-line review comments.

				Patches should be sent to the mesa-dev mailing list for review:

				<a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev">

				mesa-dev@lists.freedesktop.org<a/>.

				When submitting a patch make sure to use

				<a href="https://git-scm.com/docs/git-send-email">git send-email</a>

				rather than attaching patches to emails. Sending patches as

				attachments prevents people from being able to provide in-line review

				comments.

				</p>

				<p>

				@@ -684,9 +688,11 @@ To add a new GL extension to Mesa you have to do at least the following.

				</li>

				<li>

				   Add a new entry to the <code>gl_extensions</code> struct in mtypes.h

				   if the extension requires driver capabilities not already exposed by

				   another extension.

				</li>

				<li>

				   Update the <code>extensions.c</code> file.

				   Add a new entry to the src/mesa/main/extensions_table.h file.

				</li>

				<li>

				   From this point, the best way to proceed is to find another extension,

				@@ -697,12 +703,18 @@ To add a new GL extension to Mesa you have to do at least the following.

				   If the new extension adds new GL state, the functions in get.c, enable.c

				   and attrib.c will most likely require new code.

				</li>

				<li>

				   To determine if the new extension is active in the current context,

				   use the auto-generated _mesa_has_##name_str() function defined in

				   src/mesa/main/extensions.h.

				</li>

				<li>

				   The dispatch tests check_table.cpp and dispatch_sanity.cpp

				   should be updated with details about the new extensions functions. These

				   tests are run using 'make check'

				</li>

				</ul>

				</p>

									
										29

docs/envvars.html
									
												View File
												
				@@ -50,8 +50,17 @@ sometimes be useful for debugging end-user issues.

				   if the application generates a GL_INVALID_ENUM error, a corresponding error

				   message indicating where the error occurred, and possibly why, will be

				   printed to stderr.<br>

				   If the value of MESA_DEBUG is 'FP' floating point arithmetic errors will

				   generate exceptions.

				   For release builds, MESA_DEBUG defaults to off (no debug output).

				   MESA_DEBUG accepts the following comma-separated list of named

				   flags, which adds extra behaviour to just set MESA_DEBUG=1:

				   <ul>

				     <li>silent - turn off debug messages. Only useful for debug builds.</li>

				     <li>flush - flush after each drawing command</li>

				     <li>incomplete_tex - extra debug messages when a texture is incomplete</li>

				     <li>incomplete_fbo - extra debug messages when a fbo is incomplete</li>

				   </ul>

				<li>MESA_LOG_FILE - specifies a file name for logging all errors, warnings,

				etc., rather than stderr

				<li>MESA_TEX_PROG - if set, implement conventional texture env modes with

				@@ -144,11 +153,10 @@ See the <a href="xlibdriver.html">Xlib software driver page</a> for details.

				   <li>bat - emit batch information</li>

				   <li>pix - emit messages about pixel operations</li>

				   <li>buf - emit messages about buffer objects</li>

				   <li>reg - emit messages about regions</li>

				   <li>fbo - emit messages about framebuffers</li>

				   <li>fs - dump shader assembly for fragment shaders</li>

				   <li>gs - dump shader assembly for geometry shaders</li>

				   <li>sync - emit messages about synchronization</li>

				   <li>sync - after sending each batch, emit a message and wait for that batch to finish rendering</li>

				   <li>prim - emit messages about drawing primitives</li>

				   <li>vert - emit messages about vertex assembly</li>

				   <li>dri - emit messages about the DRI interface</li>

				@@ -163,9 +171,18 @@ See the <a href="xlibdriver.html">Xlib software driver page</a> for details.

				   <li>blorp - emit messages about the blorp operations (blits &amp; clears)</li>

				   <li>nodualobj - suppress generation of dual-object geometry shader code</li>

				   <li>optimizer - dump shader assembly to files at each optimization pass and iteration that make progress</li>

				   <li>ann - annotate IR in assembly dumps</li>

				   <li>no8 - don't generate SIMD8 fragment shader</li>

				   <li>vec4 - force vec4 mode in vertex shader</li>

				   <li>spill_fs - force spilling of all registers in the scalar backend (useful to debug spilling code)</li>

				   <li>spill_vec4 - force spilling of all registers in the vec4 backend (useful to debug spilling code)</li>

				   <li>cs - dump shader assembly for compute shaders</li>

				   <li>hex - print instruction hex dump with the disassembly</li>

				   <li>nocompact - disable instruction compaction</li>

				   <li>tcs - dump shader assembly for tessellation control shaders</li>

				   <li>tes - dump shader assembly for tessellation evaluation shaders</li>

				   <li>l3 - emit messages about the new L3 state during transitions</li>

				   <li>do32 - generate compute shader SIMD32 programs even if workgroup size doesn't exceed the SIMD16 limit</li>

				   <li>norbc - disable single sampled render buffer compression</li>

				</ul>

				</ul>

				@@ -198,8 +215,10 @@ Mesa EGL supports different sets of environment variables.  See the

				<li>GALLIUM_HUD_TOGGLE_SIGNAL - toggle visibility via user specified signal.

				    Especially useful to toggle hud at specific points of application and

				    disable for unencumbered viewing the rest of the time. For example, set

				    GALLIUM_HUD_VISIBLE to false and GALLIUM_HUD_SIGNAL_TOGGLE to 10 (SIGUSR1).

				    GALLIUM_HUD_VISIBLE to false and GALLIUM_HUD_TOGGLE_SIGNAL to 10 (SIGUSR1).

				    Use kill -10 <pid> to toggle the hud as desired.

				<li>GALLIUM_DRIVER - useful in combination with LIBGL_ALWAYS_SOFTWARE=1 for

				    choosing one of the software renderers "softpipe", "llvmpipe" or "swr".

				<li>GALLIUM_LOG_FILE - specifies a file for logging all errors, warnings, etc.

				    rather than stderr.

				<li>GALLIUM_PRINT_OPTIONS - if non-zero, print all the Gallium environment

									
										2

docs/faq.html
									
												View File
												
				@@ -57,7 +57,7 @@ drivers for X.org.

				<ul>

				  <li>See the <a href="http://dri.freedesktop.org/">DRI website</a>

				  for more information.</li>

				  <li>See <a href="http://intellinuxgraphics.org">intellinuxgraphics.org</a>

				  <li>See <a href="https://01.org/linuxgraphics">01.org</a>

				  for more information about Intel drivers.</li>

				  <li>See <a href="http://nouveau.freedesktop.org">nouveau.freedesktop.org</a>

				  for more information about Nouveau drivers.</li>

204

docs/GL3.txt → docs/features.txt

View File

@@ -107,11 +107,11 @@ GL 3.3, GLSL 3.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, soft
   GL_ARB_vertex_type_2_10_10_10_rev                     DONE (swr)
 GL 4.0, GLSL 4.00 --- all DONE: nvc0, r600, radeonsi
 GL 4.0, GLSL 4.00 --- all DONE: i965/gen8+, nvc0, r600, radeonsi
   GL_ARB_draw_buffers_blend                             DONE (i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_draw_indirect                                  DONE (i965, llvmpipe, softpipe, swr)
   GL_ARB_gpu_shader5                                    DONE (i965)
   GL_ARB_draw_buffers_blend                             DONE (i965/gen6+, nv50, llvmpipe, softpipe, swr)
   GL_ARB_draw_indirect                                  DONE (i965/gen7+, llvmpipe, softpipe, swr)
   GL_ARB_gpu_shader5                                    DONE (i965/gen7+)
   - 'precise' qualifier                                 DONE
   - Dynamically uniform sampler array indices           DONE (softpipe)
   - Dynamically uniform UBO array indices               DONE ()
@@ -124,154 +124,214 @@ GL 4.0, GLSL 4.00 --- all DONE: nvc0, r600, radeonsi
   - Enhanced per-sample shading                         DONE ()
   - Interpolation functions                             DONE ()
   - New overload resolution rules                       DONE
   GL_ARB_gpu_shader_fp64                                DONE (i965/gen8+, llvmpipe, softpipe)
   GL_ARB_sample_shading                                 DONE (i965, nv50)
   GL_ARB_shader_subroutine                              DONE (i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_tessellation_shader                            DONE (i965)
   GL_ARB_texture_buffer_object_rgb32                    DONE (i965, llvmpipe, softpipe, swr)
   GL_ARB_texture_cube_map_array                         DONE (i965, nv50, llvmpipe, softpipe)
   GL_ARB_texture_gather                                 DONE (i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_gpu_shader_fp64                                DONE (llvmpipe, softpipe)
   GL_ARB_sample_shading                                 DONE (i965/gen6+, nv50)
   GL_ARB_shader_subroutine                              DONE (i965/gen6+, nv50, llvmpipe, softpipe, swr)
   GL_ARB_tessellation_shader                            DONE (i965/gen7+)
   GL_ARB_texture_buffer_object_rgb32                    DONE (i965/gen6+, llvmpipe, softpipe, swr)
   GL_ARB_texture_cube_map_array                         DONE (i965/gen6+, nv50, llvmpipe, softpipe)
   GL_ARB_texture_gather                                 DONE (i965/gen6+, nv50, llvmpipe, softpipe, swr)
   GL_ARB_texture_query_lod                              DONE (i965, nv50, softpipe)
   GL_ARB_transform_feedback2                            DONE (i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_transform_feedback3                            DONE (i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_transform_feedback2                            DONE (i965/gen7+, nv50, llvmpipe, softpipe, swr)
   GL_ARB_transform_feedback3                            DONE (i965/gen7+, nv50, llvmpipe, softpipe, swr)
 GL 4.1, GLSL 4.10 --- all DONE: nvc0, r600, radeonsi
 GL 4.1, GLSL 4.10 --- all DONE: i965/gen8+, nvc0, r600, radeonsi
   GL_ARB_ES2_compatibility                              DONE (i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_get_program_binary                             DONE (0 binary formats)
   GL_ARB_separate_shader_objects                        DONE (all drivers)
   GL_ARB_shader_precision                               DONE (all drivers that support GLSL 4.10)
   GL_ARB_vertex_attrib_64bit                            DONE (i965/gen8+, llvmpipe, softpipe)
   GL_ARB_vertex_attrib_64bit                            DONE (llvmpipe, softpipe)
   GL_ARB_viewport_array                                 DONE (i965, nv50, llvmpipe, softpipe)
 GL 4.2, GLSL 4.20 -- all DONE: radeonsi
 GL 4.2, GLSL 4.20 -- all DONE: i965/gen8+, nvc0, radeonsi
   GL_ARB_texture_compression_bptc                       DONE (i965, nvc0, r600, radeonsi)
   GL_ARB_texture_compression_bptc                       DONE (i965, r600)
   GL_ARB_compressed_texture_pixel_storage               DONE (all drivers)
   GL_ARB_shader_atomic_counters                         DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_shader_atomic_counters                         DONE (i965, softpipe)
   GL_ARB_texture_storage                                DONE (all drivers)
   GL_ARB_transform_feedback_instanced                   DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_base_instance                                  DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_shader_image_load_store                        DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_transform_feedback_instanced                   DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_base_instance                                  DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_shader_image_load_store                        DONE (i965, softpipe)
   GL_ARB_conservative_depth                             DONE (all drivers that support GLSL 1.30)
   GL_ARB_shading_language_420pack                       DONE (all drivers that support GLSL 1.30)
   GL_ARB_shading_language_packing                       DONE (all drivers)
   GL_ARB_internalformat_query                           DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_internalformat_query                           DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_map_buffer_alignment                           DONE (all drivers)
 GL 4.3, GLSL 4.30:
 GL 4.3, GLSL 4.30 -- all DONE: i965/gen8+, nvc0, radeonsi
   GL_ARB_arrays_of_arrays                               DONE (all drivers that support GLSL 1.30)
   GL_ARB_ES3_compatibility                              DONE (all drivers that support GLSL 3.30)
   GL_ARB_clear_buffer_object                            DONE (all drivers)
   GL_ARB_compute_shader                                 DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_copy_image                                     DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_ARB_compute_shader                                 DONE (i965, softpipe)
   GL_ARB_copy_image                                     DONE (i965, nv50, r600, softpipe, llvmpipe)
   GL_KHR_debug                                          DONE (all drivers)
   GL_ARB_explicit_uniform_location                      DONE (all drivers that support GLSL)
   GL_ARB_fragment_layer_viewport                        DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe)
   GL_ARB_framebuffer_no_attachments                     DONE (i965, nvc0, r600, radeonsi, softpipe)
   GL_ARB_fragment_layer_viewport                        DONE (i965, nv50, r600, llvmpipe, softpipe)
   GL_ARB_framebuffer_no_attachments                     DONE (i965, r600, softpipe)
   GL_ARB_internalformat_query2                          DONE (all drivers)
   GL_ARB_invalidate_subdata                             DONE (all drivers)
   GL_ARB_multi_draw_indirect                            DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_multi_draw_indirect                            DONE (i965, r600, llvmpipe, softpipe, swr)
   GL_ARB_program_interface_query                        DONE (all drivers)
   GL_ARB_robust_buffer_access_behavior                  DONE (i965, nvc0, radeonsi)
   GL_ARB_shader_image_size                              DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_shader_storage_buffer_object                   DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_stencil_texturing                              DONE (i965/gen8+, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_texture_buffer_range                           DONE (nv50, nvc0, i965, r600, radeonsi, llvmpipe)
   GL_ARB_robust_buffer_access_behavior                  DONE (i965)
   GL_ARB_shader_image_size                              DONE (i965, softpipe)
   GL_ARB_shader_storage_buffer_object                   DONE (i965, softpipe)
   GL_ARB_stencil_texturing                              DONE (i965/hsw+, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_texture_buffer_range                           DONE (nv50, i965, r600, llvmpipe)
   GL_ARB_texture_query_levels                           DONE (all drivers that support GLSL 1.30)
   GL_ARB_texture_storage_multisample                    DONE (all drivers that support GL_ARB_texture_multisample)
   GL_ARB_texture_view                                   DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_texture_view                                   DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_vertex_attrib_binding                          DONE (all drivers)
 GL 4.4, GLSL 4.40:
 GL 4.4, GLSL 4.40 -- all DONE: i965/gen8+, nvc0, radeonsi
   GL_MAX_VERTEX_ATTRIB_STRIDE                           DONE (all drivers)
   GL_ARB_buffer_storage                                 DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_ARB_clear_texture                                  DONE (i965, nv50, nvc0)
   GL_ARB_enhanced_layouts                               in progress (Timothy)
   GL_ARB_buffer_storage                                 DONE (i965, nv50, r600)
   GL_ARB_clear_texture                                  DONE (i965, nv50, r600)
   GL_ARB_enhanced_layouts                               DONE (i965, nv50, llvmpipe, softpipe)
   - compile-time constant expressions                   DONE
   - explicit byte offsets for blocks                    DONE
   - forced alignment within blocks                      DONE
   - specified vec4-slot component numbers               in progress
   - specified vec4-slot component numbers               DONE (i965, nv50, llvmpipe, softpipe)
   - specified transform/feedback layout                 DONE
   - input/output block locations                        DONE
   GL_ARB_multi_bind                                     DONE (all drivers)
   GL_ARB_query_buffer_object                            DONE (i965/hsw+, nvc0)
   GL_ARB_texture_mirror_clamp_to_edge                   DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_texture_stencil8                               DONE (i965/gen8+, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_vertex_type_10f_11f_11f_rev                    DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_query_buffer_object                            DONE (i965/hsw+)
   GL_ARB_texture_mirror_clamp_to_edge                   DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_texture_stencil8                               DONE (i965/hsw+, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_vertex_type_10f_11f_11f_rev                    DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
 GL 4.5, GLSL 4.50:
 GL 4.5, GLSL 4.50 -- all DONE: nvc0, radeonsi
   GL_ARB_ES3_1_compatibility                            DONE (nvc0, radeonsi)
   GL_ARB_clip_control                                   DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_conditional_render_inverted                    DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_cull_distance                                  DONE (i965, nv50, nvc0, llvmpipe, softpipe)
   GL_ARB_derivative_control                             DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_ARB_ES3_1_compatibility                            DONE (i965/hsw+)
   GL_ARB_clip_control                                   DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_conditional_render_inverted                    DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_cull_distance                                  DONE (i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_derivative_control                             DONE (i965, nv50, r600)
   GL_ARB_direct_state_access                            DONE (all drivers)
   GL_ARB_get_texture_sub_image                          DONE (all drivers)
   GL_ARB_shader_texture_image_samples                   DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_ARB_texture_barrier                                DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_ARB_shader_texture_image_samples                   DONE (i965, nv50, r600)
   GL_ARB_texture_barrier                                DONE (i965, nv50, r600)
   GL_KHR_context_flush_control                          DONE (all - but needs GLX/EGL extension to be useful)
   GL_KHR_robustness                                     DONE (i965)
   GL_EXT_shader_integer_mix                             DONE (all drivers that support GLSL)
 These are the extensions cherry-picked to make GLES 3.1
 GLES3.1, GLSL ES 3.1
 GLES3.1, GLSL ES 3.1 -- all DONE: i965/hsw+, nvc0, radeonsi
   GL_ARB_arrays_of_arrays                               DONE (all drivers that support GLSL 1.30)
   GL_ARB_compute_shader                                 DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_draw_indirect                                  DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_compute_shader                                 DONE (i965/gen7+, softpipe)
   GL_ARB_draw_indirect                                  DONE (i965/gen7+, r600, llvmpipe, softpipe, swr)
   GL_ARB_explicit_uniform_location                      DONE (all drivers that support GLSL)
   GL_ARB_framebuffer_no_attachments                     DONE (i965, nvc0, r600, radeonsi, softpipe)
   GL_ARB_framebuffer_no_attachments                     DONE (i965/gen7+, r600, softpipe)
   GL_ARB_program_interface_query                        DONE (all drivers)
   GL_ARB_shader_atomic_counters                         DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_shader_image_load_store                        DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_shader_image_size                              DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_shader_storage_buffer_object                   DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_shader_atomic_counters                         DONE (i965/gen7+, softpipe)
   GL_ARB_shader_image_load_store                        DONE (i965/gen7+, softpipe)
   GL_ARB_shader_image_size                              DONE (i965/gen7+, softpipe)
   GL_ARB_shader_storage_buffer_object                   DONE (i965/gen7+, softpipe)
   GL_ARB_shading_language_packing                       DONE (all drivers)
   GL_ARB_separate_shader_objects                        DONE (all drivers)
   GL_ARB_stencil_texturing                              DONE (i965/gen8+, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_texture_multisample (Multisample textures)     DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_stencil_texturing                              DONE (nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_texture_multisample (Multisample textures)     DONE (i965/gen7+, nv50, r600, llvmpipe, softpipe)
   GL_ARB_texture_storage_multisample                    DONE (all drivers that support GL_ARB_texture_multisample)
   GL_ARB_vertex_attrib_binding                          DONE (all drivers)
   GS5 Enhanced textureGather                            DONE (i965, nvc0, r600, radeonsi)
   GS5 Packing/bitfield/conversion functions             DONE (i965, nvc0, r600, radeonsi)
   GS5 Enhanced textureGather                            DONE (i965/gen7+, r600)
   GS5 Packing/bitfield/conversion functions             DONE (i965/gen6+, r600)
   GL_EXT_shader_integer_mix                             DONE (all drivers that support GLSL)
   Additional functionality not covered above:
       glMemoryBarrierByRegion                           DONE
       glGetTexLevelParameter[fi]v - needs updates       DONE
       glGetBooleani_v - restrict to GLES enums
       gl_HelperInvocation support                       DONE (i965, nvc0, r600, radeonsi)
       gl_HelperInvocation support                       DONE (i965, r600)
 GLES3.2, GLSL ES 3.2 -- all DONE: i965/gen9+
 GLES3.2, GLSL ES 3.2
   GL_EXT_color_buffer_float                             DONE (all drivers)
   GL_KHR_blend_equation_advanced                        not started
   GL_KHR_blend_equation_advanced                        DONE (i965)
   GL_KHR_debug                                          DONE (all drivers)
   GL_KHR_robustness                                     DONE (i965)
   GL_KHR_robustness                                     DONE (i965, nvc0, radeonsi)
   GL_KHR_texture_compression_astc_ldr                   DONE (i965/gen9+)
   GL_OES_copy_image                                     DONE (i965)
   GL_OES_copy_image                                     DONE (all drivers)
   GL_OES_draw_buffers_indexed                           DONE (all drivers that support GL_ARB_draw_buffers_blend)
   GL_OES_draw_elements_base_vertex                      DONE (all drivers)
   GL_OES_geometry_shader                                started (idr)
   GL_OES_geometry_shader                                DONE (i965/gen8+, nvc0, radeonsi)
   GL_OES_gpu_shader5                                    DONE (all drivers that support GL_ARB_gpu_shader5)
   GL_OES_primitive_bounding_box                         not started
   GL_OES_primitive_bounding_box                         DONE (i965/gen7+, nvc0, radeonsi)
   GL_OES_sample_shading                                 DONE (i965, nvc0, r600, radeonsi)
   GL_OES_sample_variables                               DONE (i965, nvc0, r600, radeonsi)
   GL_OES_shader_image_atomic                            DONE (all drivers that support GL_ARB_shader_image_load_store)
   GL_OES_shader_io_blocks                               DONE (i965/gen8+, nvc0, radeonsi)
   GL_OES_shader_multisample_interpolation               DONE (i965, nvc0, r600, radeonsi)
   GL_OES_tessellation_shader                            started (Ken)
   GL_OES_tessellation_shader                            DONE (all drivers that support GL_ARB_tessellation_shader)
   GL_OES_texture_border_clamp                           DONE (all drivers)
   GL_OES_texture_buffer                                 DONE (i965, nvc0, radeonsi)
   GL_OES_texture_cube_map_array                         not started (based on GL_ARB_texture_cube_map_array, which is done for all drivers)
   GL_OES_texture_cube_map_array                         DONE (i965/gen8+, nvc0, radeonsi)
   GL_OES_texture_stencil8                               DONE (all drivers that support GL_ARB_texture_stencil8)
   GL_OES_texture_storage_multisample_2d_array           DONE (all drivers that support GL_ARB_texture_multisample)
 Khronos, ARB, and OES extensions that are not part of any OpenGL or OpenGL ES version:
   GL_ARB_bindless_texture                               started (airlied)
   GL_ARB_cl_event                                       not started
   GL_ARB_compute_variable_group_size                    DONE (nvc0, radeonsi)
   GL_ARB_ES3_2_compatibility                            DONE (i965/gen8+)
   GL_ARB_fragment_shader_interlock                      not started
   GL_ARB_gl_spirv                                       not started
   GL_ARB_gpu_shader_int64                               started (airlied for core and Gallium, idr for i965)
   GL_ARB_indirect_parameters                            DONE (nvc0, radeonsi)
   GL_ARB_parallel_shader_compile                        not started, but Chia-I Wu did some related work in 2014
   GL_ARB_pipeline_statistics_query                      DONE (i965, nvc0, radeonsi, softpipe, swr)
   GL_ARB_post_depth_coverage                            not started
   GL_ARB_robustness_isolation                           not started
   GL_ARB_sample_locations                               not started
   GL_ARB_seamless_cubemap_per_texture                   DONE (i965, nvc0, radeonsi, r600, softpipe, swr)
   GL_ARB_shader_atomic_counter_ops                      DONE (nvc0, radeonsi, softpipe)
   GL_ARB_shader_ballot                                  not started
   GL_ARB_shader_clock                                   DONE (i965/gen7+)
   GL_ARB_shader_draw_parameters                         DONE (i965, nvc0, radeonsi)
   GL_ARB_shader_group_vote                              DONE (nvc0)
   GL_ARB_shader_stencil_export                          DONE (i965/gen9+, radeonsi, softpipe, llvmpipe, swr)
   GL_ARB_shader_viewport_layer_array                    DONE (i965/gen6+)
   GL_ARB_sparse_buffer                                  not started
   GL_ARB_sparse_texture                                 not started
   GL_ARB_sparse_texture2                                not started
   GL_ARB_sparse_texture_clamp                           not started
   GL_ARB_texture_filter_minmax                          not started
   GL_ARB_transform_feedback_overflow_query              not started
   GL_KHR_blend_equation_advanced_coherent               DONE (i965/gen9+)
   GL_KHR_no_error                                       not started
   GL_KHR_texture_compression_astc_hdr                   DONE (core only)
   GL_KHR_texture_compression_astc_sliced_3d             not started
   GL_OES_depth_texture_cube_map                         DONE (all drivers that support GLSL 1.30+)
   GL_OES_EGL_image                                      DONE (all drivers)
   GL_OES_EGL_image_external_essl3                       not started
   GL_OES_required_internalformat                        not started - GLES2 extension based on OpenGL ES 3.0 feature
   GL_OES_surfaceless_context                            DONE (all drivers)
   GL_OES_texture_compression_astc                       DONE (core only)
   GL_OES_texture_float                                  DONE (i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
   GL_OES_texture_float_linear                           DONE (i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
   GL_OES_texture_half_float                             DONE (i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
   GL_OES_texture_half_float_linear                      DONE (i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
   GL_OES_texture_view                                   not started - based on GL_ARB_texture_view
   GL_OES_viewport_array                                 DONE (i965, nvc0, radeonsi)
   GLX_ARB_context_flush_control                         not started
   GLX_ARB_robustness_application_isolation              not started
   GLX_ARB_robustness_share_group_isolation              not started
 The following extensions are not part of any OpenGL or OpenGL ES version, and
 we DO NOT WANT implementations of these extensions for Mesa.
   GL_ARB_geometry_shader4                               Superseded by GL 3.2 geometry shaders
   GL_ARB_matrix_palette                                 Superseded by GL_ARB_vertex_program
   GL_ARB_shading_language_include                       Not interesting
   GL_ARB_shadow_ambient                                 Superseded by GL_ARB_fragment_program
   GL_ARB_vertex_blend                                   Superseded by GL_ARB_vertex_program
 More info about these features and the work involved can be found at
 http://dri.freedesktop.org/wiki/MissingFunctionality

									
										4

docs/helpwanted.html
									
												View File
												
				@@ -56,8 +56,8 @@ You can find some further To-do lists here:

				<b>Common To-Do lists:</b>

				</p>

				<ul>

				  <li><a href="http://cgit.freedesktop.org/mesa/mesa/tree/docs/GL3.txt">

				    <b>GL3.txt</b></a> - Status of OpenGL 3.x / 4.x features in Mesa.</li>

				  <li><a href="http://cgit.freedesktop.org/mesa/mesa/tree/docs/features.txt">

				    <b>features.txt</b></a> - Status of OpenGL 3.x / 4.x features in Mesa.</li>

				  <li><a href="http://dri.freedesktop.org/wiki/MissingFunctionality">

				    <b>MissingFunctionality</b></a> - Detailed information about missing OpenGL features.</li>

				</ul>

									
										25

docs/index.html
									
												View File
												
				@@ -16,6 +16,31 @@

				<h1>News</h1>

				<h2>September 15, 2016</h2>

				<p>

				<a href="relnotes/12.0.3.html">Mesa 12.0.3</a> is released.

				This is a bug-fix release.

				</p>

				<h2>September 2, 2016</h2>

				<p>

				<a href="relnotes/12.0.2.html">Mesa 12.0.2</a> is released.

				This is a bug-fix release.

				</p>

				<h2>July 8, 2016</h2>

				<p>

				<a href="relnotes/12.0.1.html">Mesa 12.0.1</a> is released.

				This is a bug-fix release, resolving build issues in the r600 and

				radeonsi drivers.

				</p>

				<p>

				<a href="relnotes/12.0.0.html">Mesa 12.0.0</a> is released.  This is a

				new development release.  See the release notes for more information

				about the release.

				</p>

				<h2>May 9, 2016</h2>

				<p>

				<a href="relnotes/11.1.4.html">Mesa 11.1.4</a> and

									
										25

docs/intro.html
									
												View File
												
				@@ -173,6 +173,27 @@ of the OpenGL specification is implemented.

				</p>

				<h2>Version 12.x features</h2>

				<p>

				Version 12.x of Mesa implements the OpenGL 4.3 API, but not all drivers

				support OpenGL 4.3.

				</p>

				<h2>Version 11.x features</h2>

				<p>

				Version 11.x of Mesa implements the OpenGL 4.1 API, but not all drivers

				support OpenGL 4.1.

				</p>

				<h2>Version 10.x features</h2>

				<p>

				Version 10.x of Mesa implements the OpenGL 3.3 API, but not all drivers

				support OpenGL 3.3.

				</p>

				<h2>Version 9.x features</h2>

				<p>

				Version 9.x of Mesa implements the OpenGL 3.1 API.

				@@ -182,6 +203,10 @@ community contributed features required for OpenGL 3.1.  The primary

				features added since the Mesa 8.0 release are

				GL_ARB_texture_buffer_object and GL_ARB_uniform_buffer_object.

				</p>

				<p>

				Version 9.0 of Mesa also included the first release of the Clover state

				tracker for OpenCL.

				</p>

				<h2>Version 8.x features</h2>

									
										4

docs/relnotes.html
									
												View File
												
				@@ -21,6 +21,10 @@ The release notes summarize what's new or changed in each Mesa release.

				</p>

				<ul>

				<li><a href="relnotes/12.0.3.html">12.0.3 release notes</a>

				<li><a href="relnotes/12.0.2.html">12.0.2 release notes</a>

				<li><a href="relnotes/12.0.1.html">12.0.1 release notes</a>

				<li><a href="relnotes/12.0.0.html">12.0.0 release notes</a>

				<li><a href="relnotes/11.2.2.html">11.2.2 release notes</a>

				<li><a href="relnotes/11.1.4.html">11.1.4 release notes</a>

				<li><a href="relnotes/11.2.1.html">11.2.1 release notes</a>

									
										5

docs/relnotes/12.0.1.html
									
												View File
												
				@@ -16,8 +16,6 @@

				<h1>Mesa 12.0.1 Release Notes / July 8, 2016</h1>

				<h1>Mesa 12.0.1 Release Notes / July 8, 2016</h1>

				<p>

				Mesa 12.0.1 is a bug fix release which fixes bugs found since the 12.0.1 release.

				</p>

				@@ -33,7 +31,8 @@ because compatibility contexts are not supported.

				<h2>SHA256 checksums</h2>

				<pre>

				TBD.

				28dff9c045f4305c96a875a487b9f06c7e88d910511cd6016dbddcd1f53ade0d  mesa-12.0.1.tar.gz

				bab24fb79f78c876073527f515ed871fc9c81d816f66c8a0b051d8d653896389  mesa-12.0.1.tar.xz

				</pre>

									
										403

docs/relnotes/12.0.2.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,403 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 12.0.2 Release Notes / September 2, 2016</h1>

				<p>

				Mesa 12.0.2 is a bug fix release which fixes bugs found since the 12.0.1 release.

				</p>

				<p>

				Mesa 12.0.2 implements the OpenGL 4.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.3.  OpenGL

				4.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				a08565ab1273751ebe2ffa928cbf785056594c803077c9719d0763da780f2918  mesa-12.0.2.tar.gz

				d957a5cc371dcd7ff2aa0d87492f263aece46f79352f4520039b58b1f32552cb  mesa-12.0.2.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=69622">Bug 69622</a> - eglTerminate then eglMakeCurrent crahes</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89599">Bug 89599</a> - symbol 'x86_64_entry_start' is already defined when building with LLVM/clang</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91342">Bug 91342</a> - Very dark textures on some objects in indoors environments in Postal 2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92306">Bug 92306</a> - GL Excess demo renders incorrectly on nv43</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94148">Bug 94148</a> - Framebuffer considered invalid when a draw call is done before glCheckFramebufferStatus</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96274">Bug 96274</a> - [NVC0] Failure when compiling compute shader: Assertion `bb-&gt;getFirst()-&gt;serial &lt;= bb-&gt;getExit()-&gt;serial' failed</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96358">Bug 96358</a> - SSO: wrong interface validation between GS and VS (regresion due to latest gles 3.1)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96381">Bug 96381</a> - Texture artifacts with immutable texture storage and mipmaps</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96762">Bug 96762</a> - [radeonsi,apitrace] Firewatch: nothing rendered in scrollable (text) areas</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96835">Bug 96835</a> - &quot;gallium: Force blend color to 16-byte alignment&quot; crash with &quot;-march=native -O3&quot; causes some 32bit games to crash</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96850">Bug 96850</a> - Crucible tests fail for 32bit mesa</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96908">Bug 96908</a> - [radeonsi] MSAA causes graphical artifacts</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96911">Bug 96911</a> - webgl2 conformance2/textures/misc/tex-mipmap-levels.html crashes 12.1 Intel driver</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96971">Bug 96971</a> - invariant qualifier is not valid for shader inputs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97039">Bug 97039</a> - The Talos Principle and Serious Sam 3 GPU faults</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97207">Bug 97207</a> - [IVY BRIDGE] Fragment shader discard writing to depth</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97214">Bug 97214</a> - X not running with error &quot;Failed to make EGL context current&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97225">Bug 97225</a> - [i965 on HD4600 Haswell] xcom switch to ingame cinematics cause segmentation fault</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97231">Bug 97231</a> - GL_DEPTH_CLAMP doesn't clamp to the far plane</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97307">Bug 97307</a> - glsl/glcpp/tests/glcpp-test regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97331">Bug 97331</a> - glDrawElementsBaseVertex doesn't work in display list on i915</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97351">Bug 97351</a> - DrawElementsBaseVertex with VBO ignores base vertex on Intel GMA 9xx in some cases</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97426">Bug 97426</a> - glScissor gives vertically inverted result</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97476">Bug 97476</a> - Shader binaries should not be stored in the PipelineCache</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97567">Bug 97567</a> - [SNB, ILK] ctl, piglit regressions in mesa 12.0.2rc1</li>

				</ul>

				<h2>Changes</h2>

				<p>Andreas Boll (1):</p>

				<ul>

				  <li>configure.ac: Use ${datarootdir} for --with-vulkan-icddir help string too</li>

				</ul>

				<p>Bernard Kilarski (1):</p>

				<ul>

				  <li>glx: fix error code when there is no context bound</li>

				</ul>

				<p>Brian Paul (4):</p>

				<ul>

				  <li>svga: handle mismatched number of samplers, sampler views</li>

				  <li>mesa: use _mesa_clear_texture_image() in clear_texture_fields()</li>

				  <li>swrast: fix incorrectly positioned putImage() in swrast driver</li>

				  <li>mesa: fix format conversion bug in get_tex_rgba_uncompressed()</li>

				</ul>

				<p>Chad Versace (2):</p>

				<ul>

				  <li>i965: Fix miptree layout for EGLImage-based renderbuffers</li>

				  <li>i965: Respect miptree offsets in intel_readpixels_tiled_memcpy()</li>

				</ul>

				<p>Christian König (1):</p>

				<ul>

				  <li>st/mesa: fix reference counting bug in st_vdpau</li>

				</ul>

				<p>Chuck Atkins (1):</p>

				<ul>

				  <li>swr: Refactor checks for compiler feature flags</li>

				</ul>

				<p>Daniel Scharrer (1):</p>

				<ul>

				  <li>mesa: Fix fixed function spot lighting on newer hardware (again)</li>

				</ul>

				<p>Dave Airlie (2):</p>

				<ul>

				  <li>anv: fix writemask on blit fragment shader.</li>

				  <li>st/glsl_to_tgsi: fix st_src_reg_for_double constant.</li>

				</ul>

				<p>Emil Velikov (15):</p>

				<ul>

				  <li>docs: add sha256 checksums for 12.0.1</li>

				  <li>mesa: automake: list builddir before srcdir</li>

				  <li>mesa: scons: list builddir before srcdir</li>

				  <li>i965: store reference to the context within struct brw_fence (v2)</li>

				  <li>anv: remove internal 'validate' layer</li>

				  <li>anv: automake: use VISIBILITY_CFLAGS to restrict symbol visibility</li>

				  <li>anv: automake: build with -Bsymbolic</li>

				  <li>anv: do not export the Vulkan API</li>

				  <li>anv: remove dummy VK_DEBUG_MARKER_EXT entry points</li>

				  <li>isl: automake: use VISIBILITY_CFLAGS to restrict symbol visibility</li>

				  <li>cherry-ignore: temporary(?) drop "a4xx: make sure to actually clamp depth"</li>

				  <li>i915: Check return value of screen-&gt;image.loader-&gt;getBuffers</li>

				  <li>Revert "i965/miptree: Set logical_depth0 == 6 for cube maps"</li>

				  <li>glx/glvnd: list the strcmp arguments in correct order</li>

				  <li>Update version to 12.0.2</li>

				</ul>

				<p>Eric Anholt (4):</p>

				<ul>

				  <li>vc4: Close our screen's fd on screen close.</li>

				  <li>vc4: Disable early Z with computed depth.</li>

				  <li>vc4: Fix a leak of the src[] array of VPM reads in optimization.</li>

				  <li>vc4: Fix leak of the bo_handles table.</li>

				</ul>

				<p>Francisco Jerez (3):</p>

				<ul>

				  <li>i965: Emit SKL VF cache invalidation W/A from brw_emit_pipe_control_flush.</li>

				  <li>i965: Make room in the batch epilogue for three more pipe controls.</li>

				  <li>i965: Fix remaining flush vs invalidate race conditions in brw_emit_pipe_control_flush.</li>

				</ul>

				<p>Haixia Shi (1):</p>

				<ul>

				  <li>platform_android: prevent deadlock in droid_swap_buffers</li>

				</ul>

				<p>Ian Romanick (5):</p>

				<ul>

				  <li>mesa: Strip arrayness from interface block names in some IO validation</li>

				  <li>glsl: Pack integer and double varyings as flat even if interpolation mode is none</li>

				  <li>glcpp: Track the actual version instead of just the version_resolved flag</li>

				  <li>glcpp: Only disallow #undef of pre-defined macros on GLSL ES &gt;= 3.00 shaders</li>

				  <li>glsl: Mark cube map array sampler types as reserved in GLSL ES 3.10</li>

				</ul>

				<p>Ilia Mirkin (16):</p>

				<ul>

				  <li>mesa: etc2 online compression is unsupported, don't attempt it</li>

				  <li>st/mesa: return appropriate mesa format for ETC texture formats</li>

				  <li>mesa: set _NEW_BUFFERS when updating texture bound to current buffers</li>

				  <li>nv50,nvc0: srgb rendering is only available for rgba/bgra</li>

				  <li>vbo: allow DrawElementsBaseVertex in display lists</li>

				  <li>gallium/util: add helper to compute zmin/zmax for a viewport state</li>

				  <li>nv50,nvc0: fix depth range when halfz is enabled</li>

				  <li>nv50/ir: fix bb positions after exit instructions</li>

				  <li>vbo: add basevertex when looking up elements for vbo splitting</li>

				  <li>a4xx: only disable depth clipping, not all clipping, when requested</li>

				  <li>nv50/ir: make sure cfg iterator always hits all blocks</li>

				  <li>main: add missing EXTRA_END in OES_sample_variables get check</li>

				  <li>nouveau: always enable at least one RC</li>

				  <li>nv30: only bail on color/depth bpp mismatch when surfaces are swizzled</li>

				  <li>a4xx: make sure to actually clamp depth as requested</li>

				  <li>gk110/ir: fix quadop dall emission</li>

				</ul>

				<p>Jan Ziak (2):</p>

				<ul>

				  <li>egl/x11: avoid using freed memory if dri2 init fails</li>

				  <li>loader: fix memory leak in loader_dri3_open</li>

				</ul>

				<p>Jason Ekstrand (31):</p>

				<ul>

				  <li>nir/spirv: Don't multiply the push constant block size by 4</li>

				  <li>anv: Add a stub for CmdCopyQueryPoolResults on Ivy Bridge</li>

				  <li>glsl/types: Fix function type comparison function</li>

				  <li>glsl/types: Use _mesa_hash_data for hashing function types</li>

				  <li>genxml: Make gen6-7 blending look more like gen8</li>

				  <li>anv/pipeline: Unify blend state setup between gen7 and gen8</li>

				  <li>anv: Enable independentBlend on gen7</li>

				  <li>anv: Add an align_down_npot_u32 helper</li>

				  <li>anv: Handle VK_WHOLE_SIZE properly for buffer views</li>

				  <li>i965/miptree: Enforce that height == 1 for 1-D array textures</li>

				  <li>i965/miptree: Set logical_depth0 == 6 for cube maps</li>

				  <li>nir: Add a nir_deref_foreach_leaf helper</li>

				  <li>nir/inline: Constant-initialize local variables in the callee if needed</li>

				  <li>anv/pipeline: Set up point coord enables</li>

				  <li>i965/miptree: Stop multiplying cube depth by 6 in HiZ calculations</li>

				  <li>i965/vec4: Make opt_vector_float reset at the top of each block</li>

				  <li>anv/blit2d: Add a format parameter to bind_dst and create_iview</li>

				  <li>anv/blit2d: Add support for RGB destinations</li>

				  <li>anv/clear: Make cmd_clear_image take an actual VkClearValue</li>

				  <li>anv/clear: Clear E5B9G9R9 images as R32_UINT</li>

				  <li>anv: Include the pipeline layout in the shader hash</li>

				  <li>isl: Allow multisampled array textures</li>

				  <li>anv/descriptor_set: memset anv_descriptor_set_layout</li>

				  <li>anv/pipeline: Fix bind maps for fragment output arrays</li>

				  <li>anv/allocator: Correctly set the number of buckets</li>

				  <li>anv/pipeline: Properly handle OOM during shader compilation</li>

				  <li>anv: Remove unused fields from anv_pipeline_bind_map</li>

				  <li>anv: Add pipeline_has_stage guards a few places</li>

				  <li>anv: Add a struct for storing a compiled shader</li>

				  <li>anv/pipeline: Add support for caching the push constant map</li>

				  <li>anv: Rework pipeline caching</li>

				</ul>

				<p>José Fonseca (2):</p>

				<ul>

				  <li>appveyor: Install pywin32 extensions.</li>

				  <li>appveyor: Force Visual Studio 2013 image.</li>

				</ul>

				<p>Kenneth Graunke (21):</p>

				<ul>

				  <li>genxml: Add CLIPMODE_* prefix to 3DSTATE_CLIP's "Clip Mode" enum values.</li>

				  <li>genxml: Add APIMODE_D3D missing enum values and improve consistency.</li>

				  <li>anv: Fix near plane clipping on Gen7/7.5.</li>

				  <li>anv: Enable early culling on Gen7.</li>

				  <li>anv: Unify 3DSTATE_CLIP code across generations.</li>

				  <li>genxml: Rename "API Rendering Disable" to "Rendering Disable".</li>

				  <li>anv: Properly call gen75_emit_state_base_address on Haswell.</li>

				  <li>i965: Include VUE handles for GS with invocations &gt; 1.</li>

				  <li>nir: Add a base const_index to shared atomic intrinsics.</li>

				  <li>i965: Fix shared atomic intrinsics to pay attention to base.</li>

				  <li>mesa: Add GL_BGRA_EXT to the list of GenerateMipmap internal formats.</li>

				  <li>mesa: Don't call GenerateMipmap if Width or Height == 0.</li>

				  <li>glsl: Delete bogus ir_set_program_inouts assert.</li>

				  <li>glsl: Fix the program resource names of gl_TessLevelOuter/Inner[].</li>

				  <li>glsl: Fix location bias for patch variables.</li>

				  <li>glsl: Fix invariant matching in GLSL 4.30 and GLSL ES 1.00.</li>

				  <li>mesa: Fix uf10_to_f32() scale factor in the E == 0 and M != 0 case.</li>

				  <li>nir/builder: Add bany_inequal and bany helpers.</li>

				  <li>i965: Implement the WaPreventHSTessLevelsInterference workaround.</li>

				  <li>i965: Fix execution size of scalar TCS barrier setup code.</li>

				  <li>i965: Fix barrier count shift in scalar TCS backend.</li>

				</ul>

				<p>Leo Liu (2):</p>

				<ul>

				  <li>st/omx/enc: check uninitialized list from task release</li>

				  <li>vl/dri3: fix a memory leak from front buffer</li>

				</ul>

				<p>Marek Olšák (7):</p>

				<ul>

				  <li>glsl_to_tgsi: don't use the negate modifier in integer ops after bitcast</li>

				  <li>radeonsi: add a workaround for a compute VGPR-usage LLVM bug</li>

				  <li>winsys/amdgpu: disallow DCC with mipmaps</li>

				  <li>gallium/util: fix align64</li>

				  <li>radeonsi: only set dual source blending for MRT0</li>

				  <li>radeonsi: fix VM faults due NULL internal const buffers on CIK</li>

				  <li>radeonsi: disable SDMA texture copying on Carrizo</li>

				</ul>

				<p>Matt Turner (4):</p>

				<ul>

				  <li>mapi: Massage code to allow clang to compile.</li>

				  <li>i965/vec4: Ignore swizzle of VGRF for use by var_range_end().</li>

				  <li>mesa: Use AC_HEADER_MAJOR to include correct header for major().</li>

				  <li>nir: Walk blocks in source code order in lower_vars_to_ssa.</li>

				</ul>

				<p>Michel Dänzer (1):</p>

				<ul>

				  <li>glx: Don't use current context in __glXSendError</li>

				</ul>

				<p>Miklós Máté (1):</p>

				<ul>

				  <li>vbo: set draw_id</li>

				</ul>

				<p>Nanley Chery (5):</p>

				<ul>

				  <li>anv/descriptor_set: Fix binding partly undefined descriptor sets</li>

				  <li>isl: Fix assert on raw buffer surface state size</li>

				  <li>anv/device: Fix max buffer range limits</li>

				  <li>isl: Fix isl_tiling_is_any_y()</li>

				  <li>anv/gen7_pipeline: Set PixelShaderKillPixel for discards</li>

				</ul>

				<p>Nicolai Hähnle (7):</p>

				<ul>

				  <li>radeonsi: explicitly choose center locations for 1xAA on Polaris</li>

				  <li>radeonsi: fix Polaris MSAA regression</li>

				  <li>radeonsi: ensure sample locations are set for line and polygon smoothing</li>

				  <li>st_glsl_to_tgsi: only skip over slots of an input array that are present</li>

				  <li>glsl: fix optimization of discard nested multiple levels</li>

				  <li>radeonsi: flush TC L2 cache for indirect draw data</li>

				  <li>radeonsi: add si_set_rw_buffer to be used for internal descriptors</li>

				</ul>

				<p>Nicolas Boichat (6):</p>

				<ul>

				  <li>egl/dri2: dri2_make_current: Set EGL error if bindContext fails</li>

				  <li>egl/wayland: Set disp-&gt;DriverData to NULL on error</li>

				  <li>egl/surfaceless: Set disp-&gt;DriverData to NULL on error</li>

				  <li>egl/drm: Set disp-&gt;DriverData to NULL on error</li>

				  <li>egl/android: Set dpy-&gt;DriverData to NULL on error</li>

				  <li>egl/dri2: Add reference count for dri2_egl_display</li>

				</ul>

				<p>Rob Herring (3):</p>

				<ul>

				  <li>Android: add missing u_math.h include path for libmesa_isl</li>

				  <li>vc4: fix vc4_resource_from_handle() stride calculation</li>

				  <li>vc4: add hash table look-up for exported dmabufs</li>

				</ul>

				<p>Samuel Pitoiset (7):</p>

				<ul>

				  <li>nvc0/ir: fix images indirect access on Fermi</li>

				  <li>nvc0: fix the driver cb size when draw parameters are used</li>

				  <li>gm107/ir: add missing NEG modifier for IADD32I</li>

				  <li>gm107/ir: make use of ADD32I for all immediates</li>

				  <li>nvc0: upload sample locations on GM20x</li>

				  <li>nvc0: invalidate textures/samplers on GK104+</li>

				  <li>nv50/ir: always emit the NDV bit for OP_QUADOP</li>

				</ul>

				<p>Stefan Dirsch (1):</p>

				<ul>

				  <li>Avoid overflow in 'last' variable of FindGLXFunction(...)</li>

				</ul>

				<p>Stencel, Joanna (1):</p>

				<ul>

				  <li>egl/wayland-egl: Fix for segfault in dri2_wl_destroy_surface.</li>

				</ul>

				<p>Tim Rowley (2):</p>

				<ul>

				  <li>Revert "gallium: Force blend color to 16-byte alignment"</li>

				  <li>swr: switch from overriding -march to selecting features</li>

				</ul>

				<p>Tomasz Figa (8):</p>

				<ul>

				  <li>gallium/dri: Add shared glapi to LIBADD on Android</li>

				  <li>egl/android: Remove unused variables</li>

				  <li>egl/android: Check return value of dri2_get_dri_config()</li>

				  <li>egl/android: Stop leaking DRI images</li>

				  <li>gallium/winsys/kms: Fix double refcount when importing from prime FD (v2)</li>

				  <li>gallium/winsys/kms: Fully initialize kms_sw_dt at prime import time (v2)</li>

				  <li>gallium/winsys/kms: Move display target handle lookup to separate function</li>

				  <li>gallium/winsys/kms: Look up the GEM handle after importing a prime FD</li>

				</ul>

				</div>

				</body>

				</html>

									
										71

docs/relnotes/12.0.3.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,71 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 12.0.3 Release Notes / September 15, 2016</h1>

				<p>

				Mesa 12.0.3 is a bug fix release which fixes bugs found since the 12.0.3 release.

				</p>

				<p>

				Mesa 12.0.3 implements the OpenGL 4.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.3.  OpenGL

				4.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				79abcfab3de30dbd416d1582a3cf6b1be308466231488775f1b7bb43be353602 mesa-12.0.3.tar.gz

				1dc86dd9b51272eee1fad3df65e18cda2e556ef1bc0b6e07cd750b9757f493b1 mesa-12.0.3.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97781">Bug 97781</a> - [HSW, BYT, IVB] es2-cts.gtf.gl2extensiontests.depth_texture_cube_map.depth_texture_cube_map</li>

				</ul>

				<h2>Changes</h2>

				<p>Emil Velikov (3):</p>

				<ul>

				  <li>docs: add sha256 checksums for 12.0.2</li>

				  <li>Revert "i965/miptree: Stop multiplying cube depth by 6 in HiZ calculations"</li>

				  <li>Update version to 12.0.3</li>

				</ul>

				<p>José Fonseca (1):</p>

				<ul>

				  <li>appveyor: Update winflexbison download URL.</li>

				</ul>

				</div>

				</body>

				</html>

									
										85

docs/relnotes/13.0.0.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,85 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 13.0.0 Release Notes / TBD</h1>

				<p>

				Mesa 13.0.0 is a new development release.

				People who are concerned with stability and reliability should stick

				with a previous release or wait for Mesa 13.0.1.

				</p>

				<p>

				Mesa 13.0.0 implements the OpenGL 4.4 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.4.  OpenGL

				4.4 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				TBD.

				</pre>

				<h2>New features</h2>

				<p>

				Note: some of the new features are only available with certain drivers.

				</p>

				<ul>

				<li>OpenGL ES 3.1 on i965/hsw</li>

				<li>OpenGL ES 3.2 on i965/gen9+ (Skylake and later)</li>

				<li>GL_ARB_ES3_1_compatibility on i965</li>

				<li>GL_ARB_ES3_2_compatibility on i965/gen8+</li>

				<li>GL_ARB_clear_texture on r600, radeonsi</li>

				<li>GL_ARB_compute_variable_group_size on nvc0, radeonsi</li>

				<li>GL_ARB_cull_distance on radeonsi</li>

				<li>GL_ARB_enhanced_layouts on i965, nv50, nvc0, radeonsi, llvmpipe, softpipe</li>

				<li>GL_ARB_indirect_parameters on radeonsi</li>

				<li>GL_ARB_query_buffer_object on radeonsi</li>

				<li>GL_ARB_shader_draw_parameters on radeonsi</li>

				<li>GL_ARB_shader_group_vote on nvc0</li>

				<li>GL_ARB_shader_viewport_layer_array on i965/gen6+</li>

				<li>GL_ARB_stencil_texturing on i965/hsw</li>

				<li>GL_ARB_texture_stencil8 on i965/hsw</li>

				<li>GL_EXT_window_rectangles on nv50, nvc0</li>

				<li>GL_KHR_blend_equation_advanced on i965</li>

				<li>GL_KHR_robustness on nvc0, radeonsi</li>

				<li>GL_KHR_texture_compression_astc_sliced_3d on i965</li>

				<li>GL_OES_copy_image on nv50, nvc0, r600, radeonsi, softpipe, llvmpipe</li>

				<li>GL_OES_geometry_shader on i965/gen8+, nvc0, radeonsi</li>

				<li>GL_OES_primitive_bounding_box on i965/gen7+, nvc0, radeonsi</li>

				<li>GL_OES_texture_cube_map_array on i965/gen8+, nvc0, radeonsi</li>

				<li>GL_OES_tessellation_shader on i965/gen7+, nvc0, radeonsi</li>

				<li>GL_OES_viewport_array on nvc0, radeonsi</li>

				<li>GL_ANDROID_extension_pack_es31a on i965/gen9+</li>

				</ul>

				<h2>Bug fixes</h2>

				TBD.

				<h2>Changes</h2>

				TBD.

				</div>

				</body>

				</html>

120

docs/specs/EGL_MESA_platform_surfaceless.txt Normal file

View File

@@ -0,0 +1,120 @@
 Name
     MESA_platform_surfaceless
 Name Strings
     EGL_MESA_platform_surfaceless
 Contributors
     Chad Versace <chadversary@google.com>
     Haixia Shi <hshi@google.com>
     Stéphane Marchesin <marcheu@google.com>
     Zach Reizner <zachr@chromium.org>
     Gurchetan Singh <gurchetansingh@google.com>
 Contacts
     Chad Versace <chadversary@google.com>
 Status
     DRAFT
 Version
     Version 2, 2016-10-13
 Number
     EGL Extension #TODO
 Extension Type
     EGL client extension
 Dependencies
     Requires EGL 1.5 or later; or EGL 1.4 with EGL_EXT_platform_base.
     This extension is written against the EGL 1.5 Specification (draft
     20140122).
     This extension interacts with EGL_EXT_platform_base as follows. If the
     implementation supports EGL_EXT_platform_base, then text regarding
     eglGetPlatformDisplay applies also to eglGetPlatformDisplayEXT;
     eglCreatePlatformWindowSurface to eglCreatePlatformWindowSurfaceEXT; and
     eglCreatePlatformPixmapSurface to eglCreatePlatformPixmapSurfaceEXT.
 Overview
     This extension defines a new EGL platform, the "surfaceless" platform. This
     platfom's defining property is that it has no native surfaces, and hence
     neither eglCreatePlatformWindowSurface nor eglCreatePlatformPixmapSurface
     can be used. The platform is independent of any native window system.
     The platform's intended use case is for enabling OpenGL and OpenGL ES
     applications on systems where no window system exists. However, the
     platform's permitted usage is not restricted to this case.  Since the
     platform is independent of any native window system, it may also be used on
     systems where a window system is present.
 New Types
     None
 New Procedures and Functions
     None
 New Tokens
     Accepted as the <platform> argument of eglGetPlatformDisplay:
         EGL_PLATFORM_SURFACELESS_MESA           0x31DD
 Additions to the EGL Specification
     None.
 New Behavior
     To determine if the EGL implementation supports this extension, clients
     should query the EGL_EXTENSIONS string of EGL_NO_DISPLAY.
     To obtain an EGLDisplay on the surfaceless platform, call
     eglGetPlatformDisplay with <platform> set to EGL_PLATFORM_SURFACELESS_MESA.
     The <native_display> parameter must be EGL_DEFAULT_DISPLAY.
     eglCreatePlatformWindowSurface fails when called with a <display> that
     belongs to the surfaceless platform. It returns EGL_NO_SURFACE and
     generates EGL_BAD_NATIVE_WINDOW. The justification for this unconditional
     failure is that the surfaceless platform has no native windows, and
     therefore the <native_window> parameter is always invalid.
     Likewise, eglCreatePlatformPixmapSurface also fails when called with a
     <display> that belongs to the surfaceless platform.  It returns
     EGL_NO_SURFACE and generates EGL_BAD_NATIVE_PIXMAP.
     The surfaceless platform imposes no platform-specific restrictions on the
     creation of pbuffers, as eglCreatePbufferSurface has no native surface
     parameter.  Specifically, if the EGLDisplay advertises an EGLConfig whose
     EGL_SURFACE_TYPE attribute contains EGL_PBUFFER_BIT, then the EGLDisplay
     permits the creation of pbuffers with that config.
 Issues
     None.
 Revision History
     Version 2, 2016-10-13 (Chad Versace)
         - Assign enum values
         - Define interfactions with EGL 1.4 and EGL_EXT_platform_base.
         - Add Gurchetan as contributor, as he implemented the pbuffer support.
     Version 1, 2016-09-23 (Chad Versace)
         - Initial version
         - Posted for review at
           https://lists.freedesktop.org/archives/mesa-dev/2016-September/129549.html

									
										8

docs/specs/MESA_configless_context.spec
									
												View File
												
				@@ -12,11 +12,12 @@ Contact

				Status

				    Proposal

				    Superseded by the functionally identical EGL_KHR_no_config_context

				    extension.

				Version

				    Version 1, February 28, 2014

				    Version 2, September 9, 2016

				Number

				@@ -121,5 +122,8 @@ Issues

				Revision History

				    Version 2, September 9, 2016

				        Defer to EGL_KHR_no_config_context (Adam Jackson)

				    Version 1, February 28, 2014

				        Initial draft (Neil Roberts)

520

docs/specs/MESA_shader_integer_functions.txt Normal file

View File

@@ -0,0 +1,520 @@
 Name
     MESA_shader_integer_functions
 Name Strings
     GL_MESA_shader_integer_functions
 Contact
     Ian Romanick <ian.d.romanick@intel.com>
 Contributors
     All the contributors of GL_ARB_gpu_shader5
 Status
     Supported by all GLSL 1.30 capable drivers in Mesa 12.1 and later
 Version
     Version 2, July 7, 2016
 Number
     TBD
 Dependencies
     This extension is written against the OpenGL 3.2 (Compatibility Profile)
     Specification.
     This extension is written against Version 1.50 (Revision 09) of the OpenGL
     Shading Language Specification.
     GLSL 1.30 is required.
     This extension interacts with ARB_gpu_shader5.
     This extension interacts with ARB_gpu_shader_fp64.
     This extension interacts with NV_gpu_shader5.
 Overview
     GL_ARB_gpu_shader5 extends GLSL in a number of useful ways.  Much of this
     added functionality requires significant hardware support.  There are many
     aspects, however, that can be easily implmented on any GPU with "real"
     integer support (as opposed to simulating integers using floating point
     calculations).
     This extension provides a set of new features to the OpenGL Shading
     Language to support capabilities of these GPUs, extending the capabilities
     of version 1.30 of the OpenGL Shading Language.  Shaders
     using the new functionality provided by this extension should enable this
     functionality via the construct
       #extension GL_MESA_shader_integer_functions : require   (or enable)
     This extension provides a variety of new features for all shader types,
     including:
       * support for implicitly converting signed integer types to unsigned
         types, as well as more general implicit conversion and function
         overloading infrastructure to support new data types introduced by
         other extensions;
       * new built-in functions supporting:
         * splitting a floating-point number into a significand and exponent
           (frexp), or building a floating-point number from a significand and
           exponent (ldexp);
         * integer bitfield manipulation, including functions to find the
           position of the most or least significant set bit, count the number
           of one bits, and bitfield insertion, extraction, and reversal;
         * extended integer precision math, including add with carry, subtract
           with borrow, and extenended multiplication;
     The resulting extension is a strict subset of GL_ARB_gpu_shader5.
 IP Status
     No known IP claims.
 New Procedures and Functions
     None
 New Tokens
     None
 Additions to Chapter 2 of the OpenGL 3.2 (Compatibility Profile) Specification
 (OpenGL Operation)
     None.
 Additions to Chapter 3 of the OpenGL 3.2 (Compatibility Profile) Specification
 (Rasterization)
     None.
 Additions to Chapter 4 of the OpenGL 3.2 (Compatibility Profile) Specification
 (Per-Fragment Operations and the Frame Buffer)
     None.
 Additions to Chapter 5 of the OpenGL 3.2 (Compatibility Profile) Specification
 (Special Functions)
     None.
 Additions to Chapter 6 of the OpenGL 3.2 (Compatibility Profile) Specification
 (State and State Requests)
     None.
 Additions to Appendix A of the OpenGL 3.2 (Compatibility Profile)
 Specification (Invariance)
     None.
 Additions to the AGL/GLX/WGL Specifications
     None.
 Modifications to The OpenGL Shading Language Specification, Version 1.50
 (Revision 09)
     Including the following line in a shader can be used to control the
     language features described in this extension:
       #extension GL_MESA_shader_integer_functions : <behavior>
     where <behavior> is as specified in section 3.3.
     New preprocessor #defines are added to the OpenGL Shading Language:
       #define GL_MESA_shader_integer_functions        1
     Modify Section 4.1.10, Implicit Conversions, p. 27
     (modify table of implicit conversions)
                                 Can be implicitly
         Type of expression        converted to
         ---------------------   -----------------
         int                     uint, float
         ivec2                   uvec2, vec2
         ivec3                   uvec3, vec3
         ivec4                   uvec4, vec4
         uint                    float
         uvec2                   vec2
         uvec3                   vec3
         uvec4                   vec4
     (modify second paragraph of the section) No implicit conversions are
     provided to convert from unsigned to signed integer types or from
     floating-point to integer types.  There are no implicit array or structure
     conversions.
     (insert before the final paragraph of the section) When performing
     implicit conversion for binary operators, there may be multiple data types
     to which the two operands can be converted.  For example, when adding an
     int value to a uint value, both values can be implicitly converted to uint
     and float.  In such cases, a floating-point type is chosen if either
     operand has a floating-point type.  Otherwise, an unsigned integer type is
     chosen if either operand has an unsigned integer type.  Otherwise, a
     signed integer type is chosen.
     Modify Section 5.9, Expressions, p. 57
     (modify bulleted list as follows, adding support for implicit conversion
     between signed and unsigned types)
     Expressions in the shading language are built from the following:
     * Constants of type bool, int, int64_t, uint, uint64_t, float, all vector
       types, and all matrix types.
     ...
     * The operator modulus (%) operates on signed or unsigned integer scalars
       or vectors.  If the fundamental types of the operands do not match, the
       conversions from Section 4.1.10 "Implicit Conversions" are applied to
       produce matching types.  ...
     Modify Section 6.1, Function Definitions, p. 63
     (modify description of overloading, beginning at the top of p. 64)
      Function names can be overloaded.  The same function name can be used for
      multiple functions, as long as the parameter types differ.  If a function
      name is declared twice with the same parameter types, then the return
      types and all qualifiers must also match, and it is the same function
      being declared.  For example,
        vec4 f(in vec4 x, out vec4  y);   // (A)
        vec4 f(in vec4 x, out uvec4 y);   // (B) okay, different argument type
        vec4 f(in ivec4 x, out uvec4 y);  // (C) okay, different argument type
        int  f(in vec4 x, out ivec4 y);  // error, only return type differs
        vec4 f(in vec4 x, in  vec4  y);  // error, only qualifier differs
        vec4 f(const in vec4 x, out vec4 y);  // error, only qualifier differs
      When function calls are resolved, an exact type match for all the
      arguments is sought.  If an exact match is found, all other functions are
      ignored, and the exact match is used.  If no exact match is found, then
      the implicit conversions in Section 4.1.10 (Implicit Conversions) will be
      applied to find a match.  Mismatched types on input parameters (in or
      inout or default) must have a conversion from the calling argument type
      to the formal parameter type.  Mismatched types on output parameters (out
      or inout) must have a conversion from the formal parameter type to the
      calling argument type.
      If implicit conversions can be used to find more than one matching
      function, a single best-matching function is sought.  To determine a best
      match, the conversions between calling argument and formal parameter
      types are compared for each function argument and pair of matching
      functions.  After these comparisons are performed, each pair of matching
      functions are compared.  A function definition A is considered a better
      match than function definition B if:
        * for at least one function argument, the conversion for that argument
          in A is better than the corresponding conversion in B; and
        * there is no function argument for which the conversion in B is better
          than the corresponding conversion in A.
      If a single function definition is considered a better match than every
      other matching function definition, it will be used.  Otherwise, a
      semantic error occurs and the shader will fail to compile.
      To determine whether the conversion for a single argument in one match is
      better than that for another match, the following rules are applied, in
      order:
 . An exact match is better than a match involving any implicit
           conversion.
 . A match involving an implicit conversion from float to double is
           better than a match involving any other implicit conversion.
 . A match involving an implicit conversion from either int or uint to
           float is better than a match involving an implicit conversion from
           either int or uint to double.
      If none of the rules above apply to a particular pair of conversions,
      neither conversion is considered better than the other.
      For the function prototypes (A), (B), and (C) above, the following
      examples show how the rules apply to different sets of calling argument
      types:
        f(vec4, vec4);        // exact match of vec4 f(in vec4 x, out vec4 y)
        f(vec4, uvec4);       // exact match of vec4 f(in vec4 x, out ivec4 y)
        f(vec4, ivec4);       // matched to vec4 f(in vec4 x, out vec4 y)
                              //   (C) not relevant, can't convert vec4 to
                              //   ivec4.  (A) better than (B) for 2nd
                              //   argument (rule 2), same on first argument.
        f(ivec4, vec4);       // NOT matched.  All three match by implicit
                              //   conversion.  (C) is better than (A) and (B)
                              //   on the first argument.  (A) is better than
                              //   (B) and (C).
     Modify Section 8.3, Common Functions, p. 84
     (add support for single-precision frexp and ldexp functions)
     Syntax:
       genType frexp(genType x, out genIType exp);
       genType ldexp(genType x, in genIType exp);
     The function frexp() splits each single-precision floating-point number in
     <x> into a binary significand, a floating-point number in the range [0.5,
 .0), and an integral exponent of two, such that:
       x = significand * 2 ^ exponent
     The significand is returned by the function; the exponent is returned in
     the parameter <exp>.  For a floating-point value of zero, the significant
     and exponent are both zero.  For a floating-point value that is an
     infinity or is not a number, the results of frexp() are undefined.
     If the input <x> is a vector, this operation is performed in a
     component-wise manner; the value returned by the function and the value
     written to <exp> are vectors with the same number of components as <x>.
     The function ldexp() builds a single-precision floating-point number from
     each significand component in <x> and the corresponding integral exponent
     of two in <exp>, returning:
       significand * 2 ^ exponent
     If this product is too large to be represented as a single-precision
     floating-point value, the result is considered undefined.
     If the input <x> is a vector, this operation is performed in a
     component-wise manner; the value passed in <exp> and returned by the
     function are vectors with the same number of components as <x>.
     (add support for new integer built-in functions)
     Syntax:
       genIType bitfieldExtract(genIType value, int offset, int bits);
       genUType bitfieldExtract(genUType value, int offset, int bits);
       genIType bitfieldInsert(genIType base, genIType insert, int offset,
                               int bits);
       genUType bitfieldInsert(genUType base, genUType insert, int offset,
                               int bits);
       genIType bitfieldReverse(genIType value);
       genUType bitfieldReverse(genUType value);
       genIType bitCount(genIType value);
       genIType bitCount(genUType value);
       genIType findLSB(genIType value);
       genIType findLSB(genUType value);
       genIType findMSB(genIType value);
       genIType findMSB(genUType value);
     The function bitfieldExtract() extracts bits <offset> through
     <offset>+<bits>-1 from each component in <value>, returning them in the
     least significant bits of corresponding component of the result.  For
     unsigned data types, the most significant bits of the result will be set
     to zero.  For signed data types, the most significant bits will be set to
     the value of bit <offset>+<base>-1.  If <bits> is zero, the result will be
     zero.  The result will be undefined if <offset> or <bits> is negative, or
     if the sum of <offset> and <bits> is greater than the number of bits used
     to store the operand.  Note that for vector versions of bitfieldExtract(),
     a single pair of <offset> and <bits> values is shared for all components.
     The function bitfieldInsert() inserts the <bits> least significant bits of
     each component of <insert> into the corresponding component of <base>.
     The result will have bits numbered <offset> through <offset>+<bits>-1
     taken from bits 0 through <bits>-1 of <insert>, and all other bits taken
     directly from the corresponding bits of <base>.  If <bits> is zero, the
     result will simply be <base>.  The result will be undefined if <offset> or
     <bits> is negative, or if the sum of <offset> and <bits> is greater than
     the number of bits used to store the operand.  Note that for vector
     versions of bitfieldInsert(), a single pair of <offset> and <bits> values
     is shared for all components.
     The function bitfieldReverse() reverses the bits of <value>.  The bit
     numbered <n> of the result will be taken from bit (<bits>-1)-<n> of
     <value>, where <bits> is the total number of bits used to represent
     <value>.
     The function bitCount() returns the number of one bits in the binary
     representation of <value>.
     The function findLSB() returns the bit number of the least significant one
     bit in the binary representation of <value>.  If <value> is zero, -1 will
     be returned.
     The function findMSB() returns the bit number of the most significant bit
     in the binary representation of <value>.  For positive integers, the
     result will be the bit number of the most significant one bit.  For
     negative integers, the result will be the bit number of the most
     significant zero bit.  For a <value> of zero or negative one, -1 will be
     returned.
     (support for unsigned integer add/subtract with carry-out)
     Syntax:
       genUType uaddCarry(genUType x, genUType y, out genUType carry);
       genUType usubBorrow(genUType x, genUType y, out genUType borrow);
     The function uaddCarry() adds 32-bit unsigned integers or vectors <x> and
     <y>, returning the sum modulo 2^32.  The value <carry> is set to zero if
     the sum was less than 2^32, or one otherwise.
     The function usubBorrow() subtracts the 32-bit unsigned integer or vector
     <y> from <x>, returning the difference if non-negative or 2^32 plus the
     difference, otherwise.  The value <borrow> is set to zero if x >= y, or
     one otherwise.
     (support for signed and unsigned multiplies, with 32-bit inputs and a
 -bit result spanning two 32-bit outputs)
     Syntax:
       void umulExtended(genUType x, genUType y, out genUType msb,
                         out genUType lsb);
       void imulExtended(genIType x, genIType y, out genIType msb,
                         out genIType lsb);
     The functions umulExtended() and imulExtended() multiply 32-bit unsigned
     or signed integers or vectors <x> and <y>, producing a 64-bit result.  The
 least significant bits are returned in <lsb>; the 32 most significant
     bits are returned in <msb>.
 GLX Protocol
     None.
 Dependencies on ARB_gpu_shader_fp64
     This extension, ARB_gpu_shader_fp64, and NV_gpu_shader5 all modify the set
     of implicit conversions supported in the OpenGL Shading Language.  If more
     than one of these extensions is supported, an expression of one type may
     be converted to another type if that conversion is allowed by any of these
     specifications.
     If ARB_gpu_shader_fp64 or a similar extension introducing new data types
     is not supported, the function overloading rule in the GLSL specification
     preferring promotion an input parameters to smaller type to a larger type
     is never applicable, as all data types are of the same size.  That rule
     and the example referring to "double" should be removed.
 Dependencies on NV_gpu_shader5
     This extension, ARB_gpu_shader_fp64, and NV_gpu_shader5 all modify the set
     of implicit conversions supported in the OpenGL Shading Language.  If more
     than one of these extensions is supported, an expression of one type may
     be converted to another type if that conversion is allowed by any of these
     specifications.
     If NV_gpu_shader5 is supported, integer data types are supported with four
     different precisions (8-, 16, 32-, and 64-bit) and floating-point data
     types are supported with three different precisions (16-, 32-, and
 -bit).  The extension adds the following rule for output parameters,
     which is similar to the one present in this extension for input
     parameters:
 . If the formal parameters in both matches are output parameters, a
           conversion from a type with a larger number of bits per component is
           better than a conversion from a type with a smaller number of bits
           per component.  For example, a conversion from an "int16_t" formal
           parameter type to "int"  is better than one from an "int8_t" formal
           parameter type to "int".
     Such a rule is not provided in this extension because there is no
     combination of types in this extension and ARB_gpu_shader_fp64 where this
     rule has any effect.
 Errors
     None
 New State
     None
 New Implementation Dependent State
     None
 Issues
     (1) What should this extension be called?
       UNRESOLVED.  This extension borrows from GL_ARB_gpu_shader5, so creating
       some sort of a play on that name would be viable.  However, nothing in
       this extension should require SM5 hardware, so such a name would be a
       little misleading and weird.
       Since the primary purpose is to add integer related functions from
       GL_ARB_gpu_shader5, call this extension GL_MESA_shader_integer_functions
       for now.
     (2) Why is some of the formatting in this extension weird?
       RESOLVED: This extension is formatted to minimize the differences (as
       reported by 'diff --side-by-side -W180') with the GL_ARB_gpu_shader5
       specification.
     (3) Should ldexp and frexp be included?
       RESOLVED: Yes.  Few GPUs have native instructions to implement these
       functions.  These are generally implemented using existing GLSL built-in
       functions and the other functions provided by this extension.
     (4) Should umulExtended and imulExtended be included?
       RESOLVED: Yes.  These functions should be implementable on any GPU that
       can support the rest of this extension, but the implementation may be
       complex.  The implementation on a GPU that only supports 32bit x 32bit =
 bit multiplication would be quite expensive.  However, many GPUs
       (including OpenGL 4.0 GPUs that already support this function) have a
 bit x 16bit = 48bit multiplier.  The implementation there is only
       trivially more expensive than regular 32bit multiplication.
     (5) Should the pack and unpack functions be included?
       RESOLVED: No.  These functions are already available via
       GL_ARB_shading_language_packing.
     (6) Should the "BitsTo" functions be included?
       RESOLVED: No.  These functions are already available via
       GL_ARB_shader_bit_encoding.
 Revision History
     Rev.      Date     Author    Changes
     ----  -----------  --------  -----------------------------------------
 7-Jul-2016  idr       Fix typo in #extension line
 20-Jun-2016  idr       Initial version based on GL_ARB_gpu_shader5.

0

src/egl/docs/EGL_MESA_screen_surface → docs/specs/OLD/EGL_MESA_screen_surface.txt

View File

41

docs/specs/enums.txt

View File

@@ -1,10 +1,18 @@
 The definitive source for enum values and reserved ranges are the XML files in
 the Khronos registry:
 See the OpenGL ARB enum registry at http://www.opengl.org/registry/api/enum.spec
     https://cvs.khronos.org/svn/repos/ogl/trunk/doc/registry/public/api/egl.xml
     https://cvs.khronos.org/svn/repos/ogl/trunk/doc/registry/public/api/gl.xml
     https://cvs.khronos.org/svn/repos/ogl/trunk/doc/registry/public/api/glx.xml
     https://cvs.khronos.org/svn/repos/ogl/trunk/doc/registry/public/api/wgl.xml
 Blocks allocated to Mesa:
 GL blocks allocated to Mesa:
 x8750-0x875F
 x8BB0-0x8BBF
 EGL blocks allocated to Mesa:
 x31D0-0x31DF
 x3290-0x329F
 GL_MESA_packed_depth_stencil
 	GL_DEPTH_STENCIL_MESA            0x8750
@@ -13,7 +21,7 @@ GL_MESA_packed_depth_stencil
 	GL_UNSIGNED_SHORT_15_1_MESA      0x8753
 	GL_UNSIGNED_SHORT_1_15_REV_MESA  0x8754
 GL_MESA_trace.spec:
 GL_MESA_trace:
 	GL_TRACE_ALL_BITS_MESA           0xFFFF
 	GL_TRACE_OPERATIONS_BIT_MESA     0x0001
 	GL_TRACE_PRIMITIVES_BIT_MESA     0x0002
@@ -24,12 +32,12 @@ GL_MESA_trace.spec:
 	GL_TRACE_MASK_MESA               0x8755
 	GL_TRACE_NAME_MESA               0x8756
 MESA_ycbcr_texture.spec:
 GL_MESA_ycbcr_texture:
 	GL_YCBCR_MESA                    0x8757
 	GL_UNSIGNED_SHORT_8_8_MESA       0x85BA /* same as Apple's */
 	GL_UNSIGNED_SHORT_8_8_REV_MESA   0x85BB /* same as Apple's */
 GL_MESA_pack_invert.spec
 GL_MESA_pack_invert:
 	GL_PACK_INVERT_MESA              0x8758
 GL_MESA_shader_debug.spec: (obsolete)
@@ -37,7 +45,7 @@ GL_MESA_shader_debug.spec: (obsolete)
         GL_DEBUG_PRINT_MESA              0x875A
         GL_DEBUG_ASSERT_MESA             0x875B
 GL_MESA_program_debug.spec: (obsolete)
 GL_MESA_program_debug: (obsolete)
 	GL_FRAGMENT_PROGRAM_CALLBACK_MESA      0x????
 	GL_VERTEX_PROGRAM_CALLBACK_MESA        0x????
 	GL_FRAGMENT_PROGRAM_POSITION_MESA      0x????
@@ -55,3 +63,24 @@ GL_MESAX_texture_stack:
 	GL_TEXTURE_1D_STACK_BINDING_MESAX    0x875D
 	GL_TEXTURE_2D_STACK_BINDING_MESAX    0x875E
 EGL_MESA_drm_image
         EGL_DRM_BUFFER_FORMAT_MESA		0x31D0
         EGL_DRM_BUFFER_USE_MESA			0x31D1
         EGL_DRM_BUFFER_FORMAT_ARGB32_MESA	0x31D2
         EGL_DRM_BUFFER_MESA			0x31D3
         EGL_DRM_BUFFER_STRIDE_MESA		0x31D4
 EGL_MESA_platform_gbm
         EGL_PLATFORM_GBM_MESA                   0x31D7
 EGL_MESA_platform_surfaceless
         EGL_PLATFORM_SURFACELESS_MESA           0x31DD
 EGL_WL_bind_wayland_display
         EGL_TEXTURE_FORMAT                      0x3080
         EGL_WAYLAND_BUFFER_WL                   0x31D5
         EGL_WAYLAND_PLANE_WL                    0x31D6
         EGL_TEXTURE_Y_U_V_WL                    0x31D7
         EGL_TEXTURE_Y_UV_WL                     0x31D8
         EGL_TEXTURE_Y_XUXV_WL                   0x31D9
         EGL_WAYLAND_Y_INVERTED_WL               0x31DB

									
										2

docs/xlibdriver.html
									
												View File
												
				@@ -199,7 +199,7 @@ This incurs a small performance penalty.

				<h2>Extensions</h2>

				<p>

				The following MESA-specific extensions are implemented in the Xlib driver.

				The following Mesa-specific extensions are implemented in the Xlib driver.

				</p>

				<h3>GLX_MESA_pixmap_colormap</h3>

									
										2

include/D3D9/.editorconfig
									
										Normal file
									
												View File
												
				@@ -0,0 +1,2 @@

				[*.h]

				indent_style = tab

									
										121

include/EGL/eglext.h
									
												View File
												
				@@ -6,7 +6,7 @@ extern "C" {

				#endif

				/*

				** Copyright (c) 2013-2014 The Khronos Group Inc.

				** Copyright (c) 2013-2016 The Khronos Group Inc.

				**

				** Permission is hereby granted, free of charge, to any person obtaining a

				** copy of this software and/or associated documentation files (the

				@@ -38,7 +38,7 @@ extern "C" {

				#include <EGL/eglplatform.h>

				#define EGL_EGLEXT_VERSION 20150508

				#define EGL_EGLEXT_VERSION 20160809

				/* Generated C header for:

				 * API: egl

				@@ -99,6 +99,33 @@ EGLAPI EGLSyncKHR EGLAPIENTRY eglCreateSync64KHR (EGLDisplay dpy, EGLenum type,

				#define EGL_CONTEXT_OPENGL_NO_ERROR_KHR   0x31B3

				#endif /* EGL_KHR_create_context_no_error */

				#ifndef EGL_KHR_debug

				#define EGL_KHR_debug 1

				typedef void *EGLLabelKHR;

				typedef void *EGLObjectKHR;

				typedef void (EGLAPIENTRY  *EGLDEBUGPROCKHR)(EGLenum error,const char *command,EGLint messageType,EGLLabelKHR threadLabel,EGLLabelKHR objectLabel,const char* message);

				#define EGL_OBJECT_THREAD_KHR             0x33B0

				#define EGL_OBJECT_DISPLAY_KHR            0x33B1

				#define EGL_OBJECT_CONTEXT_KHR            0x33B2

				#define EGL_OBJECT_SURFACE_KHR            0x33B3

				#define EGL_OBJECT_IMAGE_KHR              0x33B4

				#define EGL_OBJECT_SYNC_KHR               0x33B5

				#define EGL_OBJECT_STREAM_KHR             0x33B6

				#define EGL_DEBUG_MSG_CRITICAL_KHR        0x33B9

				#define EGL_DEBUG_MSG_ERROR_KHR           0x33BA

				#define EGL_DEBUG_MSG_WARN_KHR            0x33BB

				#define EGL_DEBUG_MSG_INFO_KHR            0x33BC

				#define EGL_DEBUG_CALLBACK_KHR            0x33B8

				typedef EGLint (EGLAPIENTRYP PFNEGLDEBUGMESSAGECONTROLKHRPROC) (EGLDEBUGPROCKHR callback, const EGLAttrib *attrib_list);

				typedef EGLBoolean (EGLAPIENTRYP PFNEGLQUERYDEBUGKHRPROC) (EGLint attribute, EGLAttrib *value);

				typedef EGLint (EGLAPIENTRYP PFNEGLLABELOBJECTKHRPROC) (EGLDisplay display, EGLenum objectType, EGLObjectKHR object, EGLLabelKHR label);

				#ifdef EGL_EGLEXT_PROTOTYPES

				EGLAPI EGLint EGLAPIENTRY eglDebugMessageControlKHR (EGLDEBUGPROCKHR callback, const EGLAttrib *attrib_list);

				EGLAPI EGLBoolean EGLAPIENTRY eglQueryDebugKHR (EGLint attribute, EGLAttrib *value);

				EGLAPI EGLint EGLAPIENTRY eglLabelObjectKHR (EGLDisplay display, EGLenum objectType, EGLObjectKHR object, EGLLabelKHR label);

				#endif

				#endif /* EGL_KHR_debug */

				#ifndef EGL_KHR_fence_sync

				#define EGL_KHR_fence_sync 1

				typedef khronos_utime_nanoseconds_t EGLTimeKHR;

				@@ -223,6 +250,16 @@ EGLAPI EGLBoolean EGLAPIENTRY eglQuerySurface64KHR (EGLDisplay dpy, EGLSurface s

				#endif

				#endif /* EGL_KHR_lock_surface3 */

				#ifndef EGL_KHR_mutable_render_buffer

				#define EGL_KHR_mutable_render_buffer 1

				#define EGL_MUTABLE_RENDER_BUFFER_BIT_KHR 0x1000

				#endif /* EGL_KHR_mutable_render_buffer */

				#ifndef EGL_KHR_no_config_context

				#define EGL_KHR_no_config_context 1

				#define EGL_NO_CONFIG_KHR                 ((EGLConfig)0)

				#endif /* EGL_KHR_no_config_context */

				#ifndef EGL_KHR_partial_update

				#define EGL_KHR_partial_update 1

				#define EGL_BUFFER_AGE_KHR                0x313D

				@@ -402,11 +439,28 @@ EGLAPI void EGLAPIENTRY eglSetBlobCacheFuncsANDROID (EGLDisplay dpy, EGLSetBlobF

				#endif

				#endif /* EGL_ANDROID_blob_cache */

				#ifndef EGL_ANDROID_create_native_client_buffer

				#define EGL_ANDROID_create_native_client_buffer 1

				#define EGL_NATIVE_BUFFER_USAGE_ANDROID   0x3143

				#define EGL_NATIVE_BUFFER_USAGE_PROTECTED_BIT_ANDROID 0x00000001

				#define EGL_NATIVE_BUFFER_USAGE_RENDERBUFFER_BIT_ANDROID 0x00000002

				#define EGL_NATIVE_BUFFER_USAGE_TEXTURE_BIT_ANDROID 0x00000004

				typedef EGLClientBuffer (EGLAPIENTRYP PFNEGLCREATENATIVECLIENTBUFFERANDROIDPROC) (const EGLint *attrib_list);

				#ifdef EGL_EGLEXT_PROTOTYPES

				EGLAPI EGLClientBuffer EGLAPIENTRY eglCreateNativeClientBufferANDROID (const EGLint *attrib_list);

				#endif

				#endif /* EGL_ANDROID_create_native_client_buffer */

				#ifndef EGL_ANDROID_framebuffer_target

				#define EGL_ANDROID_framebuffer_target 1

				#define EGL_FRAMEBUFFER_TARGET_ANDROID    0x3147

				#endif /* EGL_ANDROID_framebuffer_target */

				#ifndef EGL_ANDROID_front_buffer_auto_refresh

				#define EGL_ANDROID_front_buffer_auto_refresh 1

				#define EGL_FRONT_BUFFER_AUTO_REFRESH_ANDROID 0x314C

				#endif /* EGL_ANDROID_front_buffer_auto_refresh */

				#ifndef EGL_ANDROID_image_native_buffer

				#define EGL_ANDROID_image_native_buffer 1

				#define EGL_NATIVE_BUFFER_ANDROID         0x3140

				@@ -424,6 +478,15 @@ EGLAPI EGLint EGLAPIENTRY eglDupNativeFenceFDANDROID (EGLDisplay dpy, EGLSyncKHR

				#endif

				#endif /* EGL_ANDROID_native_fence_sync */

				#ifndef EGL_ANDROID_presentation_time

				#define EGL_ANDROID_presentation_time 1

				typedef khronos_stime_nanoseconds_t EGLnsecsANDROID;

				typedef EGLBoolean (EGLAPIENTRYP PFNEGLPRESENTATIONTIMEANDROIDPROC) (EGLDisplay dpy, EGLSurface surface, EGLnsecsANDROID time);

				#ifdef EGL_EGLEXT_PROTOTYPES

				EGLAPI EGLBoolean EGLAPIENTRY eglPresentationTimeANDROID (EGLDisplay dpy, EGLSurface surface, EGLnsecsANDROID time);

				#endif

				#endif /* EGL_ANDROID_presentation_time */

				#ifndef EGL_ANDROID_recordable

				#define EGL_ANDROID_recordable 1

				#define EGL_RECORDABLE_ANDROID            0x3142

				@@ -616,9 +679,13 @@ EGLAPI EGLSurface EGLAPIENTRY eglCreatePlatformPixmapSurfaceEXT (EGLDisplay dpy,

				#define EGL_PLATFORM_X11_SCREEN_EXT       0x31D6

				#endif /* EGL_EXT_platform_x11 */

				#ifndef EGL_EXT_protected_content

				#define EGL_EXT_protected_content 1

				#define EGL_PROTECTED_CONTENT_EXT         0x32C0

				#endif /* EGL_EXT_protected_content */

				#ifndef EGL_EXT_protected_surface

				#define EGL_EXT_protected_surface 1

				#define EGL_PROTECTED_CONTENT_EXT         0x32C0

				#endif /* EGL_EXT_protected_surface */

				#ifndef EGL_EXT_stream_consumer_egloutput

				@@ -697,6 +764,12 @@ EGLAPI EGLSurface EGLAPIENTRY eglCreatePixmapSurfaceHI (EGLDisplay dpy, EGLConfi

				#define EGL_CONTEXT_PRIORITY_LOW_IMG      0x3103

				#endif /* EGL_IMG_context_priority */

				#ifndef EGL_IMG_image_plane_attribs

				#define EGL_IMG_image_plane_attribs 1

				#define EGL_NATIVE_BUFFER_MULTIPLANE_SEPARATE_IMG 0x3105

				#define EGL_NATIVE_BUFFER_PLANE_OFFSET_IMG 0x3106

				#endif /* EGL_IMG_image_plane_attribs */

				#ifndef EGL_MESA_drm_image

				#define EGL_MESA_drm_image 1

				#define EGL_DRM_BUFFER_FORMAT_MESA        0x31D0

				@@ -812,6 +885,48 @@ EGLAPI EGLBoolean EGLAPIENTRY eglPostSubBufferNV (EGLDisplay dpy, EGLSurface sur

				#endif

				#endif /* EGL_NV_post_sub_buffer */

				#ifndef EGL_NV_robustness_video_memory_purge

				#define EGL_NV_robustness_video_memory_purge 1

				#define EGL_GENERATE_RESET_ON_VIDEO_MEMORY_PURGE_NV 0x334C

				#endif /* EGL_NV_robustness_video_memory_purge */

				#ifndef EGL_NV_stream_consumer_gltexture_yuv

				#define EGL_NV_stream_consumer_gltexture_yuv 1

				#define EGL_YUV_PLANE0_TEXTURE_UNIT_NV    0x332C

				#define EGL_YUV_PLANE1_TEXTURE_UNIT_NV    0x332D

				#define EGL_YUV_PLANE2_TEXTURE_UNIT_NV    0x332E

				typedef EGLBoolean (EGLAPIENTRYP PFNEGLSTREAMCONSUMERGLTEXTUREEXTERNALATTRIBSNVPROC) (EGLDisplay dpy, EGLStreamKHR stream, EGLAttrib *attrib_list);

				#ifdef EGL_EGLEXT_PROTOTYPES

				EGLAPI EGLBoolean EGLAPIENTRY eglStreamConsumerGLTextureExternalAttribsNV (EGLDisplay dpy, EGLStreamKHR stream, EGLAttrib *attrib_list);

				#endif

				#endif /* EGL_NV_stream_consumer_gltexture_yuv */

				#ifndef EGL_NV_stream_metadata

				#define EGL_NV_stream_metadata 1

				#define EGL_MAX_STREAM_METADATA_BLOCKS_NV 0x3250

				#define EGL_MAX_STREAM_METADATA_BLOCK_SIZE_NV 0x3251

				#define EGL_MAX_STREAM_METADATA_TOTAL_SIZE_NV 0x3252

				#define EGL_PRODUCER_METADATA_NV          0x3253

				#define EGL_CONSUMER_METADATA_NV          0x3254

				#define EGL_PENDING_METADATA_NV           0x3328

				#define EGL_METADATA0_SIZE_NV             0x3255

				#define EGL_METADATA1_SIZE_NV             0x3256

				#define EGL_METADATA2_SIZE_NV             0x3257

				#define EGL_METADATA3_SIZE_NV             0x3258

				#define EGL_METADATA0_TYPE_NV             0x3259

				#define EGL_METADATA1_TYPE_NV             0x325A

				#define EGL_METADATA2_TYPE_NV             0x325B

				#define EGL_METADATA3_TYPE_NV             0x325C

				typedef EGLBoolean (EGLAPIENTRYP PFNEGLQUERYDISPLAYATTRIBNVPROC) (EGLDisplay dpy, EGLint attribute, EGLAttrib *value);

				typedef EGLBoolean (EGLAPIENTRYP PFNEGLSETSTREAMMETADATANVPROC) (EGLDisplay dpy, EGLStreamKHR stream, EGLint n, EGLint offset, EGLint size, const void *data);

				typedef EGLBoolean (EGLAPIENTRYP PFNEGLQUERYSTREAMMETADATANVPROC) (EGLDisplay dpy, EGLStreamKHR stream, EGLenum name, EGLint n, EGLint offset, EGLint size, void *data);

				#ifdef EGL_EGLEXT_PROTOTYPES

				EGLAPI EGLBoolean EGLAPIENTRY eglQueryDisplayAttribNV (EGLDisplay dpy, EGLint attribute, EGLAttrib *value);

				EGLAPI EGLBoolean EGLAPIENTRY eglSetStreamMetadataNV (EGLDisplay dpy, EGLStreamKHR stream, EGLint n, EGLint offset, EGLint size, const void *data);

				EGLAPI EGLBoolean EGLAPIENTRY eglQueryStreamMetadataNV (EGLDisplay dpy, EGLStreamKHR stream, EGLenum name, EGLint n, EGLint offset, EGLint size, void *data);

				#endif

				#endif /* EGL_NV_stream_metadata */

				#ifndef EGL_NV_stream_sync

				#define EGL_NV_stream_sync 1

				#define EGL_SYNC_NEW_FRAME_NV             0x321F

									
										5

include/EGL/eglmesaext.h
									
												View File
												
				@@ -84,6 +84,11 @@ typedef EGLBoolean (EGLAPIENTRYP PFNEGLSWAPBUFFERSREGIONNOK) (EGLDisplay dpy, EG

				#define EGL_NO_CONFIG_MESA			((EGLConfig)0)

				#endif

				#ifndef EGL_MESA_platform_surfaceless

				#define EGL_MESA_platform_surfaceless 1

				#define EGL_PLATFORM_SURFACELESS_MESA           0x31DD

				#endif /* EGL_MESA_platform_surfaceless */

				#ifdef __cplusplus

				}

				#endif

									
										9

include/GL/glext.h
									
												View File
												
				@@ -33,7 +33,7 @@ extern "C" {

				** used to make the header, and the header can be found at

				**   http://www.opengl.org/registry/

				**

				** Khronos $Revision: 32957 $ on $Date: 2016-06-09 17:03:08 -0400 (Thu, 09 Jun 2016) $

				** Khronos $Revision: 33061 $ on $Date: 2016-07-14 20:14:13 -0400 (Thu, 14 Jul 2016) $

				*/

				#if defined(_WIN32) && !defined(APIENTRY) && !defined(__CYGWIN__) && !defined(__SCITECH_SNAP__)

				@@ -53,7 +53,7 @@ extern "C" {

				#define GLAPI extern

				#endif

				#define GL_GLEXT_VERSION 20160609

				#define GL_GLEXT_VERSION 20160714

				/* Generated C header for:

				 * API: gl

				@@ -8836,6 +8836,11 @@ GLAPI void APIENTRY glBlendFuncSeparateINGR (GLenum sfactorRGB, GLenum dfactorRG

				#define GL_INTERLACE_READ_INGR            0x8568

				#endif /* GL_INGR_interlace_read */

				#ifndef GL_INTEL_conservative_rasterization

				#define GL_INTEL_conservative_rasterization 1

				#define GL_CONSERVATIVE_RASTERIZATION_INTEL 0x83FE

				#endif /* GL_INTEL_conservative_rasterization */

				#ifndef GL_INTEL_fragment_shader_ordering

				#define GL_INTEL_fragment_shader_ordering 1

				#endif /* GL_INTEL_fragment_shader_ordering */

									
										36

include/GL/glxext.h
									
												View File
												
				@@ -6,7 +6,7 @@ extern "C" {

				#endif

				/*

				** Copyright (c) 2013-2014 The Khronos Group Inc.

				** Copyright (c) 2013-2016 The Khronos Group Inc.

				**

				** Permission is hereby granted, free of charge, to any person obtaining a

				** copy of this software and/or associated documentation files (the

				@@ -33,10 +33,10 @@ extern "C" {

				** used to make the header, and the header can be found at

				**   http://www.opengl.org/registry/

				**

				** Khronos $Revision: 27684 $ on $Date: 2014-08-11 01:21:35 -0700 (Mon, 11 Aug 2014) $

				** Khronos $Revision: 32889 $ on $Date: 2016-05-31 07:09:51 -0400 (Tue, 31 May 2016) $

				*/

				#define GLX_GLXEXT_VERSION 20140810

				#define GLX_GLXEXT_VERSION 20160531

				/* Generated C header for:

				 * API: glx

				@@ -250,6 +250,26 @@ __GLXextFuncPtr glXGetProcAddressARB (const GLubyte *procName);

				#define GLX_GPU_NUM_SIMD_AMD              0x21A6

				#define GLX_GPU_NUM_RB_AMD                0x21A7

				#define GLX_GPU_NUM_SPI_AMD               0x21A8

				typedef unsigned int ( *PFNGLXGETGPUIDSAMDPROC) (unsigned int maxCount, unsigned int *ids);

				typedef int ( *PFNGLXGETGPUINFOAMDPROC) (unsigned int id, int property, GLenum dataType, unsigned int size, void *data);

				typedef unsigned int ( *PFNGLXGETCONTEXTGPUIDAMDPROC) (GLXContext ctx);

				typedef GLXContext ( *PFNGLXCREATEASSOCIATEDCONTEXTAMDPROC) (unsigned int id, GLXContext share_list);

				typedef GLXContext ( *PFNGLXCREATEASSOCIATEDCONTEXTATTRIBSAMDPROC) (unsigned int id, GLXContext share_context, const int *attribList);

				typedef Bool ( *PFNGLXDELETEASSOCIATEDCONTEXTAMDPROC) (GLXContext ctx);

				typedef Bool ( *PFNGLXMAKEASSOCIATEDCONTEXTCURRENTAMDPROC) (GLXContext ctx);

				typedef GLXContext ( *PFNGLXGETCURRENTASSOCIATEDCONTEXTAMDPROC) (void);

				typedef void ( *PFNGLXBLITCONTEXTFRAMEBUFFERAMDPROC) (GLXContext dstCtx, GLint srcX0, GLint srcY0, GLint srcX1, GLint srcY1, GLint dstX0, GLint dstY0, GLint dstX1, GLint dstY1, GLbitfield mask, GLenum filter);

				#ifdef GLX_GLXEXT_PROTOTYPES

				unsigned int glXGetGPUIDsAMD (unsigned int maxCount, unsigned int *ids);

				int glXGetGPUInfoAMD (unsigned int id, int property, GLenum dataType, unsigned int size, void *data);

				unsigned int glXGetContextGPUIDAMD (GLXContext ctx);

				GLXContext glXCreateAssociatedContextAMD (unsigned int id, GLXContext share_list);

				GLXContext glXCreateAssociatedContextAttribsAMD (unsigned int id, GLXContext share_context, const int *attribList);

				Bool glXDeleteAssociatedContextAMD (GLXContext ctx);

				Bool glXMakeAssociatedContextCurrentAMD (GLXContext ctx);

				GLXContext glXGetCurrentAssociatedContextAMD (void);

				void glXBlitContextFramebufferAMD (GLXContext dstCtx, GLint srcX0, GLint srcY0, GLint srcX1, GLint srcY1, GLint dstX0, GLint dstY0, GLint dstX1, GLint dstY1, GLbitfield mask, GLenum filter);

				#endif

				#endif /* GLX_AMD_gpu_association */

				#ifndef GLX_EXT_buffer_age

				@@ -297,6 +317,11 @@ void glXFreeContextEXT (Display *dpy, GLXContext context);

				#endif

				#endif /* GLX_EXT_import_context */

				#ifndef GLX_EXT_libglvnd

				#define GLX_EXT_libglvnd 1

				#define GLX_VENDOR_NAMES_EXT              0x20F6

				#endif /* GLX_EXT_libglvnd */

				#ifndef GLX_EXT_stereo_tree

				#define GLX_EXT_stereo_tree 1

				typedef struct {

				@@ -523,6 +548,11 @@ int glXBindVideoDeviceNV (Display *dpy, unsigned int video_slot, unsigned int vi

				#endif

				#endif /* GLX_NV_present_video */

				#ifndef GLX_NV_robustness_video_memory_purge

				#define GLX_NV_robustness_video_memory_purge 1

				#define GLX_GENERATE_RESET_ON_VIDEO_MEMORY_PURGE_NV 0x20F7

				#endif /* GLX_NV_robustness_video_memory_purge */

				#ifndef GLX_NV_swap_group

				#define GLX_NV_swap_group 1

				typedef Bool ( *PFNGLXJOINSWAPGROUPNVPROC) (Display *dpy, GLXDrawable drawable, GLuint group);

									
										4

include/GL/internal/dri_interface.h
									
												View File
												
				@@ -1094,7 +1094,7 @@ struct __DRIdri2ExtensionRec {

				 * extensions.

				 */

				#define __DRI_IMAGE "DRI_IMAGE"

				#define __DRI_IMAGE_VERSION 12

				#define __DRI_IMAGE_VERSION 13

				/**

				 * These formats correspond to the similarly named MESA_FORMAT_*

				@@ -1208,6 +1208,8 @@ struct __DRIdri2ExtensionRec {

				#define __DRI_IMAGE_ATTRIB_FOURCC       0x2008 /* available in versions 11 */

				#define __DRI_IMAGE_ATTRIB_NUM_PLANES   0x2009 /* available in versions 11 */

				#define __DRI_IMAGE_ATTRIB_OFFSET 0x200A /* available in versions 13 */

				enum __DRIYUVColorSpace {

				   __DRI_YUV_COLOR_SPACE_UNDEFINED = 0,

				   __DRI_YUV_COLOR_SPACE_ITU_REC601 = 0x327F,

									
										18

include/GL/mesa_glinterop.h
									
												View File
												
				@@ -58,8 +58,8 @@ extern "C" {

				#endif

				/* Forward declarations to avoid inclusion of GL/glx.h */

				typedef struct _XDisplay Display;

				typedef struct __GLXcontextRec *GLXContext;

				struct _XDisplay;

				struct __GLXcontextRec;

				/* Forward declarations to avoid inclusion of EGL/egl.h */

				typedef void *EGLDisplay;

				@@ -97,7 +97,7 @@ struct mesa_glinterop_device_info {

				   /* The callee will overwrite it if it supports a lower version.

				    *

				    * The caller should check the value and access up-to the version supported

				    * by the the callee.

				    * by the callee.

				    */

				   /* NOTE: Do not use the MESA_GLINTEROP_DEVICE_INFO_VERSION macro */

				   uint32_t version;

				@@ -125,7 +125,7 @@ struct mesa_glinterop_export_in {

				   /* The callee will overwrite it if it supports a lower version.

				    *

				    * The caller should check the value and access up-to the version supported

				    * by the the callee.

				    * by the callee.

				    */

				   /* NOTE: Do not use the MESA_GLINTEROP_EXPORT_IN_VERSION macro */

				   uint32_t version;

				@@ -190,7 +190,7 @@ struct mesa_glinterop_export_out {

				   /* The callee will overwrite it if it supports a lower version.

				    *

				    * The caller should check the value and access up-to the version supported

				    * by the the callee.

				    * by the callee.

				    */

				   /* NOTE: Do not use the MESA_GLINTEROP_EXPORT_OUT_VERSION macro */

				   uint32_t version;

				@@ -246,7 +246,7 @@ struct mesa_glinterop_export_out {

				 * \return MESA_GLINTEROP_SUCCESS or MESA_GLINTEROP_* != 0 on error

				 */

				int

				MesaGLInteropGLXQueryDeviceInfo(Display *dpy, GLXContext context,

				MesaGLInteropGLXQueryDeviceInfo(struct _XDisplay *dpy, struct __GLXcontextRec *context,

				                                struct mesa_glinterop_device_info *out);

				@@ -271,7 +271,7 @@ MesaGLInteropEGLQueryDeviceInfo(EGLDisplay dpy, EGLContext context,

				 * \return MESA_GLINTEROP_SUCCESS or MESA_GLINTEROP_* != 0 on error

				 */

				int

				MesaGLInteropGLXExportObject(Display *dpy, GLXContext context,

				MesaGLInteropGLXExportObject(struct _XDisplay *dpy, struct __GLXcontextRec *context,

				                             struct mesa_glinterop_export_in *in,

				                             struct mesa_glinterop_export_out *out);

				@@ -286,11 +286,11 @@ MesaGLInteropEGLExportObject(EGLDisplay dpy, EGLContext context,

				                             struct mesa_glinterop_export_out *out);

				typedef int (PFNMESAGLINTEROPGLXQUERYDEVICEINFOPROC)(Display *dpy, GLXContext context,

				typedef int (PFNMESAGLINTEROPGLXQUERYDEVICEINFOPROC)(struct _XDisplay *dpy, struct __GLXcontextRec *context,

				                                                     struct mesa_glinterop_device_info *out);

				typedef int (PFNMESAGLINTEROPEGLQUERYDEVICEINFOPROC)(EGLDisplay dpy, EGLContext context,

				                                                     struct mesa_glinterop_device_info *out);

				typedef int (PFNMESAGLINTEROPGLXEXPORTOBJECTPROC)(Display *dpy, GLXContext context,

				typedef int (PFNMESAGLINTEROPGLXEXPORTOBJECTPROC)(struct _XDisplay *dpy, struct __GLXcontextRec *context,

				                                                  struct mesa_glinterop_export_in *in,

				                                                  struct mesa_glinterop_export_out *out);

				typedef int (PFNMESAGLINTEROPEGLEXPORTOBJECTPROC)(EGLDisplay dpy, EGLContext context,

									
										6

include/GL/wglext.h
									
												View File
												
				@@ -6,7 +6,7 @@ extern "C" {

				#endif

				/*

				** Copyright (c) 2013-2014 The Khronos Group Inc.

				** Copyright (c) 2013-2016 The Khronos Group Inc.

				**

				** Permission is hereby granted, free of charge, to any person obtaining a

				** copy of this software and/or associated documentation files (the

				@@ -33,7 +33,7 @@ extern "C" {

				** used to make the header, and the header can be found at

				**   http://www.opengl.org/registry/

				**

				** Khronos $Revision: 27684 $ on $Date: 2014-08-11 01:21:35 -0700 (Mon, 11 Aug 2014) $

				** Khronos $Revision: 32686 $ on $Date: 2016-04-19 21:08:44 -0400 (Tue, 19 Apr 2016) $

				*/

				#if defined(_WIN32) && !defined(APIENTRY) && !defined(__CYGWIN__) && !defined(__SCITECH_SNAP__)

				@@ -41,7 +41,7 @@ extern "C" {

				#include <windows.h>

				#endif

				#define WGL_WGLEXT_VERSION 20140810

				#define WGL_WGLEXT_VERSION 20160419

				/* Generated C header for:

				 * API: wgl

									
										152

include/GLES2/gl2.h
									
												View File
												
				@@ -6,7 +6,7 @@ extern "C" {

				#endif

				/*

				** Copyright (c) 2013 The Khronos Group Inc.

				** Copyright (c) 2013-2016 The Khronos Group Inc.

				**

				** Permission is hereby granted, free of charge, to any person obtaining a

				** copy of this software and/or associated documentation files (the

				@@ -33,12 +33,16 @@ extern "C" {

				** used to make the header, and the header can be found at

				**   http://www.opengl.org/registry/

				**

				** Khronos $Revision: 24614 $ on $Date: 2013-12-30 04:44:46 -0800 (Mon, 30 Dec 2013) $

				** Khronos $Revision: 32749 $ on $Date: 2016-04-28 09:03:03 -0700 (Thu, 28 Apr 2016) $

				*/

				#include <GLES2/gl2platform.h>

				/* Generated on date 20131230 */

				#ifndef GL_APIENTRYP

				#define GL_APIENTRYP GL_APIENTRY*

				#endif

				/* Generated on date 20160428 */

				/* Generated C header for:

				 * API: gles2

				@@ -374,6 +378,148 @@ typedef khronos_uint8_t GLubyte;

				#define GL_RENDERBUFFER_BINDING           0x8CA7

				#define GL_MAX_RENDERBUFFER_SIZE          0x84E8

				#define GL_INVALID_FRAMEBUFFER_OPERATION  0x0506

				typedef void (GL_APIENTRYP PFNGLACTIVETEXTUREPROC) (GLenum texture);

				typedef void (GL_APIENTRYP PFNGLATTACHSHADERPROC) (GLuint program, GLuint shader);

				typedef void (GL_APIENTRYP PFNGLBINDATTRIBLOCATIONPROC) (GLuint program, GLuint index, const GLchar *name);

				typedef void (GL_APIENTRYP PFNGLBINDBUFFERPROC) (GLenum target, GLuint buffer);

				typedef void (GL_APIENTRYP PFNGLBINDFRAMEBUFFERPROC) (GLenum target, GLuint framebuffer);

				typedef void (GL_APIENTRYP PFNGLBINDRENDERBUFFERPROC) (GLenum target, GLuint renderbuffer);

				typedef void (GL_APIENTRYP PFNGLBINDTEXTUREPROC) (GLenum target, GLuint texture);

				typedef void (GL_APIENTRYP PFNGLBLENDCOLORPROC) (GLfloat red, GLfloat green, GLfloat blue, GLfloat alpha);

				typedef void (GL_APIENTRYP PFNGLBLENDEQUATIONPROC) (GLenum mode);

				typedef void (GL_APIENTRYP PFNGLBLENDEQUATIONSEPARATEPROC) (GLenum modeRGB, GLenum modeAlpha);

				typedef void (GL_APIENTRYP PFNGLBLENDFUNCPROC) (GLenum sfactor, GLenum dfactor);

				typedef void (GL_APIENTRYP PFNGLBLENDFUNCSEPARATEPROC) (GLenum sfactorRGB, GLenum dfactorRGB, GLenum sfactorAlpha, GLenum dfactorAlpha);

				typedef void (GL_APIENTRYP PFNGLBUFFERDATAPROC) (GLenum target, GLsizeiptr size, const void *data, GLenum usage);

				typedef void (GL_APIENTRYP PFNGLBUFFERSUBDATAPROC) (GLenum target, GLintptr offset, GLsizeiptr size, const void *data);

				typedef GLenum (GL_APIENTRYP PFNGLCHECKFRAMEBUFFERSTATUSPROC) (GLenum target);

				typedef void (GL_APIENTRYP PFNGLCLEARPROC) (GLbitfield mask);

				typedef void (GL_APIENTRYP PFNGLCLEARCOLORPROC) (GLfloat red, GLfloat green, GLfloat blue, GLfloat alpha);

				typedef void (GL_APIENTRYP PFNGLCLEARDEPTHFPROC) (GLfloat d);

				typedef void (GL_APIENTRYP PFNGLCLEARSTENCILPROC) (GLint s);

				typedef void (GL_APIENTRYP PFNGLCOLORMASKPROC) (GLboolean red, GLboolean green, GLboolean blue, GLboolean alpha);

				typedef void (GL_APIENTRYP PFNGLCOMPILESHADERPROC) (GLuint shader);

				typedef void (GL_APIENTRYP PFNGLCOMPRESSEDTEXIMAGE2DPROC) (GLenum target, GLint level, GLenum internalformat, GLsizei width, GLsizei height, GLint border, GLsizei imageSize, const void *data);

				typedef void (GL_APIENTRYP PFNGLCOMPRESSEDTEXSUBIMAGE2DPROC) (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLsizei width, GLsizei height, GLenum format, GLsizei imageSize, const void *data);

				typedef void (GL_APIENTRYP PFNGLCOPYTEXIMAGE2DPROC) (GLenum target, GLint level, GLenum internalformat, GLint x, GLint y, GLsizei width, GLsizei height, GLint border);

				typedef void (GL_APIENTRYP PFNGLCOPYTEXSUBIMAGE2DPROC) (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLint x, GLint y, GLsizei width, GLsizei height);

				typedef GLuint (GL_APIENTRYP PFNGLCREATEPROGRAMPROC) (void);

				typedef GLuint (GL_APIENTRYP PFNGLCREATESHADERPROC) (GLenum type);

				typedef void (GL_APIENTRYP PFNGLCULLFACEPROC) (GLenum mode);

				typedef void (GL_APIENTRYP PFNGLDELETEBUFFERSPROC) (GLsizei n, const GLuint *buffers);

				typedef void (GL_APIENTRYP PFNGLDELETEFRAMEBUFFERSPROC) (GLsizei n, const GLuint *framebuffers);

				typedef void (GL_APIENTRYP PFNGLDELETEPROGRAMPROC) (GLuint program);

				typedef void (GL_APIENTRYP PFNGLDELETERENDERBUFFERSPROC) (GLsizei n, const GLuint *renderbuffers);

				typedef void (GL_APIENTRYP PFNGLDELETESHADERPROC) (GLuint shader);

				typedef void (GL_APIENTRYP PFNGLDELETETEXTURESPROC) (GLsizei n, const GLuint *textures);

				typedef void (GL_APIENTRYP PFNGLDEPTHFUNCPROC) (GLenum func);

				typedef void (GL_APIENTRYP PFNGLDEPTHMASKPROC) (GLboolean flag);

				typedef void (GL_APIENTRYP PFNGLDEPTHRANGEFPROC) (GLfloat n, GLfloat f);

				typedef void (GL_APIENTRYP PFNGLDETACHSHADERPROC) (GLuint program, GLuint shader);

				typedef void (GL_APIENTRYP PFNGLDISABLEPROC) (GLenum cap);

				typedef void (GL_APIENTRYP PFNGLDISABLEVERTEXATTRIBARRAYPROC) (GLuint index);

				typedef void (GL_APIENTRYP PFNGLDRAWARRAYSPROC) (GLenum mode, GLint first, GLsizei count);

				typedef void (GL_APIENTRYP PFNGLDRAWELEMENTSPROC) (GLenum mode, GLsizei count, GLenum type, const void *indices);

				typedef void (GL_APIENTRYP PFNGLENABLEPROC) (GLenum cap);

				typedef void (GL_APIENTRYP PFNGLENABLEVERTEXATTRIBARRAYPROC) (GLuint index);

				typedef void (GL_APIENTRYP PFNGLFINISHPROC) (void);

				typedef void (GL_APIENTRYP PFNGLFLUSHPROC) (void);

				typedef void (GL_APIENTRYP PFNGLFRAMEBUFFERRENDERBUFFERPROC) (GLenum target, GLenum attachment, GLenum renderbuffertarget, GLuint renderbuffer);

				typedef void (GL_APIENTRYP PFNGLFRAMEBUFFERTEXTURE2DPROC) (GLenum target, GLenum attachment, GLenum textarget, GLuint texture, GLint level);

				typedef void (GL_APIENTRYP PFNGLFRONTFACEPROC) (GLenum mode);

				typedef void (GL_APIENTRYP PFNGLGENBUFFERSPROC) (GLsizei n, GLuint *buffers);

				typedef void (GL_APIENTRYP PFNGLGENERATEMIPMAPPROC) (GLenum target);

				typedef void (GL_APIENTRYP PFNGLGENFRAMEBUFFERSPROC) (GLsizei n, GLuint *framebuffers);

				typedef void (GL_APIENTRYP PFNGLGENRENDERBUFFERSPROC) (GLsizei n, GLuint *renderbuffers);

				typedef void (GL_APIENTRYP PFNGLGENTEXTURESPROC) (GLsizei n, GLuint *textures);

				typedef void (GL_APIENTRYP PFNGLGETACTIVEATTRIBPROC) (GLuint program, GLuint index, GLsizei bufSize, GLsizei *length, GLint *size, GLenum *type, GLchar *name);

				typedef void (GL_APIENTRYP PFNGLGETACTIVEUNIFORMPROC) (GLuint program, GLuint index, GLsizei bufSize, GLsizei *length, GLint *size, GLenum *type, GLchar *name);

				typedef void (GL_APIENTRYP PFNGLGETATTACHEDSHADERSPROC) (GLuint program, GLsizei maxCount, GLsizei *count, GLuint *shaders);

				typedef GLint (GL_APIENTRYP PFNGLGETATTRIBLOCATIONPROC) (GLuint program, const GLchar *name);

				typedef void (GL_APIENTRYP PFNGLGETBOOLEANVPROC) (GLenum pname, GLboolean *data);

				typedef void (GL_APIENTRYP PFNGLGETBUFFERPARAMETERIVPROC) (GLenum target, GLenum pname, GLint *params);

				typedef GLenum (GL_APIENTRYP PFNGLGETERRORPROC) (void);

				typedef void (GL_APIENTRYP PFNGLGETFLOATVPROC) (GLenum pname, GLfloat *data);

				typedef void (GL_APIENTRYP PFNGLGETFRAMEBUFFERATTACHMENTPARAMETERIVPROC) (GLenum target, GLenum attachment, GLenum pname, GLint *params);

				typedef void (GL_APIENTRYP PFNGLGETINTEGERVPROC) (GLenum pname, GLint *data);

				typedef void (GL_APIENTRYP PFNGLGETPROGRAMIVPROC) (GLuint program, GLenum pname, GLint *params);

				typedef void (GL_APIENTRYP PFNGLGETPROGRAMINFOLOGPROC) (GLuint program, GLsizei bufSize, GLsizei *length, GLchar *infoLog);

				typedef void (GL_APIENTRYP PFNGLGETRENDERBUFFERPARAMETERIVPROC) (GLenum target, GLenum pname, GLint *params);

				typedef void (GL_APIENTRYP PFNGLGETSHADERIVPROC) (GLuint shader, GLenum pname, GLint *params);

				typedef void (GL_APIENTRYP PFNGLGETSHADERINFOLOGPROC) (GLuint shader, GLsizei bufSize, GLsizei *length, GLchar *infoLog);

				typedef void (GL_APIENTRYP PFNGLGETSHADERPRECISIONFORMATPROC) (GLenum shadertype, GLenum precisiontype, GLint *range, GLint *precision);

				typedef void (GL_APIENTRYP PFNGLGETSHADERSOURCEPROC) (GLuint shader, GLsizei bufSize, GLsizei *length, GLchar *source);

				typedef const GLubyte *(GL_APIENTRYP PFNGLGETSTRINGPROC) (GLenum name);

				typedef void (GL_APIENTRYP PFNGLGETTEXPARAMETERFVPROC) (GLenum target, GLenum pname, GLfloat *params);

				typedef void (GL_APIENTRYP PFNGLGETTEXPARAMETERIVPROC) (GLenum target, GLenum pname, GLint *params);

				typedef void (GL_APIENTRYP PFNGLGETUNIFORMFVPROC) (GLuint program, GLint location, GLfloat *params);

				typedef void (GL_APIENTRYP PFNGLGETUNIFORMIVPROC) (GLuint program, GLint location, GLint *params);

				typedef GLint (GL_APIENTRYP PFNGLGETUNIFORMLOCATIONPROC) (GLuint program, const GLchar *name);

				typedef void (GL_APIENTRYP PFNGLGETVERTEXATTRIBFVPROC) (GLuint index, GLenum pname, GLfloat *params);

				typedef void (GL_APIENTRYP PFNGLGETVERTEXATTRIBIVPROC) (GLuint index, GLenum pname, GLint *params);

				typedef void (GL_APIENTRYP PFNGLGETVERTEXATTRIBPOINTERVPROC) (GLuint index, GLenum pname, void **pointer);

				typedef void (GL_APIENTRYP PFNGLHINTPROC) (GLenum target, GLenum mode);

				typedef GLboolean (GL_APIENTRYP PFNGLISBUFFERPROC) (GLuint buffer);

				typedef GLboolean (GL_APIENTRYP PFNGLISENABLEDPROC) (GLenum cap);

				typedef GLboolean (GL_APIENTRYP PFNGLISFRAMEBUFFERPROC) (GLuint framebuffer);

				typedef GLboolean (GL_APIENTRYP PFNGLISPROGRAMPROC) (GLuint program);

				typedef GLboolean (GL_APIENTRYP PFNGLISRENDERBUFFERPROC) (GLuint renderbuffer);

				typedef GLboolean (GL_APIENTRYP PFNGLISSHADERPROC) (GLuint shader);

				typedef GLboolean (GL_APIENTRYP PFNGLISTEXTUREPROC) (GLuint texture);

				typedef void (GL_APIENTRYP PFNGLLINEWIDTHPROC) (GLfloat width);

				typedef void (GL_APIENTRYP PFNGLLINKPROGRAMPROC) (GLuint program);

				typedef void (GL_APIENTRYP PFNGLPIXELSTOREIPROC) (GLenum pname, GLint param);

				typedef void (GL_APIENTRYP PFNGLPOLYGONOFFSETPROC) (GLfloat factor, GLfloat units);

				typedef void (GL_APIENTRYP PFNGLREADPIXELSPROC) (GLint x, GLint y, GLsizei width, GLsizei height, GLenum format, GLenum type, void *pixels);

				typedef void (GL_APIENTRYP PFNGLRELEASESHADERCOMPILERPROC) (void);

				typedef void (GL_APIENTRYP PFNGLRENDERBUFFERSTORAGEPROC) (GLenum target, GLenum internalformat, GLsizei width, GLsizei height);

				typedef void (GL_APIENTRYP PFNGLSAMPLECOVERAGEPROC) (GLfloat value, GLboolean invert);

				typedef void (GL_APIENTRYP PFNGLSCISSORPROC) (GLint x, GLint y, GLsizei width, GLsizei height);

				typedef void (GL_APIENTRYP PFNGLSHADERBINARYPROC) (GLsizei count, const GLuint *shaders, GLenum binaryformat, const void *binary, GLsizei length);

				typedef void (GL_APIENTRYP PFNGLSHADERSOURCEPROC) (GLuint shader, GLsizei count, const GLchar *const*string, const GLint *length);

				typedef void (GL_APIENTRYP PFNGLSTENCILFUNCPROC) (GLenum func, GLint ref, GLuint mask);

				typedef void (GL_APIENTRYP PFNGLSTENCILFUNCSEPARATEPROC) (GLenum face, GLenum func, GLint ref, GLuint mask);

				typedef void (GL_APIENTRYP PFNGLSTENCILMASKPROC) (GLuint mask);

				typedef void (GL_APIENTRYP PFNGLSTENCILMASKSEPARATEPROC) (GLenum face, GLuint mask);

				typedef void (GL_APIENTRYP PFNGLSTENCILOPPROC) (GLenum fail, GLenum zfail, GLenum zpass);

				typedef void (GL_APIENTRYP PFNGLSTENCILOPSEPARATEPROC) (GLenum face, GLenum sfail, GLenum dpfail, GLenum dppass);

				typedef void (GL_APIENTRYP PFNGLTEXIMAGE2DPROC) (GLenum target, GLint level, GLint internalformat, GLsizei width, GLsizei height, GLint border, GLenum format, GLenum type, const void *pixels);

				typedef void (GL_APIENTRYP PFNGLTEXPARAMETERFPROC) (GLenum target, GLenum pname, GLfloat param);

				typedef void (GL_APIENTRYP PFNGLTEXPARAMETERFVPROC) (GLenum target, GLenum pname, const GLfloat *params);

				typedef void (GL_APIENTRYP PFNGLTEXPARAMETERIPROC) (GLenum target, GLenum pname, GLint param);

				typedef void (GL_APIENTRYP PFNGLTEXPARAMETERIVPROC) (GLenum target, GLenum pname, const GLint *params);

				typedef void (GL_APIENTRYP PFNGLTEXSUBIMAGE2DPROC) (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLsizei width, GLsizei height, GLenum format, GLenum type, const void *pixels);

				typedef void (GL_APIENTRYP PFNGLUNIFORM1FPROC) (GLint location, GLfloat v0);

				typedef void (GL_APIENTRYP PFNGLUNIFORM1FVPROC) (GLint location, GLsizei count, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORM1IPROC) (GLint location, GLint v0);

				typedef void (GL_APIENTRYP PFNGLUNIFORM1IVPROC) (GLint location, GLsizei count, const GLint *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORM2FPROC) (GLint location, GLfloat v0, GLfloat v1);

				typedef void (GL_APIENTRYP PFNGLUNIFORM2FVPROC) (GLint location, GLsizei count, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORM2IPROC) (GLint location, GLint v0, GLint v1);

				typedef void (GL_APIENTRYP PFNGLUNIFORM2IVPROC) (GLint location, GLsizei count, const GLint *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORM3FPROC) (GLint location, GLfloat v0, GLfloat v1, GLfloat v2);

				typedef void (GL_APIENTRYP PFNGLUNIFORM3FVPROC) (GLint location, GLsizei count, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORM3IPROC) (GLint location, GLint v0, GLint v1, GLint v2);

				typedef void (GL_APIENTRYP PFNGLUNIFORM3IVPROC) (GLint location, GLsizei count, const GLint *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORM4FPROC) (GLint location, GLfloat v0, GLfloat v1, GLfloat v2, GLfloat v3);

				typedef void (GL_APIENTRYP PFNGLUNIFORM4FVPROC) (GLint location, GLsizei count, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORM4IPROC) (GLint location, GLint v0, GLint v1, GLint v2, GLint v3);

				typedef void (GL_APIENTRYP PFNGLUNIFORM4IVPROC) (GLint location, GLsizei count, const GLint *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX2FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX3FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX4FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLUSEPROGRAMPROC) (GLuint program);

				typedef void (GL_APIENTRYP PFNGLVALIDATEPROGRAMPROC) (GLuint program);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB1FPROC) (GLuint index, GLfloat x);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB1FVPROC) (GLuint index, const GLfloat *v);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB2FPROC) (GLuint index, GLfloat x, GLfloat y);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB2FVPROC) (GLuint index, const GLfloat *v);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB3FPROC) (GLuint index, GLfloat x, GLfloat y, GLfloat z);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB3FVPROC) (GLuint index, const GLfloat *v);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB4FPROC) (GLuint index, GLfloat x, GLfloat y, GLfloat z, GLfloat w);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB4FVPROC) (GLuint index, const GLfloat *v);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIBPOINTERPROC) (GLuint index, GLint size, GLenum type, GLboolean normalized, GLsizei stride, const void *pointer);

				typedef void (GL_APIENTRYP PFNGLVIEWPORTPROC) (GLint x, GLint y, GLsizei width, GLsizei height);

				GL_APICALL void GL_APIENTRY glActiveTexture (GLenum texture);

				GL_APICALL void GL_APIENTRY glAttachShader (GLuint program, GLuint shader);

				GL_APICALL void GL_APIENTRY glBindAttribLocation (GLuint program, GLuint index, const GLchar *name);

									
										262

include/GLES2/gl2ext.h
									
												View File
												
				@@ -6,7 +6,7 @@ extern "C" {

				#endif

				/*

				** Copyright (c) 2013-2015 The Khronos Group Inc.

				** Copyright (c) 2013-2016 The Khronos Group Inc.

				**

				** Permission is hereby granted, free of charge, to any person obtaining a

				** copy of this software and/or associated documentation files (the

				@@ -33,14 +33,14 @@ extern "C" {

				** used to make the header, and the header can be found at

				**   http://www.opengl.org/registry/

				**

				** Khronos $Revision: 32120 $ on $Date: 2015-10-15 04:27:13 -0700 (Thu, 15 Oct 2015) $

				** Khronos $Revision: 33080 $ on $Date: 2016-08-05 04:09:22 -0700 (Fri, 05 Aug 2016) $

				*/

				#ifndef GL_APIENTRYP

				#define GL_APIENTRYP GL_APIENTRY*

				#endif

				/* Generated on date 20151015 */

				/* Generated on date 20160805 */

				/* Generated C header for:

				 * API: gles2

				@@ -52,6 +52,10 @@ extern "C" {

				 * Extensions removed: _nomatch_^

				 */

				#ifndef GL_ARB_sparse_texture2

				#define GL_ARB_sparse_texture2 1

				#endif /* GL_ARB_sparse_texture2 */

				#ifndef GL_KHR_blend_equation_advanced

				#define GL_KHR_blend_equation_advanced 1

				#define GL_MULTIPLY_KHR                   0x9294

				@@ -752,6 +756,34 @@ GL_APICALL GLboolean GL_APIENTRY glIsVertexArrayOES (GLuint array);

				#define GL_INT_10_10_10_2_OES             0x8DF7

				#endif /* GL_OES_vertex_type_10_10_10_2 */

				#ifndef GL_OES_viewport_array

				#define GL_OES_viewport_array 1

				#define GL_MAX_VIEWPORTS_OES              0x825B

				#define GL_VIEWPORT_SUBPIXEL_BITS_OES     0x825C

				#define GL_VIEWPORT_BOUNDS_RANGE_OES      0x825D

				#define GL_VIEWPORT_INDEX_PROVOKING_VERTEX_OES 0x825F

				typedef void (GL_APIENTRYP PFNGLVIEWPORTARRAYVOESPROC) (GLuint first, GLsizei count, const GLfloat *v);

				typedef void (GL_APIENTRYP PFNGLVIEWPORTINDEXEDFOESPROC) (GLuint index, GLfloat x, GLfloat y, GLfloat w, GLfloat h);

				typedef void (GL_APIENTRYP PFNGLVIEWPORTINDEXEDFVOESPROC) (GLuint index, const GLfloat *v);

				typedef void (GL_APIENTRYP PFNGLSCISSORARRAYVOESPROC) (GLuint first, GLsizei count, const GLint *v);

				typedef void (GL_APIENTRYP PFNGLSCISSORINDEXEDOESPROC) (GLuint index, GLint left, GLint bottom, GLsizei width, GLsizei height);

				typedef void (GL_APIENTRYP PFNGLSCISSORINDEXEDVOESPROC) (GLuint index, const GLint *v);

				typedef void (GL_APIENTRYP PFNGLDEPTHRANGEARRAYFVOESPROC) (GLuint first, GLsizei count, const GLfloat *v);

				typedef void (GL_APIENTRYP PFNGLDEPTHRANGEINDEXEDFOESPROC) (GLuint index, GLfloat n, GLfloat f);

				typedef void (GL_APIENTRYP PFNGLGETFLOATI_VOESPROC) (GLenum target, GLuint index, GLfloat *data);

				#ifdef GL_GLEXT_PROTOTYPES

				GL_APICALL void GL_APIENTRY glViewportArrayvOES (GLuint first, GLsizei count, const GLfloat *v);

				GL_APICALL void GL_APIENTRY glViewportIndexedfOES (GLuint index, GLfloat x, GLfloat y, GLfloat w, GLfloat h);

				GL_APICALL void GL_APIENTRY glViewportIndexedfvOES (GLuint index, const GLfloat *v);

				GL_APICALL void GL_APIENTRY glScissorArrayvOES (GLuint first, GLsizei count, const GLint *v);

				GL_APICALL void GL_APIENTRY glScissorIndexedOES (GLuint index, GLint left, GLint bottom, GLsizei width, GLsizei height);

				GL_APICALL void GL_APIENTRY glScissorIndexedvOES (GLuint index, const GLint *v);

				GL_APICALL void GL_APIENTRY glDepthRangeArrayfvOES (GLuint first, GLsizei count, const GLfloat *v);

				GL_APICALL void GL_APIENTRY glDepthRangeIndexedfOES (GLuint index, GLfloat n, GLfloat f);

				GL_APICALL void GL_APIENTRY glGetFloati_vOES (GLenum target, GLuint index, GLfloat *data);

				#endif

				#endif /* GL_OES_viewport_array */

				#ifndef GL_AMD_compressed_3DC_texture

				#define GL_AMD_compressed_3DC_texture 1

				#define GL_3DC_X_AMD                      0x87F9

				@@ -1086,6 +1118,21 @@ GL_APICALL void GL_APIENTRY glBufferStorageEXT (GLenum target, GLsizeiptr size,

				#endif

				#endif /* GL_EXT_buffer_storage */

				#ifndef GL_EXT_clip_cull_distance

				#define GL_EXT_clip_cull_distance 1

				#define GL_MAX_CLIP_DISTANCES_EXT         0x0D32

				#define GL_MAX_CULL_DISTANCES_EXT         0x82F9

				#define GL_MAX_COMBINED_CLIP_AND_CULL_DISTANCES_EXT 0x82FA

				#define GL_CLIP_DISTANCE0_EXT             0x3000

				#define GL_CLIP_DISTANCE1_EXT             0x3001

				#define GL_CLIP_DISTANCE2_EXT             0x3002

				#define GL_CLIP_DISTANCE3_EXT             0x3003

				#define GL_CLIP_DISTANCE4_EXT             0x3004

				#define GL_CLIP_DISTANCE5_EXT             0x3005

				#define GL_CLIP_DISTANCE6_EXT             0x3006

				#define GL_CLIP_DISTANCE7_EXT             0x3007

				#endif /* GL_EXT_clip_cull_distance */

				#ifndef GL_EXT_color_buffer_float

				#define GL_EXT_color_buffer_float 1

				#endif /* GL_EXT_color_buffer_float */

				@@ -1412,6 +1459,15 @@ GL_APICALL void GL_APIENTRY glGetIntegeri_vEXT (GLenum target, GLuint index, GLi

				#define GL_ANY_SAMPLES_PASSED_CONSERVATIVE_EXT 0x8D6A

				#endif /* GL_EXT_occlusion_query_boolean */

				#ifndef GL_EXT_polygon_offset_clamp

				#define GL_EXT_polygon_offset_clamp 1

				#define GL_POLYGON_OFFSET_CLAMP_EXT       0x8E1B

				typedef void (GL_APIENTRYP PFNGLPOLYGONOFFSETCLAMPEXTPROC) (GLfloat factor, GLfloat units, GLfloat clamp);

				#ifdef GL_GLEXT_PROTOTYPES

				GL_APICALL void GL_APIENTRY glPolygonOffsetClampEXT (GLfloat factor, GLfloat units, GLfloat clamp);

				#endif

				#endif /* GL_EXT_polygon_offset_clamp */

				#ifndef GL_EXT_post_depth_coverage

				#define GL_EXT_post_depth_coverage 1

				#endif /* GL_EXT_post_depth_coverage */

				@@ -1425,6 +1481,12 @@ GL_APICALL void GL_APIENTRY glPrimitiveBoundingBoxEXT (GLfloat minX, GLfloat min

				#endif

				#endif /* GL_EXT_primitive_bounding_box */

				#ifndef GL_EXT_protected_textures

				#define GL_EXT_protected_textures 1

				#define GL_CONTEXT_FLAG_PROTECTED_CONTENT_BIT_EXT 0x00000010

				#define GL_TEXTURE_PROTECTED_EXT          0x8BFA

				#endif /* GL_EXT_protected_textures */

				#ifndef GL_EXT_pvrtc_sRGB

				#define GL_EXT_pvrtc_sRGB 1

				#define GL_COMPRESSED_SRGB_PVRTC_2BPPV1_EXT 0x8A54

				@@ -1604,6 +1666,10 @@ GL_APICALL void GL_APIENTRY glProgramUniformMatrix4x3fvEXT (GLuint program, GLin

				#define GL_FRAGMENT_SHADER_DISCARDS_SAMPLES_EXT 0x8A52

				#endif /* GL_EXT_shader_framebuffer_fetch */

				#ifndef GL_EXT_shader_group_vote

				#define GL_EXT_shader_group_vote 1

				#endif /* GL_EXT_shader_group_vote */

				#ifndef GL_EXT_shader_implicit_conversions

				#define GL_EXT_shader_implicit_conversions 1

				#endif /* GL_EXT_shader_implicit_conversions */

				@@ -1616,6 +1682,10 @@ GL_APICALL void GL_APIENTRY glProgramUniformMatrix4x3fvEXT (GLuint program, GLin

				#define GL_EXT_shader_io_blocks 1

				#endif /* GL_EXT_shader_io_blocks */

				#ifndef GL_EXT_shader_non_constant_global_initializers

				#define GL_EXT_shader_non_constant_global_initializers 1

				#endif /* GL_EXT_shader_non_constant_global_initializers */

				#ifndef GL_EXT_shader_pixel_local_storage

				#define GL_EXT_shader_pixel_local_storage 1

				#define GL_MAX_SHADER_PIXEL_LOCAL_STORAGE_FAST_SIZE_EXT 0x8F63

				@@ -1623,6 +1693,21 @@ GL_APICALL void GL_APIENTRY glProgramUniformMatrix4x3fvEXT (GLuint program, GLin

				#define GL_SHADER_PIXEL_LOCAL_STORAGE_EXT 0x8F64

				#endif /* GL_EXT_shader_pixel_local_storage */

				#ifndef GL_EXT_shader_pixel_local_storage2

				#define GL_EXT_shader_pixel_local_storage2 1

				#define GL_MAX_SHADER_COMBINED_LOCAL_STORAGE_FAST_SIZE_EXT 0x9650

				#define GL_MAX_SHADER_COMBINED_LOCAL_STORAGE_SIZE_EXT 0x9651

				#define GL_FRAMEBUFFER_INCOMPLETE_INSUFFICIENT_SHADER_COMBINED_LOCAL_STORAGE_EXT 0x9652

				typedef void (GL_APIENTRYP PFNGLFRAMEBUFFERPIXELLOCALSTORAGESIZEEXTPROC) (GLuint target, GLsizei size);

				typedef GLsizei (GL_APIENTRYP PFNGLGETFRAMEBUFFERPIXELLOCALSTORAGESIZEEXTPROC) (GLuint target);

				typedef void (GL_APIENTRYP PFNGLCLEARPIXELLOCALSTORAGEUIEXTPROC) (GLsizei offset, GLsizei n, const GLuint *values);

				#ifdef GL_GLEXT_PROTOTYPES

				GL_APICALL void GL_APIENTRY glFramebufferPixelLocalStorageSizeEXT (GLuint target, GLsizei size);

				GL_APICALL GLsizei GL_APIENTRY glGetFramebufferPixelLocalStorageSizeEXT (GLuint target);

				GL_APICALL void GL_APIENTRY glClearPixelLocalStorageuiEXT (GLsizei offset, GLsizei n, const GLuint *values);

				#endif

				#endif /* GL_EXT_shader_pixel_local_storage2 */

				#ifndef GL_EXT_shader_texture_lod

				#define GL_EXT_shader_texture_lod 1

				#endif /* GL_EXT_shader_texture_lod */

				@@ -1888,11 +1973,39 @@ GL_APICALL void GL_APIENTRY glTextureViewEXT (GLuint texture, GLenum target, GLu

				#define GL_UNPACK_SKIP_PIXELS_EXT         0x0CF4

				#endif /* GL_EXT_unpack_subimage */

				#ifndef GL_EXT_window_rectangles

				#define GL_EXT_window_rectangles 1

				#define GL_INCLUSIVE_EXT                  0x8F10

				#define GL_EXCLUSIVE_EXT                  0x8F11

				#define GL_WINDOW_RECTANGLE_EXT           0x8F12

				#define GL_WINDOW_RECTANGLE_MODE_EXT      0x8F13

				#define GL_MAX_WINDOW_RECTANGLES_EXT      0x8F14

				#define GL_NUM_WINDOW_RECTANGLES_EXT      0x8F15

				typedef void (GL_APIENTRYP PFNGLWINDOWRECTANGLESEXTPROC) (GLenum mode, GLsizei count, const GLint *box);

				#ifdef GL_GLEXT_PROTOTYPES

				GL_APICALL void GL_APIENTRY glWindowRectanglesEXT (GLenum mode, GLsizei count, const GLint *box);

				#endif

				#endif /* GL_EXT_window_rectangles */

				#ifndef GL_FJ_shader_binary_GCCSO

				#define GL_FJ_shader_binary_GCCSO 1

				#define GL_GCCSO_SHADER_BINARY_FJ         0x9260

				#endif /* GL_FJ_shader_binary_GCCSO */

				#ifndef GL_IMG_framebuffer_downsample

				#define GL_IMG_framebuffer_downsample 1

				#define GL_FRAMEBUFFER_INCOMPLETE_MULTISAMPLE_AND_DOWNSAMPLE_IMG 0x913C

				#define GL_NUM_DOWNSAMPLE_SCALES_IMG      0x913D

				#define GL_DOWNSAMPLE_SCALES_IMG          0x913E

				#define GL_FRAMEBUFFER_ATTACHMENT_TEXTURE_SCALE_IMG 0x913F

				typedef void (GL_APIENTRYP PFNGLFRAMEBUFFERTEXTURE2DDOWNSAMPLEIMGPROC) (GLenum target, GLenum attachment, GLenum textarget, GLuint texture, GLint level, GLint xscale, GLint yscale);

				typedef void (GL_APIENTRYP PFNGLFRAMEBUFFERTEXTURELAYERDOWNSAMPLEIMGPROC) (GLenum target, GLenum attachment, GLuint texture, GLint level, GLint layer, GLint xscale, GLint yscale);

				#ifdef GL_GLEXT_PROTOTYPES

				GL_APICALL void GL_APIENTRY glFramebufferTexture2DDownsampleIMG (GLenum target, GLenum attachment, GLenum textarget, GLuint texture, GLint level, GLint xscale, GLint yscale);

				GL_APICALL void GL_APIENTRY glFramebufferTextureLayerDownsampleIMG (GLenum target, GLenum attachment, GLuint texture, GLint level, GLint layer, GLint xscale, GLint yscale);

				#endif

				#endif /* GL_IMG_framebuffer_downsample */

				#ifndef GL_IMG_multisampled_render_to_texture

				#define GL_IMG_multisampled_render_to_texture 1

				#define GL_RENDERBUFFER_SAMPLES_IMG       0x9133

				@@ -1944,6 +2057,11 @@ GL_APICALL void GL_APIENTRY glFramebufferTexture2DMultisampleIMG (GLenum target,

				#define GL_CUBIC_MIPMAP_LINEAR_IMG        0x913B

				#endif /* GL_IMG_texture_filter_cubic */

				#ifndef GL_INTEL_conservative_rasterization

				#define GL_INTEL_conservative_rasterization 1

				#define GL_CONSERVATIVE_RASTERIZATION_INTEL 0x83FE

				#endif /* GL_INTEL_conservative_rasterization */

				#ifndef GL_INTEL_framebuffer_CMAA

				#define GL_INTEL_framebuffer_CMAA 1

				typedef void (GL_APIENTRYP PFNGLAPPLYFRAMEBUFFERATTACHMENTCMAAINTELPROC) (void);

				@@ -2120,6 +2238,17 @@ GL_APICALL void GL_APIENTRY glSubpixelPrecisionBiasNV (GLuint xbits, GLuint ybit

				#endif

				#endif /* GL_NV_conservative_raster */

				#ifndef GL_NV_conservative_raster_pre_snap_triangles

				#define GL_NV_conservative_raster_pre_snap_triangles 1

				#define GL_CONSERVATIVE_RASTER_MODE_NV    0x954D

				#define GL_CONSERVATIVE_RASTER_MODE_POST_SNAP_NV 0x954E

				#define GL_CONSERVATIVE_RASTER_MODE_PRE_SNAP_TRIANGLES_NV 0x954F

				typedef void (GL_APIENTRYP PFNGLCONSERVATIVERASTERPARAMETERINVPROC) (GLenum pname, GLint param);

				#ifdef GL_GLEXT_PROTOTYPES

				GL_APICALL void GL_APIENTRY glConservativeRasterParameteriNV (GLenum pname, GLint param);

				#endif

				#endif /* GL_NV_conservative_raster_pre_snap_triangles */

				#ifndef GL_NV_copy_buffer

				#define GL_NV_copy_buffer 1

				#define GL_COPY_READ_BUFFER_NV            0x8F36

				@@ -2307,6 +2436,109 @@ GL_APICALL void GL_APIENTRY glRenderbufferStorageMultisampleNV (GLenum target, G

				#define GL_NV_geometry_shader_passthrough 1

				#endif /* GL_NV_geometry_shader_passthrough */

				#ifndef GL_NV_gpu_shader5

				#define GL_NV_gpu_shader5 1

				typedef khronos_int64_t GLint64EXT;

				typedef khronos_uint64_t GLuint64EXT;

				#define GL_INT64_NV                       0x140E

				#define GL_UNSIGNED_INT64_NV              0x140F

				#define GL_INT8_NV                        0x8FE0

				#define GL_INT8_VEC2_NV                   0x8FE1

				#define GL_INT8_VEC3_NV                   0x8FE2

				#define GL_INT8_VEC4_NV                   0x8FE3

				#define GL_INT16_NV                       0x8FE4

				#define GL_INT16_VEC2_NV                  0x8FE5

				#define GL_INT16_VEC3_NV                  0x8FE6

				#define GL_INT16_VEC4_NV                  0x8FE7

				#define GL_INT64_VEC2_NV                  0x8FE9

				#define GL_INT64_VEC3_NV                  0x8FEA

				#define GL_INT64_VEC4_NV                  0x8FEB

				#define GL_UNSIGNED_INT8_NV               0x8FEC

				#define GL_UNSIGNED_INT8_VEC2_NV          0x8FED

				#define GL_UNSIGNED_INT8_VEC3_NV          0x8FEE

				#define GL_UNSIGNED_INT8_VEC4_NV          0x8FEF

				#define GL_UNSIGNED_INT16_NV              0x8FF0

				#define GL_UNSIGNED_INT16_VEC2_NV         0x8FF1

				#define GL_UNSIGNED_INT16_VEC3_NV         0x8FF2

				#define GL_UNSIGNED_INT16_VEC4_NV         0x8FF3

				#define GL_UNSIGNED_INT64_VEC2_NV         0x8FF5

				#define GL_UNSIGNED_INT64_VEC3_NV         0x8FF6

				#define GL_UNSIGNED_INT64_VEC4_NV         0x8FF7

				#define GL_FLOAT16_NV                     0x8FF8

				#define GL_FLOAT16_VEC2_NV                0x8FF9

				#define GL_FLOAT16_VEC3_NV                0x8FFA

				#define GL_FLOAT16_VEC4_NV                0x8FFB

				#define GL_PATCHES                        0x000E

				typedef void (GL_APIENTRYP PFNGLUNIFORM1I64NVPROC) (GLint location, GLint64EXT x);

				typedef void (GL_APIENTRYP PFNGLUNIFORM2I64NVPROC) (GLint location, GLint64EXT x, GLint64EXT y);

				typedef void (GL_APIENTRYP PFNGLUNIFORM3I64NVPROC) (GLint location, GLint64EXT x, GLint64EXT y, GLint64EXT z);

				typedef void (GL_APIENTRYP PFNGLUNIFORM4I64NVPROC) (GLint location, GLint64EXT x, GLint64EXT y, GLint64EXT z, GLint64EXT w);

				typedef void (GL_APIENTRYP PFNGLUNIFORM1I64VNVPROC) (GLint location, GLsizei count, const GLint64EXT *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORM2I64VNVPROC) (GLint location, GLsizei count, const GLint64EXT *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORM3I64VNVPROC) (GLint location, GLsizei count, const GLint64EXT *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORM4I64VNVPROC) (GLint location, GLsizei count, const GLint64EXT *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORM1UI64NVPROC) (GLint location, GLuint64EXT x);

				typedef void (GL_APIENTRYP PFNGLUNIFORM2UI64NVPROC) (GLint location, GLuint64EXT x, GLuint64EXT y);

				typedef void (GL_APIENTRYP PFNGLUNIFORM3UI64NVPROC) (GLint location, GLuint64EXT x, GLuint64EXT y, GLuint64EXT z);

				typedef void (GL_APIENTRYP PFNGLUNIFORM4UI64NVPROC) (GLint location, GLuint64EXT x, GLuint64EXT y, GLuint64EXT z, GLuint64EXT w);

				typedef void (GL_APIENTRYP PFNGLUNIFORM1UI64VNVPROC) (GLint location, GLsizei count, const GLuint64EXT *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORM2UI64VNVPROC) (GLint location, GLsizei count, const GLuint64EXT *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORM3UI64VNVPROC) (GLint location, GLsizei count, const GLuint64EXT *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORM4UI64VNVPROC) (GLint location, GLsizei count, const GLuint64EXT *value);

				typedef void (GL_APIENTRYP PFNGLGETUNIFORMI64VNVPROC) (GLuint program, GLint location, GLint64EXT *params);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM1I64NVPROC) (GLuint program, GLint location, GLint64EXT x);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM2I64NVPROC) (GLuint program, GLint location, GLint64EXT x, GLint64EXT y);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM3I64NVPROC) (GLuint program, GLint location, GLint64EXT x, GLint64EXT y, GLint64EXT z);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM4I64NVPROC) (GLuint program, GLint location, GLint64EXT x, GLint64EXT y, GLint64EXT z, GLint64EXT w);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM1I64VNVPROC) (GLuint program, GLint location, GLsizei count, const GLint64EXT *value);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM2I64VNVPROC) (GLuint program, GLint location, GLsizei count, const GLint64EXT *value);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM3I64VNVPROC) (GLuint program, GLint location, GLsizei count, const GLint64EXT *value);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM4I64VNVPROC) (GLuint program, GLint location, GLsizei count, const GLint64EXT *value);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM1UI64NVPROC) (GLuint program, GLint location, GLuint64EXT x);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM2UI64NVPROC) (GLuint program, GLint location, GLuint64EXT x, GLuint64EXT y);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM3UI64NVPROC) (GLuint program, GLint location, GLuint64EXT x, GLuint64EXT y, GLuint64EXT z);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM4UI64NVPROC) (GLuint program, GLint location, GLuint64EXT x, GLuint64EXT y, GLuint64EXT z, GLuint64EXT w);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM1UI64VNVPROC) (GLuint program, GLint location, GLsizei count, const GLuint64EXT *value);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM2UI64VNVPROC) (GLuint program, GLint location, GLsizei count, const GLuint64EXT *value);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM3UI64VNVPROC) (GLuint program, GLint location, GLsizei count, const GLuint64EXT *value);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM4UI64VNVPROC) (GLuint program, GLint location, GLsizei count, const GLuint64EXT *value);

				#ifdef GL_GLEXT_PROTOTYPES

				GL_APICALL void GL_APIENTRY glUniform1i64NV (GLint location, GLint64EXT x);

				GL_APICALL void GL_APIENTRY glUniform2i64NV (GLint location, GLint64EXT x, GLint64EXT y);

				GL_APICALL void GL_APIENTRY glUniform3i64NV (GLint location, GLint64EXT x, GLint64EXT y, GLint64EXT z);

				GL_APICALL void GL_APIENTRY glUniform4i64NV (GLint location, GLint64EXT x, GLint64EXT y, GLint64EXT z, GLint64EXT w);

				GL_APICALL void GL_APIENTRY glUniform1i64vNV (GLint location, GLsizei count, const GLint64EXT *value);

				GL_APICALL void GL_APIENTRY glUniform2i64vNV (GLint location, GLsizei count, const GLint64EXT *value);

				GL_APICALL void GL_APIENTRY glUniform3i64vNV (GLint location, GLsizei count, const GLint64EXT *value);

				GL_APICALL void GL_APIENTRY glUniform4i64vNV (GLint location, GLsizei count, const GLint64EXT *value);

				GL_APICALL void GL_APIENTRY glUniform1ui64NV (GLint location, GLuint64EXT x);

				GL_APICALL void GL_APIENTRY glUniform2ui64NV (GLint location, GLuint64EXT x, GLuint64EXT y);

				GL_APICALL void GL_APIENTRY glUniform3ui64NV (GLint location, GLuint64EXT x, GLuint64EXT y, GLuint64EXT z);

				GL_APICALL void GL_APIENTRY glUniform4ui64NV (GLint location, GLuint64EXT x, GLuint64EXT y, GLuint64EXT z, GLuint64EXT w);

				GL_APICALL void GL_APIENTRY glUniform1ui64vNV (GLint location, GLsizei count, const GLuint64EXT *value);

				GL_APICALL void GL_APIENTRY glUniform2ui64vNV (GLint location, GLsizei count, const GLuint64EXT *value);

				GL_APICALL void GL_APIENTRY glUniform3ui64vNV (GLint location, GLsizei count, const GLuint64EXT *value);

				GL_APICALL void GL_APIENTRY glUniform4ui64vNV (GLint location, GLsizei count, const GLuint64EXT *value);

				GL_APICALL void GL_APIENTRY glGetUniformi64vNV (GLuint program, GLint location, GLint64EXT *params);

				GL_APICALL void GL_APIENTRY glProgramUniform1i64NV (GLuint program, GLint location, GLint64EXT x);

				GL_APICALL void GL_APIENTRY glProgramUniform2i64NV (GLuint program, GLint location, GLint64EXT x, GLint64EXT y);

				GL_APICALL void GL_APIENTRY glProgramUniform3i64NV (GLuint program, GLint location, GLint64EXT x, GLint64EXT y, GLint64EXT z);

				GL_APICALL void GL_APIENTRY glProgramUniform4i64NV (GLuint program, GLint location, GLint64EXT x, GLint64EXT y, GLint64EXT z, GLint64EXT w);

				GL_APICALL void GL_APIENTRY glProgramUniform1i64vNV (GLuint program, GLint location, GLsizei count, const GLint64EXT *value);

				GL_APICALL void GL_APIENTRY glProgramUniform2i64vNV (GLuint program, GLint location, GLsizei count, const GLint64EXT *value);

				GL_APICALL void GL_APIENTRY glProgramUniform3i64vNV (GLuint program, GLint location, GLsizei count, const GLint64EXT *value);

				GL_APICALL void GL_APIENTRY glProgramUniform4i64vNV (GLuint program, GLint location, GLsizei count, const GLint64EXT *value);

				GL_APICALL void GL_APIENTRY glProgramUniform1ui64NV (GLuint program, GLint location, GLuint64EXT x);

				GL_APICALL void GL_APIENTRY glProgramUniform2ui64NV (GLuint program, GLint location, GLuint64EXT x, GLuint64EXT y);

				GL_APICALL void GL_APIENTRY glProgramUniform3ui64NV (GLuint program, GLint location, GLuint64EXT x, GLuint64EXT y, GLuint64EXT z);

				GL_APICALL void GL_APIENTRY glProgramUniform4ui64NV (GLuint program, GLint location, GLuint64EXT x, GLuint64EXT y, GLuint64EXT z, GLuint64EXT w);

				GL_APICALL void GL_APIENTRY glProgramUniform1ui64vNV (GLuint program, GLint location, GLsizei count, const GLuint64EXT *value);

				GL_APICALL void GL_APIENTRY glProgramUniform2ui64vNV (GLuint program, GLint location, GLsizei count, const GLuint64EXT *value);

				GL_APICALL void GL_APIENTRY glProgramUniform3ui64vNV (GLuint program, GLint location, GLsizei count, const GLuint64EXT *value);

				GL_APICALL void GL_APIENTRY glProgramUniform4ui64vNV (GLuint program, GLint location, GLsizei count, const GLuint64EXT *value);

				#endif

				#endif /* GL_NV_gpu_shader5 */

				#ifndef GL_NV_image_formats

				#define GL_NV_image_formats 1

				#endif /* GL_NV_image_formats */

				@@ -2713,6 +2945,10 @@ GL_APICALL void GL_APIENTRY glResolveDepthValuesNV (void);

				#define GL_NV_sample_mask_override_coverage 1

				#endif /* GL_NV_sample_mask_override_coverage */

				#ifndef GL_NV_shader_atomic_fp16_vector

				#define GL_NV_shader_atomic_fp16_vector 1

				#endif /* GL_NV_shader_atomic_fp16_vector */

				#ifndef GL_NV_shader_noperspective_interpolation

				#define GL_NV_shader_noperspective_interpolation 1

				#endif /* GL_NV_shader_noperspective_interpolation */

				@@ -2779,6 +3015,26 @@ GL_APICALL GLboolean GL_APIENTRY glIsEnablediNV (GLenum target, GLuint index);

				#define GL_NV_viewport_array2 1

				#endif /* GL_NV_viewport_array2 */

				#ifndef GL_NV_viewport_swizzle

				#define GL_NV_viewport_swizzle 1

				#define GL_VIEWPORT_SWIZZLE_POSITIVE_X_NV 0x9350

				#define GL_VIEWPORT_SWIZZLE_NEGATIVE_X_NV 0x9351

				#define GL_VIEWPORT_SWIZZLE_POSITIVE_Y_NV 0x9352

				#define GL_VIEWPORT_SWIZZLE_NEGATIVE_Y_NV 0x9353

				#define GL_VIEWPORT_SWIZZLE_POSITIVE_Z_NV 0x9354

				#define GL_VIEWPORT_SWIZZLE_NEGATIVE_Z_NV 0x9355

				#define GL_VIEWPORT_SWIZZLE_POSITIVE_W_NV 0x9356

				#define GL_VIEWPORT_SWIZZLE_NEGATIVE_W_NV 0x9357

				#define GL_VIEWPORT_SWIZZLE_X_NV          0x9358

				#define GL_VIEWPORT_SWIZZLE_Y_NV          0x9359

				#define GL_VIEWPORT_SWIZZLE_Z_NV          0x935A

				#define GL_VIEWPORT_SWIZZLE_W_NV          0x935B

				typedef void (GL_APIENTRYP PFNGLVIEWPORTSWIZZLENVPROC) (GLuint index, GLenum swizzlex, GLenum swizzley, GLenum swizzlez, GLenum swizzlew);

				#ifdef GL_GLEXT_PROTOTYPES

				GL_APICALL void GL_APIENTRY glViewportSwizzleNV (GLuint index, GLenum swizzlex, GLenum swizzley, GLenum swizzlez, GLenum swizzlew);

				#endif

				#endif /* GL_NV_viewport_swizzle */

				#ifndef GL_OVR_multiview

				#define GL_OVR_multiview 1

				#define GL_FRAMEBUFFER_ATTACHMENT_TEXTURE_NUM_VIEWS_OVR 0x9630

									
										2

include/GLES2/gl2platform.h
									
												View File
												
				@@ -1,7 +1,7 @@

				#ifndef __gl2platform_h_

				#define __gl2platform_h_

				/* $Revision: 10602 $ on $Date:: 2010-03-04 22:35:34 -0800 #$ */

				/* $Revision: 23328 $ on $Date:: 2013-10-02 02:28:28 -0700 #$ */

				/*

				 * This document is licensed under the SGI Free Software B License Version

									
										276

include/GLES3/gl3.h
									
												View File
												
				@@ -6,7 +6,7 @@ extern "C" {

				#endif

				/*

				** Copyright (c) 2013 The Khronos Group Inc.

				** Copyright (c) 2013-2016 The Khronos Group Inc.

				**

				** Permission is hereby granted, free of charge, to any person obtaining a

				** copy of this software and/or associated documentation files (the

				@@ -33,17 +33,21 @@ extern "C" {

				** used to make the header, and the header can be found at

				**   http://www.opengl.org/registry/

				**

				** Khronos $Revision: 24614 $ on $Date: 2013-12-30 04:44:46 -0800 (Mon, 30 Dec 2013) $

				** Khronos $Revision: 32749 $ on $Date: 2016-04-28 09:03:03 -0700 (Thu, 28 Apr 2016) $

				*/

				#include <GLES3/gl3platform.h>

				/* Generated on date 20131230 */

				#ifndef GL_APIENTRYP

				#define GL_APIENTRYP GL_APIENTRY*

				#endif

				/* Generated on date 20160428 */

				/* Generated C header for:

				 * API: gles2

				 * Profile: common

				 * Versions considered: [23]\.[0-9]

				 * Versions considered: 2\.[0-9]|3\.0

				 * Versions emitted: .*

				 * Default extensions included: None

				 * Additional extensions included: _nomatch_^

				@@ -374,6 +378,148 @@ typedef khronos_uint8_t GLubyte;

				#define GL_RENDERBUFFER_BINDING           0x8CA7

				#define GL_MAX_RENDERBUFFER_SIZE          0x84E8

				#define GL_INVALID_FRAMEBUFFER_OPERATION  0x0506

				typedef void (GL_APIENTRYP PFNGLACTIVETEXTUREPROC) (GLenum texture);

				typedef void (GL_APIENTRYP PFNGLATTACHSHADERPROC) (GLuint program, GLuint shader);

				typedef void (GL_APIENTRYP PFNGLBINDATTRIBLOCATIONPROC) (GLuint program, GLuint index, const GLchar *name);

				typedef void (GL_APIENTRYP PFNGLBINDBUFFERPROC) (GLenum target, GLuint buffer);

				typedef void (GL_APIENTRYP PFNGLBINDFRAMEBUFFERPROC) (GLenum target, GLuint framebuffer);

				typedef void (GL_APIENTRYP PFNGLBINDRENDERBUFFERPROC) (GLenum target, GLuint renderbuffer);

				typedef void (GL_APIENTRYP PFNGLBINDTEXTUREPROC) (GLenum target, GLuint texture);

				typedef void (GL_APIENTRYP PFNGLBLENDCOLORPROC) (GLfloat red, GLfloat green, GLfloat blue, GLfloat alpha);

				typedef void (GL_APIENTRYP PFNGLBLENDEQUATIONPROC) (GLenum mode);

				typedef void (GL_APIENTRYP PFNGLBLENDEQUATIONSEPARATEPROC) (GLenum modeRGB, GLenum modeAlpha);

				typedef void (GL_APIENTRYP PFNGLBLENDFUNCPROC) (GLenum sfactor, GLenum dfactor);

				typedef void (GL_APIENTRYP PFNGLBLENDFUNCSEPARATEPROC) (GLenum sfactorRGB, GLenum dfactorRGB, GLenum sfactorAlpha, GLenum dfactorAlpha);

				typedef void (GL_APIENTRYP PFNGLBUFFERDATAPROC) (GLenum target, GLsizeiptr size, const void *data, GLenum usage);

				typedef void (GL_APIENTRYP PFNGLBUFFERSUBDATAPROC) (GLenum target, GLintptr offset, GLsizeiptr size, const void *data);

				typedef GLenum (GL_APIENTRYP PFNGLCHECKFRAMEBUFFERSTATUSPROC) (GLenum target);

				typedef void (GL_APIENTRYP PFNGLCLEARPROC) (GLbitfield mask);

				typedef void (GL_APIENTRYP PFNGLCLEARCOLORPROC) (GLfloat red, GLfloat green, GLfloat blue, GLfloat alpha);

				typedef void (GL_APIENTRYP PFNGLCLEARDEPTHFPROC) (GLfloat d);

				typedef void (GL_APIENTRYP PFNGLCLEARSTENCILPROC) (GLint s);

				typedef void (GL_APIENTRYP PFNGLCOLORMASKPROC) (GLboolean red, GLboolean green, GLboolean blue, GLboolean alpha);

				typedef void (GL_APIENTRYP PFNGLCOMPILESHADERPROC) (GLuint shader);

				typedef void (GL_APIENTRYP PFNGLCOMPRESSEDTEXIMAGE2DPROC) (GLenum target, GLint level, GLenum internalformat, GLsizei width, GLsizei height, GLint border, GLsizei imageSize, const void *data);

				typedef void (GL_APIENTRYP PFNGLCOMPRESSEDTEXSUBIMAGE2DPROC) (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLsizei width, GLsizei height, GLenum format, GLsizei imageSize, const void *data);

				typedef void (GL_APIENTRYP PFNGLCOPYTEXIMAGE2DPROC) (GLenum target, GLint level, GLenum internalformat, GLint x, GLint y, GLsizei width, GLsizei height, GLint border);

				typedef void (GL_APIENTRYP PFNGLCOPYTEXSUBIMAGE2DPROC) (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLint x, GLint y, GLsizei width, GLsizei height);

				typedef GLuint (GL_APIENTRYP PFNGLCREATEPROGRAMPROC) (void);

				typedef GLuint (GL_APIENTRYP PFNGLCREATESHADERPROC) (GLenum type);

				typedef void (GL_APIENTRYP PFNGLCULLFACEPROC) (GLenum mode);

				typedef void (GL_APIENTRYP PFNGLDELETEBUFFERSPROC) (GLsizei n, const GLuint *buffers);

				typedef void (GL_APIENTRYP PFNGLDELETEFRAMEBUFFERSPROC) (GLsizei n, const GLuint *framebuffers);

				typedef void (GL_APIENTRYP PFNGLDELETEPROGRAMPROC) (GLuint program);

				typedef void (GL_APIENTRYP PFNGLDELETERENDERBUFFERSPROC) (GLsizei n, const GLuint *renderbuffers);

				typedef void (GL_APIENTRYP PFNGLDELETESHADERPROC) (GLuint shader);

				typedef void (GL_APIENTRYP PFNGLDELETETEXTURESPROC) (GLsizei n, const GLuint *textures);

				typedef void (GL_APIENTRYP PFNGLDEPTHFUNCPROC) (GLenum func);

				typedef void (GL_APIENTRYP PFNGLDEPTHMASKPROC) (GLboolean flag);

				typedef void (GL_APIENTRYP PFNGLDEPTHRANGEFPROC) (GLfloat n, GLfloat f);

				typedef void (GL_APIENTRYP PFNGLDETACHSHADERPROC) (GLuint program, GLuint shader);

				typedef void (GL_APIENTRYP PFNGLDISABLEPROC) (GLenum cap);

				typedef void (GL_APIENTRYP PFNGLDISABLEVERTEXATTRIBARRAYPROC) (GLuint index);

				typedef void (GL_APIENTRYP PFNGLDRAWARRAYSPROC) (GLenum mode, GLint first, GLsizei count);

				typedef void (GL_APIENTRYP PFNGLDRAWELEMENTSPROC) (GLenum mode, GLsizei count, GLenum type, const void *indices);

				typedef void (GL_APIENTRYP PFNGLENABLEPROC) (GLenum cap);

				typedef void (GL_APIENTRYP PFNGLENABLEVERTEXATTRIBARRAYPROC) (GLuint index);

				typedef void (GL_APIENTRYP PFNGLFINISHPROC) (void);

				typedef void (GL_APIENTRYP PFNGLFLUSHPROC) (void);

				typedef void (GL_APIENTRYP PFNGLFRAMEBUFFERRENDERBUFFERPROC) (GLenum target, GLenum attachment, GLenum renderbuffertarget, GLuint renderbuffer);

				typedef void (GL_APIENTRYP PFNGLFRAMEBUFFERTEXTURE2DPROC) (GLenum target, GLenum attachment, GLenum textarget, GLuint texture, GLint level);

				typedef void (GL_APIENTRYP PFNGLFRONTFACEPROC) (GLenum mode);

				typedef void (GL_APIENTRYP PFNGLGENBUFFERSPROC) (GLsizei n, GLuint *buffers);

				typedef void (GL_APIENTRYP PFNGLGENERATEMIPMAPPROC) (GLenum target);

				typedef void (GL_APIENTRYP PFNGLGENFRAMEBUFFERSPROC) (GLsizei n, GLuint *framebuffers);

				typedef void (GL_APIENTRYP PFNGLGENRENDERBUFFERSPROC) (GLsizei n, GLuint *renderbuffers);

				typedef void (GL_APIENTRYP PFNGLGENTEXTURESPROC) (GLsizei n, GLuint *textures);

				typedef void (GL_APIENTRYP PFNGLGETACTIVEATTRIBPROC) (GLuint program, GLuint index, GLsizei bufSize, GLsizei *length, GLint *size, GLenum *type, GLchar *name);

				typedef void (GL_APIENTRYP PFNGLGETACTIVEUNIFORMPROC) (GLuint program, GLuint index, GLsizei bufSize, GLsizei *length, GLint *size, GLenum *type, GLchar *name);

				typedef void (GL_APIENTRYP PFNGLGETATTACHEDSHADERSPROC) (GLuint program, GLsizei maxCount, GLsizei *count, GLuint *shaders);

				typedef GLint (GL_APIENTRYP PFNGLGETATTRIBLOCATIONPROC) (GLuint program, const GLchar *name);

				typedef void (GL_APIENTRYP PFNGLGETBOOLEANVPROC) (GLenum pname, GLboolean *data);

				typedef void (GL_APIENTRYP PFNGLGETBUFFERPARAMETERIVPROC) (GLenum target, GLenum pname, GLint *params);

				typedef GLenum (GL_APIENTRYP PFNGLGETERRORPROC) (void);

				typedef void (GL_APIENTRYP PFNGLGETFLOATVPROC) (GLenum pname, GLfloat *data);

				typedef void (GL_APIENTRYP PFNGLGETFRAMEBUFFERATTACHMENTPARAMETERIVPROC) (GLenum target, GLenum attachment, GLenum pname, GLint *params);

				typedef void (GL_APIENTRYP PFNGLGETINTEGERVPROC) (GLenum pname, GLint *data);

				typedef void (GL_APIENTRYP PFNGLGETPROGRAMIVPROC) (GLuint program, GLenum pname, GLint *params);

				typedef void (GL_APIENTRYP PFNGLGETPROGRAMINFOLOGPROC) (GLuint program, GLsizei bufSize, GLsizei *length, GLchar *infoLog);

				typedef void (GL_APIENTRYP PFNGLGETRENDERBUFFERPARAMETERIVPROC) (GLenum target, GLenum pname, GLint *params);

				typedef void (GL_APIENTRYP PFNGLGETSHADERIVPROC) (GLuint shader, GLenum pname, GLint *params);

				typedef void (GL_APIENTRYP PFNGLGETSHADERINFOLOGPROC) (GLuint shader, GLsizei bufSize, GLsizei *length, GLchar *infoLog);

				typedef void (GL_APIENTRYP PFNGLGETSHADERPRECISIONFORMATPROC) (GLenum shadertype, GLenum precisiontype, GLint *range, GLint *precision);

				typedef void (GL_APIENTRYP PFNGLGETSHADERSOURCEPROC) (GLuint shader, GLsizei bufSize, GLsizei *length, GLchar *source);

				typedef const GLubyte *(GL_APIENTRYP PFNGLGETSTRINGPROC) (GLenum name);

				typedef void (GL_APIENTRYP PFNGLGETTEXPARAMETERFVPROC) (GLenum target, GLenum pname, GLfloat *params);

				typedef void (GL_APIENTRYP PFNGLGETTEXPARAMETERIVPROC) (GLenum target, GLenum pname, GLint *params);

				typedef void (GL_APIENTRYP PFNGLGETUNIFORMFVPROC) (GLuint program, GLint location, GLfloat *params);

				typedef void (GL_APIENTRYP PFNGLGETUNIFORMIVPROC) (GLuint program, GLint location, GLint *params);

				typedef GLint (GL_APIENTRYP PFNGLGETUNIFORMLOCATIONPROC) (GLuint program, const GLchar *name);

				typedef void (GL_APIENTRYP PFNGLGETVERTEXATTRIBFVPROC) (GLuint index, GLenum pname, GLfloat *params);

				typedef void (GL_APIENTRYP PFNGLGETVERTEXATTRIBIVPROC) (GLuint index, GLenum pname, GLint *params);

				typedef void (GL_APIENTRYP PFNGLGETVERTEXATTRIBPOINTERVPROC) (GLuint index, GLenum pname, void **pointer);

				typedef void (GL_APIENTRYP PFNGLHINTPROC) (GLenum target, GLenum mode);

				typedef GLboolean (GL_APIENTRYP PFNGLISBUFFERPROC) (GLuint buffer);

				typedef GLboolean (GL_APIENTRYP PFNGLISENABLEDPROC) (GLenum cap);

				typedef GLboolean (GL_APIENTRYP PFNGLISFRAMEBUFFERPROC) (GLuint framebuffer);

				typedef GLboolean (GL_APIENTRYP PFNGLISPROGRAMPROC) (GLuint program);

				typedef GLboolean (GL_APIENTRYP PFNGLISRENDERBUFFERPROC) (GLuint renderbuffer);

				typedef GLboolean (GL_APIENTRYP PFNGLISSHADERPROC) (GLuint shader);

				typedef GLboolean (GL_APIENTRYP PFNGLISTEXTUREPROC) (GLuint texture);

				typedef void (GL_APIENTRYP PFNGLLINEWIDTHPROC) (GLfloat width);

				typedef void (GL_APIENTRYP PFNGLLINKPROGRAMPROC) (GLuint program);

				typedef void (GL_APIENTRYP PFNGLPIXELSTOREIPROC) (GLenum pname, GLint param);

				typedef void (GL_APIENTRYP PFNGLPOLYGONOFFSETPROC) (GLfloat factor, GLfloat units);

				typedef void (GL_APIENTRYP PFNGLREADPIXELSPROC) (GLint x, GLint y, GLsizei width, GLsizei height, GLenum format, GLenum type, void *pixels);

				typedef void (GL_APIENTRYP PFNGLRELEASESHADERCOMPILERPROC) (void);

				typedef void (GL_APIENTRYP PFNGLRENDERBUFFERSTORAGEPROC) (GLenum target, GLenum internalformat, GLsizei width, GLsizei height);

				typedef void (GL_APIENTRYP PFNGLSAMPLECOVERAGEPROC) (GLfloat value, GLboolean invert);

				typedef void (GL_APIENTRYP PFNGLSCISSORPROC) (GLint x, GLint y, GLsizei width, GLsizei height);

				typedef void (GL_APIENTRYP PFNGLSHADERBINARYPROC) (GLsizei count, const GLuint *shaders, GLenum binaryformat, const void *binary, GLsizei length);

				typedef void (GL_APIENTRYP PFNGLSHADERSOURCEPROC) (GLuint shader, GLsizei count, const GLchar *const*string, const GLint *length);

				typedef void (GL_APIENTRYP PFNGLSTENCILFUNCPROC) (GLenum func, GLint ref, GLuint mask);

				typedef void (GL_APIENTRYP PFNGLSTENCILFUNCSEPARATEPROC) (GLenum face, GLenum func, GLint ref, GLuint mask);

				typedef void (GL_APIENTRYP PFNGLSTENCILMASKPROC) (GLuint mask);

				typedef void (GL_APIENTRYP PFNGLSTENCILMASKSEPARATEPROC) (GLenum face, GLuint mask);

				typedef void (GL_APIENTRYP PFNGLSTENCILOPPROC) (GLenum fail, GLenum zfail, GLenum zpass);

				typedef void (GL_APIENTRYP PFNGLSTENCILOPSEPARATEPROC) (GLenum face, GLenum sfail, GLenum dpfail, GLenum dppass);

				typedef void (GL_APIENTRYP PFNGLTEXIMAGE2DPROC) (GLenum target, GLint level, GLint internalformat, GLsizei width, GLsizei height, GLint border, GLenum format, GLenum type, const void *pixels);

				typedef void (GL_APIENTRYP PFNGLTEXPARAMETERFPROC) (GLenum target, GLenum pname, GLfloat param);

				typedef void (GL_APIENTRYP PFNGLTEXPARAMETERFVPROC) (GLenum target, GLenum pname, const GLfloat *params);

				typedef void (GL_APIENTRYP PFNGLTEXPARAMETERIPROC) (GLenum target, GLenum pname, GLint param);

				typedef void (GL_APIENTRYP PFNGLTEXPARAMETERIVPROC) (GLenum target, GLenum pname, const GLint *params);

				typedef void (GL_APIENTRYP PFNGLTEXSUBIMAGE2DPROC) (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLsizei width, GLsizei height, GLenum format, GLenum type, const void *pixels);

				typedef void (GL_APIENTRYP PFNGLUNIFORM1FPROC) (GLint location, GLfloat v0);

				typedef void (GL_APIENTRYP PFNGLUNIFORM1FVPROC) (GLint location, GLsizei count, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORM1IPROC) (GLint location, GLint v0);

				typedef void (GL_APIENTRYP PFNGLUNIFORM1IVPROC) (GLint location, GLsizei count, const GLint *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORM2FPROC) (GLint location, GLfloat v0, GLfloat v1);

				typedef void (GL_APIENTRYP PFNGLUNIFORM2FVPROC) (GLint location, GLsizei count, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORM2IPROC) (GLint location, GLint v0, GLint v1);

				typedef void (GL_APIENTRYP PFNGLUNIFORM2IVPROC) (GLint location, GLsizei count, const GLint *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORM3FPROC) (GLint location, GLfloat v0, GLfloat v1, GLfloat v2);

				typedef void (GL_APIENTRYP PFNGLUNIFORM3FVPROC) (GLint location, GLsizei count, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORM3IPROC) (GLint location, GLint v0, GLint v1, GLint v2);

				typedef void (GL_APIENTRYP PFNGLUNIFORM3IVPROC) (GLint location, GLsizei count, const GLint *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORM4FPROC) (GLint location, GLfloat v0, GLfloat v1, GLfloat v2, GLfloat v3);

				typedef void (GL_APIENTRYP PFNGLUNIFORM4FVPROC) (GLint location, GLsizei count, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORM4IPROC) (GLint location, GLint v0, GLint v1, GLint v2, GLint v3);

				typedef void (GL_APIENTRYP PFNGLUNIFORM4IVPROC) (GLint location, GLsizei count, const GLint *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX2FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX3FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX4FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLUSEPROGRAMPROC) (GLuint program);

				typedef void (GL_APIENTRYP PFNGLVALIDATEPROGRAMPROC) (GLuint program);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB1FPROC) (GLuint index, GLfloat x);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB1FVPROC) (GLuint index, const GLfloat *v);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB2FPROC) (GLuint index, GLfloat x, GLfloat y);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB2FVPROC) (GLuint index, const GLfloat *v);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB3FPROC) (GLuint index, GLfloat x, GLfloat y, GLfloat z);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB3FVPROC) (GLuint index, const GLfloat *v);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB4FPROC) (GLuint index, GLfloat x, GLfloat y, GLfloat z, GLfloat w);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB4FVPROC) (GLuint index, const GLfloat *v);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIBPOINTERPROC) (GLuint index, GLint size, GLenum type, GLboolean normalized, GLsizei stride, const void *pointer);

				typedef void (GL_APIENTRYP PFNGLVIEWPORTPROC) (GLint x, GLint y, GLsizei width, GLsizei height);

				GL_APICALL void GL_APIENTRY glActiveTexture (GLenum texture);

				GL_APICALL void GL_APIENTRY glAttachShader (GLuint program, GLuint shader);

				GL_APICALL void GL_APIENTRY glBindAttribLocation (GLuint program, GLuint index, const GLchar *name);

				@@ -705,6 +851,22 @@ typedef unsigned short GLhalf;

				#define GL_COLOR_ATTACHMENT13             0x8CED

				#define GL_COLOR_ATTACHMENT14             0x8CEE

				#define GL_COLOR_ATTACHMENT15             0x8CEF

				#define GL_COLOR_ATTACHMENT16             0x8CF0

				#define GL_COLOR_ATTACHMENT17             0x8CF1

				#define GL_COLOR_ATTACHMENT18             0x8CF2

				#define GL_COLOR_ATTACHMENT19             0x8CF3

				#define GL_COLOR_ATTACHMENT20             0x8CF4

				#define GL_COLOR_ATTACHMENT21             0x8CF5

				#define GL_COLOR_ATTACHMENT22             0x8CF6

				#define GL_COLOR_ATTACHMENT23             0x8CF7

				#define GL_COLOR_ATTACHMENT24             0x8CF8

				#define GL_COLOR_ATTACHMENT25             0x8CF9

				#define GL_COLOR_ATTACHMENT26             0x8CFA

				#define GL_COLOR_ATTACHMENT27             0x8CFB

				#define GL_COLOR_ATTACHMENT28             0x8CFC

				#define GL_COLOR_ATTACHMENT29             0x8CFD

				#define GL_COLOR_ATTACHMENT30             0x8CFE

				#define GL_COLOR_ATTACHMENT31             0x8CFF

				#define GL_FRAMEBUFFER_INCOMPLETE_MULTISAMPLE 0x8D56

				#define GL_MAX_SAMPLES                    0x8D57

				#define GL_HALF_FLOAT                     0x140B

				@@ -826,7 +988,111 @@ typedef unsigned short GLhalf;

				#define GL_MAX_ELEMENT_INDEX              0x8D6B

				#define GL_NUM_SAMPLE_COUNTS              0x9380

				#define GL_TEXTURE_IMMUTABLE_LEVELS       0x82DF

				GL_APICALL void GL_APIENTRY glReadBuffer (GLenum mode);

				typedef void (GL_APIENTRYP PFNGLREADBUFFERPROC) (GLenum src);

				typedef void (GL_APIENTRYP PFNGLDRAWRANGEELEMENTSPROC) (GLenum mode, GLuint start, GLuint end, GLsizei count, GLenum type, const void *indices);

				typedef void (GL_APIENTRYP PFNGLTEXIMAGE3DPROC) (GLenum target, GLint level, GLint internalformat, GLsizei width, GLsizei height, GLsizei depth, GLint border, GLenum format, GLenum type, const void *pixels);

				typedef void (GL_APIENTRYP PFNGLTEXSUBIMAGE3DPROC) (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLsizei width, GLsizei height, GLsizei depth, GLenum format, GLenum type, const void *pixels);

				typedef void (GL_APIENTRYP PFNGLCOPYTEXSUBIMAGE3DPROC) (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLint x, GLint y, GLsizei width, GLsizei height);

				typedef void (GL_APIENTRYP PFNGLCOMPRESSEDTEXIMAGE3DPROC) (GLenum target, GLint level, GLenum internalformat, GLsizei width, GLsizei height, GLsizei depth, GLint border, GLsizei imageSize, const void *data);

				typedef void (GL_APIENTRYP PFNGLCOMPRESSEDTEXSUBIMAGE3DPROC) (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLsizei width, GLsizei height, GLsizei depth, GLenum format, GLsizei imageSize, const void *data);

				typedef void (GL_APIENTRYP PFNGLGENQUERIESPROC) (GLsizei n, GLuint *ids);

				typedef void (GL_APIENTRYP PFNGLDELETEQUERIESPROC) (GLsizei n, const GLuint *ids);

				typedef GLboolean (GL_APIENTRYP PFNGLISQUERYPROC) (GLuint id);

				typedef void (GL_APIENTRYP PFNGLBEGINQUERYPROC) (GLenum target, GLuint id);

				typedef void (GL_APIENTRYP PFNGLENDQUERYPROC) (GLenum target);

				typedef void (GL_APIENTRYP PFNGLGETQUERYIVPROC) (GLenum target, GLenum pname, GLint *params);

				typedef void (GL_APIENTRYP PFNGLGETQUERYOBJECTUIVPROC) (GLuint id, GLenum pname, GLuint *params);

				typedef GLboolean (GL_APIENTRYP PFNGLUNMAPBUFFERPROC) (GLenum target);

				typedef void (GL_APIENTRYP PFNGLGETBUFFERPOINTERVPROC) (GLenum target, GLenum pname, void **params);

				typedef void (GL_APIENTRYP PFNGLDRAWBUFFERSPROC) (GLsizei n, const GLenum *bufs);

				typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX2X3FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX3X2FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX2X4FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX4X2FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX3X4FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX4X3FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLBLITFRAMEBUFFERPROC) (GLint srcX0, GLint srcY0, GLint srcX1, GLint srcY1, GLint dstX0, GLint dstY0, GLint dstX1, GLint dstY1, GLbitfield mask, GLenum filter);

				typedef void (GL_APIENTRYP PFNGLRENDERBUFFERSTORAGEMULTISAMPLEPROC) (GLenum target, GLsizei samples, GLenum internalformat, GLsizei width, GLsizei height);

				typedef void (GL_APIENTRYP PFNGLFRAMEBUFFERTEXTURELAYERPROC) (GLenum target, GLenum attachment, GLuint texture, GLint level, GLint layer);

				typedef void *(GL_APIENTRYP PFNGLMAPBUFFERRANGEPROC) (GLenum target, GLintptr offset, GLsizeiptr length, GLbitfield access);

				typedef void (GL_APIENTRYP PFNGLFLUSHMAPPEDBUFFERRANGEPROC) (GLenum target, GLintptr offset, GLsizeiptr length);

				typedef void (GL_APIENTRYP PFNGLBINDVERTEXARRAYPROC) (GLuint array);

				typedef void (GL_APIENTRYP PFNGLDELETEVERTEXARRAYSPROC) (GLsizei n, const GLuint *arrays);

				typedef void (GL_APIENTRYP PFNGLGENVERTEXARRAYSPROC) (GLsizei n, GLuint *arrays);

				typedef GLboolean (GL_APIENTRYP PFNGLISVERTEXARRAYPROC) (GLuint array);

				typedef void (GL_APIENTRYP PFNGLGETINTEGERI_VPROC) (GLenum target, GLuint index, GLint *data);

				typedef void (GL_APIENTRYP PFNGLBEGINTRANSFORMFEEDBACKPROC) (GLenum primitiveMode);

				typedef void (GL_APIENTRYP PFNGLENDTRANSFORMFEEDBACKPROC) (void);

				typedef void (GL_APIENTRYP PFNGLBINDBUFFERRANGEPROC) (GLenum target, GLuint index, GLuint buffer, GLintptr offset, GLsizeiptr size);

				typedef void (GL_APIENTRYP PFNGLBINDBUFFERBASEPROC) (GLenum target, GLuint index, GLuint buffer);

				typedef void (GL_APIENTRYP PFNGLTRANSFORMFEEDBACKVARYINGSPROC) (GLuint program, GLsizei count, const GLchar *const*varyings, GLenum bufferMode);

				typedef void (GL_APIENTRYP PFNGLGETTRANSFORMFEEDBACKVARYINGPROC) (GLuint program, GLuint index, GLsizei bufSize, GLsizei *length, GLsizei *size, GLenum *type, GLchar *name);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIBIPOINTERPROC) (GLuint index, GLint size, GLenum type, GLsizei stride, const void *pointer);

				typedef void (GL_APIENTRYP PFNGLGETVERTEXATTRIBIIVPROC) (GLuint index, GLenum pname, GLint *params);

				typedef void (GL_APIENTRYP PFNGLGETVERTEXATTRIBIUIVPROC) (GLuint index, GLenum pname, GLuint *params);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIBI4IPROC) (GLuint index, GLint x, GLint y, GLint z, GLint w);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIBI4UIPROC) (GLuint index, GLuint x, GLuint y, GLuint z, GLuint w);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIBI4IVPROC) (GLuint index, const GLint *v);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIBI4UIVPROC) (GLuint index, const GLuint *v);

				typedef void (GL_APIENTRYP PFNGLGETUNIFORMUIVPROC) (GLuint program, GLint location, GLuint *params);

				typedef GLint (GL_APIENTRYP PFNGLGETFRAGDATALOCATIONPROC) (GLuint program, const GLchar *name);

				typedef void (GL_APIENTRYP PFNGLUNIFORM1UIPROC) (GLint location, GLuint v0);

				typedef void (GL_APIENTRYP PFNGLUNIFORM2UIPROC) (GLint location, GLuint v0, GLuint v1);

				typedef void (GL_APIENTRYP PFNGLUNIFORM3UIPROC) (GLint location, GLuint v0, GLuint v1, GLuint v2);

				typedef void (GL_APIENTRYP PFNGLUNIFORM4UIPROC) (GLint location, GLuint v0, GLuint v1, GLuint v2, GLuint v3);

				typedef void (GL_APIENTRYP PFNGLUNIFORM1UIVPROC) (GLint location, GLsizei count, const GLuint *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORM2UIVPROC) (GLint location, GLsizei count, const GLuint *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORM3UIVPROC) (GLint location, GLsizei count, const GLuint *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORM4UIVPROC) (GLint location, GLsizei count, const GLuint *value);

				typedef void (GL_APIENTRYP PFNGLCLEARBUFFERIVPROC) (GLenum buffer, GLint drawbuffer, const GLint *value);

				typedef void (GL_APIENTRYP PFNGLCLEARBUFFERUIVPROC) (GLenum buffer, GLint drawbuffer, const GLuint *value);

				typedef void (GL_APIENTRYP PFNGLCLEARBUFFERFVPROC) (GLenum buffer, GLint drawbuffer, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLCLEARBUFFERFIPROC) (GLenum buffer, GLint drawbuffer, GLfloat depth, GLint stencil);

				typedef const GLubyte *(GL_APIENTRYP PFNGLGETSTRINGIPROC) (GLenum name, GLuint index);

				typedef void (GL_APIENTRYP PFNGLCOPYBUFFERSUBDATAPROC) (GLenum readTarget, GLenum writeTarget, GLintptr readOffset, GLintptr writeOffset, GLsizeiptr size);

				typedef void (GL_APIENTRYP PFNGLGETUNIFORMINDICESPROC) (GLuint program, GLsizei uniformCount, const GLchar *const*uniformNames, GLuint *uniformIndices);

				typedef void (GL_APIENTRYP PFNGLGETACTIVEUNIFORMSIVPROC) (GLuint program, GLsizei uniformCount, const GLuint *uniformIndices, GLenum pname, GLint *params);

				typedef GLuint (GL_APIENTRYP PFNGLGETUNIFORMBLOCKINDEXPROC) (GLuint program, const GLchar *uniformBlockName);

				typedef void (GL_APIENTRYP PFNGLGETACTIVEUNIFORMBLOCKIVPROC) (GLuint program, GLuint uniformBlockIndex, GLenum pname, GLint *params);

				typedef void (GL_APIENTRYP PFNGLGETACTIVEUNIFORMBLOCKNAMEPROC) (GLuint program, GLuint uniformBlockIndex, GLsizei bufSize, GLsizei *length, GLchar *uniformBlockName);

				typedef void (GL_APIENTRYP PFNGLUNIFORMBLOCKBINDINGPROC) (GLuint program, GLuint uniformBlockIndex, GLuint uniformBlockBinding);

				typedef void (GL_APIENTRYP PFNGLDRAWARRAYSINSTANCEDPROC) (GLenum mode, GLint first, GLsizei count, GLsizei instancecount);

				typedef void (GL_APIENTRYP PFNGLDRAWELEMENTSINSTANCEDPROC) (GLenum mode, GLsizei count, GLenum type, const void *indices, GLsizei instancecount);

				typedef GLsync (GL_APIENTRYP PFNGLFENCESYNCPROC) (GLenum condition, GLbitfield flags);

				typedef GLboolean (GL_APIENTRYP PFNGLISSYNCPROC) (GLsync sync);

				typedef void (GL_APIENTRYP PFNGLDELETESYNCPROC) (GLsync sync);

				typedef GLenum (GL_APIENTRYP PFNGLCLIENTWAITSYNCPROC) (GLsync sync, GLbitfield flags, GLuint64 timeout);

				typedef void (GL_APIENTRYP PFNGLWAITSYNCPROC) (GLsync sync, GLbitfield flags, GLuint64 timeout);

				typedef void (GL_APIENTRYP PFNGLGETINTEGER64VPROC) (GLenum pname, GLint64 *data);

				typedef void (GL_APIENTRYP PFNGLGETSYNCIVPROC) (GLsync sync, GLenum pname, GLsizei bufSize, GLsizei *length, GLint *values);

				typedef void (GL_APIENTRYP PFNGLGETINTEGER64I_VPROC) (GLenum target, GLuint index, GLint64 *data);

				typedef void (GL_APIENTRYP PFNGLGETBUFFERPARAMETERI64VPROC) (GLenum target, GLenum pname, GLint64 *params);

				typedef void (GL_APIENTRYP PFNGLGENSAMPLERSPROC) (GLsizei count, GLuint *samplers);

				typedef void (GL_APIENTRYP PFNGLDELETESAMPLERSPROC) (GLsizei count, const GLuint *samplers);

				typedef GLboolean (GL_APIENTRYP PFNGLISSAMPLERPROC) (GLuint sampler);

				typedef void (GL_APIENTRYP PFNGLBINDSAMPLERPROC) (GLuint unit, GLuint sampler);

				typedef void (GL_APIENTRYP PFNGLSAMPLERPARAMETERIPROC) (GLuint sampler, GLenum pname, GLint param);

				typedef void (GL_APIENTRYP PFNGLSAMPLERPARAMETERIVPROC) (GLuint sampler, GLenum pname, const GLint *param);

				typedef void (GL_APIENTRYP PFNGLSAMPLERPARAMETERFPROC) (GLuint sampler, GLenum pname, GLfloat param);

				typedef void (GL_APIENTRYP PFNGLSAMPLERPARAMETERFVPROC) (GLuint sampler, GLenum pname, const GLfloat *param);

				typedef void (GL_APIENTRYP PFNGLGETSAMPLERPARAMETERIVPROC) (GLuint sampler, GLenum pname, GLint *params);

				typedef void (GL_APIENTRYP PFNGLGETSAMPLERPARAMETERFVPROC) (GLuint sampler, GLenum pname, GLfloat *params);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIBDIVISORPROC) (GLuint index, GLuint divisor);

				typedef void (GL_APIENTRYP PFNGLBINDTRANSFORMFEEDBACKPROC) (GLenum target, GLuint id);

				typedef void (GL_APIENTRYP PFNGLDELETETRANSFORMFEEDBACKSPROC) (GLsizei n, const GLuint *ids);

				typedef void (GL_APIENTRYP PFNGLGENTRANSFORMFEEDBACKSPROC) (GLsizei n, GLuint *ids);

				typedef GLboolean (GL_APIENTRYP PFNGLISTRANSFORMFEEDBACKPROC) (GLuint id);

				typedef void (GL_APIENTRYP PFNGLPAUSETRANSFORMFEEDBACKPROC) (void);

				typedef void (GL_APIENTRYP PFNGLRESUMETRANSFORMFEEDBACKPROC) (void);

				typedef void (GL_APIENTRYP PFNGLGETPROGRAMBINARYPROC) (GLuint program, GLsizei bufSize, GLsizei *length, GLenum *binaryFormat, void *binary);

				typedef void (GL_APIENTRYP PFNGLPROGRAMBINARYPROC) (GLuint program, GLenum binaryFormat, const void *binary, GLsizei length);

				typedef void (GL_APIENTRYP PFNGLPROGRAMPARAMETERIPROC) (GLuint program, GLenum pname, GLint value);

				typedef void (GL_APIENTRYP PFNGLINVALIDATEFRAMEBUFFERPROC) (GLenum target, GLsizei numAttachments, const GLenum *attachments);

				typedef void (GL_APIENTRYP PFNGLINVALIDATESUBFRAMEBUFFERPROC) (GLenum target, GLsizei numAttachments, const GLenum *attachments, GLint x, GLint y, GLsizei width, GLsizei height);

				typedef void (GL_APIENTRYP PFNGLTEXSTORAGE2DPROC) (GLenum target, GLsizei levels, GLenum internalformat, GLsizei width, GLsizei height);

				typedef void (GL_APIENTRYP PFNGLTEXSTORAGE3DPROC) (GLenum target, GLsizei levels, GLenum internalformat, GLsizei width, GLsizei height, GLsizei depth);

				typedef void (GL_APIENTRYP PFNGLGETINTERNALFORMATIVPROC) (GLenum target, GLenum internalformat, GLenum pname, GLsizei bufSize, GLint *params);

				GL_APICALL void GL_APIENTRY glReadBuffer (GLenum src);

				GL_APICALL void GL_APIENTRY glDrawRangeElements (GLenum mode, GLuint start, GLuint end, GLsizei count, GLenum type, const void *indices);

				GL_APICALL void GL_APIENTRY glTexImage3D (GLenum target, GLint level, GLint internalformat, GLsizei width, GLsizei height, GLsizei depth, GLint border, GLenum format, GLenum type, const void *pixels);

				GL_APICALL void GL_APIENTRY glTexSubImage3D (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLsizei width, GLsizei height, GLsizei depth, GLenum format, GLenum type, const void *pixels);

									
										342

include/GLES3/gl31.h
									
												View File
												
				@@ -6,7 +6,7 @@ extern "C" {

				#endif

				/*

				** Copyright (c) 2013-2014 The Khronos Group Inc.

				** Copyright (c) 2013-2016 The Khronos Group Inc.

				**

				** Permission is hereby granted, free of charge, to any person obtaining a

				** copy of this software and/or associated documentation files (the

				@@ -38,12 +38,16 @@ extern "C" {

				#include <GLES3/gl3platform.h>

				/* Generated on date 20140317 */

				#ifndef GL_APIENTRYP

				#define GL_APIENTRYP GL_APIENTRY*

				#endif

				/* Generated on date 20160428 */

				/* Generated C header for:

				 * API: gles2

				 * Profile: common

				 * Versions considered: 2.[0-9]|3.[01]

				 * Versions considered: 2\.[0-9]|3\.[01]

				 * Versions emitted: .*

				 * Default extensions included: None

				 * Additional extensions included: _nomatch_^

				@@ -374,6 +378,148 @@ typedef khronos_uint8_t GLubyte;

				#define GL_RENDERBUFFER_BINDING           0x8CA7

				#define GL_MAX_RENDERBUFFER_SIZE          0x84E8

				#define GL_INVALID_FRAMEBUFFER_OPERATION  0x0506

				typedef void (GL_APIENTRYP PFNGLACTIVETEXTUREPROC) (GLenum texture);

				typedef void (GL_APIENTRYP PFNGLATTACHSHADERPROC) (GLuint program, GLuint shader);

				typedef void (GL_APIENTRYP PFNGLBINDATTRIBLOCATIONPROC) (GLuint program, GLuint index, const GLchar *name);

				typedef void (GL_APIENTRYP PFNGLBINDBUFFERPROC) (GLenum target, GLuint buffer);

				typedef void (GL_APIENTRYP PFNGLBINDFRAMEBUFFERPROC) (GLenum target, GLuint framebuffer);

				typedef void (GL_APIENTRYP PFNGLBINDRENDERBUFFERPROC) (GLenum target, GLuint renderbuffer);

				typedef void (GL_APIENTRYP PFNGLBINDTEXTUREPROC) (GLenum target, GLuint texture);

				typedef void (GL_APIENTRYP PFNGLBLENDCOLORPROC) (GLfloat red, GLfloat green, GLfloat blue, GLfloat alpha);

				typedef void (GL_APIENTRYP PFNGLBLENDEQUATIONPROC) (GLenum mode);

				typedef void (GL_APIENTRYP PFNGLBLENDEQUATIONSEPARATEPROC) (GLenum modeRGB, GLenum modeAlpha);

				typedef void (GL_APIENTRYP PFNGLBLENDFUNCPROC) (GLenum sfactor, GLenum dfactor);

				typedef void (GL_APIENTRYP PFNGLBLENDFUNCSEPARATEPROC) (GLenum sfactorRGB, GLenum dfactorRGB, GLenum sfactorAlpha, GLenum dfactorAlpha);

				typedef void (GL_APIENTRYP PFNGLBUFFERDATAPROC) (GLenum target, GLsizeiptr size, const void *data, GLenum usage);

				typedef void (GL_APIENTRYP PFNGLBUFFERSUBDATAPROC) (GLenum target, GLintptr offset, GLsizeiptr size, const void *data);

				typedef GLenum (GL_APIENTRYP PFNGLCHECKFRAMEBUFFERSTATUSPROC) (GLenum target);

				typedef void (GL_APIENTRYP PFNGLCLEARPROC) (GLbitfield mask);

				typedef void (GL_APIENTRYP PFNGLCLEARCOLORPROC) (GLfloat red, GLfloat green, GLfloat blue, GLfloat alpha);

				typedef void (GL_APIENTRYP PFNGLCLEARDEPTHFPROC) (GLfloat d);

				typedef void (GL_APIENTRYP PFNGLCLEARSTENCILPROC) (GLint s);

				typedef void (GL_APIENTRYP PFNGLCOLORMASKPROC) (GLboolean red, GLboolean green, GLboolean blue, GLboolean alpha);

				typedef void (GL_APIENTRYP PFNGLCOMPILESHADERPROC) (GLuint shader);

				typedef void (GL_APIENTRYP PFNGLCOMPRESSEDTEXIMAGE2DPROC) (GLenum target, GLint level, GLenum internalformat, GLsizei width, GLsizei height, GLint border, GLsizei imageSize, const void *data);

				typedef void (GL_APIENTRYP PFNGLCOMPRESSEDTEXSUBIMAGE2DPROC) (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLsizei width, GLsizei height, GLenum format, GLsizei imageSize, const void *data);

				typedef void (GL_APIENTRYP PFNGLCOPYTEXIMAGE2DPROC) (GLenum target, GLint level, GLenum internalformat, GLint x, GLint y, GLsizei width, GLsizei height, GLint border);

				typedef void (GL_APIENTRYP PFNGLCOPYTEXSUBIMAGE2DPROC) (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLint x, GLint y, GLsizei width, GLsizei height);

				typedef GLuint (GL_APIENTRYP PFNGLCREATEPROGRAMPROC) (void);

				typedef GLuint (GL_APIENTRYP PFNGLCREATESHADERPROC) (GLenum type);

				typedef void (GL_APIENTRYP PFNGLCULLFACEPROC) (GLenum mode);

				typedef void (GL_APIENTRYP PFNGLDELETEBUFFERSPROC) (GLsizei n, const GLuint *buffers);

				typedef void (GL_APIENTRYP PFNGLDELETEFRAMEBUFFERSPROC) (GLsizei n, const GLuint *framebuffers);

				typedef void (GL_APIENTRYP PFNGLDELETEPROGRAMPROC) (GLuint program);

				typedef void (GL_APIENTRYP PFNGLDELETERENDERBUFFERSPROC) (GLsizei n, const GLuint *renderbuffers);

				typedef void (GL_APIENTRYP PFNGLDELETESHADERPROC) (GLuint shader);

				typedef void (GL_APIENTRYP PFNGLDELETETEXTURESPROC) (GLsizei n, const GLuint *textures);

				typedef void (GL_APIENTRYP PFNGLDEPTHFUNCPROC) (GLenum func);

				typedef void (GL_APIENTRYP PFNGLDEPTHMASKPROC) (GLboolean flag);

				typedef void (GL_APIENTRYP PFNGLDEPTHRANGEFPROC) (GLfloat n, GLfloat f);

				typedef void (GL_APIENTRYP PFNGLDETACHSHADERPROC) (GLuint program, GLuint shader);

				typedef void (GL_APIENTRYP PFNGLDISABLEPROC) (GLenum cap);

				typedef void (GL_APIENTRYP PFNGLDISABLEVERTEXATTRIBARRAYPROC) (GLuint index);

				typedef void (GL_APIENTRYP PFNGLDRAWARRAYSPROC) (GLenum mode, GLint first, GLsizei count);

				typedef void (GL_APIENTRYP PFNGLDRAWELEMENTSPROC) (GLenum mode, GLsizei count, GLenum type, const void *indices);

				typedef void (GL_APIENTRYP PFNGLENABLEPROC) (GLenum cap);

				typedef void (GL_APIENTRYP PFNGLENABLEVERTEXATTRIBARRAYPROC) (GLuint index);

				typedef void (GL_APIENTRYP PFNGLFINISHPROC) (void);

				typedef void (GL_APIENTRYP PFNGLFLUSHPROC) (void);

				typedef void (GL_APIENTRYP PFNGLFRAMEBUFFERRENDERBUFFERPROC) (GLenum target, GLenum attachment, GLenum renderbuffertarget, GLuint renderbuffer);

				typedef void (GL_APIENTRYP PFNGLFRAMEBUFFERTEXTURE2DPROC) (GLenum target, GLenum attachment, GLenum textarget, GLuint texture, GLint level);

				typedef void (GL_APIENTRYP PFNGLFRONTFACEPROC) (GLenum mode);

				typedef void (GL_APIENTRYP PFNGLGENBUFFERSPROC) (GLsizei n, GLuint *buffers);

				typedef void (GL_APIENTRYP PFNGLGENERATEMIPMAPPROC) (GLenum target);

				typedef void (GL_APIENTRYP PFNGLGENFRAMEBUFFERSPROC) (GLsizei n, GLuint *framebuffers);

				typedef void (GL_APIENTRYP PFNGLGENRENDERBUFFERSPROC) (GLsizei n, GLuint *renderbuffers);

				typedef void (GL_APIENTRYP PFNGLGENTEXTURESPROC) (GLsizei n, GLuint *textures);

				typedef void (GL_APIENTRYP PFNGLGETACTIVEATTRIBPROC) (GLuint program, GLuint index, GLsizei bufSize, GLsizei *length, GLint *size, GLenum *type, GLchar *name);

				typedef void (GL_APIENTRYP PFNGLGETACTIVEUNIFORMPROC) (GLuint program, GLuint index, GLsizei bufSize, GLsizei *length, GLint *size, GLenum *type, GLchar *name);

				typedef void (GL_APIENTRYP PFNGLGETATTACHEDSHADERSPROC) (GLuint program, GLsizei maxCount, GLsizei *count, GLuint *shaders);

				typedef GLint (GL_APIENTRYP PFNGLGETATTRIBLOCATIONPROC) (GLuint program, const GLchar *name);

				typedef void (GL_APIENTRYP PFNGLGETBOOLEANVPROC) (GLenum pname, GLboolean *data);

				typedef void (GL_APIENTRYP PFNGLGETBUFFERPARAMETERIVPROC) (GLenum target, GLenum pname, GLint *params);

				typedef GLenum (GL_APIENTRYP PFNGLGETERRORPROC) (void);

				typedef void (GL_APIENTRYP PFNGLGETFLOATVPROC) (GLenum pname, GLfloat *data);

				typedef void (GL_APIENTRYP PFNGLGETFRAMEBUFFERATTACHMENTPARAMETERIVPROC) (GLenum target, GLenum attachment, GLenum pname, GLint *params);

				typedef void (GL_APIENTRYP PFNGLGETINTEGERVPROC) (GLenum pname, GLint *data);

				typedef void (GL_APIENTRYP PFNGLGETPROGRAMIVPROC) (GLuint program, GLenum pname, GLint *params);

				typedef void (GL_APIENTRYP PFNGLGETPROGRAMINFOLOGPROC) (GLuint program, GLsizei bufSize, GLsizei *length, GLchar *infoLog);

				typedef void (GL_APIENTRYP PFNGLGETRENDERBUFFERPARAMETERIVPROC) (GLenum target, GLenum pname, GLint *params);

				typedef void (GL_APIENTRYP PFNGLGETSHADERIVPROC) (GLuint shader, GLenum pname, GLint *params);

				typedef void (GL_APIENTRYP PFNGLGETSHADERINFOLOGPROC) (GLuint shader, GLsizei bufSize, GLsizei *length, GLchar *infoLog);

				typedef void (GL_APIENTRYP PFNGLGETSHADERPRECISIONFORMATPROC) (GLenum shadertype, GLenum precisiontype, GLint *range, GLint *precision);

				typedef void (GL_APIENTRYP PFNGLGETSHADERSOURCEPROC) (GLuint shader, GLsizei bufSize, GLsizei *length, GLchar *source);

				typedef const GLubyte *(GL_APIENTRYP PFNGLGETSTRINGPROC) (GLenum name);

				typedef void (GL_APIENTRYP PFNGLGETTEXPARAMETERFVPROC) (GLenum target, GLenum pname, GLfloat *params);

				typedef void (GL_APIENTRYP PFNGLGETTEXPARAMETERIVPROC) (GLenum target, GLenum pname, GLint *params);

				typedef void (GL_APIENTRYP PFNGLGETUNIFORMFVPROC) (GLuint program, GLint location, GLfloat *params);

				typedef void (GL_APIENTRYP PFNGLGETUNIFORMIVPROC) (GLuint program, GLint location, GLint *params);

				typedef GLint (GL_APIENTRYP PFNGLGETUNIFORMLOCATIONPROC) (GLuint program, const GLchar *name);

				typedef void (GL_APIENTRYP PFNGLGETVERTEXATTRIBFVPROC) (GLuint index, GLenum pname, GLfloat *params);

				typedef void (GL_APIENTRYP PFNGLGETVERTEXATTRIBIVPROC) (GLuint index, GLenum pname, GLint *params);

				typedef void (GL_APIENTRYP PFNGLGETVERTEXATTRIBPOINTERVPROC) (GLuint index, GLenum pname, void **pointer);

				typedef void (GL_APIENTRYP PFNGLHINTPROC) (GLenum target, GLenum mode);

				typedef GLboolean (GL_APIENTRYP PFNGLISBUFFERPROC) (GLuint buffer);

				typedef GLboolean (GL_APIENTRYP PFNGLISENABLEDPROC) (GLenum cap);

				typedef GLboolean (GL_APIENTRYP PFNGLISFRAMEBUFFERPROC) (GLuint framebuffer);

				typedef GLboolean (GL_APIENTRYP PFNGLISPROGRAMPROC) (GLuint program);

				typedef GLboolean (GL_APIENTRYP PFNGLISRENDERBUFFERPROC) (GLuint renderbuffer);

				typedef GLboolean (GL_APIENTRYP PFNGLISSHADERPROC) (GLuint shader);

				typedef GLboolean (GL_APIENTRYP PFNGLISTEXTUREPROC) (GLuint texture);

				typedef void (GL_APIENTRYP PFNGLLINEWIDTHPROC) (GLfloat width);

				typedef void (GL_APIENTRYP PFNGLLINKPROGRAMPROC) (GLuint program);

				typedef void (GL_APIENTRYP PFNGLPIXELSTOREIPROC) (GLenum pname, GLint param);

				typedef void (GL_APIENTRYP PFNGLPOLYGONOFFSETPROC) (GLfloat factor, GLfloat units);

				typedef void (GL_APIENTRYP PFNGLREADPIXELSPROC) (GLint x, GLint y, GLsizei width, GLsizei height, GLenum format, GLenum type, void *pixels);

				typedef void (GL_APIENTRYP PFNGLRELEASESHADERCOMPILERPROC) (void);

				typedef void (GL_APIENTRYP PFNGLRENDERBUFFERSTORAGEPROC) (GLenum target, GLenum internalformat, GLsizei width, GLsizei height);

				typedef void (GL_APIENTRYP PFNGLSAMPLECOVERAGEPROC) (GLfloat value, GLboolean invert);

				typedef void (GL_APIENTRYP PFNGLSCISSORPROC) (GLint x, GLint y, GLsizei width, GLsizei height);

				typedef void (GL_APIENTRYP PFNGLSHADERBINARYPROC) (GLsizei count, const GLuint *shaders, GLenum binaryformat, const void *binary, GLsizei length);

				typedef void (GL_APIENTRYP PFNGLSHADERSOURCEPROC) (GLuint shader, GLsizei count, const GLchar *const*string, const GLint *length);

				typedef void (GL_APIENTRYP PFNGLSTENCILFUNCPROC) (GLenum func, GLint ref, GLuint mask);

				typedef void (GL_APIENTRYP PFNGLSTENCILFUNCSEPARATEPROC) (GLenum face, GLenum func, GLint ref, GLuint mask);

				typedef void (GL_APIENTRYP PFNGLSTENCILMASKPROC) (GLuint mask);

				typedef void (GL_APIENTRYP PFNGLSTENCILMASKSEPARATEPROC) (GLenum face, GLuint mask);

				typedef void (GL_APIENTRYP PFNGLSTENCILOPPROC) (GLenum fail, GLenum zfail, GLenum zpass);

				typedef void (GL_APIENTRYP PFNGLSTENCILOPSEPARATEPROC) (GLenum face, GLenum sfail, GLenum dpfail, GLenum dppass);

				typedef void (GL_APIENTRYP PFNGLTEXIMAGE2DPROC) (GLenum target, GLint level, GLint internalformat, GLsizei width, GLsizei height, GLint border, GLenum format, GLenum type, const void *pixels);

				typedef void (GL_APIENTRYP PFNGLTEXPARAMETERFPROC) (GLenum target, GLenum pname, GLfloat param);

				typedef void (GL_APIENTRYP PFNGLTEXPARAMETERFVPROC) (GLenum target, GLenum pname, const GLfloat *params);

				typedef void (GL_APIENTRYP PFNGLTEXPARAMETERIPROC) (GLenum target, GLenum pname, GLint param);

				typedef void (GL_APIENTRYP PFNGLTEXPARAMETERIVPROC) (GLenum target, GLenum pname, const GLint *params);

				typedef void (GL_APIENTRYP PFNGLTEXSUBIMAGE2DPROC) (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLsizei width, GLsizei height, GLenum format, GLenum type, const void *pixels);

				typedef void (GL_APIENTRYP PFNGLUNIFORM1FPROC) (GLint location, GLfloat v0);

				typedef void (GL_APIENTRYP PFNGLUNIFORM1FVPROC) (GLint location, GLsizei count, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORM1IPROC) (GLint location, GLint v0);

				typedef void (GL_APIENTRYP PFNGLUNIFORM1IVPROC) (GLint location, GLsizei count, const GLint *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORM2FPROC) (GLint location, GLfloat v0, GLfloat v1);

				typedef void (GL_APIENTRYP PFNGLUNIFORM2FVPROC) (GLint location, GLsizei count, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORM2IPROC) (GLint location, GLint v0, GLint v1);

				typedef void (GL_APIENTRYP PFNGLUNIFORM2IVPROC) (GLint location, GLsizei count, const GLint *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORM3FPROC) (GLint location, GLfloat v0, GLfloat v1, GLfloat v2);

				typedef void (GL_APIENTRYP PFNGLUNIFORM3FVPROC) (GLint location, GLsizei count, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORM3IPROC) (GLint location, GLint v0, GLint v1, GLint v2);

				typedef void (GL_APIENTRYP PFNGLUNIFORM3IVPROC) (GLint location, GLsizei count, const GLint *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORM4FPROC) (GLint location, GLfloat v0, GLfloat v1, GLfloat v2, GLfloat v3);

				typedef void (GL_APIENTRYP PFNGLUNIFORM4FVPROC) (GLint location, GLsizei count, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORM4IPROC) (GLint location, GLint v0, GLint v1, GLint v2, GLint v3);

				typedef void (GL_APIENTRYP PFNGLUNIFORM4IVPROC) (GLint location, GLsizei count, const GLint *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX2FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX3FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX4FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLUSEPROGRAMPROC) (GLuint program);

				typedef void (GL_APIENTRYP PFNGLVALIDATEPROGRAMPROC) (GLuint program);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB1FPROC) (GLuint index, GLfloat x);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB1FVPROC) (GLuint index, const GLfloat *v);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB2FPROC) (GLuint index, GLfloat x, GLfloat y);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB2FVPROC) (GLuint index, const GLfloat *v);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB3FPROC) (GLuint index, GLfloat x, GLfloat y, GLfloat z);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB3FVPROC) (GLuint index, const GLfloat *v);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB4FPROC) (GLuint index, GLfloat x, GLfloat y, GLfloat z, GLfloat w);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIB4FVPROC) (GLuint index, const GLfloat *v);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIBPOINTERPROC) (GLuint index, GLint size, GLenum type, GLboolean normalized, GLsizei stride, const void *pointer);

				typedef void (GL_APIENTRYP PFNGLVIEWPORTPROC) (GLint x, GLint y, GLsizei width, GLsizei height);

				GL_APICALL void GL_APIENTRY glActiveTexture (GLenum texture);

				GL_APICALL void GL_APIENTRY glAttachShader (GLuint program, GLuint shader);

				GL_APICALL void GL_APIENTRY glBindAttribLocation (GLuint program, GLuint index, const GLchar *name);

				@@ -705,6 +851,22 @@ typedef unsigned short GLhalf;

				#define GL_COLOR_ATTACHMENT13             0x8CED

				#define GL_COLOR_ATTACHMENT14             0x8CEE

				#define GL_COLOR_ATTACHMENT15             0x8CEF

				#define GL_COLOR_ATTACHMENT16             0x8CF0

				#define GL_COLOR_ATTACHMENT17             0x8CF1

				#define GL_COLOR_ATTACHMENT18             0x8CF2

				#define GL_COLOR_ATTACHMENT19             0x8CF3

				#define GL_COLOR_ATTACHMENT20             0x8CF4

				#define GL_COLOR_ATTACHMENT21             0x8CF5

				#define GL_COLOR_ATTACHMENT22             0x8CF6

				#define GL_COLOR_ATTACHMENT23             0x8CF7

				#define GL_COLOR_ATTACHMENT24             0x8CF8

				#define GL_COLOR_ATTACHMENT25             0x8CF9

				#define GL_COLOR_ATTACHMENT26             0x8CFA

				#define GL_COLOR_ATTACHMENT27             0x8CFB

				#define GL_COLOR_ATTACHMENT28             0x8CFC

				#define GL_COLOR_ATTACHMENT29             0x8CFD

				#define GL_COLOR_ATTACHMENT30             0x8CFE

				#define GL_COLOR_ATTACHMENT31             0x8CFF

				#define GL_FRAMEBUFFER_INCOMPLETE_MULTISAMPLE 0x8D56

				#define GL_MAX_SAMPLES                    0x8D57

				#define GL_HALF_FLOAT                     0x140B

				@@ -826,7 +988,111 @@ typedef unsigned short GLhalf;

				#define GL_MAX_ELEMENT_INDEX              0x8D6B

				#define GL_NUM_SAMPLE_COUNTS              0x9380

				#define GL_TEXTURE_IMMUTABLE_LEVELS       0x82DF

				GL_APICALL void GL_APIENTRY glReadBuffer (GLenum mode);

				typedef void (GL_APIENTRYP PFNGLREADBUFFERPROC) (GLenum src);

				typedef void (GL_APIENTRYP PFNGLDRAWRANGEELEMENTSPROC) (GLenum mode, GLuint start, GLuint end, GLsizei count, GLenum type, const void *indices);

				typedef void (GL_APIENTRYP PFNGLTEXIMAGE3DPROC) (GLenum target, GLint level, GLint internalformat, GLsizei width, GLsizei height, GLsizei depth, GLint border, GLenum format, GLenum type, const void *pixels);

				typedef void (GL_APIENTRYP PFNGLTEXSUBIMAGE3DPROC) (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLsizei width, GLsizei height, GLsizei depth, GLenum format, GLenum type, const void *pixels);

				typedef void (GL_APIENTRYP PFNGLCOPYTEXSUBIMAGE3DPROC) (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLint x, GLint y, GLsizei width, GLsizei height);

				typedef void (GL_APIENTRYP PFNGLCOMPRESSEDTEXIMAGE3DPROC) (GLenum target, GLint level, GLenum internalformat, GLsizei width, GLsizei height, GLsizei depth, GLint border, GLsizei imageSize, const void *data);

				typedef void (GL_APIENTRYP PFNGLCOMPRESSEDTEXSUBIMAGE3DPROC) (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLsizei width, GLsizei height, GLsizei depth, GLenum format, GLsizei imageSize, const void *data);

				typedef void (GL_APIENTRYP PFNGLGENQUERIESPROC) (GLsizei n, GLuint *ids);

				typedef void (GL_APIENTRYP PFNGLDELETEQUERIESPROC) (GLsizei n, const GLuint *ids);

				typedef GLboolean (GL_APIENTRYP PFNGLISQUERYPROC) (GLuint id);

				typedef void (GL_APIENTRYP PFNGLBEGINQUERYPROC) (GLenum target, GLuint id);

				typedef void (GL_APIENTRYP PFNGLENDQUERYPROC) (GLenum target);

				typedef void (GL_APIENTRYP PFNGLGETQUERYIVPROC) (GLenum target, GLenum pname, GLint *params);

				typedef void (GL_APIENTRYP PFNGLGETQUERYOBJECTUIVPROC) (GLuint id, GLenum pname, GLuint *params);

				typedef GLboolean (GL_APIENTRYP PFNGLUNMAPBUFFERPROC) (GLenum target);

				typedef void (GL_APIENTRYP PFNGLGETBUFFERPOINTERVPROC) (GLenum target, GLenum pname, void **params);

				typedef void (GL_APIENTRYP PFNGLDRAWBUFFERSPROC) (GLsizei n, const GLenum *bufs);

				typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX2X3FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX3X2FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX2X4FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX4X2FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX3X4FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORMMATRIX4X3FVPROC) (GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLBLITFRAMEBUFFERPROC) (GLint srcX0, GLint srcY0, GLint srcX1, GLint srcY1, GLint dstX0, GLint dstY0, GLint dstX1, GLint dstY1, GLbitfield mask, GLenum filter);

				typedef void (GL_APIENTRYP PFNGLRENDERBUFFERSTORAGEMULTISAMPLEPROC) (GLenum target, GLsizei samples, GLenum internalformat, GLsizei width, GLsizei height);

				typedef void (GL_APIENTRYP PFNGLFRAMEBUFFERTEXTURELAYERPROC) (GLenum target, GLenum attachment, GLuint texture, GLint level, GLint layer);

				typedef void *(GL_APIENTRYP PFNGLMAPBUFFERRANGEPROC) (GLenum target, GLintptr offset, GLsizeiptr length, GLbitfield access);

				typedef void (GL_APIENTRYP PFNGLFLUSHMAPPEDBUFFERRANGEPROC) (GLenum target, GLintptr offset, GLsizeiptr length);

				typedef void (GL_APIENTRYP PFNGLBINDVERTEXARRAYPROC) (GLuint array);

				typedef void (GL_APIENTRYP PFNGLDELETEVERTEXARRAYSPROC) (GLsizei n, const GLuint *arrays);

				typedef void (GL_APIENTRYP PFNGLGENVERTEXARRAYSPROC) (GLsizei n, GLuint *arrays);

				typedef GLboolean (GL_APIENTRYP PFNGLISVERTEXARRAYPROC) (GLuint array);

				typedef void (GL_APIENTRYP PFNGLGETINTEGERI_VPROC) (GLenum target, GLuint index, GLint *data);

				typedef void (GL_APIENTRYP PFNGLBEGINTRANSFORMFEEDBACKPROC) (GLenum primitiveMode);

				typedef void (GL_APIENTRYP PFNGLENDTRANSFORMFEEDBACKPROC) (void);

				typedef void (GL_APIENTRYP PFNGLBINDBUFFERRANGEPROC) (GLenum target, GLuint index, GLuint buffer, GLintptr offset, GLsizeiptr size);

				typedef void (GL_APIENTRYP PFNGLBINDBUFFERBASEPROC) (GLenum target, GLuint index, GLuint buffer);

				typedef void (GL_APIENTRYP PFNGLTRANSFORMFEEDBACKVARYINGSPROC) (GLuint program, GLsizei count, const GLchar *const*varyings, GLenum bufferMode);

				typedef void (GL_APIENTRYP PFNGLGETTRANSFORMFEEDBACKVARYINGPROC) (GLuint program, GLuint index, GLsizei bufSize, GLsizei *length, GLsizei *size, GLenum *type, GLchar *name);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIBIPOINTERPROC) (GLuint index, GLint size, GLenum type, GLsizei stride, const void *pointer);

				typedef void (GL_APIENTRYP PFNGLGETVERTEXATTRIBIIVPROC) (GLuint index, GLenum pname, GLint *params);

				typedef void (GL_APIENTRYP PFNGLGETVERTEXATTRIBIUIVPROC) (GLuint index, GLenum pname, GLuint *params);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIBI4IPROC) (GLuint index, GLint x, GLint y, GLint z, GLint w);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIBI4UIPROC) (GLuint index, GLuint x, GLuint y, GLuint z, GLuint w);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIBI4IVPROC) (GLuint index, const GLint *v);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIBI4UIVPROC) (GLuint index, const GLuint *v);

				typedef void (GL_APIENTRYP PFNGLGETUNIFORMUIVPROC) (GLuint program, GLint location, GLuint *params);

				typedef GLint (GL_APIENTRYP PFNGLGETFRAGDATALOCATIONPROC) (GLuint program, const GLchar *name);

				typedef void (GL_APIENTRYP PFNGLUNIFORM1UIPROC) (GLint location, GLuint v0);

				typedef void (GL_APIENTRYP PFNGLUNIFORM2UIPROC) (GLint location, GLuint v0, GLuint v1);

				typedef void (GL_APIENTRYP PFNGLUNIFORM3UIPROC) (GLint location, GLuint v0, GLuint v1, GLuint v2);

				typedef void (GL_APIENTRYP PFNGLUNIFORM4UIPROC) (GLint location, GLuint v0, GLuint v1, GLuint v2, GLuint v3);

				typedef void (GL_APIENTRYP PFNGLUNIFORM1UIVPROC) (GLint location, GLsizei count, const GLuint *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORM2UIVPROC) (GLint location, GLsizei count, const GLuint *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORM3UIVPROC) (GLint location, GLsizei count, const GLuint *value);

				typedef void (GL_APIENTRYP PFNGLUNIFORM4UIVPROC) (GLint location, GLsizei count, const GLuint *value);

				typedef void (GL_APIENTRYP PFNGLCLEARBUFFERIVPROC) (GLenum buffer, GLint drawbuffer, const GLint *value);

				typedef void (GL_APIENTRYP PFNGLCLEARBUFFERUIVPROC) (GLenum buffer, GLint drawbuffer, const GLuint *value);

				typedef void (GL_APIENTRYP PFNGLCLEARBUFFERFVPROC) (GLenum buffer, GLint drawbuffer, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLCLEARBUFFERFIPROC) (GLenum buffer, GLint drawbuffer, GLfloat depth, GLint stencil);

				typedef const GLubyte *(GL_APIENTRYP PFNGLGETSTRINGIPROC) (GLenum name, GLuint index);

				typedef void (GL_APIENTRYP PFNGLCOPYBUFFERSUBDATAPROC) (GLenum readTarget, GLenum writeTarget, GLintptr readOffset, GLintptr writeOffset, GLsizeiptr size);

				typedef void (GL_APIENTRYP PFNGLGETUNIFORMINDICESPROC) (GLuint program, GLsizei uniformCount, const GLchar *const*uniformNames, GLuint *uniformIndices);

				typedef void (GL_APIENTRYP PFNGLGETACTIVEUNIFORMSIVPROC) (GLuint program, GLsizei uniformCount, const GLuint *uniformIndices, GLenum pname, GLint *params);

				typedef GLuint (GL_APIENTRYP PFNGLGETUNIFORMBLOCKINDEXPROC) (GLuint program, const GLchar *uniformBlockName);

				typedef void (GL_APIENTRYP PFNGLGETACTIVEUNIFORMBLOCKIVPROC) (GLuint program, GLuint uniformBlockIndex, GLenum pname, GLint *params);

				typedef void (GL_APIENTRYP PFNGLGETACTIVEUNIFORMBLOCKNAMEPROC) (GLuint program, GLuint uniformBlockIndex, GLsizei bufSize, GLsizei *length, GLchar *uniformBlockName);

				typedef void (GL_APIENTRYP PFNGLUNIFORMBLOCKBINDINGPROC) (GLuint program, GLuint uniformBlockIndex, GLuint uniformBlockBinding);

				typedef void (GL_APIENTRYP PFNGLDRAWARRAYSINSTANCEDPROC) (GLenum mode, GLint first, GLsizei count, GLsizei instancecount);

				typedef void (GL_APIENTRYP PFNGLDRAWELEMENTSINSTANCEDPROC) (GLenum mode, GLsizei count, GLenum type, const void *indices, GLsizei instancecount);

				typedef GLsync (GL_APIENTRYP PFNGLFENCESYNCPROC) (GLenum condition, GLbitfield flags);

				typedef GLboolean (GL_APIENTRYP PFNGLISSYNCPROC) (GLsync sync);

				typedef void (GL_APIENTRYP PFNGLDELETESYNCPROC) (GLsync sync);

				typedef GLenum (GL_APIENTRYP PFNGLCLIENTWAITSYNCPROC) (GLsync sync, GLbitfield flags, GLuint64 timeout);

				typedef void (GL_APIENTRYP PFNGLWAITSYNCPROC) (GLsync sync, GLbitfield flags, GLuint64 timeout);

				typedef void (GL_APIENTRYP PFNGLGETINTEGER64VPROC) (GLenum pname, GLint64 *data);

				typedef void (GL_APIENTRYP PFNGLGETSYNCIVPROC) (GLsync sync, GLenum pname, GLsizei bufSize, GLsizei *length, GLint *values);

				typedef void (GL_APIENTRYP PFNGLGETINTEGER64I_VPROC) (GLenum target, GLuint index, GLint64 *data);

				typedef void (GL_APIENTRYP PFNGLGETBUFFERPARAMETERI64VPROC) (GLenum target, GLenum pname, GLint64 *params);

				typedef void (GL_APIENTRYP PFNGLGENSAMPLERSPROC) (GLsizei count, GLuint *samplers);

				typedef void (GL_APIENTRYP PFNGLDELETESAMPLERSPROC) (GLsizei count, const GLuint *samplers);

				typedef GLboolean (GL_APIENTRYP PFNGLISSAMPLERPROC) (GLuint sampler);

				typedef void (GL_APIENTRYP PFNGLBINDSAMPLERPROC) (GLuint unit, GLuint sampler);

				typedef void (GL_APIENTRYP PFNGLSAMPLERPARAMETERIPROC) (GLuint sampler, GLenum pname, GLint param);

				typedef void (GL_APIENTRYP PFNGLSAMPLERPARAMETERIVPROC) (GLuint sampler, GLenum pname, const GLint *param);

				typedef void (GL_APIENTRYP PFNGLSAMPLERPARAMETERFPROC) (GLuint sampler, GLenum pname, GLfloat param);

				typedef void (GL_APIENTRYP PFNGLSAMPLERPARAMETERFVPROC) (GLuint sampler, GLenum pname, const GLfloat *param);

				typedef void (GL_APIENTRYP PFNGLGETSAMPLERPARAMETERIVPROC) (GLuint sampler, GLenum pname, GLint *params);

				typedef void (GL_APIENTRYP PFNGLGETSAMPLERPARAMETERFVPROC) (GLuint sampler, GLenum pname, GLfloat *params);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIBDIVISORPROC) (GLuint index, GLuint divisor);

				typedef void (GL_APIENTRYP PFNGLBINDTRANSFORMFEEDBACKPROC) (GLenum target, GLuint id);

				typedef void (GL_APIENTRYP PFNGLDELETETRANSFORMFEEDBACKSPROC) (GLsizei n, const GLuint *ids);

				typedef void (GL_APIENTRYP PFNGLGENTRANSFORMFEEDBACKSPROC) (GLsizei n, GLuint *ids);

				typedef GLboolean (GL_APIENTRYP PFNGLISTRANSFORMFEEDBACKPROC) (GLuint id);

				typedef void (GL_APIENTRYP PFNGLPAUSETRANSFORMFEEDBACKPROC) (void);

				typedef void (GL_APIENTRYP PFNGLRESUMETRANSFORMFEEDBACKPROC) (void);

				typedef void (GL_APIENTRYP PFNGLGETPROGRAMBINARYPROC) (GLuint program, GLsizei bufSize, GLsizei *length, GLenum *binaryFormat, void *binary);

				typedef void (GL_APIENTRYP PFNGLPROGRAMBINARYPROC) (GLuint program, GLenum binaryFormat, const void *binary, GLsizei length);

				typedef void (GL_APIENTRYP PFNGLPROGRAMPARAMETERIPROC) (GLuint program, GLenum pname, GLint value);

				typedef void (GL_APIENTRYP PFNGLINVALIDATEFRAMEBUFFERPROC) (GLenum target, GLsizei numAttachments, const GLenum *attachments);

				typedef void (GL_APIENTRYP PFNGLINVALIDATESUBFRAMEBUFFERPROC) (GLenum target, GLsizei numAttachments, const GLenum *attachments, GLint x, GLint y, GLsizei width, GLsizei height);

				typedef void (GL_APIENTRYP PFNGLTEXSTORAGE2DPROC) (GLenum target, GLsizei levels, GLenum internalformat, GLsizei width, GLsizei height);

				typedef void (GL_APIENTRYP PFNGLTEXSTORAGE3DPROC) (GLenum target, GLsizei levels, GLenum internalformat, GLsizei width, GLsizei height, GLsizei depth);

				typedef void (GL_APIENTRYP PFNGLGETINTERNALFORMATIVPROC) (GLenum target, GLenum internalformat, GLenum pname, GLsizei bufSize, GLint *params);

				GL_APICALL void GL_APIENTRY glReadBuffer (GLenum src);

				GL_APICALL void GL_APIENTRY glDrawRangeElements (GLenum mode, GLuint start, GLuint end, GLsizei count, GLenum type, const void *indices);

				GL_APICALL void GL_APIENTRY glTexImage3D (GLenum target, GLint level, GLint internalformat, GLsizei width, GLsizei height, GLsizei depth, GLint border, GLenum format, GLenum type, const void *pixels);

				GL_APICALL void GL_APIENTRY glTexSubImage3D (GLenum target, GLint level, GLint xoffset, GLint yoffset, GLint zoffset, GLsizei width, GLsizei height, GLsizei depth, GLenum format, GLenum type, const void *pixels);

				@@ -1107,6 +1373,74 @@ GL_APICALL void GL_APIENTRY glGetInternalformativ (GLenum target, GLenum interna

				#define GL_MAX_VERTEX_ATTRIB_RELATIVE_OFFSET 0x82D9

				#define GL_MAX_VERTEX_ATTRIB_BINDINGS     0x82DA

				#define GL_MAX_VERTEX_ATTRIB_STRIDE       0x82E5

				typedef void (GL_APIENTRYP PFNGLDISPATCHCOMPUTEPROC) (GLuint num_groups_x, GLuint num_groups_y, GLuint num_groups_z);

				typedef void (GL_APIENTRYP PFNGLDISPATCHCOMPUTEINDIRECTPROC) (GLintptr indirect);

				typedef void (GL_APIENTRYP PFNGLDRAWARRAYSINDIRECTPROC) (GLenum mode, const void *indirect);

				typedef void (GL_APIENTRYP PFNGLDRAWELEMENTSINDIRECTPROC) (GLenum mode, GLenum type, const void *indirect);

				typedef void (GL_APIENTRYP PFNGLFRAMEBUFFERPARAMETERIPROC) (GLenum target, GLenum pname, GLint param);

				typedef void (GL_APIENTRYP PFNGLGETFRAMEBUFFERPARAMETERIVPROC) (GLenum target, GLenum pname, GLint *params);

				typedef void (GL_APIENTRYP PFNGLGETPROGRAMINTERFACEIVPROC) (GLuint program, GLenum programInterface, GLenum pname, GLint *params);

				typedef GLuint (GL_APIENTRYP PFNGLGETPROGRAMRESOURCEINDEXPROC) (GLuint program, GLenum programInterface, const GLchar *name);

				typedef void (GL_APIENTRYP PFNGLGETPROGRAMRESOURCENAMEPROC) (GLuint program, GLenum programInterface, GLuint index, GLsizei bufSize, GLsizei *length, GLchar *name);

				typedef void (GL_APIENTRYP PFNGLGETPROGRAMRESOURCEIVPROC) (GLuint program, GLenum programInterface, GLuint index, GLsizei propCount, const GLenum *props, GLsizei bufSize, GLsizei *length, GLint *params);

				typedef GLint (GL_APIENTRYP PFNGLGETPROGRAMRESOURCELOCATIONPROC) (GLuint program, GLenum programInterface, const GLchar *name);

				typedef void (GL_APIENTRYP PFNGLUSEPROGRAMSTAGESPROC) (GLuint pipeline, GLbitfield stages, GLuint program);

				typedef void (GL_APIENTRYP PFNGLACTIVESHADERPROGRAMPROC) (GLuint pipeline, GLuint program);

				typedef GLuint (GL_APIENTRYP PFNGLCREATESHADERPROGRAMVPROC) (GLenum type, GLsizei count, const GLchar *const*strings);

				typedef void (GL_APIENTRYP PFNGLBINDPROGRAMPIPELINEPROC) (GLuint pipeline);

				typedef void (GL_APIENTRYP PFNGLDELETEPROGRAMPIPELINESPROC) (GLsizei n, const GLuint *pipelines);

				typedef void (GL_APIENTRYP PFNGLGENPROGRAMPIPELINESPROC) (GLsizei n, GLuint *pipelines);

				typedef GLboolean (GL_APIENTRYP PFNGLISPROGRAMPIPELINEPROC) (GLuint pipeline);

				typedef void (GL_APIENTRYP PFNGLGETPROGRAMPIPELINEIVPROC) (GLuint pipeline, GLenum pname, GLint *params);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM1IPROC) (GLuint program, GLint location, GLint v0);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM2IPROC) (GLuint program, GLint location, GLint v0, GLint v1);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM3IPROC) (GLuint program, GLint location, GLint v0, GLint v1, GLint v2);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM4IPROC) (GLuint program, GLint location, GLint v0, GLint v1, GLint v2, GLint v3);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM1UIPROC) (GLuint program, GLint location, GLuint v0);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM2UIPROC) (GLuint program, GLint location, GLuint v0, GLuint v1);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM3UIPROC) (GLuint program, GLint location, GLuint v0, GLuint v1, GLuint v2);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM4UIPROC) (GLuint program, GLint location, GLuint v0, GLuint v1, GLuint v2, GLuint v3);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM1FPROC) (GLuint program, GLint location, GLfloat v0);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM2FPROC) (GLuint program, GLint location, GLfloat v0, GLfloat v1);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM3FPROC) (GLuint program, GLint location, GLfloat v0, GLfloat v1, GLfloat v2);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM4FPROC) (GLuint program, GLint location, GLfloat v0, GLfloat v1, GLfloat v2, GLfloat v3);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM1IVPROC) (GLuint program, GLint location, GLsizei count, const GLint *value);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM2IVPROC) (GLuint program, GLint location, GLsizei count, const GLint *value);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM3IVPROC) (GLuint program, GLint location, GLsizei count, const GLint *value);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM4IVPROC) (GLuint program, GLint location, GLsizei count, const GLint *value);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM1UIVPROC) (GLuint program, GLint location, GLsizei count, const GLuint *value);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM2UIVPROC) (GLuint program, GLint location, GLsizei count, const GLuint *value);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM3UIVPROC) (GLuint program, GLint location, GLsizei count, const GLuint *value);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM4UIVPROC) (GLuint program, GLint location, GLsizei count, const GLuint *value);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM1FVPROC) (GLuint program, GLint location, GLsizei count, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM2FVPROC) (GLuint program, GLint location, GLsizei count, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM3FVPROC) (GLuint program, GLint location, GLsizei count, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORM4FVPROC) (GLuint program, GLint location, GLsizei count, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORMMATRIX2FVPROC) (GLuint program, GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORMMATRIX3FVPROC) (GLuint program, GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORMMATRIX4FVPROC) (GLuint program, GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORMMATRIX2X3FVPROC) (GLuint program, GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORMMATRIX3X2FVPROC) (GLuint program, GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORMMATRIX2X4FVPROC) (GLuint program, GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORMMATRIX4X2FVPROC) (GLuint program, GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORMMATRIX3X4FVPROC) (GLuint program, GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLPROGRAMUNIFORMMATRIX4X3FVPROC) (GLuint program, GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);

				typedef void (GL_APIENTRYP PFNGLVALIDATEPROGRAMPIPELINEPROC) (GLuint pipeline);

				typedef void (GL_APIENTRYP PFNGLGETPROGRAMPIPELINEINFOLOGPROC) (GLuint pipeline, GLsizei bufSize, GLsizei *length, GLchar *infoLog);

				typedef void (GL_APIENTRYP PFNGLBINDIMAGETEXTUREPROC) (GLuint unit, GLuint texture, GLint level, GLboolean layered, GLint layer, GLenum access, GLenum format);

				typedef void (GL_APIENTRYP PFNGLGETBOOLEANI_VPROC) (GLenum target, GLuint index, GLboolean *data);

				typedef void (GL_APIENTRYP PFNGLMEMORYBARRIERPROC) (GLbitfield barriers);

				typedef void (GL_APIENTRYP PFNGLMEMORYBARRIERBYREGIONPROC) (GLbitfield barriers);

				typedef void (GL_APIENTRYP PFNGLTEXSTORAGE2DMULTISAMPLEPROC) (GLenum target, GLsizei samples, GLenum internalformat, GLsizei width, GLsizei height, GLboolean fixedsamplelocations);

				typedef void (GL_APIENTRYP PFNGLGETMULTISAMPLEFVPROC) (GLenum pname, GLuint index, GLfloat *val);

				typedef void (GL_APIENTRYP PFNGLSAMPLEMASKIPROC) (GLuint maskNumber, GLbitfield mask);

				typedef void (GL_APIENTRYP PFNGLGETTEXLEVELPARAMETERIVPROC) (GLenum target, GLint level, GLenum pname, GLint *params);

				typedef void (GL_APIENTRYP PFNGLGETTEXLEVELPARAMETERFVPROC) (GLenum target, GLint level, GLenum pname, GLfloat *params);

				typedef void (GL_APIENTRYP PFNGLBINDVERTEXBUFFERPROC) (GLuint bindingindex, GLuint buffer, GLintptr offset, GLsizei stride);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIBFORMATPROC) (GLuint attribindex, GLint size, GLenum type, GLboolean normalized, GLuint relativeoffset);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIBIFORMATPROC) (GLuint attribindex, GLint size, GLenum type, GLuint relativeoffset);

				typedef void (GL_APIENTRYP PFNGLVERTEXATTRIBBINDINGPROC) (GLuint attribindex, GLuint bindingindex);

				typedef void (GL_APIENTRYP PFNGLVERTEXBINDINGDIVISORPROC) (GLuint bindingindex, GLuint divisor);

				GL_APICALL void GL_APIENTRY glDispatchCompute (GLuint num_groups_x, GLuint num_groups_y, GLuint num_groups_z);

				GL_APICALL void GL_APIENTRY glDispatchComputeIndirect (GLintptr indirect);

				GL_APICALL void GL_APIENTRY glDrawArraysIndirect (GLenum mode, const void *indirect);

1817

include/GLES3/gl32.h Normal file

View File

File diff suppressed because it is too large Load Diff

									
										3

include/c11/.editorconfig
									
										Normal file
									
												View File
												
				@@ -0,0 +1,3 @@

				[*.h]

				indent_style = space

				indent_size = 4

									
										2

include/c11/threads_posix.h
									
												View File
												
				@@ -184,7 +184,7 @@ mtx_destroy(mtx_t *mtx)

				 * Thus the linker will be happy and things don't clash when building

				 * with -O1 or greater.

				 */

				#ifdef HAVE_FUNC_ATTRIBUTE_WEAK

				#if defined(HAVE_FUNC_ATTRIBUTE_WEAK) && !defined(__CYGWIN__)

				__attribute__((weak))

				int pthread_mutexattr_init(pthread_mutexattr_t *attr);

									
										3

include/d3dadapter/.editorconfig
									
										Normal file
									
												View File
												
				@@ -0,0 +1,3 @@

				[*.h]

				indent_style = space

				indent_size = 4

									
										8

include/pci_ids/i965_pci_ids.h
									
												View File
												
				@@ -137,6 +137,7 @@ CHIPSET(0x193D, skl_gt4, "Intel(R) Iris Pro Graphics P580 (Skylake GT4e)")

				CHIPSET(0x5902, kbl_gt1, "Intel(R) Kabylake GT1")

				CHIPSET(0x5906, kbl_gt1, "Intel(R) Kabylake GT1")

				CHIPSET(0x590A, kbl_gt1, "Intel(R) Kabylake GT1")

				CHIPSET(0x5908, kbl_gt1, "Intel(R) Kabylake GT1")

				CHIPSET(0x590B, kbl_gt1, "Intel(R) Kabylake GT1")

				CHIPSET(0x590E, kbl_gt1, "Intel(R) Kabylake GT1")

				CHIPSET(0x5913, kbl_gt1_5, "Intel(R) Kabylake GT1.5")

				@@ -149,13 +150,10 @@ CHIPSET(0x591B, kbl_gt2, "Intel(R) Kabylake GT2")

				CHIPSET(0x591D, kbl_gt2, "Intel(R) Kabylake GT2")

				CHIPSET(0x591E, kbl_gt2, "Intel(R) Kabylake GT2")

				CHIPSET(0x5921, kbl_gt2, "Intel(R) Kabylake GT2F")

				CHIPSET(0x5923, kbl_gt3, "Intel(R) Kabylake GT3")

				CHIPSET(0x5926, kbl_gt3, "Intel(R) Kabylake GT3")

				CHIPSET(0x592A, kbl_gt3, "Intel(R) Kabylake GT3")

				CHIPSET(0x592B, kbl_gt3, "Intel(R) Kabylake GT3")

				CHIPSET(0x5932, kbl_gt4, "Intel(R) Kabylake GT4")

				CHIPSET(0x593A, kbl_gt4, "Intel(R) Kabylake GT4")

				CHIPSET(0x5927, kbl_gt3, "Intel(R) Kabylake GT3")

				CHIPSET(0x593B, kbl_gt4, "Intel(R) Kabylake GT4")

				CHIPSET(0x593D, kbl_gt4, "Intel(R) Kabylake GT4")

				CHIPSET(0x22B0, chv,     "Intel(R) HD Graphics (Cherrytrail)")

				CHIPSET(0x22B1, chv,     "Intel(R) HD Graphics XXX (Braswell)") /* Overridden in brw_get_renderer_string */

				CHIPSET(0x22B2, chv,     "Intel(R) HD Graphics (Cherryview)")

									
										3

include/vulkan/.editorconfig
									
										Normal file
									
												View File
												
				@@ -0,0 +1,3 @@

				[*.h]

				indent_style = space

				indent_size = 4

									
										4

install-gallium-links.mk
									
												View File
												
				@@ -13,8 +13,8 @@ all-local : .install-gallium-links

					fi;							\

					$(MKDIR_P) $$link_dir;					\

					file_list="$(dri_LTLIBRARIES:%.la=.libs/%.so)";		\

					file_list+="$(egl_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*)";	\

					file_list+="$(lib_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*)";	\

					file_list="$$file_list$(egl_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*)";	\

					file_list="$$file_list$(lib_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*)";	\

					for f in $$file_list; do 				\

						if test -h .libs/$$f; then			\

							cp -d $$f $$link_dir;			\

72

m4/ax_check_compile_flag.m4

View File

@@ -1,72 +0,0 @@
 # ===========================================================================
 #   http://www.gnu.org/software/autoconf-archive/ax_check_compile_flag.html
 # ===========================================================================
 #
 # SYNOPSIS
 #
 #   AX_CHECK_COMPILE_FLAG(FLAG, [ACTION-SUCCESS], [ACTION-FAILURE], [EXTRA-FLAGS])
 #
 # DESCRIPTION
 #
 #   Check whether the given FLAG works with the current language's compiler
 #   or gives an error.  (Warnings, however, are ignored)
 #
 #   ACTION-SUCCESS/ACTION-FAILURE are shell commands to execute on
 #   success/failure.
 #
 #   If EXTRA-FLAGS is defined, it is added to the current language's default
 #   flags (e.g. CFLAGS) when the check is done.  The check is thus made with
 #   the flags: "CFLAGS EXTRA-FLAGS FLAG".  This can for example be used to
 #   force the compiler to issue an error when a bad flag is given.
 #
 #   NOTE: Implementation based on AX_CFLAGS_GCC_OPTION. Please keep this
 #   macro in sync with AX_CHECK_{PREPROC,LINK}_FLAG.
 #
 # LICENSE
 #
 #   Copyright (c) 2008 Guido U. Draheim <guidod@gmx.de>
 #   Copyright (c) 2011 Maarten Bosmans <mkbosmans@gmail.com>
 #
 #   This program is free software: you can redistribute it and/or modify it
 #   under the terms of the GNU General Public License as published by the
 #   Free Software Foundation, either version 3 of the License, or (at your
 #   option) any later version.
 #
 #   This program is distributed in the hope that it will be useful, but
 #   WITHOUT ANY WARRANTY; without even the implied warranty of
 #   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
 #   Public License for more details.
 #
 #   You should have received a copy of the GNU General Public License along
 #   with this program. If not, see <http://www.gnu.org/licenses/>.
 #
 #   As a special exception, the respective Autoconf Macro's copyright owner
 #   gives unlimited permission to copy, distribute and modify the configure
 #   scripts that are the output of Autoconf when processing the Macro. You
 #   need not follow the terms of the GNU General Public License when using
 #   or distributing such scripts, even though portions of the text of the
 #   Macro appear in them. The GNU General Public License (GPL) does govern
 #   all other use of the material that constitutes the Autoconf Macro.
 #
 #   This special exception to the GPL applies to versions of the Autoconf
 #   Macro released by the Autoconf Archive. When you make and distribute a
 #   modified version of the Autoconf Macro, you may extend this special
 #   exception to the GPL to apply to your modified version as well.
 #serial 2
 AC_DEFUN([AX_CHECK_COMPILE_FLAG],
 [AC_PREREQ(2.59)dnl for _AC_LANG_PREFIX
 AS_VAR_PUSHDEF([CACHEVAR],[ax_cv_check_[]_AC_LANG_ABBREV[]flags_$4_$1])dnl
 AC_CACHE_CHECK([whether _AC_LANG compiler accepts $1], CACHEVAR, [
   ax_check_save_flags=$[]_AC_LANG_PREFIX[]FLAGS
   _AC_LANG_PREFIX[]FLAGS="$[]_AC_LANG_PREFIX[]FLAGS $4 $1"
   AC_COMPILE_IFELSE([AC_LANG_PROGRAM()],
     [AS_VAR_SET(CACHEVAR,[yes])],
     [AS_VAR_SET(CACHEVAR,[no])])
   _AC_LANG_PREFIX[]FLAGS=$ax_check_save_flags])
 AS_IF([test x"AS_VAR_GET(CACHEVAR)" = xyes],
   [m4_default([$2], :)],
   [m4_default([$3], :)])
 AS_VAR_POPDEF([CACHEVAR])dnl
 ])dnl AX_CHECK_COMPILE_FLAGS

									
										10

scons/custom.py
									
												View File
												
				@@ -103,8 +103,14 @@ def python_scan(node, env, path):

				    # http://www.scons.org/doc/0.98.5/HTML/scons-user/c2781.html#AEN2789

				    # https://docs.python.org/2/library/modulefinder.html

				    contents = node.get_contents()

				    source_dir = node.get_dir()

				    finder = modulefinder.ModuleFinder()

				    # Tell ModuleFinder to search dependencies in the script dir, and the glapi

				    # dirs

				    source_dir = node.get_dir().abspath

				    GLAPI = env.Dir('#src/mapi/glapi/gen').abspath

				    path = [source_dir, GLAPI] + sys.path

				    finder = modulefinder.ModuleFinder(path=path)

				    finder.run_script(node.abspath)

				    results = []

				    for name, mod in finder.modules.iteritems():

									
										9

scons/gallium.py
									
												View File
												
				@@ -256,7 +256,7 @@ def generate(env):

				        if env['build'] == 'profile':

				            env['debug'] = False

				            env['profile'] = True

				        if env['build'] == 'release':

				        if env['build'] in ('release', 'opt'):

				            env['debug'] = False

				            env['profile'] = False

				@@ -301,6 +301,8 @@ def generate(env):

				        cppdefines += ['NDEBUG']

				    if env['build'] == 'profile':

				        cppdefines += ['PROFILE']

				    if env['build'] in ('opt', 'profile'):

				        cppdefines += ['VMX86_STATS']

				    if env['platform'] in ('posix', 'linux', 'freebsd', 'darwin'):

				        cppdefines += [

				            '_POSIX_SOURCE',

				@@ -450,7 +452,7 @@ def generate(env):

				            ccflags += [

				                '/O2', # optimize for speed

				            ]

				        if env['build'] == 'release':

				        if env['build'] in ('release', 'opt'):

				            if not env['clang']:

				                ccflags += [

				                    '/GL', # enable whole program optimization

				@@ -561,7 +563,7 @@ def generate(env):

				            shlinkflags += ['-Wl,--enable-stdcall-fixup']

				            #shlinkflags += ['-Wl,--kill-at']

				    if msvc:

				        if env['build'] == 'release' and not env['clang']:

				        if env['build'] in ('release', 'opt') and not env['clang']:

				            # enable Link-time Code Generation

				            linkflags += ['/LTCG']

				            env.Append(ARFLAGS = ['/LTCG'])

				@@ -650,7 +652,6 @@ def generate(env):

				    env.PkgCheckModules('XCB', ['x11-xcb', 'xcb-glx >= 1.8.1', 'xcb-dri2 >= 1.8'])

				    env.PkgCheckModules('XF86VIDMODE', ['xxf86vm'])

				    env.PkgCheckModules('DRM', ['libdrm >= 2.4.38'])

				    env.PkgCheckModules('UDEV', ['libudev >= 151'])

				    if env['x11']:

				        env.Append(CPPPATH = env['X11_CPPPATH'])

									
										2

scripts/get_reviewer.pl
									
												View File
												
				@@ -865,7 +865,7 @@ sub top_of_mesa_tree {

					$lk_path .= "/";

				    }

				    if (   (-f "${lk_path}docs/mesa.css")

					&& (-f "${lk_path}docs/GL3.txt")

					&& (-f "${lk_path}docs/features.txt")

					&& (-f "${lk_path}src/mesa/main/version.c")

					&& (-f "${lk_path}REVIEWERS")

					&& (-d "${lk_path}scripts")) {

									
										50

src/Makefile.am
									
												View File
												
				@@ -47,9 +47,37 @@ CLEANFILES = $(BUILT_SOURCES)

				SUBDIRS = . gtest util mapi/glapi/gen mapi

				if HAVE_OPENGL

				gldir = $(includedir)/GL

				gl_HEADERS = \

				  $(top_srcdir)/include/GL/gl.h \

				  $(top_srcdir)/include/GL/glext.h \

				  $(top_srcdir)/include/GL/glcorearb.h \

				  $(top_srcdir)/include/GL/gl_mangle.h

				endif

				if HAVE_GLX

				glxdir = $(includedir)/GL

				glx_HEADERS = \

				  $(top_srcdir)/include/GL/glx.h \

				  $(top_srcdir)/include/GL/glxext.h \

				  $(top_srcdir)/include/GL/glx_mangle.h

				pkgconfigdir = $(libdir)/pkgconfig

				pkgconfig_DATA = mesa/gl.pc

				endif

				if HAVE_COMMON_OSMESA

				osmesadir = $(includedir)/GL

				osmesa_HEADERS = $(top_srcdir)/include/GL/osmesa.h

				endif

				# include only conditionally ?

				SUBDIRS += compiler

				if HAVE_AMD_DRIVERS

				SUBDIRS += amd

				endif

				if HAVE_INTEL_DRIVERS

				SUBDIRS += intel

				endif

				@@ -83,17 +111,32 @@ if HAVE_EGL

				SUBDIRS += egl

				endif

				if HAVE_INTEL_DRIVERS

				SUBDIRS += intel/tools

				endif

				if HAVE_VULKAN_COMMON

				SUBDIRS += vulkan/wsi

				endif

				## Requires the i965 compiler (part of mesa) and wayland-drm

				if HAVE_INTEL_VULKAN

				SUBDIRS += intel/vulkan

				endif

				# Requires wayland-drm

				if HAVE_RADEON_VULKAN

				SUBDIRS += amd/common

				SUBDIRS += amd/vulkan

				endif

				if HAVE_GALLIUM

				SUBDIRS += gallium

				endif

				EXTRA_DIST = \

					getopt hgl SConscript

					getopt hgl SConscript \

					$(top_srcdir)/include/GL/mesa_glinterop.h

				AM_CFLAGS = $(VISIBILITY_CFLAGS)

				AM_CXXFLAGS = $(VISIBILITY_CXXFLAGS)

				@@ -102,12 +145,15 @@ AM_CPPFLAGS = \

					-I$(top_srcdir)/include/ \

					-I$(top_srcdir)/src/mapi/ \

					-I$(top_srcdir)/src/mesa/ \

					-I$(top_srcdir)/src/gallium/include \

					-I$(top_srcdir)/src/gallium/auxiliary \

					$(DEFINES)

				noinst_LTLIBRARIES = libglsl_util.la

				libglsl_util_la_SOURCES = \

					mesa/main/extensions_table.c \

					mesa/main/imports.c \

					mesa/program/prog_hash_table.c \

					mesa/program/prog_parameter.c \

					mesa/program/symbol_table.c \

					mesa/program/dummy_errors.c

									
										49

src/SConscript
									
												View File
												
				@@ -1,5 +1,8 @@

				Import('*')

				import filecmp

				import os

				import subprocess

				Import('*')

				if env['platform'] == 'windows':

				    SConscript('getopt/SConscript')

				@@ -12,6 +15,50 @@ if env['hostonly']:

				    # compilation

				    Return()

				def write_git_sha1_h_file(filename):

				    """Mesa looks for a git_sha1.h file at compile time in order to display

				    the current git hash id in the GL_VERSION string.  This function tries

				    to retrieve the git hashid and write the header file.  An empty file

				    will be created if anything goes wrong."""

				    args = [ 'git', 'rev-parse', '--short=10', 'HEAD' ]

				    try:

				        (commit, foo) = subprocess.Popen(args, stdout=subprocess.PIPE).communicate()

				    except:

				        print "Warning: exception in write_git_sha1_h_file()"

				        # git log command didn't work

				        if not os.path.exists(filename):

				            dirname = os.path.dirname(filename)

				            if dirname and not os.path.exists(dirname):

				                os.makedirs(dirname)

				            # create an empty file if none already exists

				            f = open(filename, "w")

				            f.close()

				        return

				    # note that commit[:-1] removes the trailing newline character

				    commit = '#define MESA_GIT_SHA1 "git-%s"\n' % commit[:-1]

				    tempfile = "git_sha1.h.tmp"

				    f = open(tempfile, "w")

				    f.write(commit)

				    f.close()

				    if not os.path.exists(filename) or not filecmp.cmp(tempfile, filename):

				        # The filename does not exist or it's different from the new file,

				        # so replace old file with new.

				        if os.path.exists(filename):

				            os.remove(filename)

				        os.rename(tempfile, filename)

				    return

				# Create the git_sha1.h header file

				write_git_sha1_h_file("git_sha1.h")

				# and update CPPPATH so the git_sha1.h header can be found

				env.Append(CPPPATH = ["#" + env['build_dir']])

				if env['platform'] != 'windows':

				    SConscript('loader/SConscript')

									
										44

src/amd/Android.addrlib.mk
									
										Normal file
									
												View File
												
				@@ -0,0 +1,44 @@

				# Copyright © 2016 Red Hat.

				# Copyright © 2016 Mauro Rossi <issor.oruam@gmail.com>

				#

				# Permission is hereby granted, free of charge, to any person obtaining a

				# copy of this software and associated documentation files (the "Software"),

				# to deal in the Software without restriction, including without limitation

				# the rights to use, copy, modify, merge, publish, distribute, sublicense,

				# and/or sell copies of the Software, and to permit persons to whom the

				# Software is furnished to do so, subject to the following conditions:

				#

				# The above copyright notice and this permission notice (including the next

				# paragraph) shall be included in all copies or substantial portions of the

				# Software.

				#

				# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL

				# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

				# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING

				# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS

				# IN THE SOFTWARE.

				# ---------------------------------------

				# Build libmesa_amdgpu_addrlib

				# ---------------------------------------

				include $(CLEAR_VARS)

				LOCAL_MODULE := libmesa_amdgpu_addrlib

				LOCAL_SRC_FILES := $(ADDRLIB_FILES)

				LOCAL_CFLAGS := -DBRAHMA_BUILD=1

				LOCAL_C_INCLUDES := \

					$(MESA_TOP)/src \

					$(MESA_TOP)/src/amd/common \

					$(MESA_TOP)/src/amd/addrlib \

					$(MESA_TOP)/src/amd/addrlib/core \

					$(MESA_TOP)/src/amd/addrlib/inc/chip/r800 \

					$(MESA_TOP)/src/amd/addrlib/r800/chip

				include $(MESA_COMMON_MK)

				include $(BUILD_STATIC_LIBRARY)

									
										28

src/amd/Android.mk
									
										Normal file
									
												View File
												
				@@ -0,0 +1,28 @@

				# Copyright © 2016 Red Hat.

				# Copyright © 2016 Mauro Rossi <issor.oruam@gmail.com>

				#

				# Permission is hereby granted, free of charge, to any person obtaining a

				# copy of this software and associated documentation files (the "Software"),

				# to deal in the Software without restriction, including without limitation

				# the rights to use, copy, modify, merge, publish, distribute, sublicense,

				# and/or sell copies of the Software, and to permit persons to whom the

				# Software is furnished to do so, subject to the following conditions:

				#

				# The above copyright notice and this permission notice (including the next

				# paragraph) shall be included in all copies or substantial portions of the

				# Software.

				#

				# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL

				# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

				# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING

				# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS

				# IN THE SOFTWARE.

				LOCAL_PATH := $(call my-dir)

				# Import variables

				include $(LOCAL_PATH)/Makefile.sources

				include $(LOCAL_PATH)/Android.addrlib.mk

									
										38

src/amd/Makefile.addrlib.am
									
										Normal file
									
												View File
												
				@@ -0,0 +1,38 @@

				# Copyright 2016 Red Hat Inc.

				#

				# Permission is hereby granted, free of charge, to any person obtaining a

				# copy of this software and associated documentation files (the "Software"),

				# to deal in the Software without restriction, including without limitation

				# the rights to use, copy, modify, merge, publish, distribute, sublicense,

				# and/or sell copies of the Software, and to permit persons to whom the

				# Software is furnished to do so, subject to the following conditions:

				#

				# The above copyright notice and this permission notice (including the next

				# paragraph) shall be included in all copies or substantial portions of the

				# Software.

				#

				# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL

				# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

				# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING

				# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS

				# IN THE SOFTWARE.

				ADDRLIB_LIBS = addrlib/libamdgpu_addrlib.la

				addrlib_libamdgpu_addrlib_la_CPPFLAGS = \

					-I$(top_srcdir)/src/ \

					-I$(srcdir)/common \

					-I$(srcdir)/addrlib \

					-I$(srcdir)/addrlib/core \

					-I$(srcdir)/addrlib/inc/chip/r800 \

					-I$(srcdir)/addrlib/r800/chip \

					-DBRAHMA_BUILD=1

				addrlib_libamdgpu_addrlib_la_CXXFLAGS = \

					$(VISIBILITY_CXXFLAGS)

				noinst_LTLIBRARIES += $(ADDRLIB_LIBS)

				addrlib_libamdgpu_addrlib_la_SOURCES = $(ADDRLIB_FILES)

									
										27

src/amd/Makefile.am
									
										Normal file
									
												View File
												
				@@ -0,0 +1,27 @@

				# Copyright © 2016 Red Hat.

				#

				# Permission is hereby granted, free of charge, to any person obtaining a

				# copy of this software and associated documentation files (the "Software"),

				# to deal in the Software without restriction, including without limitation

				# the rights to use, copy, modify, merge, publish, distribute, sublicense,

				# and/or sell copies of the Software, and to permit persons to whom the

				# Software is furnished to do so, subject to the following conditions:

				#

				# The above copyright notice and this permission notice (including the next

				# paragraph) shall be included in all copies or substantial portions of the

				# Software.

				#

				# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL

				# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

				# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING

				# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS

				# IN THE SOFTWARE.

				include Makefile.sources

				noinst_LTLIBRARIES =

				EXTRA_DIST = $(COMMON_HEADER_FILES)

				include Makefile.addrlib.am

									
										27

src/amd/Makefile.sources
									
										Normal file
									
												View File
												
				@@ -0,0 +1,27 @@

				COMMON_HEADER_FILES = \

					common/sid.h \

					common/r600d_common.h \

					common/amd_family.h \

					common/amd_kernel_code_t.h \

					common/amdgpu_id.h

				ADDRLIB_FILES = \

					addrlib/addrinterface.cpp \

					addrlib/addrinterface.h \

					addrlib/addrtypes.h \

					addrlib/core/addrcommon.h \

					addrlib/core/addrelemlib.cpp \

					addrlib/core/addrelemlib.h \

					addrlib/core/addrlib.cpp \

					addrlib/core/addrlib.h \

					addrlib/core/addrobject.cpp \

					addrlib/core/addrobject.h \

					addrlib/inc/chip/r800/si_gb_reg.h \

					addrlib/inc/lnx_common_defs.h \

					addrlib/r800/chip/si_ci_vi_merged_enum.h \

					addrlib/r800/ciaddrlib.cpp \

					addrlib/r800/ciaddrlib.h \

					addrlib/r800/egbaddrlib.cpp \

					addrlib/r800/egbaddrlib.h \

					addrlib/r800/siaddrlib.cpp \

					addrlib/r800/siaddrlib.h

0

src/gallium/winsys/amdgpu/drm/addrlib/addrinterface.cpp → src/amd/addrlib/addrinterface.cpp

View File

0

src/gallium/winsys/amdgpu/drm/addrlib/addrinterface.h → src/amd/addrlib/addrinterface.h

View File

0

src/gallium/winsys/amdgpu/drm/addrlib/addrtypes.h → src/amd/addrlib/addrtypes.h

View File

0

src/gallium/winsys/amdgpu/drm/addrlib/core/addrcommon.h → src/amd/addrlib/core/addrcommon.h

View File

0

src/gallium/winsys/amdgpu/drm/addrlib/core/addrelemlib.cpp → src/amd/addrlib/core/addrelemlib.cpp

View File

0

src/gallium/winsys/amdgpu/drm/addrlib/core/addrelemlib.h → src/amd/addrlib/core/addrelemlib.h

View File

0

src/gallium/winsys/amdgpu/drm/addrlib/core/addrlib.cpp → src/amd/addrlib/core/addrlib.cpp

View File

0

src/gallium/winsys/amdgpu/drm/addrlib/core/addrlib.h → src/amd/addrlib/core/addrlib.h

View File

0

src/gallium/winsys/amdgpu/drm/addrlib/core/addrobject.cpp → src/amd/addrlib/core/addrobject.cpp

View File

0

src/gallium/winsys/amdgpu/drm/addrlib/core/addrobject.h → src/amd/addrlib/core/addrobject.h

View File

0

src/gallium/winsys/amdgpu/drm/addrlib/inc/chip/r800/si_gb_reg.h → src/amd/addrlib/inc/chip/r800/si_gb_reg.h

View File

0

src/gallium/winsys/amdgpu/drm/addrlib/inc/lnx_common_defs.h → src/amd/addrlib/inc/lnx_common_defs.h

View File

0

src/gallium/winsys/amdgpu/drm/addrlib/r800/chip/si_ci_vi_merged_enum.h → src/amd/addrlib/r800/chip/si_ci_vi_merged_enum.h

View File

0

src/gallium/winsys/amdgpu/drm/addrlib/r800/ciaddrlib.cpp → src/amd/addrlib/r800/ciaddrlib.cpp

View File

0

src/gallium/winsys/amdgpu/drm/addrlib/r800/ciaddrlib.h → src/amd/addrlib/r800/ciaddrlib.h

View File

0

src/gallium/winsys/amdgpu/drm/addrlib/r800/egbaddrlib.cpp → src/amd/addrlib/r800/egbaddrlib.cpp

View File

0

src/gallium/winsys/amdgpu/drm/addrlib/r800/egbaddrlib.h → src/amd/addrlib/r800/egbaddrlib.h

View File

0

src/gallium/winsys/amdgpu/drm/addrlib/r800/siaddrlib.cpp → src/amd/addrlib/r800/siaddrlib.cpp

View File

0

src/gallium/winsys/amdgpu/drm/addrlib/r800/siaddrlib.h → src/amd/addrlib/r800/siaddrlib.h

View File

									
										51

src/amd/common/Makefile.am
									
										Normal file
									
												View File
												
				@@ -0,0 +1,51 @@

				# Copyright © 2016 Bas Nieuwenhuizen

				#

				# Permission is hereby granted, free of charge, to any person obtaining a

				# copy of this software and associated documentation files (the "Software"),

				# to deal in the Software without restriction, including without limitation

				# the rights to use, copy, modify, merge, publish, distribute, sublicense,

				# and/or sell copies of the Software, and to permit persons to whom the

				# Software is furnished to do so, subject to the following conditions:

				#

				# The above copyright notice and this permission notice (including the next

				# paragraph) shall be included in all copies or substantial portions of the

				# Software.

				#

				# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL

				# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

				# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING

				# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS

				# IN THE SOFTWARE.

				include Makefile.sources

				# TODO cleanup these

				AM_CPPFLAGS = \

					$(VALGRIND_CFLAGS) \

					$(DEFINES) \

					-I$(top_srcdir)/include \

					-I$(top_builddir)/src \

					-I$(top_srcdir)/src \

					-I$(top_builddir)/src/compiler \

					-I$(top_builddir)/src/compiler/nir \

					-I$(top_srcdir)/src/compiler \

					-I$(top_srcdir)/src/mapi \

					-I$(top_srcdir)/src/mesa \

					-I$(top_srcdir)/src/mesa/drivers/dri/common \

					-I$(top_srcdir)/src/gallium/auxiliary \

					-I$(top_srcdir)/src/gallium/include

				AM_CFLAGS = $(VISIBILITY_CFLAGS) \

					$(PTHREAD_CFLAGS) \

					$(LLVM_CFLAGS) \

					$(LIBELF_CFLAGS)

				AM_CXXFLAGS = \

					$(VISIBILITY_CXXFLAGS) \

					$(LLVM_CXXFLAGS)

				noinst_LTLIBRARIES = libamd_common.la

				libamd_common_la_SOURCES = $(AMD_COMPILER_SOURCES)

									
										29

src/amd/common/Makefile.sources
									
										Normal file
									
												View File
												
				@@ -0,0 +1,29 @@

				# Copyright © 2016 Bas Nieuwenhuizen

				#

				# Permission is hereby granted, free of charge, to any person obtaining a

				# copy of this software and associated documentation files (the "Software"),

				# to deal in the Software without restriction, including without limitation

				# the rights to use, copy, modify, merge, publish, distribute, sublicense,

				# and/or sell copies of the Software, and to permit persons to whom the

				# Software is furnished to do so, subject to the following conditions:

				#

				# The above copyright notice and this permission notice (including the next

				# paragraph) shall be included in all copies or substantial portions of the

				# Software.

				#

				# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL

				# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

				# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING

				# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS

				# IN THE SOFTWARE.

				AMD_COMPILER_SOURCES := \

					ac_binary.c \

					ac_binary.h \

					ac_llvm_helper.cpp \

					ac_llvm_util.c \

					ac_llvm_util.h \

					ac_nir_to_llvm.c \

					ac_nir_to_llvm.h

									
										288

src/amd/common/ac_binary.c
									
										Normal file
									
												View File
												
				@@ -0,0 +1,288 @@

				/*

				 * Copyright 2014 Advanced Micro Devices, Inc.

				 *

				 * Permission is hereby granted, free of charge, to any person obtaining a

				 * copy of this software and associated documentation files (the "Software"),

				 * to deal in the Software without restriction, including without limitation

				 * the rights to use, copy, modify, merge, publish, distribute, sublicense,

				 * and/or sell copies of the Software, and to permit persons to whom the

				 * Software is furnished to do so, subject to the following conditions:

				 *

				 * The above copyright notice and this permission notice (including the next

				 * paragraph) shall be included in all copies or substantial portions of the

				 * Software.

				 *

				 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL

				 * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

				 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,

				 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE

				 * SOFTWARE.

				 *

				 * Authors: Tom Stellard <thomas.stellard@amd.com>

				 *

				 * Based on radeon_elf_util.c.

				 */

				#include "ac_binary.h"

				#include "util/u_math.h"

				#include "util/u_memory.h"

				#include <gelf.h>

				#include <libelf.h>

				#include <stdio.h>

				#include <sid.h>

				#define SPILLED_SGPRS                                     0x4

				#define SPILLED_VGPRS                                     0x8

				static void parse_symbol_table(Elf_Data *symbol_table_data,

								const GElf_Shdr *symbol_table_header,

								struct ac_shader_binary *binary)

				{

					GElf_Sym symbol;

					unsigned i = 0;

					unsigned symbol_count =

						symbol_table_header->sh_size / symbol_table_header->sh_entsize;

					/* We are over allocating this list, because symbol_count gives the

					 * total number of symbols, and we will only be filling the list

					 * with offsets of global symbols.  The memory savings from

					 * allocating the correct size of this list will be small, and

					 * I don't think it is worth the cost of pre-computing the number

					 * of global symbols.

					 */

					binary->global_symbol_offsets = CALLOC(symbol_count, sizeof(uint64_t));

					while (gelf_getsym(symbol_table_data, i++, &symbol)) {

						unsigned i;

						if (GELF_ST_BIND(symbol.st_info) != STB_GLOBAL ||

						    symbol.st_shndx == 0 /* Undefined symbol */) {

							continue;

						}

						binary->global_symbol_offsets[binary->global_symbol_count] =

									symbol.st_value;

						/* Sort the list using bubble sort.  This list will usually

						 * be small. */

						for (i = binary->global_symbol_count; i > 0; --i) {

							uint64_t lhs = binary->global_symbol_offsets[i - 1];

							uint64_t rhs = binary->global_symbol_offsets[i];

							if (lhs < rhs) {

								break;

							}

							binary->global_symbol_offsets[i] = lhs;

							binary->global_symbol_offsets[i - 1] = rhs;

						}

						++binary->global_symbol_count;

					}

				}

				static void parse_relocs(Elf *elf, Elf_Data *relocs, Elf_Data *symbols,

							unsigned symbol_sh_link,

							struct ac_shader_binary *binary)

				{

					unsigned i;

					if (!relocs || !symbols || !binary->reloc_count) {

						return;

					}

					binary->relocs = CALLOC(binary->reloc_count,

							sizeof(struct ac_shader_reloc));

					for (i = 0; i < binary->reloc_count; i++) {

						GElf_Sym symbol;

						GElf_Rel rel;

						char *symbol_name;

						struct ac_shader_reloc *reloc = &binary->relocs[i];

						gelf_getrel(relocs, i, &rel);

						gelf_getsym(symbols, GELF_R_SYM(rel.r_info), &symbol);

						symbol_name = elf_strptr(elf, symbol_sh_link, symbol.st_name);

						reloc->offset = rel.r_offset;

						strncpy(reloc->name, symbol_name, sizeof(reloc->name)-1);

						reloc->name[sizeof(reloc->name)-1] = 0;

					}

				}

				void ac_elf_read(const char *elf_data, unsigned elf_size,

						 struct ac_shader_binary *binary)

				{

					char *elf_buffer;

					Elf *elf;

					Elf_Scn *section = NULL;

					Elf_Data *symbols = NULL, *relocs = NULL;

					size_t section_str_index;

					unsigned symbol_sh_link = 0;

					/* One of the libelf implementations

					 * (http://www.mr511.de/software/english.htm) requires calling

					 * elf_version() before elf_memory().

					 */

					elf_version(EV_CURRENT);

					elf_buffer = MALLOC(elf_size);

					memcpy(elf_buffer, elf_data, elf_size);

					elf = elf_memory(elf_buffer, elf_size);

					elf_getshdrstrndx(elf, &section_str_index);

					while ((section = elf_nextscn(elf, section))) {

						const char *name;

						Elf_Data *section_data = NULL;

						GElf_Shdr section_header;

						if (gelf_getshdr(section, &section_header) != &section_header) {

							fprintf(stderr, "Failed to read ELF section header\n");

							return;

						}

						name = elf_strptr(elf, section_str_index, section_header.sh_name);

						if (!strcmp(name, ".text")) {

							section_data = elf_getdata(section, section_data);

							binary->code_size = section_data->d_size;

							binary->code = MALLOC(binary->code_size * sizeof(unsigned char));

							memcpy(binary->code, section_data->d_buf, binary->code_size);

						} else if (!strcmp(name, ".AMDGPU.config")) {

							section_data = elf_getdata(section, section_data);

							binary->config_size = section_data->d_size;

							binary->config = MALLOC(binary->config_size * sizeof(unsigned char));

							memcpy(binary->config, section_data->d_buf, binary->config_size);

						} else if (!strcmp(name, ".AMDGPU.disasm")) {

							/* Always read disassembly if it's available. */

							section_data = elf_getdata(section, section_data);

							binary->disasm_string = strndup(section_data->d_buf,

											section_data->d_size);

						} else if (!strncmp(name, ".rodata", 7)) {

							section_data = elf_getdata(section, section_data);

							binary->rodata_size = section_data->d_size;

							binary->rodata = MALLOC(binary->rodata_size * sizeof(unsigned char));

							memcpy(binary->rodata, section_data->d_buf, binary->rodata_size);

						} else if (!strncmp(name, ".symtab", 7)) {

							symbols = elf_getdata(section, section_data);

							symbol_sh_link = section_header.sh_link;

							parse_symbol_table(symbols, &section_header, binary);

						} else if (!strcmp(name, ".rel.text")) {

							relocs = elf_getdata(section, section_data);

							binary->reloc_count = section_header.sh_size /

									section_header.sh_entsize;

						}

					}

					parse_relocs(elf, relocs, symbols, symbol_sh_link, binary);

					if (elf){

						elf_end(elf);

					}

					FREE(elf_buffer);

					/* Cache the config size per symbol */

					if (binary->global_symbol_count) {

						binary->config_size_per_symbol =

							binary->config_size / binary->global_symbol_count;

					} else {

						binary->global_symbol_count = 1;

						binary->config_size_per_symbol = binary->config_size;

					}

				}

				static

				const unsigned char *ac_shader_binary_config_start(

					const struct ac_shader_binary *binary,

					uint64_t symbol_offset)

				{

					unsigned i;

					for (i = 0; i < binary->global_symbol_count; ++i) {

						if (binary->global_symbol_offsets[i] == symbol_offset) {

							unsigned offset = i * binary->config_size_per_symbol;

							return binary->config + offset;

						}

					}

					return binary->config;

				}

				static const char *scratch_rsrc_dword0_symbol =

					"SCRATCH_RSRC_DWORD0";

				static const char *scratch_rsrc_dword1_symbol =

					"SCRATCH_RSRC_DWORD1";

				void ac_shader_binary_read_config(struct ac_shader_binary *binary,

								  struct ac_shader_config *conf,

								  unsigned symbol_offset)

				{

					unsigned i;

					const unsigned char *config =

						ac_shader_binary_config_start(binary, symbol_offset);

					bool really_needs_scratch = false;

					/* LLVM adds SGPR spills to the scratch size.

					 * Find out if we really need the scratch buffer.

					 */

					for (i = 0; i < binary->reloc_count; i++) {

						const struct ac_shader_reloc *reloc = &binary->relocs[i];

						if (!strcmp(scratch_rsrc_dword0_symbol, reloc->name) ||

						    !strcmp(scratch_rsrc_dword1_symbol, reloc->name)) {

							really_needs_scratch = true;

							break;

						}

					}

					for (i = 0; i < binary->config_size_per_symbol; i+= 8) {

						unsigned reg = util_le32_to_cpu(*(uint32_t*)(config + i));

						unsigned value = util_le32_to_cpu(*(uint32_t*)(config + i + 4));

						switch (reg) {

						case R_00B028_SPI_SHADER_PGM_RSRC1_PS:

						case R_00B128_SPI_SHADER_PGM_RSRC1_VS:

						case R_00B228_SPI_SHADER_PGM_RSRC1_GS:

						case R_00B848_COMPUTE_PGM_RSRC1:

							conf->num_sgprs = MAX2(conf->num_sgprs, (G_00B028_SGPRS(value) + 1) * 8);

							conf->num_vgprs = MAX2(conf->num_vgprs, (G_00B028_VGPRS(value) + 1) * 4);

							conf->float_mode =  G_00B028_FLOAT_MODE(value);

							break;

						case R_00B02C_SPI_SHADER_PGM_RSRC2_PS:

							conf->lds_size = MAX2(conf->lds_size, G_00B02C_EXTRA_LDS_SIZE(value));

							break;

						case R_00B84C_COMPUTE_PGM_RSRC2:

							conf->lds_size = MAX2(conf->lds_size, G_00B84C_LDS_SIZE(value));

							break;

						case R_0286CC_SPI_PS_INPUT_ENA:

							conf->spi_ps_input_ena = value;

							break;

						case R_0286D0_SPI_PS_INPUT_ADDR:

							conf->spi_ps_input_addr = value;

							break;

						case R_0286E8_SPI_TMPRING_SIZE:

						case R_00B860_COMPUTE_TMPRING_SIZE:

							/* WAVESIZE is in units of 256 dwords. */

							if (really_needs_scratch)

								conf->scratch_bytes_per_wave =

									G_00B860_WAVESIZE(value) * 256 * 4;

							break;

						case SPILLED_SGPRS:

							conf->spilled_sgprs = value;

							break;

						case SPILLED_VGPRS:

							conf->spilled_vgprs = value;

							break;

						default:

							{

								static bool printed;

								if (!printed) {

									fprintf(stderr, "Warning: LLVM emitted unknown "

										"config register: 0x%x\n", reg);

									printed = true;

								}

							}

							break;

						}

						if (!conf->spi_ps_input_addr)

							conf->spi_ps_input_addr = conf->spi_ps_input_ena;

					}

				}

									
										88

src/amd/common/ac_binary.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,88 @@

				/*

				 * Copyright 2014 Advanced Micro Devices, Inc.

				 *

				 * Permission is hereby granted, free of charge, to any person obtaining a

				 * copy of this software and associated documentation files (the "Software"),

				 * to deal in the Software without restriction, including without limitation

				 * the rights to use, copy, modify, merge, publish, distribute, sublicense,

				 * and/or sell copies of the Software, and to permit persons to whom the

				 * Software is furnished to do so, subject to the following conditions:

				 *

				 * The above copyright notice and this permission notice (including the next

				 * paragraph) shall be included in all copies or substantial portions of the

				 * Software.

				 *

				 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL

				 * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

				 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,

				 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE

				 * SOFTWARE.

				 *

				 * Authors: Tom Stellard <thomas.stellard@amd.com>

				 *

				 */

				#pragma once

				#include <stdint.h>

				struct ac_shader_reloc {

					char name[32];

					uint64_t offset;

				};

				struct ac_shader_binary {

					/** Shader code */

					unsigned char *code;

					unsigned code_size;

					/** Config/Context register state that accompanies this shader.

					 * This is a stream of dword pairs.  First dword contains the

					 * register address, the second dword contains the value.*/

					unsigned char *config;

					unsigned config_size;

					/** The number of bytes of config information for each global symbol.

					 */

					unsigned config_size_per_symbol;

					/** Constant data accessed by the shader.  This will be uploaded

					 * into a constant buffer. */

					unsigned char *rodata;

					unsigned rodata_size;

					/** List of symbol offsets for the shader */

					uint64_t *global_symbol_offsets;

					unsigned global_symbol_count;

					struct ac_shader_reloc *relocs;

					unsigned reloc_count;

					/** Disassembled shader in a string. */

					char *disasm_string;

				};

				struct ac_shader_config {

					unsigned num_sgprs;

					unsigned num_vgprs;

					unsigned spilled_sgprs;

					unsigned spilled_vgprs;

					unsigned lds_size;

					unsigned spi_ps_input_ena;

					unsigned spi_ps_input_addr;

					unsigned float_mode;

					unsigned scratch_bytes_per_wave;

				};

				/*

				 * Parse the elf binary stored in \p elf_data and create a

				 * ac_shader_binary object.

				 */

				void ac_elf_read(const char *elf_data, unsigned elf_size,

						 struct ac_shader_binary *binary);

				void ac_shader_binary_read_config(struct ac_shader_binary *binary,

								  struct ac_shader_config *conf,

								  unsigned symbol_offset);

									
										46

src/amd/common/ac_llvm_helper.cpp
									
										Normal file
									
												View File
												
				@@ -0,0 +1,46 @@

				/*

				 * Copyright 2014 Advanced Micro Devices, Inc.

				 *

				 * Permission is hereby granted, free of charge, to any person obtaining a

				 * copy of this software and associated documentation files (the

				 * "Software"), to deal in the Software without restriction, including

				 * without limitation the rights to use, copy, modify, merge, publish,

				 * distribute, sub license, and/or sell copies of the Software, and to

				 * permit persons to whom the Software is furnished to do so, subject to

				 * the following conditions:

				 *

				 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				 * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL

				 * THE COPYRIGHT HOLDERS, AUTHORS AND/OR ITS SUPPLIERS BE LIABLE FOR ANY CLAIM,

				 * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR

				 * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE

				 * USE OR OTHER DEALINGS IN THE SOFTWARE.

				 *

				 * The above copyright notice and this permission notice (including the

				 * next paragraph) shall be included in all copies or substantial portions

				 * of the Software.

				 *

				 */

				/* based on Marek's patch to lp_bld_misc.cpp */

				// Workaround http://llvm.org/PR23628

				#if HAVE_LLVM >= 0x0307

				#  pragma push_macro("DEBUG")

				#  undef DEBUG

				#endif

				#include "ac_nir_to_llvm.h"

				#include <llvm-c/Core.h>

				#include <llvm/Target/TargetOptions.h>

				#include <llvm/ExecutionEngine/ExecutionEngine.h>

				extern "C" void

				ac_add_attr_dereferenceable(LLVMValueRef val, uint64_t bytes)

				{

				   llvm::Argument *A = llvm::unwrap<llvm::Argument>(val);

				   llvm::AttrBuilder B;

				   B.addDereferenceableAttr(bytes);

				   A->addAttr(llvm::AttributeSet::get(A->getContext(), A->getArgNo() + 1,  B));

				}

									
										142

src/amd/common/ac_llvm_util.c
									
										Normal file
									
												View File
												
				@@ -0,0 +1,142 @@

				/*

				 * Copyright 2014 Advanced Micro Devices, Inc.

				 *

				 * Permission is hereby granted, free of charge, to any person obtaining a

				 * copy of this software and associated documentation files (the

				 * "Software"), to deal in the Software without restriction, including

				 * without limitation the rights to use, copy, modify, merge, publish,

				 * distribute, sub license, and/or sell copies of the Software, and to

				 * permit persons to whom the Software is furnished to do so, subject to

				 * the following conditions:

				 *

				 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				 * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL

				 * THE COPYRIGHT HOLDERS, AUTHORS AND/OR ITS SUPPLIERS BE LIABLE FOR ANY CLAIM,

				 * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR

				 * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE

				 * USE OR OTHER DEALINGS IN THE SOFTWARE.

				 *

				 * The above copyright notice and this permission notice (including the

				 * next paragraph) shall be included in all copies or substantial portions

				 * of the Software.

				 *

				 */

				/* based on pieces from si_pipe.c and radeon_llvm_emit.c */

				#include "ac_llvm_util.h"

				#include <llvm-c/Core.h>

				#include "c11/threads.h"

				#include <assert.h>

				#include <stdio.h>

				static void ac_init_llvm_target()

				{

				#if HAVE_LLVM < 0x0307

					LLVMInitializeR600TargetInfo();

					LLVMInitializeR600Target();

					LLVMInitializeR600TargetMC();

					LLVMInitializeR600AsmPrinter();

				#else

					LLVMInitializeAMDGPUTargetInfo();

					LLVMInitializeAMDGPUTarget();

					LLVMInitializeAMDGPUTargetMC();

					LLVMInitializeAMDGPUAsmPrinter();

				#endif

				}

				static once_flag ac_init_llvm_target_once_flag = ONCE_FLAG_INIT;

				static LLVMTargetRef ac_get_llvm_target(const char *triple)

				{

					LLVMTargetRef target = NULL;

					char *err_message = NULL;

					call_once(&ac_init_llvm_target_once_flag, ac_init_llvm_target);

					if (LLVMGetTargetFromTriple(triple, &target, &err_message)) {

						fprintf(stderr, "Cannot find target for triple %s ", triple);

						if (err_message) {

							fprintf(stderr, "%s\n", err_message);

						}

						LLVMDisposeMessage(err_message);

						return NULL;

					}

					return target;

				}

				static const char *ac_get_llvm_processor_name(enum radeon_family family)

				{

					switch (family) {

					case CHIP_TAHITI:

						return "tahiti";

					case CHIP_PITCAIRN:

						return "pitcairn";

					case CHIP_VERDE:

						return "verde";

					case CHIP_OLAND:

						return "oland";

					case CHIP_HAINAN:

						return "hainan";

					case CHIP_BONAIRE:

						return "bonaire";

					case CHIP_KABINI:

						return "kabini";

					case CHIP_KAVERI:

						return "kaveri";

					case CHIP_HAWAII:

						return "hawaii";

					case CHIP_MULLINS:

						return "mullins";

					case CHIP_TONGA:

						return "tonga";

					case CHIP_ICELAND:

						return "iceland";

					case CHIP_CARRIZO:

						return "carrizo";

				#if HAVE_LLVM <= 0x0307

					case CHIP_FIJI:

						return "tonga";

					case CHIP_STONEY:

						return "carrizo";

				#else

					case CHIP_FIJI:

						return "fiji";

					case CHIP_STONEY:

						return "stoney";

				#endif

				#if HAVE_LLVM <= 0x0308

					case CHIP_POLARIS10:

						return "tonga";

					case CHIP_POLARIS11:

						return "tonga";

				#else

					case CHIP_POLARIS10:

						return "polaris10";

					case CHIP_POLARIS11:

						return "polaris11";

				#endif

					default:

						return "";

					}

				}

				LLVMTargetMachineRef ac_create_target_machine(enum radeon_family family)

				{

					assert(family >= CHIP_TAHITI);

					const char *triple = "amdgcn--";

					LLVMTargetRef target = ac_get_llvm_target(triple);

					LLVMTargetMachineRef tm = LLVMCreateTargetMachine(

					                             target,

					                             triple,

					                             ac_get_llvm_processor_name(family),

					                             "+DumpCode,+vgpr-spilling",

					                             LLVMCodeGenLevelDefault,

					                             LLVMRelocDefault,

					                             LLVMCodeModelDefault);

					return tm;

				}

									
										31

src/amd/common/ac_llvm_util.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,31 @@

				/*

				 * Copyright 2016 Bas Nieuwenhuizen

				 *

				 * Permission is hereby granted, free of charge, to any person obtaining a

				 * copy of this software and associated documentation files (the

				 * "Software"), to deal in the Software without restriction, including

				 * without limitation the rights to use, copy, modify, merge, publish,

				 * distribute, sub license, and/or sell copies of the Software, and to

				 * permit persons to whom the Software is furnished to do so, subject to

				 * the following conditions:

				 *

				 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				 * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL

				 * THE COPYRIGHT HOLDERS, AUTHORS AND/OR ITS SUPPLIERS BE LIABLE FOR ANY CLAIM,

				 * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR

				 * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE

				 * USE OR OTHER DEALINGS IN THE SOFTWARE.

				 *

				 * The above copyright notice and this permission notice (including the

				 * next paragraph) shall be included in all copies or substantial portions

				 * of the Software.

				 *

				 */

				#pragma once

				#include <llvm-c/TargetMachine.h>

				#include "amd_family.h"

				LLVMTargetMachineRef ac_create_target_machine(enum radeon_family family);

4650

src/amd/common/ac_nir_to_llvm.c Normal file

View File

File diff suppressed because it is too large Load Diff

									
										119

src/amd/common/ac_nir_to_llvm.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,119 @@

				/*

				 * Copyright © 2016 Bas Nieuwenhuizen

				 *

				 * Permission is hereby granted, free of charge, to any person obtaining a

				 * copy of this software and associated documentation files (the "Software"),

				 * to deal in the Software without restriction, including without limitation

				 * the rights to use, copy, modify, merge, publish, distribute, sublicense,

				 * and/or sell copies of the Software, and to permit persons to whom the

				 * Software is furnished to do so, subject to the following conditions:

				 *

				 * The above copyright notice and this permission notice (including the next

				 * paragraph) shall be included in all copies or substantial portions of the

				 * Software.

				 *

				 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL

				 * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

				 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING

				 * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS

				 * IN THE SOFTWARE.

				 */

				#pragma once

				#include <stdbool.h>

				#include "llvm-c/Core.h"

				#include "llvm-c/TargetMachine.h"

				#include "amd_family.h"

				struct ac_shader_binary;

				struct ac_shader_config;

				struct nir_shader;

				struct radv_pipeline_layout;

				struct ac_vs_variant_key {

					uint32_t instance_rate_inputs;

				};

				struct ac_fs_variant_key {

					uint32_t col_format;

					uint32_t is_int8;

				};

				union ac_shader_variant_key {

					struct ac_vs_variant_key vs;

					struct ac_fs_variant_key fs;

				};

				struct ac_nir_compiler_options {

					struct radv_pipeline_layout *layout;

					union ac_shader_variant_key key;

					bool unsafe_math;

					enum radeon_family family;

					enum chip_class chip_class;

				};

				struct ac_shader_variant_info {

					unsigned num_user_sgprs;

					unsigned num_input_sgprs;

					unsigned num_input_vgprs;

					union {

						struct {

							unsigned param_exports;

							unsigned pos_exports;

							unsigned vgpr_comp_cnt;

							uint32_t export_mask;

							bool writes_pointsize;

							uint8_t clip_dist_mask;

							uint8_t cull_dist_mask;

						} vs;

						struct {

							unsigned num_interp;

							uint32_t input_mask;

							unsigned output_mask;

							uint32_t flat_shaded_mask;

							bool has_pcoord;

							bool can_discard;

							bool writes_z;

							bool writes_stencil;

							bool early_fragment_test;

							bool writes_memory;

						} fs;

						struct {

							unsigned block_size[3];

						} cs;

					};

				};

				void ac_compile_nir_shader(LLVMTargetMachineRef tm,

				                           struct ac_shader_binary *binary,

				                           struct ac_shader_config *config,

				                           struct ac_shader_variant_info *shader_info,

				                           struct nir_shader *nir,

				                           const struct ac_nir_compiler_options *options,

							   bool dump_shader);

				/* SHADER ABI defines */

				/* offset in dwords */

				#define AC_USERDATA_DESCRIPTOR_SET_0 0

				#define AC_USERDATA_DESCRIPTOR_SET_1 2

				#define AC_USERDATA_DESCRIPTOR_SET_2 4

				#define AC_USERDATA_DESCRIPTOR_SET_3 6

				#define AC_USERDATA_PUSH_CONST_DYN 8

				#define AC_USERDATA_VS_VERTEX_BUFFERS 10

				#define AC_USERDATA_VS_BASE_VERTEX 12

				#define AC_USERDATA_VS_START_INSTANCE 13

				#define AC_USERDATA_PS_SAMPLE_POS 10

				#define AC_USERDATA_CS_GRID_SIZE 10

				#ifdef __cplusplus

				extern "C"

				#endif

				void ac_add_attr_dereferenceable(LLVMValueRef val, uint64_t bytes);

									
										111

src/amd/common/amd_family.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,111 @@

				/*

				 * Copyright 2008 Corbin Simpson <MostAwesomeDude@gmail.com>

				 * Copyright 2010 Marek Olšák <maraeo@gmail.com>

				 *

				 * Permission is hereby granted, free of charge, to any person obtaining a

				 * copy of this software and associated documentation files (the "Software"),

				 * to deal in the Software without restriction, including without limitation

				 * on the rights to use, copy, modify, merge, publish, distribute, sub

				 * license, and/or sell copies of the Software, and to permit persons to whom

				 * the Software is furnished to do so, subject to the following conditions:

				 *

				 * The above copyright notice and this permission notice (including the next

				 * paragraph) shall be included in all copies or substantial portions of the

				 * Software.

				 *

				 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				 * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL

				 * THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM,

				 * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR

				 * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE

				 * USE OR OTHER DEALINGS IN THE SOFTWARE. */

				#ifndef AMD_FAMILY_H

				#define AMD_FAMILY_H

				enum radeon_family {

				    CHIP_UNKNOWN = 0,

				    CHIP_R300, /* R3xx-based cores. */

				    CHIP_R350,

				    CHIP_RV350,

				    CHIP_RV370,

				    CHIP_RV380,

				    CHIP_RS400,

				    CHIP_RC410,

				    CHIP_RS480,

				    CHIP_R420,     /* R4xx-based cores. */

				    CHIP_R423,

				    CHIP_R430,

				    CHIP_R480,

				    CHIP_R481,

				    CHIP_RV410,

				    CHIP_RS600,

				    CHIP_RS690,

				    CHIP_RS740,

				    CHIP_RV515,    /* R5xx-based cores. */

				    CHIP_R520,

				    CHIP_RV530,

				    CHIP_R580,

				    CHIP_RV560,

				    CHIP_RV570,

				    CHIP_R600,

				    CHIP_RV610,

				    CHIP_RV630,

				    CHIP_RV670,

				    CHIP_RV620,

				    CHIP_RV635,

				    CHIP_RS780,

				    CHIP_RS880,

				    CHIP_RV770,

				    CHIP_RV730,

				    CHIP_RV710,

				    CHIP_RV740,

				    CHIP_CEDAR,

				    CHIP_REDWOOD,

				    CHIP_JUNIPER,

				    CHIP_CYPRESS,

				    CHIP_HEMLOCK,

				    CHIP_PALM,

				    CHIP_SUMO,

				    CHIP_SUMO2,

				    CHIP_BARTS,

				    CHIP_TURKS,

				    CHIP_CAICOS,

				    CHIP_CAYMAN,

				    CHIP_ARUBA,

				    CHIP_TAHITI,

				    CHIP_PITCAIRN,

				    CHIP_VERDE,

				    CHIP_OLAND,

				    CHIP_HAINAN,

				    CHIP_BONAIRE,

				    CHIP_KAVERI,

				    CHIP_KABINI,

				    CHIP_HAWAII,

				    CHIP_MULLINS,

				    CHIP_TONGA,

				    CHIP_ICELAND,

				    CHIP_CARRIZO,

				    CHIP_FIJI,

				    CHIP_STONEY,

				    CHIP_POLARIS10,

				    CHIP_POLARIS11,

				    CHIP_LAST,

				};

				enum chip_class {

				    CLASS_UNKNOWN = 0,

				    R300,

				    R400,

				    R500,

				    R600,

				    R700,

				    EVERGREEN,

				    CAYMAN,

				    SI,

				    CIK,

				    VI,

				};

				#endif

									
										534

src/amd/common/amd_kernel_code_t.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,534 @@

				/*

				 * Copyright 2015,2016 Advanced Micro Devices, Inc.

				 *

				 * Permission is hereby granted, free of charge, to any person obtaining a

				 * copy of this software and associated documentation files (the "Software"),

				 * to deal in the Software without restriction, including without limitation

				 * on the rights to use, copy, modify, merge, publish, distribute, sub

				 * license, and/or sell copies of the Software, and to permit persons to whom

				 * the Software is furnished to do so, subject to the following conditions:

				 *

				 * The above copyright notice and this permission notice (including the next

				 * paragraph) shall be included in all copies or substantial portions of the

				 * Software.

				 *

				 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				 * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL

				 * THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM,

				 * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR

				 * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE

				 * USE OR OTHER DEALINGS IN THE SOFTWARE.

				 *

				 */

				#ifndef AMDKERNELCODET_H

				#define AMDKERNELCODET_H

				//---------------------------------------------------------------------------//

				// AMD Kernel Code, and its dependencies                                     //

				//---------------------------------------------------------------------------//

				// Sets val bits for specified mask in specified dst packed instance.

				#define AMD_HSA_BITS_SET(dst, mask, val)                                       \

				  dst &= (~(1 << mask ## _SHIFT) & ~mask);                                     \

				  dst |= (((val) << mask ## _SHIFT) & mask)

				// Gets bits for specified mask from specified src packed instance.

				#define AMD_HSA_BITS_GET(src, mask)                                            \

				  ((src & mask) >> mask ## _SHIFT)                                             \

				/* Every amd_*_code_t has the following properties, which are composed of

				 * a number of bit fields. Every bit field has a mask (AMD_CODE_PROPERTY_*),

				 * bit width (AMD_CODE_PROPERTY_*_WIDTH, and bit shift amount

				 * (AMD_CODE_PROPERTY_*_SHIFT) for convenient access. Unused bits must be 0.

				 *

				 * (Note that bit fields cannot be used as their layout is

				 * implementation defined in the C standard and so cannot be used to

				 * specify an ABI)

				 */

				enum amd_code_property_mask_t {

				  /* Enable the setup of the SGPR user data registers

				   * (AMD_CODE_PROPERTY_ENABLE_SGPR_*), see documentation of amd_kernel_code_t

				   * for initial register state.

				   *

				   * The total number of SGPRuser data registers requested must not

				   * exceed 16. Any requests beyond 16 will be ignored.

				   *

				   * Used to set COMPUTE_PGM_RSRC2.USER_SGPR (set to total count of

				   * SGPR user data registers enabled up to 16).

				   */

				  AMD_CODE_PROPERTY_ENABLE_SGPR_PRIVATE_SEGMENT_BUFFER_SHIFT = 0,

				  AMD_CODE_PROPERTY_ENABLE_SGPR_PRIVATE_SEGMENT_BUFFER_WIDTH = 1,

				  AMD_CODE_PROPERTY_ENABLE_SGPR_PRIVATE_SEGMENT_BUFFER = ((1 << AMD_CODE_PROPERTY_ENABLE_SGPR_PRIVATE_SEGMENT_BUFFER_WIDTH) - 1) << AMD_CODE_PROPERTY_ENABLE_SGPR_PRIVATE_SEGMENT_BUFFER_SHIFT,

				  AMD_CODE_PROPERTY_ENABLE_SGPR_DISPATCH_PTR_SHIFT = 1,

				  AMD_CODE_PROPERTY_ENABLE_SGPR_DISPATCH_PTR_WIDTH = 1,

				  AMD_CODE_PROPERTY_ENABLE_SGPR_DISPATCH_PTR = ((1 << AMD_CODE_PROPERTY_ENABLE_SGPR_DISPATCH_PTR_WIDTH) - 1) << AMD_CODE_PROPERTY_ENABLE_SGPR_DISPATCH_PTR_SHIFT,

				  AMD_CODE_PROPERTY_ENABLE_SGPR_QUEUE_PTR_SHIFT = 2,

				  AMD_CODE_PROPERTY_ENABLE_SGPR_QUEUE_PTR_WIDTH = 1,

				  AMD_CODE_PROPERTY_ENABLE_SGPR_QUEUE_PTR = ((1 << AMD_CODE_PROPERTY_ENABLE_SGPR_QUEUE_PTR_WIDTH) - 1) << AMD_CODE_PROPERTY_ENABLE_SGPR_QUEUE_PTR_SHIFT,

				  AMD_CODE_PROPERTY_ENABLE_SGPR_KERNARG_SEGMENT_PTR_SHIFT = 3,

				  AMD_CODE_PROPERTY_ENABLE_SGPR_KERNARG_SEGMENT_PTR_WIDTH = 1,

				  AMD_CODE_PROPERTY_ENABLE_SGPR_KERNARG_SEGMENT_PTR = ((1 << AMD_CODE_PROPERTY_ENABLE_SGPR_KERNARG_SEGMENT_PTR_WIDTH) - 1) << AMD_CODE_PROPERTY_ENABLE_SGPR_KERNARG_SEGMENT_PTR_SHIFT,

				  AMD_CODE_PROPERTY_ENABLE_SGPR_DISPATCH_ID_SHIFT = 4,

				  AMD_CODE_PROPERTY_ENABLE_SGPR_DISPATCH_ID_WIDTH = 1,

				  AMD_CODE_PROPERTY_ENABLE_SGPR_DISPATCH_ID = ((1 << AMD_CODE_PROPERTY_ENABLE_SGPR_DISPATCH_ID_WIDTH) - 1) << AMD_CODE_PROPERTY_ENABLE_SGPR_DISPATCH_ID_SHIFT,

				  AMD_CODE_PROPERTY_ENABLE_SGPR_FLAT_SCRATCH_INIT_SHIFT = 5,

				  AMD_CODE_PROPERTY_ENABLE_SGPR_FLAT_SCRATCH_INIT_WIDTH = 1,

				  AMD_CODE_PROPERTY_ENABLE_SGPR_FLAT_SCRATCH_INIT = ((1 << AMD_CODE_PROPERTY_ENABLE_SGPR_FLAT_SCRATCH_INIT_WIDTH) - 1) << AMD_CODE_PROPERTY_ENABLE_SGPR_FLAT_SCRATCH_INIT_SHIFT,

				  AMD_CODE_PROPERTY_ENABLE_SGPR_PRIVATE_SEGMENT_SIZE_SHIFT = 6,

				  AMD_CODE_PROPERTY_ENABLE_SGPR_PRIVATE_SEGMENT_SIZE_WIDTH = 1,

				  AMD_CODE_PROPERTY_ENABLE_SGPR_PRIVATE_SEGMENT_SIZE = ((1 << AMD_CODE_PROPERTY_ENABLE_SGPR_PRIVATE_SEGMENT_SIZE_WIDTH) - 1) << AMD_CODE_PROPERTY_ENABLE_SGPR_PRIVATE_SEGMENT_SIZE_SHIFT,

				  AMD_CODE_PROPERTY_ENABLE_SGPR_GRID_WORKGROUP_COUNT_X_SHIFT = 7,

				  AMD_CODE_PROPERTY_ENABLE_SGPR_GRID_WORKGROUP_COUNT_X_WIDTH = 1,

				  AMD_CODE_PROPERTY_ENABLE_SGPR_GRID_WORKGROUP_COUNT_X = ((1 << AMD_CODE_PROPERTY_ENABLE_SGPR_GRID_WORKGROUP_COUNT_X_WIDTH) - 1) << AMD_CODE_PROPERTY_ENABLE_SGPR_GRID_WORKGROUP_COUNT_X_SHIFT,

				  AMD_CODE_PROPERTY_ENABLE_SGPR_GRID_WORKGROUP_COUNT_Y_SHIFT = 8,

				  AMD_CODE_PROPERTY_ENABLE_SGPR_GRID_WORKGROUP_COUNT_Y_WIDTH = 1,

				  AMD_CODE_PROPERTY_ENABLE_SGPR_GRID_WORKGROUP_COUNT_Y = ((1 << AMD_CODE_PROPERTY_ENABLE_SGPR_GRID_WORKGROUP_COUNT_Y_WIDTH) - 1) << AMD_CODE_PROPERTY_ENABLE_SGPR_GRID_WORKGROUP_COUNT_Y_SHIFT,

				  AMD_CODE_PROPERTY_ENABLE_SGPR_GRID_WORKGROUP_COUNT_Z_SHIFT = 9,

				  AMD_CODE_PROPERTY_ENABLE_SGPR_GRID_WORKGROUP_COUNT_Z_WIDTH = 1,

				  AMD_CODE_PROPERTY_ENABLE_SGPR_GRID_WORKGROUP_COUNT_Z = ((1 << AMD_CODE_PROPERTY_ENABLE_SGPR_GRID_WORKGROUP_COUNT_Z_WIDTH) - 1) << AMD_CODE_PROPERTY_ENABLE_SGPR_GRID_WORKGROUP_COUNT_Z_SHIFT,

				  AMD_CODE_PROPERTY_RESERVED1_SHIFT = 10,

				  AMD_CODE_PROPERTY_RESERVED1_WIDTH = 6,

				  AMD_CODE_PROPERTY_RESERVED1 = ((1 << AMD_CODE_PROPERTY_RESERVED1_WIDTH) - 1) << AMD_CODE_PROPERTY_RESERVED1_SHIFT,

				  /* Control wave ID base counter for GDS ordered-append. Used to set

				   * COMPUTE_DISPATCH_INITIATOR.ORDERED_APPEND_ENBL. (Not sure if

				   * ORDERED_APPEND_MODE also needs to be settable)

				   */

				  AMD_CODE_PROPERTY_ENABLE_ORDERED_APPEND_GDS_SHIFT = 16,

				  AMD_CODE_PROPERTY_ENABLE_ORDERED_APPEND_GDS_WIDTH = 1,

				  AMD_CODE_PROPERTY_ENABLE_ORDERED_APPEND_GDS = ((1 << AMD_CODE_PROPERTY_ENABLE_ORDERED_APPEND_GDS_WIDTH) - 1) << AMD_CODE_PROPERTY_ENABLE_ORDERED_APPEND_GDS_SHIFT,

				  /* The interleave (swizzle) element size in bytes required by the

				   * code for private memory. This must be 2, 4, 8 or 16. This value

				   * is provided to the finalizer when it is invoked and is recorded

				   * here. The hardware will interleave the memory requests of each

				   * lane of a wavefront by this element size to ensure each

				   * work-item gets a distinct memory memory location. Therefore, the

				   * finalizer ensures that all load and store operations done to

				   * private memory do not exceed this size. For example, if the

				   * element size is 4 (32-bits or dword) and a 64-bit value must be

				   * loaded, the finalizer will generate two 32-bit loads. This

				   * ensures that the interleaving will get the work-item

				   * specific dword for both halves of the 64-bit value. If it just

				   * did a 64-bit load then it would get one dword which belonged to

				   * its own work-item, but the second dword would belong to the

				   * adjacent lane work-item since the interleaving is in dwords.

				   *

				   * The value used must match the value that the runtime configures

				   * the GPU flat scratch (SH_STATIC_MEM_CONFIG.ELEMENT_SIZE). This

				   * is generally DWORD.

				   *

				   * USE VALUES FROM THE AMD_ELEMENT_BYTE_SIZE_T ENUM.

				   */

				  AMD_CODE_PROPERTY_PRIVATE_ELEMENT_SIZE_SHIFT = 17,

				  AMD_CODE_PROPERTY_PRIVATE_ELEMENT_SIZE_WIDTH = 2,

				  AMD_CODE_PROPERTY_PRIVATE_ELEMENT_SIZE = ((1 << AMD_CODE_PROPERTY_PRIVATE_ELEMENT_SIZE_WIDTH) - 1) << AMD_CODE_PROPERTY_PRIVATE_ELEMENT_SIZE_SHIFT,

				  /* Are global memory addresses 64 bits. Must match

				   * amd_kernel_code_t.hsail_machine_model ==

				   * HSA_MACHINE_LARGE. Must also match

				   * SH_MEM_CONFIG.PTR32 (GFX6 (SI)/GFX7 (CI)),

				   * SH_MEM_CONFIG.ADDRESS_MODE (GFX8 (VI)+).

				   */

				  AMD_CODE_PROPERTY_IS_PTR64_SHIFT = 19,

				  AMD_CODE_PROPERTY_IS_PTR64_WIDTH = 1,

				  AMD_CODE_PROPERTY_IS_PTR64 = ((1 << AMD_CODE_PROPERTY_IS_PTR64_WIDTH) - 1) << AMD_CODE_PROPERTY_IS_PTR64_SHIFT,

				  /* Indicate if the generated ISA is using a dynamically sized call

				   * stack. This can happen if calls are implemented using a call

				   * stack and recursion, alloca or calls to indirect functions are

				   * present. In these cases the Finalizer cannot compute the total

				   * private segment size at compile time. In this case the

				   * workitem_private_segment_byte_size only specifies the statically

				   * know private segment size, and additional space must be added

				   * for the call stack.

				   */

				  AMD_CODE_PROPERTY_IS_DYNAMIC_CALLSTACK_SHIFT = 20,

				  AMD_CODE_PROPERTY_IS_DYNAMIC_CALLSTACK_WIDTH = 1,

				  AMD_CODE_PROPERTY_IS_DYNAMIC_CALLSTACK = ((1 << AMD_CODE_PROPERTY_IS_DYNAMIC_CALLSTACK_WIDTH) - 1) << AMD_CODE_PROPERTY_IS_DYNAMIC_CALLSTACK_SHIFT,

				  /* Indicate if code generated has support for debugging. */

				  AMD_CODE_PROPERTY_IS_DEBUG_SUPPORTED_SHIFT = 21,

				  AMD_CODE_PROPERTY_IS_DEBUG_SUPPORTED_WIDTH = 1,

				  AMD_CODE_PROPERTY_IS_DEBUG_SUPPORTED = ((1 << AMD_CODE_PROPERTY_IS_DEBUG_SUPPORTED_WIDTH) - 1) << AMD_CODE_PROPERTY_IS_DEBUG_SUPPORTED_SHIFT,

				  AMD_CODE_PROPERTY_IS_XNACK_SUPPORTED_SHIFT = 22,

				  AMD_CODE_PROPERTY_IS_XNACK_SUPPORTED_WIDTH = 1,

				  AMD_CODE_PROPERTY_IS_XNACK_SUPPORTED = ((1 << AMD_CODE_PROPERTY_IS_XNACK_SUPPORTED_WIDTH) - 1) << AMD_CODE_PROPERTY_IS_XNACK_SUPPORTED_SHIFT,

				  AMD_CODE_PROPERTY_RESERVED2_SHIFT = 23,

				  AMD_CODE_PROPERTY_RESERVED2_WIDTH = 9,

				  AMD_CODE_PROPERTY_RESERVED2 = ((1 << AMD_CODE_PROPERTY_RESERVED2_WIDTH) - 1) << AMD_CODE_PROPERTY_RESERVED2_SHIFT

				};

				/* AMD Kernel Code Object (amd_kernel_code_t). GPU CP uses the AMD Kernel

				 * Code Object to set up the hardware to execute the kernel dispatch.

				 *

				 * Initial Kernel Register State.

				 *

				 * Initial kernel register state will be set up by CP/SPI prior to the start

				 * of execution of every wavefront. This is limited by the constraints of the

				 * current hardware.

				 *

				 * The order of the SGPR registers is defined, but the Finalizer can specify

				 * which ones are actually setup in the amd_kernel_code_t object using the

				 * enable_sgpr_* bit fields. The register numbers used for enabled registers

				 * are dense starting at SGPR0: the first enabled register is SGPR0, the next

				 * enabled register is SGPR1 etc.; disabled registers do not have an SGPR

				 * number.

				 *

				 * The initial SGPRs comprise up to 16 User SRGPs that are set up by CP and

				 * apply to all waves of the grid. It is possible to specify more than 16 User

				 * SGPRs using the enable_sgpr_* bit fields, in which case only the first 16

				 * are actually initialized. These are then immediately followed by the System

				 * SGPRs that are set up by ADC/SPI and can have different values for each wave

				 * of the grid dispatch.

				 *

				 * SGPR register initial state is defined as follows:

				 *

				 * Private Segment Buffer (enable_sgpr_private_segment_buffer):

				 *   Number of User SGPR registers: 4. V# that can be used, together with

				 *   Scratch Wave Offset as an offset, to access the Private/Spill/Arg

				 *   segments using a segment address. It must be set as follows:

				 *     - Base address: of the scratch memory area used by the dispatch. It

				 *       does not include the scratch wave offset. It will be the per process

				 *       SH_HIDDEN_PRIVATE_BASE_VMID plus any offset from this dispatch (for

				 *       example there may be a per pipe offset, or per AQL Queue offset).

				 *     - Stride + data_format: Element Size * Index Stride (???)

				 *     - Cache swizzle: ???

				 *     - Swizzle enable: SH_STATIC_MEM_CONFIG.SWIZZLE_ENABLE (must be 1 for

				 *       scratch)

				 *     - Num records: Flat Scratch Work Item Size / Element Size (???)

				 *     - Dst_sel_*: ???

				 *     - Num_format: ???

				 *     - Element_size: SH_STATIC_MEM_CONFIG.ELEMENT_SIZE (will be DWORD, must

				 *       agree with amd_kernel_code_t.privateElementSize)

				 *     - Index_stride: SH_STATIC_MEM_CONFIG.INDEX_STRIDE (will be 64 as must

				 *       be number of wavefront lanes for scratch, must agree with

				 *       amd_kernel_code_t.wavefrontSize)

				 *     - Add tid enable: 1

				 *     - ATC: from SH_MEM_CONFIG.PRIVATE_ATC,

				 *     - Hash_enable: ???

				 *     - Heap: ???

				 *     - Mtype: from SH_STATIC_MEM_CONFIG.PRIVATE_MTYPE

				 *     - Type: 0 (a buffer) (???)

				 *

				 * Dispatch Ptr (enable_sgpr_dispatch_ptr):

				 *   Number of User SGPR registers: 2. 64 bit address of AQL dispatch packet

				 *   for kernel actually executing.

				 *

				 * Queue Ptr (enable_sgpr_queue_ptr):

				 *   Number of User SGPR registers: 2. 64 bit address of AmdQueue object for

				 *   AQL queue on which the dispatch packet was queued.

				 *

				 * Kernarg Segment Ptr (enable_sgpr_kernarg_segment_ptr):

				 *   Number of User SGPR registers: 2. 64 bit address of Kernarg segment. This

				 *   is directly copied from the kernargPtr in the dispatch packet. Having CP

				 *   load it once avoids loading it at the beginning of every wavefront.

				 *

				 * Dispatch Id (enable_sgpr_dispatch_id):

				 *   Number of User SGPR registers: 2. 64 bit Dispatch ID of the dispatch

				 *   packet being executed.

				 *

				 * Flat Scratch Init (enable_sgpr_flat_scratch_init):

				 *   Number of User SGPR registers: 2. This is 2 SGPRs.

				 *

				 *   For CI/VI:

				 *     The first SGPR is a 32 bit byte offset from SH_MEM_HIDDEN_PRIVATE_BASE

				 *     to base of memory for scratch for this dispatch. This is the same offset

				 *     used in computing the Scratch Segment Buffer base address. The value of

				 *     Scratch Wave Offset must be added by the kernel code and moved to

				 *     SGPRn-4 for use as the FLAT SCRATCH BASE in flat memory instructions.

				 *

				 *     The second SGPR is 32 bit byte size of a single work-item's scratch

				 *     memory usage. This is directly loaded from the dispatch packet Private

				 *     Segment Byte Size and rounded up to a multiple of DWORD.

				 *

				 *     \todo [Does CP need to round this to >4 byte alignment?]

				 *

				 *     The kernel code must move to SGPRn-3 for use as the FLAT SCRATCH SIZE in

				 *     flat memory instructions. Having CP load it once avoids loading it at

				 *     the beginning of every wavefront.

				 *

				 * Private Segment Size (enable_sgpr_private_segment_size):

				 *   Number of User SGPR registers: 1. The 32 bit byte size of a single

				 *   work-item's scratch memory allocation. This is the value from the dispatch

				 *   packet. Private Segment Byte Size rounded up by CP to a multiple of DWORD.

				 *

				 *   \todo [Does CP need to round this to >4 byte alignment?]

				 *

				 *   Having CP load it once avoids loading it at the beginning of every

				 *   wavefront.

				 *

				 *   \todo [This will not be used for CI/VI since it is the same value as

				 *   the second SGPR of Flat Scratch Init.

				 *

				 * Grid Work-Group Count X (enable_sgpr_grid_workgroup_count_x):

				 *   Number of User SGPR registers: 1. 32 bit count of the number of

				 *   work-groups in the X dimension for the grid being executed. Computed from

				 *   the fields in the HsaDispatchPacket as

				 *   ((gridSize.x+workgroupSize.x-1)/workgroupSize.x).

				 *

				 * Grid Work-Group Count Y (enable_sgpr_grid_workgroup_count_y):

				 *   Number of User SGPR registers: 1. 32 bit count of the number of

				 *   work-groups in the Y dimension for the grid being executed. Computed from

				 *   the fields in the HsaDispatchPacket as

				 *   ((gridSize.y+workgroupSize.y-1)/workgroupSize.y).

				 *

				 *   Only initialized if <16 previous SGPRs initialized.

				 *

				 * Grid Work-Group Count Z (enable_sgpr_grid_workgroup_count_z):

				 *   Number of User SGPR registers: 1. 32 bit count of the number of

				 *   work-groups in the Z dimension for the grid being executed. Computed

				 *   from the fields in the HsaDispatchPacket as

				 *   ((gridSize.z+workgroupSize.z-1)/workgroupSize.z).

				 *

				 *   Only initialized if <16 previous SGPRs initialized.

				 *

				 * Work-Group Id X (enable_sgpr_workgroup_id_x):

				 *   Number of System SGPR registers: 1. 32 bit work group id in X dimension

				 *   of grid for wavefront. Always present.

				 *

				 * Work-Group Id Y (enable_sgpr_workgroup_id_y):

				 *   Number of System SGPR registers: 1. 32 bit work group id in Y dimension

				 *   of grid for wavefront.

				 *

				 * Work-Group Id Z (enable_sgpr_workgroup_id_z):

				 *   Number of System SGPR registers: 1. 32 bit work group id in Z dimension

				 *   of grid for wavefront. If present then Work-group Id Y will also be

				 *   present

				 *

				 * Work-Group Info (enable_sgpr_workgroup_info):

				 *   Number of System SGPR registers: 1. {first_wave, 14'b0000,

				 *   ordered_append_term[10:0], threadgroup_size_in_waves[5:0]}

				 *

				 * Private Segment Wave Byte Offset

				 * (enable_sgpr_private_segment_wave_byte_offset):

				 *   Number of System SGPR registers: 1. 32 bit byte offset from base of

				 *   dispatch scratch base. Must be used as an offset with Private/Spill/Arg

				 *   segment address when using Scratch Segment Buffer. It must be added to

				 *   Flat Scratch Offset if setting up FLAT SCRATCH for flat addressing.

				 *

				 *

				 * The order of the VGPR registers is defined, but the Finalizer can specify

				 * which ones are actually setup in the amd_kernel_code_t object using the

				 * enableVgpr*  bit fields. The register numbers used for enabled registers

				 * are dense starting at VGPR0: the first enabled register is VGPR0, the next

				 * enabled register is VGPR1 etc.; disabled registers do not have an VGPR

				 * number.

				 *

				 * VGPR register initial state is defined as follows:

				 *

				 * Work-Item Id X (always initialized):

				 *   Number of registers: 1. 32 bit work item id in X dimension of work-group

				 *   for wavefront lane.

				 *

				 * Work-Item Id X (enable_vgpr_workitem_id > 0):

				 *   Number of registers: 1. 32 bit work item id in Y dimension of work-group

				 *   for wavefront lane.

				 *

				 * Work-Item Id X (enable_vgpr_workitem_id > 0):

				 *   Number of registers: 1. 32 bit work item id in Z dimension of work-group

				 *   for wavefront lane.

				 *

				 *

				 * The setting of registers is being done by existing GPU hardware as follows:

				 *   1) SGPRs before the Work-Group Ids are set by CP using the 16 User Data

				 *      registers.

				 *   2) Work-group Id registers X, Y, Z are set by SPI which supports any

				 *      combination including none.

				 *   3) Scratch Wave Offset is also set by SPI which is why its value cannot

				 *      be added into the value Flat Scratch Offset which would avoid the

				 *      Finalizer generated prolog having to do the add.

				 *   4) The VGPRs are set by SPI which only supports specifying either (X),

				 *      (X, Y) or (X, Y, Z).

				 *

				 * Flat Scratch Dispatch Offset and Flat Scratch Size are adjacent SGRRs so

				 * they can be moved as a 64 bit value to the hardware required SGPRn-3 and

				 * SGPRn-4 respectively using the Finalizer ?FLAT_SCRATCH? Register.

				 *

				 * The global segment can be accessed either using flat operations or buffer

				 * operations. If buffer operations are used then the Global Buffer used to

				 * access HSAIL Global/Readonly/Kernarg (which are combine) segments using a

				 * segment address is not passed into the kernel code by CP since its base

				 * address is always 0. Instead the Finalizer generates prolog code to

				 * initialize 4 SGPRs with a V# that has the following properties, and then

				 * uses that in the buffer instructions:

				 *   - base address of 0

				 *   - no swizzle

				 *   - ATC=1

				 *   - MTYPE set to support memory coherence specified in

				 *     amd_kernel_code_t.globalMemoryCoherence

				 *

				 * When the Global Buffer is used to access the Kernarg segment, must add the

				 * dispatch packet kernArgPtr to a kernarg segment address before using this V#.

				 * Alternatively scalar loads can be used if the kernarg offset is uniform, as

				 * the kernarg segment is constant for the duration of the kernel execution.

				 */

				typedef struct amd_kernel_code_s {

				  uint32_t amd_kernel_code_version_major;

				  uint32_t amd_kernel_code_version_minor;

				  uint16_t amd_machine_kind;

				  uint16_t amd_machine_version_major;

				  uint16_t amd_machine_version_minor;

				  uint16_t amd_machine_version_stepping;

				  /* Byte offset (possibly negative) from start of amd_kernel_code_t

				   * object to kernel's entry point instruction. The actual code for

				   * the kernel is required to be 256 byte aligned to match hardware

				   * requirements (SQ cache line is 16). The code must be position

				   * independent code (PIC) for AMD devices to give runtime the

				   * option of copying code to discrete GPU memory or APU L2

				   * cache. The Finalizer should endeavour to allocate all kernel

				   * machine code in contiguous memory pages so that a device

				   * pre-fetcher will tend to only pre-fetch Kernel Code objects,

				   * improving cache performance.

				   */

				  int64_t kernel_code_entry_byte_offset;

				  /* Range of bytes to consider prefetching expressed as an offset

				   * and size. The offset is from the start (possibly negative) of

				   * amd_kernel_code_t object. Set both to 0 if no prefetch

				   * information is available.

				   */

				  int64_t kernel_code_prefetch_byte_offset;

				  uint64_t kernel_code_prefetch_byte_size;

				  /* Number of bytes of scratch backing memory required for full

				   * occupancy of target chip. This takes into account the number of

				   * bytes of scratch per work-item, the wavefront size, the maximum

				   * number of wavefronts per CU, and the number of CUs. This is an

				   * upper limit on scratch. If the grid being dispatched is small it

				   * may only need less than this. If the kernel uses no scratch, or

				   * the Finalizer has not computed this value, it must be 0.

				   */

				  uint64_t max_scratch_backing_memory_byte_size;

				  /* Shader program settings for CS. Contains COMPUTE_PGM_RSRC1 and

				   * COMPUTE_PGM_RSRC2 registers.

				   */

				  uint64_t compute_pgm_resource_registers;

				  /* Code properties. See amd_code_property_mask_t for a full list of

				   * properties.

				   */

				  uint32_t code_properties;

				  /* The amount of memory required for the combined private, spill

				   * and arg segments for a work-item in bytes. If

				   * is_dynamic_callstack is 1 then additional space must be added to

				   * this value for the call stack.

				   */

				  uint32_t workitem_private_segment_byte_size;

				  /* The amount of group segment memory required by a work-group in

				   * bytes. This does not include any dynamically allocated group

				   * segment memory that may be added when the kernel is

				   * dispatched.

				   */

				  uint32_t workgroup_group_segment_byte_size;

				  /* Number of byte of GDS required by kernel dispatch. Must be 0 if

				   * not using GDS.

				   */

				  uint32_t gds_segment_byte_size;

				  /* The size in bytes of the kernarg segment that holds the values

				   * of the arguments to the kernel. This could be used by CP to

				   * prefetch the kernarg segment pointed to by the dispatch packet.

				   */

				  uint64_t kernarg_segment_byte_size;

				  /* Number of fbarrier's used in the kernel and all functions it

				   * calls. If the implementation uses group memory to allocate the

				   * fbarriers then that amount must already be included in the

				   * workgroup_group_segment_byte_size total.

				   */

				  uint32_t workgroup_fbarrier_count;

				  /* Number of scalar registers used by a wavefront. This includes

				   * the special SGPRs for VCC, Flat Scratch Base, Flat Scratch Size

				   * and XNACK (for GFX8 (VI)). It does not include the 16 SGPR added if a

				   * trap handler is enabled. Used to set COMPUTE_PGM_RSRC1.SGPRS.

				   */

				  uint16_t wavefront_sgpr_count;

				  /* Number of vector registers used by each work-item. Used to set

				   * COMPUTE_PGM_RSRC1.VGPRS.

				   */

				  uint16_t workitem_vgpr_count;

				  /* If reserved_vgpr_count is 0 then must be 0. Otherwise, this is the

				   * first fixed VGPR number reserved.

				   */

				  uint16_t reserved_vgpr_first;

				  /* The number of consecutive VGPRs reserved by the client. If

				   * is_debug_supported then this count includes VGPRs reserved

				   * for debugger use.

				   */

				  uint16_t reserved_vgpr_count;

				  /* If reserved_sgpr_count is 0 then must be 0. Otherwise, this is the

				   * first fixed SGPR number reserved.

				   */

				  uint16_t reserved_sgpr_first;

				  /* The number of consecutive SGPRs reserved by the client. If

				   * is_debug_supported then this count includes SGPRs reserved

				   * for debugger use.

				   */

				  uint16_t reserved_sgpr_count;

				  /* If is_debug_supported is 0 then must be 0. Otherwise, this is the

				   * fixed SGPR number used to hold the wave scratch offset for the

				   * entire kernel execution, or uint16_t(-1) if the register is not

				   * used or not known.

				   */

				  uint16_t debug_wavefront_private_segment_offset_sgpr;

				  /* If is_debug_supported is 0 then must be 0. Otherwise, this is the

				   * fixed SGPR number of the first of 4 SGPRs used to hold the

				   * scratch V# used for the entire kernel execution, or uint16_t(-1)

				   * if the registers are not used or not known.

				   */

				  uint16_t debug_private_segment_buffer_sgpr;

				  /* The maximum byte alignment of variables used by the kernel in

				   * the specified memory segment. Expressed as a power of two. Must

				   * be at least HSA_POWERTWO_16.

				   */

				  uint8_t kernarg_segment_alignment;

				  uint8_t group_segment_alignment;

				  uint8_t private_segment_alignment;

				  /* Wavefront size expressed as a power of two. Must be a power of 2

				   * in range 1..64 inclusive. Used to support runtime query that

				   * obtains wavefront size, which may be used by application to

				   * allocated dynamic group memory and set the dispatch work-group

				   * size.

				   */

				  uint8_t wavefront_size;

				  int32_t call_convention;

				  uint8_t reserved3[12];

				  uint64_t runtime_loader_kernel_symbol;

				  uint64_t control_directives[16];

				} amd_kernel_code_t;

				#endif // AMDKERNELCODET_H

									
										2

src/gallium/winsys/amdgpu/drm/amdgpu_id.h → src/amd/common/amdgpu_id.h
									
												View File
												
				@@ -32,7 +32,7 @@

				#ifndef AMDGPU_ID_H

				#define AMDGPU_ID_H

				#include "pipe/p_config.h"

				#include "util/u_endian.h"

				#if defined(PIPE_ARCH_LITTLE_ENDIAN)

				#define LITTLEENDIAN_CPU

									
										9

src/gallium/drivers/radeon/r600d_common.h → src/amd/common/r600d_common.h
									
												View File
												
				@@ -203,6 +203,12 @@

				#define   S_028BDC_LAST_PIXEL(x)                       (((unsigned)(x) & 0x1) << 10)

				#define   G_028BDC_LAST_PIXEL(x)                       (((x) >> 10) & 0x1)

				#define   C_028BDC_LAST_PIXEL                          0xFFFFFBFF

				#define   S_028BDC_PERPENDICULAR_ENDCAP_ENA(x)         (((unsigned)(x) & 0x1) << 11)

				#define   G_028BDC_PERPENDICULAR_ENDCAP_ENA(x)         (((x) >> 11) & 0x1)

				#define   C_028BDC_PERPENDICULAR_ENDCAP_ENA            0xFFFFF7FF

				#define   S_028BDC_DX10_DIAMOND_TEST_ENA(x)            (((unsigned)(x) & 0x1) << 12)

				#define   G_028BDC_DX10_DIAMOND_TEST_ENA(x)            (((x) >> 12) & 0x1)

				#define   C_028BDC_DX10_DIAMOND_TEST_ENA               0xFFFFEFFF

				#define CM_R_028BE0_PA_SC_AA_CONFIG                  0x28be0

				#define   S_028BE0_MSAA_NUM_SAMPLES(x)                  (((unsigned)(x) & 0x7) << 0)

				#define   S_028BE0_AA_MASK_CENTROID_DTMN(x)		(((unsigned)(x) & 0x1) << 4)

				@@ -216,7 +222,6 @@

				#define   EG_S_028C70_FAST_CLEAR(x)                       (((unsigned)(x) & 0x1) << 17)

				#define   SI_S_028C70_FAST_CLEAR(x)                       (((unsigned)(x) & 0x1) << 13)

				#define   VI_S_028C70_DCC_ENABLE(x)                       (((unsigned)(x) & 0x1) << 28)

				/*CIK+*/

				#define R_0300FC_CP_STRMOUT_CNTL		     0x0300FC

				@@ -241,5 +246,7 @@

				#define   S_028254_BR_Y(x)                                            (((unsigned)(x) & 0x7FFF) << 16)

				#define   G_028254_BR_Y(x)                                            (((x) >> 16) & 0x7FFF)

				#define   C_028254_BR_Y                                               0x8000FFFF

				#define R_0282D0_PA_SC_VPORT_ZMIN_0                                     0x0282D0

				#define R_0282D4_PA_SC_VPORT_ZMAX_0                                     0x0282D4

				#endif

									
										31

src/gallium/drivers/radeonsi/sid.h → src/amd/common/sid.h
									
												View File
												
				@@ -93,14 +93,17 @@

				#define     CONTEXT_CONTROL_SHADOW_ENABLE(x)   (((unsigned)(x) & 0x1) << 31)

				#define PKT3_INDEX_TYPE                        0x2A

				#define PKT3_DRAW_INDIRECT_MULTI               0x2C

				#define   R_2C3_DRAW_INDEX_LOC                  0x2C3

				#define     S_2C3_COUNT_INDIRECT_ENABLE(x)      (((unsigned)(x) & 0x1) << 30)

				#define     S_2C3_DRAW_INDEX_ENABLE(x)          (((unsigned)(x) & 0x1) << 31)

				#define PKT3_DRAW_INDEX_AUTO                   0x2D

				#define PKT3_DRAW_INDEX_IMMD                   0x2E /* not on CIK */

				#define PKT3_NUM_INSTANCES                     0x2F

				#define PKT3_DRAW_INDEX_MULTI_AUTO             0x30

				#define PKT3_INDIRECT_BUFFER_SI                0x32 /* not on CIK */

				#define PKT3_INDIRECT_BUFFER_CONST             0x33

				#define PKT3_STRMOUT_BUFFER_UPDATE             0x34

				#define PKT3_DRAW_INDEX_OFFSET_2               0x35

				#define PKT3_DRAW_PREAMBLE                     0x36 /* new on CIK, required on GFX7.2 and later */

				#define PKT3_WRITE_DATA                        0x37

				#define   R_370_CONTROL				0x370 /* 0x[packet number][word index] */

				#define     S_370_ENGINE_SEL(x)			(((unsigned)(x) & 0x3) << 30)

				@@ -126,6 +129,13 @@

				#define		WAIT_REG_MEM_EQUAL		3

				#define PKT3_MEM_WRITE                         0x3D /* not on CIK */

				#define PKT3_INDIRECT_BUFFER_CIK               0x3F /* new on CIK */

				#define   R_3F0_IB_BASE_LO                     0x3F0

				#define   R_3F1_IB_BASE_HI                     0x3F1

				#define   R_3F2_CONTROL                        0x3F2

				#define     S_3F2_IB_SIZE(x)                   (((unsigned)(x) & 0xfffff) << 0)

				#define     S_3F2_CHAIN(x)                     (((unsigned)(x) & 0x1) << 20)

				#define     S_3F2_VALID(x)                     (((unsigned)(x) & 0x1) << 23)

				#define PKT3_COPY_DATA			       0x40

				#define		COPY_DATA_SRC_SEL(x)		((x) & 0xf)

				#define			COPY_DATA_REG		0

				@@ -135,7 +145,7 @@

				#define		COPY_DATA_DST_SEL(x)		(((unsigned)(x) & 0xf) << 8)

				#define		COPY_DATA_COUNT_SEL		(1 << 16)

				#define		COPY_DATA_WR_CONFIRM		(1 << 20)

				#define PKT3_PFP_SYNC_ME		       0x42 /* r7xx+ */

				#define PKT3_PFP_SYNC_ME		       0x42

				#define PKT3_SURFACE_SYNC                      0x43 /* deprecated on CIK, use ACQUIRE_MEM */

				#define PKT3_ME_INITIALIZE                     0x44 /* not on CIK */

				#define PKT3_COND_WRITE                        0x45

				@@ -2334,12 +2344,18 @@

				#define   S_008F30_FORCE_UNNORMALIZED(x)                              (((unsigned)(x) & 0x1) << 15)

				#define   G_008F30_FORCE_UNNORMALIZED(x)                              (((x) >> 15) & 0x1)

				#define   C_008F30_FORCE_UNNORMALIZED                                 0xFFFF7FFF

				#define   S_008F30_ANISO_THRESHOLD(x)                                 (((unsigned)(x) & 0x07) << 16)

				#define   G_008F30_ANISO_THRESHOLD(x)                                 (((x) >> 16) & 0x07)

				#define   C_008F30_ANISO_THRESHOLD                                    0xFFF8FFFF

				#define   S_008F30_MC_COORD_TRUNC(x)                                  (((unsigned)(x) & 0x1) << 19)

				#define   G_008F30_MC_COORD_TRUNC(x)                                  (((x) >> 19) & 0x1)

				#define   C_008F30_MC_COORD_TRUNC                                     0xFFF7FFFF

				#define   S_008F30_FORCE_DEGAMMA(x)                                   (((unsigned)(x) & 0x1) << 20)

				#define   G_008F30_FORCE_DEGAMMA(x)                                   (((x) >> 20) & 0x1)

				#define   C_008F30_FORCE_DEGAMMA                                      0xFFEFFFFF

				#define   S_008F30_ANISO_BIAS(x)                                      (((unsigned)(x) & 0x3F) << 21)

				#define   G_008F30_ANISO_BIAS(x)                                      (((x) >> 21) & 0x3F)

				#define   C_008F30_ANISO_BIAS                                         0xF81FFFFF

				#define   S_008F30_TRUNC_COORD(x)                                     (((unsigned)(x) & 0x1) << 27)

				#define   G_008F30_TRUNC_COORD(x)                                     (((x) >> 27) & 0x1)

				#define   C_008F30_TRUNC_COORD                                        0xF7FFFFFF

				@@ -7209,6 +7225,7 @@

				/*     */

				#define R_028830_PA_SU_SMALL_PRIM_FILTER_CNTL                           0x028830 /* Polaris */

				#define   S_028830_SMALL_PRIM_FILTER_ENABLE(x)                        (((x) & 0x1) << 0)

				#define   C_028830_SMALL_PRIM_FILTER_ENABLE                           0xFFFFFFFE

				#define   S_028830_TRIANGLE_FILTER_DISABLE(x)                         (((x) & 0x1) << 1)

				#define   S_028830_LINE_FILTER_DISABLE(x)                             (((x) & 0x1) << 2)

				#define   S_028830_POINT_FILTER_DISABLE(x)                            (((x) & 0x1) << 3)

				@@ -7960,9 +7977,12 @@

				#define   S_028B50_ACCUM_QUAD(x)                                      (((unsigned)(x) & 0xFF) << 16)

				#define   G_028B50_ACCUM_QUAD(x)                                      (((x) >> 16) & 0xFF)

				#define   C_028B50_ACCUM_QUAD                                         0xFF00FFFF

				#define   S_028B50_DONUT_SPLIT(x)                                     (((unsigned)(x) & 0xFF) << 24)

				#define   G_028B50_DONUT_SPLIT(x)                                     (((x) >> 24) & 0xFF)

				#define   C_028B50_DONUT_SPLIT                                        0x00FFFFFF

				#define   S_028B50_DONUT_SPLIT(x)                                     (((unsigned)(x) & 0x1F) << 24)

				#define   G_028B50_DONUT_SPLIT(x)                                     (((x) >> 24) & 0x1F)

				#define   C_028B50_DONUT_SPLIT                                        0xE0FFFFFF

				#define   S_028B50_TRAP_SPLIT(x)                                      (((unsigned)(x) & 0x7) << 29) /* Fiji+ */

				#define   G_028B50_TRAP_SPLIT(x)                                      (((x) >> 29) & 0x7)

				#define   C_028B50_TRAP_SPLIT                                         0x1FFFFFFF

				/*    */

				#define R_028B54_VGT_SHADER_STAGES_EN                                   0x028B54

				#define   S_028B54_LS_EN(x)                                           (((unsigned)(x) & 0x03) << 0)

				@@ -8083,6 +8103,7 @@

				#define     V_028B6C_DISTRIBUTION_MODE_NO_DIST                      0x00

				#define     V_028B6C_DISTRIBUTION_MODE_PATCHES                      0x01

				#define     V_028B6C_DISTRIBUTION_MODE_DONUTS                       0x02

				#define     V_028B6C_DISTRIBUTION_MODE_TRAPEZOIDS                   0x03 /* Fiji+ */

				#define   S_028B6C_MTYPE(x)                                           (((unsigned)(x) & 0x03) << 19)

				#define   G_028B6C_MTYPE(x)                                           (((x) >> 19) & 0x03)

				#define   C_028B6C_MTYPE                                              0xFFE7FFFF

6

src/amd/vulkan/.gitignore vendored Normal file

View File

@@ -0,0 +1,6 @@
 # Generated source files
 /radv_entrypoints.c
 /radv_entrypoints.h
 /radv_timestamp.h
 /dev_icd.json
 /vk_format_table.c

									
										167

src/amd/vulkan/Makefile.am
									
										Normal file
									
												View File
												
				@@ -0,0 +1,167 @@

				# Copyright © 2016 Red Hat

				#

				# Permission is hereby granted, free of charge, to any person obtaining a

				# copy of this software and associated documentation files (the "Software"),

				# to deal in the Software without restriction, including without limitation

				# the rights to use, copy, modify, merge, publish, distribute, sublicense,

				# and/or sell copies of the Software, and to permit persons to whom the

				# Software is furnished to do so, subject to the following conditions:

				#

				# The above copyright notice and this permission notice (including the next

				# paragraph) shall be included in all copies or substantial portions of the

				# Software.

				#

				# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL

				# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

				# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING

				# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS

				# IN THE SOFTWARE.

				include Makefile.sources

				vulkan_includedir = $(includedir)/vulkan

				vulkan_include_HEADERS = \

					$(top_srcdir)/include/vulkan/vk_platform.h \

					$(top_srcdir)/include/vulkan/vulkan.h

				lib_LTLIBRARIES = libvulkan_radeon.la

				# The gallium includes are for the util/u_math.h include from main/macros.h

				AM_CPPFLAGS = \

					$(AMDGPU_CFLAGS) \

					$(VALGRIND_CFLAGS) \

					$(DEFINES) \

					-I$(top_srcdir)/include \

					-I$(top_builddir)/src \

					-I$(top_srcdir)/src \

					-I$(top_srcdir)/src/vulkan/wsi \

					-I$(top_srcdir)/src/amd \

					-I$(top_srcdir)/src/amd/common \

					-I$(top_builddir)/src/compiler \

					-I$(top_builddir)/src/compiler/nir \

					-I$(top_srcdir)/src/compiler \

					-I$(top_srcdir)/src/mapi \

					-I$(top_srcdir)/src/mesa \

					-I$(top_srcdir)/src/mesa/drivers/dri/common \

					-I$(top_srcdir)/src/gallium/auxiliary \

					-I$(top_srcdir)/src/gallium/include

				AM_CFLAGS = \

					$(VISIBILITY_CFLAGS) \

					$(PTHREAD_CFLAGS) \

					$(LLVM_CFLAGS)

				VULKAN_SOURCES = \

					$(VULKAN_GENERATED_FILES) \

					$(VULKAN_FILES)

				VULKAN_LIB_DEPS =

				if HAVE_PLATFORM_X11

				AM_CPPFLAGS += \

					$(XCB_DRI3_CFLAGS) \

					-DVK_USE_PLATFORM_XCB_KHR \

					-DVK_USE_PLATFORM_XLIB_KHR

				VULKAN_SOURCES += $(VULKAN_WSI_X11_FILES)

				# FIXME: Use pkg-config for X11-xcb ldflags.

				VULKAN_LIB_DEPS += $(XCB_DRI3_LIBS) -lX11-xcb

				endif

				if HAVE_PLATFORM_WAYLAND

				AM_CPPFLAGS += \

					-I$(top_builddir)/src/egl/wayland/wayland-drm \

					-I$(top_srcdir)/src/egl/wayland/wayland-drm \

					$(WAYLAND_CFLAGS) \

					-DVK_USE_PLATFORM_WAYLAND_KHR

				VULKAN_SOURCES += $(VULKAN_WSI_WAYLAND_FILES)

				VULKAN_LIB_DEPS += \

					$(top_builddir)/src/egl/wayland/wayland-drm/libwayland-drm.la \

					$(WAYLAND_LIBS)

				endif

				noinst_LTLIBRARIES = libvulkan_common.la

				libvulkan_common_la_SOURCES = $(VULKAN_SOURCES)

				VULKAN_LIB_DEPS += \

					libvulkan_common.la \

					$(top_builddir)/src/vulkan/wsi/libvulkan_wsi.la \

					$(top_builddir)/src/amd/common/libamd_common.la \

					$(top_builddir)/src/amd/addrlib/libamdgpu_addrlib.la \

					$(top_builddir)/src/compiler/nir/libnir.la \

					$(top_builddir)/src/util/libmesautil.la \

					$(LLVM_LIBS) \

					$(LIBELF_LIBS) \

					$(PTHREAD_LIBS) \

					$(AMDGPU_LIBS) \

					$(LIBDRM_LIBS) \

					$(PTHREAD_LIBS) \

					$(DLOPEN_LIBS) \

					-lm

				nodist_EXTRA_libvulkan_radeon_la_SOURCES = dummy.cpp

				libvulkan_radeon_la_SOURCES = $(VULKAN_GEM_FILES)

				radv_entrypoints.h : radv_entrypoints_gen.py $(vulkan_include_HEADERS)

					$(AM_V_GEN) cat $(vulkan_include_HEADERS) |\

					$(PYTHON2) $(srcdir)/radv_entrypoints_gen.py header > $@

				radv_entrypoints.c : radv_entrypoints_gen.py $(vulkan_include_HEADERS)

					$(AM_V_GEN) cat $(vulkan_include_HEADERS) |\

					$(PYTHON2) $(srcdir)/radv_entrypoints_gen.py code > $@

				.PHONY: radv_timestamp.h

				radv_timestamp.h:

					@echo "Updating radv_timestamp.h"

					$(AM_V_GEN) echo "#define RADV_TIMESTAMP \"$(TIMESTAMP_CMD)\"" > $@

				vk_format_table.c: vk_format_table.py \

						   vk_format_parse.py \

				                   vk_format_layout.csv

					$(PYTHON2) $(srcdir)/vk_format_table.py $(srcdir)/vk_format_layout.csv > $@

				BUILT_SOURCES = $(VULKAN_GENERATED_FILES)

				CLEANFILES = $(BUILT_SOURCES) dev_icd.json radv_timestamp.h

				EXTRA_DIST = \

					$(top_srcdir)/include/vulkan/vk_icd.h \

					dev_icd.json.in \

					radeon_icd.json \

					radv_entrypoints_gen.py \

					vk_format_layout.csv \

					vk_format_parse.py \

					vk_format_table.py

				libvulkan_radeon_la_LIBADD = $(VULKAN_LIB_DEPS)

				libvulkan_radeon_la_LDFLAGS = \

					-shared \

					-module \

					-no-undefined \

					-avoid-version \

					$(BSYMBOLIC) \

					$(LLVM_LDFLAGS) \

					$(GC_SECTIONS) \

					$(LD_NO_UNDEFINED)

				icdconfdir = @VULKAN_ICD_INSTALL_DIR@

				icdconf_DATA = radeon_icd.json

				# The following is used for development purposes, by setting VK_ICD_FILENAMES.

				noinst_DATA = dev_icd.json

				dev_icd.json : dev_icd.json.in

					$(AM_V_GEN) $(SED) \

						-e "s#@build_libdir@#${abs_top_builddir}/${LIB_DIR}#" \

						< $(srcdir)/dev_icd.json.in > $@

				include $(top_srcdir)/install-lib-links.mk

									
										77

src/amd/vulkan/Makefile.sources
									
										Normal file
									
												View File
												
				@@ -0,0 +1,77 @@

				# Copyright © 2016 Red Hat

				#

				# Permission is hereby granted, free of charge, to any person obtaining a

				# copy of this software and associated documentation files (the "Software"),

				# to deal in the Software without restriction, including without limitation

				# the rights to use, copy, modify, merge, publish, distribute, sublicense,

				# and/or sell copies of the Software, and to permit persons to whom the

				# Software is furnished to do so, subject to the following conditions:

				#

				# The above copyright notice and this permission notice (including the next

				# paragraph) shall be included in all copies or substantial portions of the

				# Software.

				#

				# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL

				# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

				# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING

				# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS

				# IN THE SOFTWARE.

				RADV_WS_AMDGPU_FILES := \

					winsys/amdgpu/radv_amdgpu_bo.c \

					winsys/amdgpu/radv_amdgpu_bo.h \

					winsys/amdgpu/radv_amdgpu_cs.c \

					winsys/amdgpu/radv_amdgpu_cs.h \

					winsys/amdgpu/radv_amdgpu_surface.c \

					winsys/amdgpu/radv_amdgpu_surface.h \

					winsys/amdgpu/radv_amdgpu_winsys.c \

					winsys/amdgpu/radv_amdgpu_winsys.h \

					winsys/amdgpu/radv_amdgpu_winsys_public.h

				VULKAN_FILES := \

					radv_cmd_buffer.c \

					radv_cs.h \

					radv_device.c \

					radv_descriptor_set.c \

					radv_descriptor_set.h \

					radv_formats.c \

					radv_image.c \

					radv_meta.c \

					radv_meta.h \

					radv_meta_blit.c \

					radv_meta_blit2d.c \

					radv_meta_buffer.c \

					radv_meta_bufimage.c \

					radv_meta_clear.c \

					radv_meta_copy.c \

					radv_meta_decompress.c \

					radv_meta_fast_clear.c \

					radv_meta_resolve.c \

					radv_meta_resolve_cs.c \

					radv_pass.c \

					radv_pipeline.c \

					radv_pipeline_cache.c \

					radv_private.h \

					radv_radeon_winsys.h \

					radv_query.c \

					radv_util.c \

					radv_util.h \

					radv_wsi.c \

					si_cmd_buffer.c \

					vk_format_table.c \

					vk_format.h \

					$(RADV_WS_AMDGPU_FILES)

				VULKAN_WSI_WAYLAND_FILES := \

					radv_wsi_wayland.c

				VULKAN_WSI_X11_FILES := \

					radv_wsi_x11.c

				VULKAN_GENERATED_FILES := \

					radv_entrypoints.c \

					radv_entrypoints.h \

					radv_timestamp.h

Compare commits

3774 Commits mesa-12.0. ... 13.0-branc

34 .editorconfig Normal file Unescape Escape View File

1 .gitignore vendored Unescape Escape View File

12 .mailmap Unescape Escape View File

29 .travis.yml Unescape Escape View File

10 Android.common.mk Unescape Escape View File

4 Android.mk Unescape Escape View File

2 Makefile.am Unescape Escape View File

4 REVIEWERS Unescape Escape View File

2 VERSION Unescape Escape View File

6 appveyor.yml Unescape Escape View File

2 bin/.cherry-ignore Unescape Escape View File

3 bin/.editorconfig Normal file Unescape Escape View File

2 common.py Unescape Escape View File

368 configure.ac Unescape Escape View File

2 docs/developers.html Unescape Escape View File

22 docs/devinfo.html Unescape Escape View File

29 docs/envvars.html Unescape Escape View File

2 docs/faq.html Unescape Escape View File

204 docs/GL3.txt → docs/features.txt Unescape Escape View File

4 docs/helpwanted.html Unescape Escape View File

25 docs/index.html Unescape Escape View File

25 docs/intro.html Unescape Escape View File

4 docs/relnotes.html Unescape Escape View File

5 docs/relnotes/12.0.1.html Unescape Escape View File

403 docs/relnotes/12.0.2.html Normal file Unescape Escape View File

71 docs/relnotes/12.0.3.html Normal file Unescape Escape View File

85 docs/relnotes/13.0.0.html Normal file Unescape Escape View File

120 docs/specs/EGL_MESA_platform_surfaceless.txt Normal file Unescape Escape View File

8 docs/specs/MESA_configless_context.spec Unescape Escape View File

520 docs/specs/MESA_shader_integer_functions.txt Normal file Unescape Escape View File

0 src/egl/docs/EGL_MESA_screen_surface → docs/specs/OLD/EGL_MESA_screen_surface.txt Unescape Escape View File

41 docs/specs/enums.txt Unescape Escape View File

2 docs/xlibdriver.html Unescape Escape View File

2 include/D3D9/.editorconfig Normal file Unescape Escape View File

121 include/EGL/eglext.h Unescape Escape View File

5 include/EGL/eglmesaext.h Unescape Escape View File

9 include/GL/glext.h Unescape Escape View File

36 include/GL/glxext.h Unescape Escape View File

4 include/GL/internal/dri_interface.h Unescape Escape View File

18 include/GL/mesa_glinterop.h Unescape Escape View File

6 include/GL/wglext.h Unescape Escape View File

152 include/GLES2/gl2.h Unescape Escape View File

262 include/GLES2/gl2ext.h Unescape Escape View File

2 include/GLES2/gl2platform.h Unescape Escape View File

276 include/GLES3/gl3.h Unescape Escape View File

342 include/GLES3/gl31.h Unescape Escape View File

1817 include/GLES3/gl32.h Normal file View File

3 include/c11/.editorconfig Normal file Unescape Escape View File

2 include/c11/threads_posix.h Unescape Escape View File

3 include/d3dadapter/.editorconfig Normal file Unescape Escape View File

8 include/pci_ids/i965_pci_ids.h Unescape Escape View File

3 include/vulkan/.editorconfig Normal file Unescape Escape View File

4 install-gallium-links.mk Unescape Escape View File

72 m4/ax_check_compile_flag.m4 Unescape Escape View File

10 scons/custom.py Unescape Escape View File

9 scons/gallium.py Unescape Escape View File

2 scripts/get_reviewer.pl Unescape Escape View File

50 src/Makefile.am Unescape Escape View File

49 src/SConscript Unescape Escape View File

44 src/amd/Android.addrlib.mk Normal file Unescape Escape View File

28 src/amd/Android.mk Normal file Unescape Escape View File

38 src/amd/Makefile.addrlib.am Normal file Unescape Escape View File

27 src/amd/Makefile.am Normal file Unescape Escape View File

27 src/amd/Makefile.sources Normal file Unescape Escape View File

0 src/gallium/winsys/amdgpu/drm/addrlib/addrinterface.cpp → src/amd/addrlib/addrinterface.cpp Unescape Escape View File

0 src/gallium/winsys/amdgpu/drm/addrlib/addrinterface.h → src/amd/addrlib/addrinterface.h Unescape Escape View File

0 src/gallium/winsys/amdgpu/drm/addrlib/addrtypes.h → src/amd/addrlib/addrtypes.h Unescape Escape View File

0 src/gallium/winsys/amdgpu/drm/addrlib/core/addrcommon.h → src/amd/addrlib/core/addrcommon.h Unescape Escape View File

0 src/gallium/winsys/amdgpu/drm/addrlib/core/addrelemlib.cpp → src/amd/addrlib/core/addrelemlib.cpp Unescape Escape View File

0 src/gallium/winsys/amdgpu/drm/addrlib/core/addrelemlib.h → src/amd/addrlib/core/addrelemlib.h Unescape Escape View File

0 src/gallium/winsys/amdgpu/drm/addrlib/core/addrlib.cpp → src/amd/addrlib/core/addrlib.cpp Unescape Escape View File

0 src/gallium/winsys/amdgpu/drm/addrlib/core/addrlib.h → src/amd/addrlib/core/addrlib.h Unescape Escape View File

0 src/gallium/winsys/amdgpu/drm/addrlib/core/addrobject.cpp → src/amd/addrlib/core/addrobject.cpp Unescape Escape View File

0 src/gallium/winsys/amdgpu/drm/addrlib/core/addrobject.h → src/amd/addrlib/core/addrobject.h Unescape Escape View File

0 src/gallium/winsys/amdgpu/drm/addrlib/inc/chip/r800/si_gb_reg.h → src/amd/addrlib/inc/chip/r800/si_gb_reg.h Unescape Escape View File

0 src/gallium/winsys/amdgpu/drm/addrlib/inc/lnx_common_defs.h → src/amd/addrlib/inc/lnx_common_defs.h Unescape Escape View File

0 src/gallium/winsys/amdgpu/drm/addrlib/r800/chip/si_ci_vi_merged_enum.h → src/amd/addrlib/r800/chip/si_ci_vi_merged_enum.h Unescape Escape View File

0 src/gallium/winsys/amdgpu/drm/addrlib/r800/ciaddrlib.cpp → src/amd/addrlib/r800/ciaddrlib.cpp Unescape Escape View File

3774 Commits

mesa-12.0. ... 13.0-branc

34

.editorconfig Normal file

View File

1

.gitignore vendored

View File

12

.mailmap

View File

29

.travis.yml

View File

10

Android.common.mk

View File

4

Android.mk

View File

2

Makefile.am

View File

4

REVIEWERS

View File

2

VERSION

View File

6

appveyor.yml

View File

2

bin/.cherry-ignore

View File

3

bin/.editorconfig Normal file

View File

2

common.py

View File

368

configure.ac

View File

2

docs/developers.html

View File

22

docs/devinfo.html

View File

29

docs/envvars.html

View File

2

docs/faq.html

View File

204

docs/GL3.txt → docs/features.txt

View File

4

docs/helpwanted.html

View File

25

docs/index.html

View File

25

docs/intro.html

View File

4

docs/relnotes.html

View File

5

docs/relnotes/12.0.1.html

View File

403

docs/relnotes/12.0.2.html Normal file

View File

71

docs/relnotes/12.0.3.html Normal file

View File

85

docs/relnotes/13.0.0.html Normal file

View File

120

docs/specs/EGL_MESA_platform_surfaceless.txt Normal file

View File

8

docs/specs/MESA_configless_context.spec

View File

520

docs/specs/MESA_shader_integer_functions.txt Normal file

View File

0

src/egl/docs/EGL_MESA_screen_surface → docs/specs/OLD/EGL_MESA_screen_surface.txt

View File

41

docs/specs/enums.txt

View File

2

docs/xlibdriver.html

View File

2

include/D3D9/.editorconfig Normal file

View File

121

include/EGL/eglext.h

View File

5

include/EGL/eglmesaext.h

View File

9

include/GL/glext.h

View File

36

include/GL/glxext.h

View File

4

include/GL/internal/dri_interface.h

View File

18

include/GL/mesa_glinterop.h

View File

6

include/GL/wglext.h

View File

152

include/GLES2/gl2.h

View File

262

include/GLES2/gl2ext.h

View File

2

include/GLES2/gl2platform.h

View File

276

include/GLES3/gl3.h

View File

342

include/GLES3/gl31.h

View File

1817

include/GLES3/gl32.h Normal file

View File

3

include/c11/.editorconfig Normal file

View File

2

include/c11/threads_posix.h

View File

3

include/d3dadapter/.editorconfig Normal file

View File

8

include/pci_ids/i965_pci_ids.h

View File

3

include/vulkan/.editorconfig Normal file

View File

4

install-gallium-links.mk

View File

72

m4/ax_check_compile_flag.m4

View File

10

scons/custom.py

View File

9

scons/gallium.py

View File

2

scripts/get_reviewer.pl

View File

50

src/Makefile.am

View File

49

src/SConscript

View File

44

src/amd/Android.addrlib.mk Normal file

View File

28

src/amd/Android.mk Normal file

View File

38

src/amd/Makefile.addrlib.am Normal file

View File

27

src/amd/Makefile.am Normal file

View File

27

src/amd/Makefile.sources Normal file

View File

0

src/gallium/winsys/amdgpu/drm/addrlib/addrinterface.cpp → src/amd/addrlib/addrinterface.cpp

View File

0

src/gallium/winsys/amdgpu/drm/addrlib/addrinterface.h → src/amd/addrlib/addrinterface.h

View File

0

src/gallium/winsys/amdgpu/drm/addrlib/addrtypes.h → src/amd/addrlib/addrtypes.h

View File

0

src/gallium/winsys/amdgpu/drm/addrlib/core/addrcommon.h → src/amd/addrlib/core/addrcommon.h

View File

0

src/gallium/winsys/amdgpu/drm/addrlib/core/addrelemlib.cpp → src/amd/addrlib/core/addrelemlib.cpp

View File

0

src/gallium/winsys/amdgpu/drm/addrlib/core/addrelemlib.h → src/amd/addrlib/core/addrelemlib.h

View File

0

src/gallium/winsys/amdgpu/drm/addrlib/core/addrlib.cpp → src/amd/addrlib/core/addrlib.cpp

View File

0

src/gallium/winsys/amdgpu/drm/addrlib/core/addrlib.h → src/amd/addrlib/core/addrlib.h

View File

0

src/gallium/winsys/amdgpu/drm/addrlib/core/addrobject.cpp → src/amd/addrlib/core/addrobject.cpp

View File

0

src/gallium/winsys/amdgpu/drm/addrlib/core/addrobject.h → src/amd/addrlib/core/addrobject.h

View File

0

src/gallium/winsys/amdgpu/drm/addrlib/inc/chip/r800/si_gb_reg.h → src/amd/addrlib/inc/chip/r800/si_gb_reg.h

View File

0

src/gallium/winsys/amdgpu/drm/addrlib/inc/lnx_common_defs.h → src/amd/addrlib/inc/lnx_common_defs.h

View File

0

src/gallium/winsys/amdgpu/drm/addrlib/r800/chip/si_ci_vi_merged_enum.h → src/amd/addrlib/r800/chip/si_ci_vi_merged_enum.h

View File

0

src/gallium/winsys/amdgpu/drm/addrlib/r800/ciaddrlib.cpp → src/amd/addrlib/r800/ciaddrlib.cpp

View File

0

src/gallium/winsys/amdgpu/drm/addrlib/r800/ciaddrlib.h → src/amd/addrlib/r800/ciaddrlib.h

View File