Comparing d43f0835a5..e76d8e7432 - mesa

fran/mesa

Author	SHA1	Message	Date
Rhys Kidd	f25fdf21e7	vc4: Fix doxygen warnings Now that vc4 automated code documentation can be generated with doxygen, fix the warnings issued by Doxygen 1.8.11. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 17:53:45 +01:00
Rhys Kidd	db975fa86c	doxygen: Plumb through gallium/ to automated documentation Add Gallium and the Gallium-based drivers to doxygen's automated code documentation infrastructure. Can be individually created with: cd $MESA_TOP_LEVEL/ make -C doxygen/ gallium.tag Benefits from the existing doxygen Makefile runners to clean up afterwards with 'make clean'. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 17:53:45 +01:00
Emil Velikov	26f4638684	Revert "osmesa: don't try to bundle osmesa.def SConscript" This reverts commit `c07df0f201`. Now that the SCons build is back we need to include the files in the tarball.	2016-05-30 17:53:45 +01:00
Andreas Fänger	9601815b4b	scons: build osmesa swrast and gallium This patch makes it possible to build classic osmesa/swrast on windows again. It was removed in commit `69db422218`. Although there is a gallium version of osmesa now, the swrast version still has more features lacking in llvmpipe, e.g. anisotropic filtering. Tested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> [Emil Velikov: remove trailing whitespace] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 17:53:45 +01:00
Emil Velikov	3689ef32af	automake: rework the git_sha1.h rule, include in tarball As we'll need the file in the release tarball, rework the rule so that the file is regenerated _only_ if we're in a git repository. With this in place we can build vulkan (anv) from a release tarball. Cc: Jason Ekstrand <jason.ekstrand@intel.com> Cc: Kristian Høgsberg Kristensen <krh@bitplanet.net> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 17:53:45 +01:00
Emil Velikov	4cd9cd6abc	automake: move the git_sha1.h rule a level up This way we can reuse the header from other places like - src/intel/vulkan and src/gallium. Only the former is hooked up atm. Make sure .gitignore is updated, as well as all the users (the mesa code does not need any changes). Also ensure that the file is always created by adding it to the BUILT_SOURCES target. Cc: Jason Ekstrand <jason.ekstrand@intel.com> Cc: Kristian Høgsberg Kristensen <krh@bitplanet.net> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 17:53:45 +01:00
Emil Velikov	13faddb6b8	mesa_glinterop: remove mesa_glinterop typedefs As is there are two places that do the typedefs - dri_interface.h and this header. As we cannot include the former in here, just drop the typedefs and use the struct directly (as needed). This is required because typedef redefinition is C11 feature which is not supported on all the versions of GCC used to build mesa. v2: Kill the typedef alltogether, as per Marek. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96236 Cc: Vinson Lee <vlee@freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-30 17:53:44 +01:00
Emil Velikov	d43c894471	glx/glvnd: automake: include all the sources in libglx_la_SOURCES Otherwise the headers will be missing from the release tarball. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 17:53:44 +01:00
Emil Velikov	f9db61d095	glx/glvnd: remove the final if defined($extension) guards Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 17:53:44 +01:00
Emil Velikov	3bf00b6c6a	glx/glvnd: rework dispatch functions/indices tables lookup Rather than checking if the function name maps to a valid entry in the respective table, just create a dummy entry at the end of each table. This allows us to remove some unnessesary "index >= 0" checks, which get executed quite often. Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 17:53:44 +01:00
Emil Velikov	eab7e54981	glx/glvnd: Use strcmp() based binary search in FindGLXFunction() It will allows us to find the function within 6 attempts, out of the ~80 entry long table. v2: calculate middle on each iteration, correctly set the lower limit. Reviewed-by: Adam Jackson <ajax@redhat.com> (v1) Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 17:53:44 +01:00
Chuck Atkins	f9a35bf012	configure.ac: correct the xlib/xlib-gallium GLX detection for GLVND Things have changed since commit `a92910a` ("glx: Refactor the configure options for glx implementation choice (v3)") where only a single configure option is used to control the GLX provider. [Emil Velikov: Ensure that the check is moved after the detection code.] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 17:53:34 +01:00
Kyle Brenneman	22a9e00aab	glx: Implement the libglvnd interface. With reference to the libglvnd branch: https://cgit.freedesktop.org/mesa/mesa/log/?h=libglvnd This is a squashed commit containing all of Kyle's commits, all but two of Emil's commits (to follow), and a small fixup from myself to mark the rest of the glX* functions as _GLX_PUBLIC so they are not exported when building for libglvnd. I (ajax) squashed them together both for ease of review, and because most of the changes are un-useful intermediate states representing the evolution of glvnd's internal API. Co-author: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2016-05-30 16:29:49 +01:00
Frederic Devernay	cee459d84d	gallivm: initialize init_native_targets_once_flag correctly Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-05-30 16:13:52 +02:00
Ilia Mirkin	8cc80e396e	nvc0/ir: fix emission of predicate spill to register The lane mask only applies to real mov's, while here we're using PSET. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-30 10:07:01 -04:00
Ilia Mirkin	9444d71611	nvc0: fix some compute texture validation bits on kepler (a) Make sure to update the TIC in case of an updated buffer address (b) Mark newly-inactive textures dirty so that we update the handle in set_tex_handles. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-05-30 10:07:01 -04:00
Dave Airlie	bac39dddcf	mesa/xfb: report calculated size for XFB buffer objects. This fixes: GL45-CTS.direct_state_access.xfb_buffers This test looks correct to me, we should work out the size value and report it rather than using only the size from the Range interface. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-30 21:18:54 +10:00
Emil Velikov	e7bd5b4b77	swr: automake: silence the python invocation Cc: Tim Rowley <timothy.o.rowley@intel.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 10:31:08 +01:00
Emil Velikov	04987ef229	swr: automake: attempt to fix the out-of-tree build Make sure that the output folder is created otherwise the python scripts yells at us. Cc: 0xe2.0x9a.0x9b@gmail.com Cc: Tim Rowley <timothy.o.rowley@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96238 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 10:31:07 +01:00
Emil Velikov	3a59a624d0	swr: remove LLVM dependency from source generation rules. The dependencies should not mention any files external to the project. If we want to do sanity checks for the LLVM installed on the system we should do that in configure, yet again where is the merit which header gets checked and which doesn't ? Cc: Tim Rowley <timothy.o.rowley@intel.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 10:31:07 +01:00
Emil Velikov	b05b782b43	swr: add all the generators to the release tarball. Namely the python scripts and the knobs.template. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 10:31:07 +01:00
Emil Velikov	38394b5d76	anv: automake: don't forget to cleanup dev_icd.json Otherwise `make distcheck' will barf at us as the file is dangling. Ideally this should be part of the clean-local hook, although we include install-lib-links.mk which already has one. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-30 10:29:21 +01:00
Emil Velikov	220d8c99fa	anv: automake: bring back VULKAN_ENTRYPOINT_CPPFLAGS We should not have removed them in the first place. There's a subtle difference between generating the complete sources and using them which was not obvious as we nuked them. Without this, the release tarball ends up without various hunks of the generated sources, thus things fail at a later stage as we attempt to build them. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-30 10:28:56 +01:00
Emil Velikov	82514f26d8	anv: automake: ship the json files in the release tarball Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-30 10:28:53 +01:00
Emil Velikov	f80b10df8d	softpipe: add sp_buffer.h to the sources list (release tarball) Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 10:28:53 +01:00
Emil Velikov	2f43908395	freedreno: make sure we pick up ir3_nir_trig.py in the release tarball Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 10:28:53 +01:00
Emil Velikov	36859022ea	isl: add isl_priv.h to the sources list Otherwise it will be missing from the release tarball. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-30 10:28:50 +01:00
Mauro Rossi	41d252e418	isl: move the sources lists to Makefile.sources [Emil Velikov: use the file in the autoconf build] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-30 10:28:48 +01:00
Emil Velikov	b4f6c70397	isl: automake: list builddir before srcdir in the includes list As seen elsewhere - we want to include the freshly built sources as opposed the the (likely) stale ones in the srcdir. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-30 10:28:46 +01:00
Emil Velikov	53a2167e68	isl: automake: flatten the tests rules Fold the unneeded extra variable tests_ldadd, the explicit sources section (single file with the default extension) and flip the check_PROGRAMS <> TESTS order (TESTS includes scripts, while check_PROGRAMS is binaries only). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-30 10:28:43 +01:00
Emil Velikov	1eecc09584	isl: automake: remove unneeded install-lib-links.mk include One uses the makefile to create compatibility symlinks (to $top_builddir/libs) for shared libraries/modules. As we don't create any here, there's no need to include the file. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-30 10:28:40 +01:00
Emil Velikov	afc1db739a	isl: automake: remove unneeded SUBDIRS As we do not include any other subdirs but self, we don't need to set it. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-30 10:28:37 +01:00
Mauro Rossi	779653489e	genxml: move the sources (headers) list to Makefile.sources [Emil Velikov: use the file in the autoconf build] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-30 10:26:36 +01:00
Emil Velikov	ace5403453	anv: bail out if anv_wsi_init() fails Otherwise we'll end up setting up a device with no winsys integration. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> --- Hard-coding the rendernode name in anv_physical_device_init() is a bad idea really. We could/should be using drmGetDevices() to get info on all the devices (master/render/etc. node names, pci location etc.) and apply our heuristics on top of that. That can come up as a follow up change.	2016-05-30 10:26:36 +01:00
Emil Velikov	93e65fdcac	anv: resolve wayland-only build Ensure that the final X11/XCB hunk is guarded by the correct macro. Otherwise we'll require the symbol even when building without said platform. Cc: Cedric Sodhi <manday@openmail.cc> Reported-by: Cedric Sodhi <manday@openmail.cc> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-30 10:26:35 +01:00
Robert Foss	5068d307f9	anv: Fix use of uninitialized variable. The return variable was not set for failure paths. It has now been changed to VK_ERROR_INITIALIZATION_FAILED for failure paths. Coverity: 1358944 Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Robert Foss <robert.foss@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> [Emil Velikov: rebase against master, s/vulkan/anv/] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-30 10:26:35 +01:00
Stanimir Varbanov	e382bc649b	gallium: push offset down to driver Push offset down to drivers when importing dmabuf. This is needed to more fully support EGL_EXT_image_dma_buf_import when a non-zero offset is specified. Tesing has been done for freedreno, and compile tested following gallium drivers: nouveau,svga,virgl,r600,r300,radeonsi,swrast,i915,ilo Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-30 10:26:35 +01:00
Stanimir Varbanov	30d28d7c31	st/dri: cleanup image_from_fd/dma_buf paths Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-30 10:26:35 +01:00
Stanimir Varbanov	9d852a1f75	st/dri: add handling of R8 and GR88 DRI fourcc formats This helps to import dmabuf buffers from DRM_FORMAT_R8 and DRM_FORMAT_GR88 used for example by GStreamer for YUV to RGB conversion using shaders. Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-30 10:26:35 +01:00
Bas Nieuwenhuizen	e9d3246a7a	radeonsi: Don't offset OFFCHIP_BUFFERING on pre-VI cards. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96239 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-30 09:59:50 +02:00
Francisco Jerez	d8cf982f7d	i965: Expose GL 4.3 on Gen8+. ARB_compute_shader was the last feature missing. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-29 23:41:38 -07:00
Francisco Jerez	4decc426c2	i965/fs: Skip gen4 pre/post-send dependency workaronds for the first/last block. We know that there cannot be any destination dependency race if we reach the beginning or end of the program without having found any other instruction the send could possibly race with. This avoids emitting a pile of useless moves at the beginning or end of the program in the most common case in which the program has a single basic block only. On the original i965 I get the following shader-db results: total instructions in shared programs: 3354165 -> 3215637 (-4.13%) instructions in affected programs: 3183065 -> 3044537 (-4.35%) helped: 13498 HURT: 0 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-29 23:41:38 -07:00
Francisco Jerez	daf4a71883	i965/fs: Skip SIMD lowering source unzipping for regular scalar regions. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-29 23:41:38 -07:00
Francisco Jerez	6956015aa5	i965/fs: Factor out region zipping and unzipping from the SIMD lowering pass. Just to make sure we keep the SIMD lowering pass tidy when we introduce additional logic to try to optimize out the copy instructions used to zip and unzip the destination and source regions into multiple packed regions of the lowered instruction width. Shouldn't cause any functional changes. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-29 23:41:38 -07:00
Francisco Jerez	a9f00a9e53	i965/fs: Generalize regions_overlap() from copy propagation to handle non-VGRF files. This will be useful in several places. The only externally visible difference (other than non-VGRF files being supported now) is that the region sizes are now passed in byte units instead of in GRF units because the loss of precision would have become a problem in the SIMD lowering pass. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-29 23:41:38 -07:00
Francisco Jerez	4db93592de	i965/fs: Refactor offset() into a separate function taking the width as argument. This will be useful in the SIMD lowering pass to avoid having to construct a builder object of the known region width just to pass it as argument to offset(), which doesn't do anything with it other than taking the builder dispatch_width as region width. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-29 23:41:38 -07:00
Francisco Jerez	a5b4f63c15	i965/fs: Implement opt_sampler_eot() in terms of logical sends. This makes the whole LOAD_PAYLOAD munging unnecessary which simplifies the code and will allow the optimization to succeed in more cases independent of whether the LOAD_PAYLOAD instruction can be found or not. The following patch is squashed in: SQUASH: i965/fs: Add basic dataflow check to opt_sampler_eot(). The sampler EOT optimization pass naively assumes that the texturing instruction provides all the data used by the FB write just because they're standing next to each other. The least we should be checking is whether the source and destination regions of the FB write and texturing instructions match. Without this the previous seemingly harmless patch would have caused opt_sampler_eot() to misoptimize a shader from dota-2 causing DCE to eliminate all of its 78 instructions except for the final sampler EOT message (!). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-29 23:41:38 -07:00
Francisco Jerez	a0d9aed268	i965/fs: Fix UB list sentinel dereference in opt_sampler_eot(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-29 23:41:38 -07:00
Francisco Jerez	2a166c13d4	i965/fs: Take opt_redundant_discard_jumps out of the optimization loop. No shader-db regressions. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-29 23:41:38 -07:00
Francisco Jerez	d5f2f32b11	i965/fs: Run SIMD and logical send lowering after the optimization loop. There are two reasons why this is useful: - It avoids the introduction of an amount of partial writes emitted by the SIMD lowering pass to zip and unzip register regions early during optimization, which can make subsequent optimization less effective. - It substantially reduces the burden on the compiler when a large fraction of the instructions in the program need to be split (e.g. during SIMD32 builds). Individual halves of split instructions will be optimized identically (if they can still be optimized at all), so doing it up front can duplicate the amount of instructions the optimizer has to deal with which causes the compilation time to explode in some cases due to the worse-than-linear runtime behaviour of the back-end. It seems helpful to re-run a few optimization passes in cases where any of the lowering passes was able to make progress. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-29 23:41:38 -07:00
Francisco Jerez	e9eb59ba68	i965/fs: Add FS_OPCODE_FB_WRITE_LOGICAL to has_side_effects(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-29 23:41:38 -07:00
Francisco Jerez	48d743c501	i965/fs: Allow constant propagation into logical send sources. Logical sends are eventually lowered into a series of copies so they can take almost anything as source. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-29 23:41:37 -07:00
Francisco Jerez	f1a607cf68	i965/fs: Let CSE handle logical sampler sends as expressions. This will prevent some shader-db regressions when we start plumbing logical sends through the optimizer. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-29 23:41:37 -07:00
Francisco Jerez	b0c8e5e0c8	i965/fs: Pass a BAD_FILE register to the logical FB write when oMask is unused. This will let the optimizer know that the sample mask value is unused so its definition can be DCE'ed. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-29 23:41:37 -07:00
Timothy Arceri	aac90ba292	glsl: fix xfb_offset unsized array validation This partially fixes CTS test: GL44-CTS.enhanced_layouts.xfb_get_program_resource_api The test now fails at a tes evaluation shader with unsized output arrays. The ARB_enhanced_layouts spec says: "It is a compile-time error to apply xfb_offset to the declaration of an unsized array." So this seems like a bug in the CTS. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-05-30 15:11:47 +10:00
Timothy Arceri	87fb5aa3e7	glsl: dont crash when attempting to assign a value to a builtin define For example GL_ARB_enhanced_layouts = 3; Fixes: GL44-CTS.enhanced_layouts.glsl_contant_immutablity Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-05-30 12:47:58 +10:00
Dave Airlie	d98d6e6269	egl/dri3: don't crash on no context. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94925 Pointed out by Karol Herbst on irc. Signed-off-by: Dave Airlie <airlied@redhat.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-30 11:30:04 +10:00
Dave Airlie	e2791b38b4	mesa/program_interface_query: fix transform feedback varyings. The spec says gl_NextBuffer and gl_SkipComponents need to be returned to userspace in the program interface queries. We currently throw those away, this requires a complete piglit run to make sure no drivers fallover due to the extra varyings. This fixes: GL45-CTS.program_interface_query.transform-feedback-built-in Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-30 11:26:50 +10:00
Dave Airlie	6effdce92e	glsl/ast: subroutineTypes can't be returned from functions. These types can't be returned. This fixes: GL43-CTS.shader_subroutine.subroutines_not_allowed_as_variables_constructors_and_argument_or_return_types for the return type case. Reviewed-by: Chris Forbes <chrisforbes@google.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-30 11:25:30 +10:00
Timothy Arceri	db2a35193f	glsl: use has_double() helper Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-05-30 11:01:40 +10:00
Timothy Arceri	8f4ac20b6f	glsl: fix explicit uniform block alignment This stops the offset being bumped again when and an explicit alignment has already been applied. Fixes alignment issues in: GL44-CTS.enhanced_layouts.uniform_block_alignment Note the test still fails due to unrelated issues with doubles. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-05-30 11:01:32 +10:00
Jordan Justen	7398a32c50	i965: Shrink stage_prog_data param array length It appears we were over-allocating these arrays. Previously we would use nir->num_uniforms directly for scalar programs, and multiply it by 4 for vec4 programs. Instead we should have been dividing by 4 in both cases to convert from bytes to a gl_constant_value count. The size of gl_constant_value is 4 bytes. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-29 09:59:55 -07:00
Ilia Mirkin	160063b110	nv50,nvc0: fix the max_vertices=0 case This is apparently legal. Drop any emit/restarts, and pass a 1 to the hardware. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-29 09:34:03 -04:00
Ilia Mirkin	f2e7268a55	st/mesa: fix setting of point_size_per_vertex in ES contexts GL ES 2.0+ does not have a GL_PROGRAM_POINT_SIZE enable, unlike desktop GL. So we have to go and check the last pre-rasterizer stage to see whether it outputs a point size or not. This fixes a number of dEQP tests that use a geometry or tessellation shader to emit points primitives. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-05-29 09:34:03 -04:00
Marek Olšák	04a78068ff	mesa: skip level checking for FramebufferTexture*D if texture is zero From the OpenGL 4.5 core spec: "An INVALID_VALUE error is generated if texture is not zero and level is not a supported texture level for textarget, as described above." Other FramebufferTexture functions already do the right thing. This fixes the main menu in F1 2015. Cc: 11.1 11.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-05-29 14:24:23 +02:00
Ilia Mirkin	60341ddd5c	st/mesa: expose OES_shader_io_blocks when we have enough for ES 3.1 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-28 20:58:12 -04:00
Vinson Lee	884ac61722	swr: [rasterizer] Do not define _mm256_storeu2_m128i with icc. Fix build error with icc. CXX libswrAVX_la-swr_clear.lo icpc: command line warning #10006: ignoring unknown option '-Wdelete-non-virtual-dtor' In file included from ./rasterizer/jitter/jit_api.h(31), from swr_context.h(30), from swr_clear.cpp(24): ./rasterizer/common/os.h(135): error: expected an identifier void _mm256_storeu2_m128i(__m128i hi, __m128i lo, __m256i a) ^ Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-05-28 14:26:54 -07:00
Thomas Hindoe Paaboel Andersen	df210ff24d	i965: add missing return in if statement Re-add the "return false" that was removed in `0c02d7002d` It seems that something went wrong when merging the patch. The patch sent to the mailing list does not directly match what was committed. https://lists.freedesktop.org/archives/mesa-dev/2016-May/118198.html Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-28 11:26:33 -07:00
Ilia Mirkin	c7731a0740	gk110/ir: fix unspilling of predicates from registers Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96258 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2 11.1" <mesa-stable@lists.freedesktop.org>	2016-05-28 13:14:19 -04:00
Samuel Pitoiset	697237b71e	nvc0: remove outdated surfaces validation code for GK104 This code was used for validating surfaces with compute but now we use pipe_image_view instead. Anyway, surfaces support should be re-introduced properly once OpenCL happens. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-28 15:50:07 +02:00
Samuel Pitoiset	f07ade6881	nvc0: do not always invalidate 3D CBs when using compute Constant buffers are aliased between 3D and CP on Fermi, but we should only invalidate them when a compute shader actually uses CBs and not all the time after a lauching grid. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-28 15:50:03 +02:00
Francisco Jerez	357495b94d	i965: Update compute workgroup size limit calculation for SIMD32. This should have the side effect of enabling the ARB_compute_shader extension on Gen8+ hardware and all Gen7 platforms that didn't previously expose it (VLV and IVB GT1) due to the number of hardware threads per subslice being insufficient in SIMD16 mode. v2: Bump workgroup size limit for GLES too (Jordan). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-27 23:29:06 -07:00
Francisco Jerez	46ce93ed22	i965: Add do32 debug option. The do32 INTEL_DEBUG option causes the back-end to try to generate a SIMD32 program when compiling a compute shader regardless of the specified compute shader workgroup size, which will be useful for testing SIMD32 code generation in the most common case in which the workgroup size doesn't exceed the SIMD16 limit so SIMD32 codegen wouldn't be automatically enabled. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:06 -07:00
Francisco Jerez	864737ce6c	i965/fs: Build 32-wide compute shader when needed. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:06 -07:00
Francisco Jerez	37fd13ee2d	i965/fs: Extend back-end interface for limiting the shader dispatch width. This replaces the current fs_visitor::no16() interface with fs_visitor::limit_dispatch_width(), which takes an additional parameter allowing the caller to specify the maximum dispatch width a shader can be compiled with. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:06 -07:00
Francisco Jerez	2d288cb9ea	i965/fs: Implement SIMD32 register allocation support. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:06 -07:00
Francisco Jerez	7f10d3983b	i965/fs: Remove pre-Gen7 register allocation class micro-optimization. This was trying to save some one-time init on pre-Gen7 hardware under the assumption that one would only ever need 1, 2, 4 and 8-wide registers on those platforms. However nothing guarantees that those will be the only VGRF sizes used after lowering and optimization. In some cases we may end up with a temporary of different size being allocated (e.g. by SIMD lowering to zip or unzip a multi-component register region of a logical send instruction), and there is no guarantee that they will be optimized away before register allocation (especially since the compute_to_mrf coalescing pass is rather... lacking...). Instead just allocate classes for all possible VGRF sizes up to MAX_VGRF_SIZE to avoid a crash in pq_test() when we encounter a variable of any other size. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:06 -07:00
Francisco Jerez	1d5bf46ad1	i965/fs: Don't mutate multi-component arguments in sampler payload set-up. The Gen5+ sampler message payload construction code steps through the coordinate and derivative components by induction like 'coordinate = offset(coordinate, bld, 1)', the problem is that while doing that it may step one past the end of the coordinate vector causing an assertion failure in offset() if it happens to be a (single component) immediate. Right now coordinates and derivatives are typically passed as actual registers but that will no longer be the case when we start propagating constants into logical messages. Instead express coordinate components in closed form like 'offset(coordinate, bld, i)' -- The end result seems slightly more readable that way and it allows passing the coordinate and derivative registers by const reference instead of by value, so it seems like a clean-up in its own right. v2: Fold a few post-increment operators into the last MOV statement. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:06 -07:00
Francisco Jerez	ad8f66ed33	i965/fs: Fix multiple ACP interference during copy propagation. This is more fallout from `cf375a3333`. It's possible for multiple ACP entries to interfere with a given VGRF write, so we need to continue iterating even if an overlapping entry has already been found. Cc: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:06 -07:00
Francisco Jerez	c88b52745c	i965/fs: Fix cmod propagation not to propagate non-identity cmod into CMP(N). The conditional mod of these instructions determines the semantics of the comparison itself (rather than being evaluated based on the result of the instruction as is usually the case for most other instructions that allow conditional mods), so it's in general not legal to propagate a conditional mod into a CMP instruction. This prevents cmod propagation from (mis)optimizing: cmp.z.f0 tmp, ... mov.z.f0 null, tmp into: cmp.z.f0 tmp, ... which gives the negation of the flag result of the original sequence. I could reproduce this easily with SIMD32 but I don't see any reason why the problem would be SIMD32-specific, it was most likely working by luck. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:06 -07:00
Francisco Jerez	8476233ae2	i965/fs: Estimate number of registers written correctly in opt_register_renaming. The current estimate is incorrect for non-32b types. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:05 -07:00
Francisco Jerez	437e65f9d9	i965/fs: Add (sub)reg_offset asserts to brw_reg_from_fs_reg. These are completely ignored by the conversion to brw_reg, so they better be zero. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:05 -07:00
Francisco Jerez	51dd6a60f5	i965/fs: Reset reg_offset of the original destination to zero in compute_to_mrf(). Prevents an assertion failure in the following commit. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:05 -07:00
Francisco Jerez	b9eab911ba	i965/fs: Skip remove_duplicate_mrf_writes() during SIMD32 runs. The pass is disabled in SIMD16 dispatch mode for the same reason, it cannot handle instructions that write multiple MRF registers at once. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:05 -07:00
Francisco Jerez	796238d9e6	i965/fs: Use SIMD8 SSBO GET_BUFFER_SIZE message regardless of the dispatch width. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:05 -07:00
Francisco Jerez	29e4717251	i965/fs: Don't emit duplicated SSBO GET_BUFFER_SIZE instruction unnecessarily. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:05 -07:00
Francisco Jerez	a55452530f	i965/fs: Emit fixed width memory fence opcode regardless of the dispatch width. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:05 -07:00
Francisco Jerez	ae730049c6	i965/fs: Return 32 bit mask from fs_builder::sample_mask(). This doesn't actually handle the FS case, just add an assertion for the moment so I don't forget to update it later on for SIMD32 fragment shader dispatch. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:05 -07:00
Francisco Jerez	8b6edee679	i965/fs: Emit fixed-width null register regardless of the dispatch width. brw_null_vec() cannot handle widths over 16 but it doesn't really matter what width we specify for null registers because destination regions have no width field at the hardware level. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:05 -07:00
Francisco Jerez	298320280f	i965/fs: Fix half() to handle more exotic register files. horiz_offset() is able to deal with a superset of the register files currently special-cased in half(). Just call horiz_offset() in all cases. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:05 -07:00
Francisco Jerez	8c9601ef7b	i965/fs: Fix horiz_offset() to handle ARF and HW GRF register files. We'll hit these in some cases during SIMD lowering in 32-wide programs. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:05 -07:00
Francisco Jerez	7d430fc05e	i965/fs: Clean up remaining uses of fs_inst::reads_flag and ::writes_flag. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:05 -07:00
Francisco Jerez	ecd7a7255a	i965/fs: Keep track of flag dependencies with byte granularity during scheduling. This prevents false dependencies from being created between instructions that write disjoint 8-bit portions of the flag register and OTOH should make sure that the scheduler considers dependencies between instructions that write or read multiple flag subregisters at once (e.g. 32-wide predication or conditional mods). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:04 -07:00
Francisco Jerez	0fec265373	i965/fs: Track flag register liveness with byte granularity. This is required for correctness in presence of multiple 8-wide flag writes (e.g. 8-wide instructions with a conditional mod set) which update a different portion of the same 16-bit flag subregister. Right now we keep track of flag dataflow with 16-bit granularity and consider flag writes to have killed any previous definition of the same subregister even if the write was less than 16 channels wide, which can cause live flag register updates to be dead code-eliminated incorrectly. Additionally this makes sure that we handle 32-wide flag writes and reads which may span multiple flag subregisters so the current approach of just setting/testing a single bit from the live set wouldn't have worked. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:04 -07:00
Francisco Jerez	df1aec763e	i965/fs: Define methods to calculate the flag subset read or written by an fs_inst. v2: Codestyle fixes (Jason). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:04 -07:00
Francisco Jerez	ece41df247	i965/fs: Expose arbitrary channel execution groups to the IR. This generalizes the current fs_inst::force_sechalf flag to allow specifying channel enable groups other than 0 or 8. At some point it will likely make sense to fix the vec4 generator to support arbitrary execution groups and then move the definition of fs_inst::group into backend_instruction (e.g. so we can do FP64 in the VEC4 back-end). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:04 -07:00
Francisco Jerez	81bc6de8c0	i965/ir: Make BROADCAST emit an unmasked single-channel move. Alternatively we could have extended the current semantics to 32-wide mode by changing brw_broadcast() to emit multiple indexed MOV instructions in the generator copying the selected value to all destination registers, but it seemed rather silly to waste EU cycles unnecessarily copying the exact same value 32 times in the GRF. The vstride change in the Align16 path is required to avoid assertions in validate_reg() since the change causes the execution size of the MOV and SEL instructions to be equal to the source region width. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:04 -07:00
Francisco Jerez	41562eb8f3	i965/fs: Allow specifying arbitrary quarter control to FIND_LIVE_CHANNEL. This makes FIND_LIVE_CHANNEL behave like a normal instruction for non-zero quarter control. On Gen8+ we just leave the quarter control field of the emitted FBL instruction set to the default value so the hardware applies the expected shift to the execution mask signals. On Gen7 we apply the offset manually by specifying a non-zero subregister offset in the source region of the FBL instruction. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:04 -07:00
Francisco Jerez	a5a0810960	i965/fs: Allow specifying arbitrary execution sizes up to 32 to FIND_LIVE_CHANNEL. Due to a Gen7-specific hardware bug native 32-wide instructions get the lower 16 bits of the execution mask applied incorrectly to both halves of the instruction, so the MOV trick we currently use wouldn't work. Instead emit multiple 16-wide MOV instructions in 32-wide mode in order to cover the whole execution mask. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:04 -07:00
Francisco Jerez	1e3c58ffaf	i965/fs: Lower 32-wide scratch writes in the generator. The hardware has messages that can write 32 32bit components at once but the channel enable mask gets messed up. We need to split them into several 16-wide scratch writes for the channel enables to be applied correctly. The SIMD lowering pass cannot be used for this because scratch writes are emitted rather late during register allocation long after SIMD lowering has been done. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:29:02 -07:00
Francisco Jerez	a7d319c00b	i965/fs: Implement scratch reads and writes of 4 GRFs at a time. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:28:59 -07:00
Francisco Jerez	fe5cdde2f9	i965/eu: Fix Gen7+ DP scratch message size calculation on Gen7. Gen7 hardware expects the block size field in the message descriptor to be the number of registers minus one instead of the log2 of the number of registers. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:28:59 -07:00
Francisco Jerez	fc7107de1d	i965/eu: Set execution size explicitly for memory fence send message. We don't want to emit a 32-wide send message in 32-wide programs. The memory fence message should have the same effect regardless of the execution size (as long as it's valid) so just set it to one. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:28:59 -07:00
Francisco Jerez	5c887326c5	i965/eu: Consider QtrCtrl 3Q-4Q in typed surface message descriptor setup. In SIMD32 programs the compiler is responsible for providing the appropriate half of the sample mask in the message header, so the first and third quarters both map to the first slot group of the provided 16-bit half, while the second and fourth quarters map to the second slot group -- IOW they should be equivalent to 1Q and 2Q modulo two. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:28:59 -07:00
Francisco Jerez	448340d31f	i965/fs: Clean up remaining uses of dispatch_width in the generator. Most of these are bugs because the intended execution size of an instruction and the dispatch width of the shader aren't necessarily the same (especially in SIMD32 programs). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:28:59 -07:00
Francisco Jerez	7f28ad8c4d	i965/eu: Remove brw_codegen::compressed and ::compressed_stack. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:28:59 -07:00
Francisco Jerez	646213168e	i965/eu: Use current exec size instead of p->compressed in surface message generation. This was kind of an abuse of p->compressed, dataport send message instructions are always uncompressed. Use the current execution size instead since p->compressed is on its way out. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:28:46 -07:00
Francisco Jerez	492286e90b	i965/fs: No need to reset predicate control after emitting some instructions. Trivial clean-up. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:22:10 -07:00
Francisco Jerez	8ef5637729	i965/fs: Pass current execution size to brw_IF() and brw_DO(). This gets IF and DO instructions working in SIMD32 programs. brw_IF() and brw_DO() should probably behave in the same way as other generator functions that emit control flow instructions and just figure out the right execution size by themselves from the current execution controls specified through the brw_codegen argument. Changing that will require updating lots of Gen4-5 clipper code though, so for the moment just pass the current value redundantly from the FS generator. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:22:10 -07:00
Francisco Jerez	fdae8b9f91	i965/eu: Stop using p->compressed to specify the exec size of control flow instructions. p->compressed won't work for SIMD32, we should just be using the execution size value specified via p->current instead. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:22:10 -07:00
Francisco Jerez	0b4cd91071	i965/fs: Extend region width calculation to allow arbitrary execution sizes. Instead of just halving the execution size when the instruction is compressed hoping that it will give a legal source region width, we can calculate the maximum legal width value in closed form from the component size and stride. This makes sure that brw_reg_from_fs_reg() always returns a valid hardware region even for virtual 32-wide instructions (e.g. send-like instructions) that would seem to exceed the hardware region width limit after halving. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:22:10 -07:00
Kenneth Graunke	dabaf4fb96	i965/fs: Pass the compression mode to brw_reg_from_fs_reg(). Curro is planning to eliminate p->compressed, so let's avoid using it here and just pass in the value directly. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> [ Francisco Jerez: Pass boolean flag instead of brw_compression enum. ] Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:22:10 -07:00
Francisco Jerez	3340a66fce	i965/fs: Simplify per-instruction compression control setup in generator. By using the new compression/group control interface. This will allow easier extension to support arbitrary channel enable groups at the IR level. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:22:10 -07:00
Francisco Jerez	c78edcea8b	i965/fs: No need to set compression control at the top of generate_code(). The right value is dependent on the specific IR instruction being generated so it has to be reset in every iteration of the loop anyway. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:22:10 -07:00
Francisco Jerez	c19c3d3a52	i965/eu: Fix a bunch of compression control bugs in the generator. Most of these were resetting quarter control to zero incorrectly even though everything they needed to do was disable instruction compression -- The brw_SAMPLE() case was doing the right thing but it can be simplified slightly by using the new compression control interface. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:22:10 -07:00
Francisco Jerez	3dffd81583	i965/eu: Define alternative interface for setting compression and group controls. This implements some simple helper functions that can be used to specify the group of channel enable signals and compression enable that apply to a brw_inst instruction. It's intended to replace brw_set_default_compression_control eventually because the current interface has a number of shortcomings inherited from the Gen-4-5-centric representation of compression and group controls as a single non-orthogonal enum: On the one hand it doesn't work for specifying arbitrary group controls other than 1Q and 2Q, which are frequently useful in SIMD32 and FP64 programs. On the other hand the current interface forces you to update the compression and group controls simultaneously, which has been the source of a number of generator bugs (a bunch of them fixed in this series), because in many cases we would end up resetting the group controls to zero inadvertently even though everything we wanted to do was disable instruction compression -- The latter seems especially unfortunate on Gen6+ hardware which have no explicit compression control, so we would end up bashing the quarter control field of the instruction for no benefit. Instead of a single function that updates both at the same time introduce separate interfaces to update one or the other independently preserving the current value of the other (which typically comes from the back-end IR so it has to be respected). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:22:10 -07:00
Francisco Jerez	5db4d62395	i965/fs: Remove FS_OPCODE_PACK_STENCIL_REF virtual instruction. It's just a byte MOV with strided source. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:22:10 -07:00
Francisco Jerez	29ce110be6	i965/fs: Remove extract virtual opcodes. These can be easily represented in the IR as a MOV instruction with strided source so they seem rather redundant. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:22:09 -07:00
Francisco Jerez	9dcb8ff6a1	i965: Define brw_int_type() helper. Intended as a (partial) inverse of type_sz(). Will be useful in the next commit and some other SIMD32 generator changes I have queued up. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:22:09 -07:00
Francisco Jerez	bb89beb26b	i965/fs: Remove manual splitting of DDY ops in the generator. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:22:02 -07:00
Francisco Jerez	982c48dc34	i965/fs: Remove manual unrolling of BFI instructions from the generator. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:23 -07:00
Francisco Jerez	95272f5c7e	i965/fs: Drop Gen7 CMP SIMD unrolling workaround from the generator. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:23 -07:00
Francisco Jerez	f14b9ea6e6	i965/fs: Drop lowering code for a few three-source instructions from the generator. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:23 -07:00
Francisco Jerez	117a9a0a64	i965/fs: Set default access mode to Align1 for all instructions in the generator. Currently the generator code for most opcodes honours the default access mode (which should typically be Align1 in the scalar back-end), but generate_code() doesn't set it explicitly which means that the access mode from a previous instruction could leak into the following ones if you did something special and weren't careful enough to save and restore the previous access mode. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:22 -07:00
Francisco Jerez	3a541d0c0b	i965/fs: Remove handcrafted math SIMD lowering from the generator. Most of this wouldn't have worked for SIMD32 and had various dispatch_width and compression control bugs. It's mostly dead now with SIMD lowering of math instructions turned on in the compiler. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:22 -07:00
Francisco Jerez	cf5443f984	i965/fs: Limit SIMD width of various virtual opcodes to the maximum supported value. Which is 16 or 8 in most cases. This will make sure that 32-wide virtual instructions get chopped up into chunks of their maximum execution size. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:22 -07:00
Francisco Jerez	197833caa3	i965/fs: Lower LOAD_PAYLOAD instructions of unsupported width. Only per-channel LOAD_PAYLOAD instructions can be lowered, which should cover everything that comes in from the front-end. LOAD_PAYLOAD instructions used to construct actual message payloads cannot be easily lowered because they contain headers and vectors of variable type that aren't necessarily channel-aligned -- We shouldn't find any of them in the program at SIMD lowering time though because they're introduced during logical send lowering. An alternative that may be worth considering would be to re-run the SIMD lowering pass after LOAD_PAYLOAD lowering instead of this patch. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:22 -07:00
Francisco Jerez	9eea3df29f	i965/fs: Lower DDY instructions to SIMD8 during SIMD lowering time ...on hardware lacking compressed Align16 support. Will allow simplifying the generator code and fixing it for SIMD32 codegen. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:22 -07:00
Francisco Jerez	12ae87abb1	i965/fs: Apply usual FPU-like execution size restrictions to MULH. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:22 -07:00
Francisco Jerez	dea9c1df89	i965/fs: Calculate maximum execution size of MOV_INDIRECT correctly. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:22 -07:00
Francisco Jerez	122e031548	i965/fs: Assert that IF instruction with embedded compare has legal exec_size. We shouldn't encounter these right now but if we did it wouldn't be possible for the SIMD lowering pass to split it into multiple instructions because of its side effects on control flow, so just assert in order to kill the program. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:22 -07:00
Francisco Jerez	98c8bef01c	i965/fs: Implement HSW BFI exec size workarounds in the SIMD lowering pass. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:22 -07:00
Francisco Jerez	88d9cc1563	i965/fs: Implement workaround for IVB CMP dependency race in the SIMD lowering pass. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:22 -07:00
Francisco Jerez	a6bf5f88c7	i965/fs: Enforce common regioning restrictions by SIMD splitting. This change addresses a number of hardware restrictions on the source and destination regions and other execution controls of regular FPU-like instructions that in some cases can be avoided by reducing the execution size of the instruction. Some of these restrictions (e.g. the one about 3src instructions not supporting compression on some hardware) are currently being worked around case by case in the generator with ad-hoc splitting code that is buggy in several ways (e.g. doesn't handle non-trivial execution controls which would break SIMD32 code), but it seems cleaner to implement as many restrictions as we can in a single lowering pass since that will allow us to simplify some of the surrounding code considerably and also make sure that we don't forget applying them in the future. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:21 -07:00
Francisco Jerez	2b5adb942b	i965/fs: Enforce extended math exec size limits during SIMD lowering. This teaches the SIMD lowering pass about the hardware limits on the execution size of math instructions, which will allow simplifying the generator code and at the same time get rid of a number of bugs in the manual SIMD unrolling done currently that prevent SIMD32 codegen from working. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:21 -07:00
Francisco Jerez	a8e7b4f1d9	i965/fs: Handle SAMPLEINFO consistently like other texturing instructions. Seems like this texturing opcode was missing its logical counterpart which would prevent it from taking advantage of the SIMD lowering infrastructure, define it and plumb it through the back-end. At some point we'll likely want to emit a single SAMPLEINFO message shared among all channels irrespective of this change, but for the moment this should be enough to get the intrinsic working in SIMD32 mode. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:21 -07:00
Francisco Jerez	99b5476d33	i965/fs: Lower math into Gen4-5 send-like instructions in lower_logical_sends. The benefit is we will be able to use the SIMD lowering pass to unroll math instructions of unsupported width and then remove some cruft from the generator. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:21 -07:00
Francisco Jerez	e531b7907a	i965/fs: Add missing get_latency_gen7() cases for the Gen7 pull constant opcodes. This was causing the scheduler to be rather optimistic about the latency of pull constant opcodes on Gen7+. This might seem to increase the cycle count estimate calculated by the scheduler itself for some shaders, even though the actual cycle count should actually be decreased. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:21 -07:00
Francisco Jerez	ed4d0e41ac	i965/fs: Rename Gen4 physical varying pull constant load opcode. For consistency with the Gen7 variant. I'm not doing the same to the uniform pull constant message at this point because the non-GEN7 one is still overloaded to be either an expression-like logical instruction or a Gen4-specific physical send message. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:21 -07:00
Francisco Jerez	64a6cb87f1	i965/fs: Implement promotion of varying pull loads on Gen4 during SIMD lowering. Varying pull constant loads inherit the same limitation of pre-ILK hardware that requires expanding SIMD8 texel fetch instructions to SIMD16, we can deal with pull constant loads in the same way it's done for texturing during SIMD lowering. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:21 -07:00
Francisco Jerez	d8a3294ac2	i965/fs: Hide varying pull constant load message setup behind logical opcode. This will allow the SIMD lowering pass to split 32-wide varying pull constant loads (not natively supported by the hardware) into 16-wide instructions. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:21 -07:00
Francisco Jerez	0bc5ad8d19	i965/fs: Avoid constant propagation when the type sizes don't match. The case where the source type of the instruction is smaller than the immediate type could be handled by calculating the portion of the immediate read by the instruction (assuming that the source channels are aligned with the destination channels of the copy) and then representing the same value as an immediate of the source type (assuming such an immediate type exists), but the code below doesn't do that, so just bail for the moment. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:20 -07:00
Francisco Jerez	52cc80d859	i965/fs: Fix CSE temporary copy for some LOAD_PAYLOAD corner cases. If the LOAD_PAYLOAD instruction only has header sources it's possible for the number of registers written to be less than or equal to the SIMD component size, in which case it would take the single-MOV path at the bottom which would cause the channel enable masks to be applied incorrectly to the header contents and/or cause it to write past the end of the allocated temporary. If the instruction is either LOAD_PAYLOAD or doesn't write exactly one component the MOV path is going to mess up the program so just don't use it. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:20 -07:00
Francisco Jerez	c5f224145a	i965/fs: Handle instruction predication in SIMD lowering pass. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:20 -07:00
Francisco Jerez	1760c24b4b	i965/fs: No need to unzip SIMD-periodic sources during SIMD lowering. If the source value is going to the same for all SIMD-lowered chunks of the instruction there should be no need to unzip the value into multiple temporary registers one for each lowered chunk. As a side effect this fixes SIMD lowering of instructions with a vector immediate source. In the long term it might still be worth fixing offset() to handle vector immediates correctly though, this should be good enough for the moment. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:20 -07:00
Francisco Jerez	168163f5f0	i965/fs: Generalize is_uniform() to is_periodic(). This will be useful in the SIMD lowering pass. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:20 -07:00
Francisco Jerez	b736e78ddb	i965/fs: Fix byte_offset() for MRF/ARF/FIXED_GRF regs. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 23:19:20 -07:00
Francisco Jerez	2db9dd5aeb	i965/fs: Fix off-by-one region overlap comparison in copy propagation. This was introduced in `cf375a3333` but the blame is mine because the pseudocode I sent in my review comment for the original patch suggesting to do things this way already had the off-by-one error. This may have caused copy propagation to be unnecessarily strict while checking whether VGRF writes interfere with any ACP entries and possibly miss valid optimization opportunities in cases where multiple copy instructions write sequential locations of the same VGRF. Cc: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-05-27 23:19:20 -07:00
Ronie Salgado	8f538d9ae0	anv/cmd_buffer: Don't delete command buffers in ResetCommandPool() v2 (Jason Ekstrand): Destroy command buffers in DestroyCommandPool(). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95034 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-27 18:56:33 -07:00
Brian Paul	747754f027	gallium/util: another s/unsigned/enum pipe_prim_type/ for clang Trivial.	2016-05-27 18:42:21 -06:00
Jason Ekstrand	b93b5935a7	anv: Try the first 8 render nodes instead of just renderD128 This way, if you have other cards installed, the Vulkan driver will still work. No guarantees about WSI working correctly but offscreen should at least work. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95537	2016-05-27 17:18:33 -07:00
Jason Ekstrand	e023c104f7	anv: strdup the device path into the physical device This way we don't have to assume that the string coming in is a piece of constant data that exists forever.	2016-05-27 17:18:33 -07:00
Jason Ekstrand	9048dee328	anv/formats: Exit early for unsupported formats	2016-05-27 17:17:09 -07:00
Jason Ekstrand	10bc9f7024	anv/formats: Map VK_FORMAT_UNDEFINED to ISL_FORMAT_UNSUPPORTED At one point in time, we may have used the mapping to ISL_FORMAT_RAW for certain buffer surfaces but that time has long since passed. This fixes a bug where doing format queries on VK_FORMAT_UNDEFINED would assert-fail.	2016-05-27 17:17:09 -07:00
Jason Ekstrand	b16326c740	anv/clear: Remove an unused variable	2016-05-27 17:17:09 -07:00
Brian Paul	8beb6f3c9c	gallium/util: another unsigned -> enum pipe_prim_type change gcc didn't warn about the unsigned / enum pipe_prim_type mismatch between the .c and .h file. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-27 17:55:05 -06:00
Jordan Justen	47e2a57fe9	i965/compute: Fix uniform init issue when SIMD8 is skipped In `d8347f12ea`, we added support for skipping SIMD8 generation when the program local size is too large for SIMD8 to be usable. This change was missed in that commit. This bug would impact gen7 platforms when the compute shader local size is greater than 512, and gen8 platforms when the local size is greater than 448. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-27 16:44:00 -07:00
Bas Nieuwenhuizen	65d4ba6f20	docs: Mention GL4.3 and ES3.1 support for nvc0 and radeonsi v2: also update the introductory text. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-28 01:04:03 +02:00
Jason Ekstrand	fb2a5ceb32	anv: Emit DRAWING_RECTANGLE once at driver initialization Also, we don't actually need it for clipping because meta always colors inside the lines and, for all other operations, the user is required to set a scissor. Since DRAWING_RECTANGLE stalls the GPU, we want to emit it as little as possible. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-27 15:18:11 -07:00
Jason Ekstrand	3a83c176ea	anv/cmd_buffer: Only emit PIPE_CONTROL on-demand This is in contrast to emitting it directly in vkCmdPipelineBarrier. This has a couple of advantages. First, it means that no matter how many vkCmdPipelineBarrier calls the application strings together it gets one or two PIPE_CONTROLs. Second, it allow us to better track when we need to do stalls because we can flag when a flush has happened and we need a stall. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-27 15:18:09 -07:00
Jason Ekstrand	7120c75ec3	genxml: Make PIPE_CONTROL::CommandStreamerStallEnable a boolean This has been declared as a uint since SNB but it's only one bit. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-27 15:18:07 -07:00
Jason Ekstrand	b26bd6790d	anv/clear: Only clear the render area when doing subpass clears Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-27 15:18:04 -07:00
Jason Ekstrand	5432487792	anv: Move push constant allocation to the command buffer Instead of blasting it out as part of the pipeline, we put it in the command buffer and only blast it out when it's really needed. Since the PUSH_CONSTANT_ALLOC commands aren't pipelined, they immediately cause a stall which we would like to avoid. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-27 15:17:43 -07:00
Bas Nieuwenhuizen	2cee0d0f9c	radeonsi: enable OpenGL 4.3 Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-27 22:28:11 +02:00
Dave Airlie	0438bc76e2	nouveau: enable GL 4.3 on kepler/fermi Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-28 05:52:13 +10:00
Marek Olšák	43550f25ed	radeonsi: always reserve output space for tess factors Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Dave Airlie <airlied@redhat.com>	2016-05-27 21:40:43 +02:00
Dave Airlie	c44513a1f3	glsl/linker: call link_uniform blocks on linked shader. The old code called this on the prelinked shader list, but at this point we have the linked shader, so we should call the interface on that alone. This fixes a regression in: dEQP-GLES31.functional.ssbo.layout.random.all_per_block_buffers.13 introduced in `5b2675093e` glsl: handle implicit sized arrays in ssbo Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96228 Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reported-by: Mark James Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-28 05:35:53 +10:00
Dave Airlie	f0254fdd07	mesa/get: drop unused extension checks. These all show up as unused warnings here, so drop them for now. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-28 05:29:23 +10:00
Bas Nieuwenhuizen	4717d5a2d3	gallium/ddebug: Add passthrough for query_memory_info. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-27 20:00:07 +02:00
Jason Ekstrand	0482efdc93	nir/inline: Also rewrite param derefs for texture instructions Without this, samplers get left hanging as derefs to variables that don't actually exist. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-05-27 10:28:27 -07:00
Jason Ekstrand	2522180845	nir/inline: Break the guts of rewrite_param-derefs into a helper Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-05-27 10:28:27 -07:00
Jason Ekstrand	d19c406395	nir/inline: Make the rewrite_param_derefs helper work on instructions Now that we have the better nir_foreach_block macro, there's no reason to use the archaic block version for everything. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-05-27 10:28:27 -07:00
Jason Ekstrand	2fcba404f8	nir/inline: Don't use foreach_instr_safe unless we need to Suggested-by: Connor Abbott <cwabbott0@gmail.com>	2016-05-27 10:28:27 -07:00
Roland Scheidegger	9247570d42	gallivm: eliminate a unnecessary AND with unorm lerps Instead of doing a add and then mask out the upper bits, we can simply do a add with a half wide type (this, of course, assumes the hw can actually do it...), so we'll get the required zero in the upper bits automatically. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-05-27 19:11:28 +02:00
Roland Scheidegger	17d685c426	gallium/util: use enum pipe_prim_type instead of unsigned some more There were complaints from a mingw build: u_draw.h:134:14: error: invalid conversion from ‘uint {aka unsigned int}’ to ‘pipe_prim_type’ [-fpermissive] Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-27 19:11:28 +02:00
Brian Paul	2318d2015a	svga: remove unneeded casts in get_query_result_vgpu9() calls Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-05-27 10:01:12 -06:00
Brian Paul	9be122e9b0	svga: use MAYBE_UNUSED to silence release-build warnings Signed-off-by: Brian Paul <brianp@vmware.com>	2016-05-27 10:00:56 -06:00
Ben Widawsky	8314dd7ff2	isl: Fix some tautological-compare warnings Fixes: isl.c:62:22: warning: self-comparison always evaluates to true [-Wtautological-compare] assert(ISL_DEV_GEN(dev) == dev->info->gen); ^~ isl.c:63:33: warning: self-comparison always evaluates to true [-Wtautological-compare] assert(ISL_DEV_USE_SEPARATE_STENCIL(dev) == dev->use_separate_stencil); Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-05-26 21:59:17 -07:00
Ilia Mirkin	4ccf8c952a	mesa: add support for GLSL ES 3.20 version string Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-05-26 21:25:53 -04:00
Ilia Mirkin	faae9ab2ee	mapi: expose new functions in GL ES 3.2 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-05-26 21:25:53 -04:00
Ilia Mirkin	df2881381a	nvc0/ir: handle a load's reg result not being used for locked variants For a load locked, we might not use the first result but the second result is the predicate result of the locking. In that case the load splitting logic doesn't apply (which is designed for splitting 128-bit loads). Instead we take the predicate and move it into the first position (as having a dead result in first def's position upsets all sorts of things including RA). Update the emitters to deal with this as well. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-05-26 21:23:49 -04:00
Ilia Mirkin	04ecad97ff	nvc0/ir: avoid generating illegal instructions for compute constbuf loads For user-supplied constbufs, fileIndex is 0. In that case, when we subtract 1, we'll end up loading from constbuf offset -16. This is illegal, and there are asserts to avoid it. Normally we'd just DCE it, but no point in generating the instructions if they're not going to be used. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-05-26 21:23:49 -04:00
Rob Clark	4f98c94be7	gallium/util: fix build break Missing #include caused build breaks after `21a3fb9cd`. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-26 20:59:08 -04:00
Jason Ekstrand	9f9f229359	nir/spirv: Allow pointless variable decorations on inputs SPIR-V specifies that a bunch of stuff gets applied to types. This means taht a local variable could get, for instance, an array stride. Just because it's pointless doesn't mean you'll never see it.	2016-05-26 17:10:50 -07:00
Brian Paul	1ec45a1948	gallium/util: use enum pipe_prim_type in u_prim.h functions Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:18 -06:00
Brian Paul	7a49b41436	util/indices: move duplicated assignments out of switch cases Spotted by Roland. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:18 -06:00
Brian Paul	46be65c681	gallium: change pipe_draw_info::mode to be pipe_prim_type Makes debugging with gdb a little nicer. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:18 -06:00
Brian Paul	a25ae485a6	util/indices,svga: s/unsigned/enum pipe_prim_type/ Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:18 -06:00
Brian Paul	21a3fb9cd8	util: s/unsigned/enum pipe_resource_usage/ for buffer usage variables Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:18 -06:00
Brian Paul	45078e8890	svga: s/unsigned/enum pipe_resource_usage/ for buffer usage variables Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:18 -06:00
Brian Paul	d21a309c6c	svga: s/unsigned/enum pipe_prim_type/ for primitive type variables Proper enum types were only added recently. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:17 -06:00
Brian Paul	90afd7b7ef	svga: fix test for unfilled triangles fallback VGPU10 actually supports line-mode triangles. We failed to make use of that before. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-05-26 17:44:17 -06:00
Brian Paul	2c07c40d2f	svga: clean up and improve comments in svga_draw_private.h Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-05-26 17:44:17 -06:00
Brian Paul	0f983e1793	util/indices: implement unfilled (tri->line) conversion for adjacency prims Tested with new piglit gl-3.2-adj-prims test. v2: re-order trisadj and tristripadj code, per Roland. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:17 -06:00
Brian Paul	d6c2c7d710	util/indices: implement provoking vertex conversion for adjacency primitives Tested with new piglit gl-3.2-adj-prims test. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:17 -06:00
Brian Paul	479d364c39	util/indices: assert that the incoming primitive is a triangle type The unfilled index translator/generator functions should only be called when the primitive mode is one of the triangle types. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:17 -06:00
Brian Paul	26de558072	util/indices: formatting, whitespace fixes in u_unfilled_indices.c Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:17 -06:00
Brian Paul	24eadb4810	util/indices: improve comments in u_indices.h Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:17 -06:00
Brian Paul	5393238765	svga: fix primitive mode (point/line/tri) test for unfilled primitives The original mode test was valid before we had GS support. Regression tested with full piglit run. Though, I don't think we have any piglit tests that exercise drawing unfilled adjacency primitives. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-05-26 17:44:17 -06:00
Ian Romanick	b7af108d3e	i965: Enable GL_OES_shader_io_blocks Only one dEQP io_blocks test fails. This test fails for the same reason as the match_different_member_struct_names test in a previous commit. dEQP-GLES31.functional.separate_shader.validation.io_blocks.match_different_member_struct_names v2: Add to release notes. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-05-26 16:24:25 -07:00
Ian Romanick	660240da9e	glsl: Allow shader interface blocks in GLSL ES Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-05-26 16:24:25 -07:00
Ian Romanick	7a3093efcc	glsl: Add a has_shader_io_blocks helper Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-05-26 16:24:25 -07:00
Ian Romanick	f0902ee813	mesa: Add extension tracking for GL_OES_shader_io_blocks v2: Also support GL_EXT_shader_io_blocks. It's pretty much identical to the OES extension. Suggested by Ilia. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-05-26 16:24:25 -07:00
Ian Romanick	326a269c77	mesa: Only validate SSO shader IO in OpenGL ES or debug context v2: Move later in series to avoid issues with Gallium drivers and debug contexts. Suggested by Ilia. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Suggested-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-05-26 16:23:53 -07:00
Ian Romanick	3722c76001	mesa: Remove old validate_io function The new validate_io catches all of the cases (and many more) that the old function caught. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-05-26 16:22:25 -07:00
Ian Romanick	bd3f15cffd	mesa: Additional SSO validation using program_interface_query data Fixes the following dEQP tests on SKL: dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_qualifier_vertex_smooth_fragment_flat dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_implicit_explicit_location_1 dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_array_element_type dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_qualifier_vertex_flat_fragment_none dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_struct_member_order dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_struct_member_type dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_qualifier_vertex_centroid_fragment_flat dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_array_length dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_type dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_struct_member_precision dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_explicit_location_type dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_qualifier_vertex_flat_fragment_centroid dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_explicit_location dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_qualifier_vertex_flat_fragment_smooth dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_struct_member_name It regresses one test: dEQP-GLES31.functional.separate_shader.validation.varying.match_different_struct_names Hoever, this test is based on language in the OpenGL ES 3.1 spec that I believe is incorrect. I have already submitted a spec bug: https://www.khronos.org/bugzilla/show_bug.cgi?id=1500 v2: Move spec quote about built-in variables to the first place where it's relevant. Suggested by Alejandro. v3: Move patch earlier in series, fix rebase issues. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [v2] Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> [v2]	2016-05-26 16:21:01 -07:00
Ian Romanick	cfff746297	mesa: Track the additional data in gl_shader_variable The interface type, interpolation mode, precision, the type of the outermost structure, and whether or not the variable has an explicit location will be used for SSO validation on OpenGL ES. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-05-26 16:19:16 -07:00
Jason Ekstrand	15e553daf0	nir: Make nir_const_value a union There's no good reason for it to be a struct of an anonymous union. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96221 Tested-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-26 16:03:44 -07:00
Kenneth Graunke	e7776fa947	i965: Use the buffer object size for VERTEX_BUFFER_STATE's size field. commit `7c8dfa78b9` (i965/draw: Use the real size for vertex buffers) changed how we programmed the VERTEX_BUFFER_STATE size field. Previously, we programmed it to the size of the actual underlying BO, which is page-aligned, and potentially much larger than the GL buffer object. This violated the ARB_robust_buffer_access spec. With that change, we started programming it based on the range of data we expect the draw call to actually access - which is based on the min_index and max_index information provided to glDrawRangeElements(). Unfortunately, applications often provide inaccurate range information to glDrawRangeElements(). For example, all the Unreal demos appear to draw using a range of [0, 3] when the index buffer's actual index range is [0, 5]. Such results are undefined, and we are absolutely allowed to restrict access to the range they specified. However, the failure mode is usually that nothing draws, or misrendering with wild geometry, which is kind of bad for a common mistake. And people tend to assume the range information isn't that important when data is in VBOs. There's no real advantage, either. ARB_robust_buffer_access only requires us to restrict access to the GL buffer object size, not the range of data we think they should access. Doing that allows buggy applications to still function. (Note that we still use this information for busy-tracking, so if they try to overwrite the data with glBufferSubData, they'll still hit a bug.) This seems to be safer. We may want to provide the more strict range as a debug option, or scan the VBO and warn against bogus glDrawRangeElements in debug contexts. That can be done as a later patch, though. Makes Unreal demos draw again. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-26 15:56:41 -07:00
Samuel Pitoiset	e01a482182	nvc0: invalidate textures/samplers between 3D and CP on Fermi Like constant buffers, samplers and textures are aliased on Fermi and we need to invalidate the state when switching from 3D to CP and vice versa. This fixes rendering issues in the UE4 demos. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-26 23:51:22 +02:00
Jason Ekstrand	9f0bc0f2b3	anv: Stop linking against libmesa.la and libdri_test_stubs.la This brings the final size of an optimized non-debug build of the Vulkan driver down to 2.9 MB as opposed to 8.7 MB for the dri driver. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-26 14:13:38 -07:00
Jason Ekstrand	057259655e	i965: Don't link libmesa or libdri_test_stubs into tests Now that the compiler has been completely separated from libmesa, we no longer need these. We can make the tests much smaller by not linking them in. This also ensures that anyone who runs make check won't accidentally put in any dependencies from the compiler to the rest of mesa core. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-26 14:13:38 -07:00
Jason Ekstrand	870ff6cd38	i965: Move compiler debug functions to intel_screen.c They reference the compiler so they shouldn't go in libi965_compiler.la. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-26 14:13:38 -07:00
Jason Ekstrand	327161a48d	i965/test: Remove the fragment/vertex_program field from test visitors None of them are actually using it. It's a relic of an older compiler interface that required a gl_program. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-26 14:13:38 -07:00
Jason Ekstrand	e0ae10c49a	i965: Move brw_new_shader to brw_link.cpp That's where brw_link_shader lives and they seem to go together. Also, this gets it out of libi965_compiler. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-26 14:13:38 -07:00
Jason Ekstrand	5136b67915	i965: Move brw_nir_lower_uniforms.cpp to i965_FILES This gets it out of i965_compiler.la Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-26 14:13:38 -07:00
Jason Ekstrand	5e43ba7e9e	i965: Move brw_create_nir to brw_program.c This way it's no longer part of libi965_compiler.la since it depends on GLSL and ARB program stuff. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-26 14:13:38 -07:00
Jason Ekstrand	86a2447eec	i965/nir: Move the type_size_*_bytes functions to brw_nir.h Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-26 14:13:38 -07:00
Jason Ekstrand	58d1e82d32	ptn: Include nir.h Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-26 14:13:38 -07:00
Jason Ekstrand	32210dea8e	compiler: Move glsl_to_nir to libglsl.la Right now libglsl.la depends on libnir.la so putting it in libnir.la adds a dependency on libglsl.la that goes the wrong direction. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-26 14:13:38 -07:00
Ben Widawsky	ddcfc35f62	i965/sklgt4: Implement depth/timestamp write w/a The stated bug describes a scenario in which a post sync write operation for depth or timestamp can be ignored. There are two workarounds suggested, the first and easier is to simply do a cs stall when we do these type of writes. The second option is to do a PIPE_CONTROL flush after the post sync but before the data is required. Generally, I believe the data written out is consumed by the application on the CPU side and so doing the easier of the two is ideal. Furthermore, these queries aren't tremendously common in the perf sensitive apps I have looked at. However, there could be cases where a shader stage might directly consume the data, and as a result option 2 may be desirable. This patch goes with the easier solution for now. gen9lp bug_de_id=2137196 By itself, this does not fix any of the GT4 hangs we're currently experiencing. Cc: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-05-26 14:08:17 -07:00
Ben Widawsky	f1fa8b4a1c	i965/bxt: Add 2x6 variant Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-26 14:06:43 -07:00
Bas Nieuwenhuizen	43d7305a40	radeonsi: Allow TES distribution between shader engines. The R_028B50_VGT_TESS_DISTRIBUTION value is copied from amdgpu-pro. Smaller values in the ACCUM fields seem to decrease the performance advantage from this patch, higher values don't seem to matter. v2: Add distribution mode field enums. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	f91c85b29b	radeonsi: Process multiple patches per threadgroup. Using more than 1 wave per threadgroup does increase performance generally. Not using too many patches per threadgroup also increases performance. Both catalyst and amdgpu-pro seem to use 40 patches as their maximum, but I haven't really seen any performance increase from limiting the number of patches to 40 instead of 64. Note that the trick where we overlap the input and output LDS does not work anymore as the insertion of the tess factors changes the patch stride. v2: - Add comment about LDS assumptions. - Add constant for buffer size. - Fix code style. v3: - Correct limits for not splitting patches between waves. - Set max num_patches to 40 as in the proprietary driver. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	fd0a7a382f	radeonsi: Add barrier before writing the tess factors. The factors may be stored to LDs by another invocation than the invocation for vertex 0. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	fee3160af9	radeonsi: Enable dynamic HS. This allows running the TES on different CU's than the TCS which results in performance improvements. v2: Only write the control word from one invocation. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	26f436132b	radeonsi: Remove LDS layout user SGPR's from TES. They are unused. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	a4e2146a9d	radeonsi: Use buffer loads and stores for passing data from TCS to TES. We always try to use 4-component loads, as LLVM does not combine loads and they bypass the L1 cache. We can't use a similar strategy for stores and this is especially notable with the tess factors, as they are often set with separate MOV's per component in the TGSI. We keep storing to LDS and the LDS space, so we can load the outputs later, either due to the shader, of for wrting the tess factors. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	6217716e8f	radeonsi: Store inputs to memory when not using a TCS. We need to copy the VS outputs to memory. I decided to do this using a shader key, as the value depends on other shaders. I also switch the fixed function TCS over to monolithic, as otherwisze many of the user SGPR's need to be passed to the epilog, which increases register pressure, or complexity to avoid that. The main body of the fixed function TCS is not that interesting to precompile anyway, since we do it on demand and it is very small. v2: Use u_bit_scan64. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	7846fa8768	radeonsi: Add offchip buffer address calculation. Instead of creating a memory area per patch and per vertex, we put the same attribute of every vertex & patch together. Most loads and stores access the same attribute across all lanes, only for different patches and vertices. For the TCS this results in tightly packed data for 4-component stores. For the TES this is not the case as within a patch the loads often also access the same vertex. However if there are < 4 vertices/patch, this still results in a reduction of the number of cache lines. In the LDS situation we only do better than worst case if the data per patch < 64 bytes, which due to the tessellation factors is pretty much never. We do not use hardware swizzling for this. It would slightly reduce the number of executed VALU instructions, but I had issues with increased wait times that I haven't been able to solve yet. Furthermore, the tbuffer_store intrinsic does not support both VGPR offset and an index, so we have a problem storing indirectly indexed outputs. This can be solved by temporarily storing arrays in LDS and then copying them, but I don't think that is worth the effort. The difference in VALU cycles hardware swizzling gives is about 0.2% of total busy cycles. That is without handling the array case. I chose for attributes instead of components as they are often accessed together, and the software swizzling takes VALU cycles for calculating offsets. v2: - Rename functions to get_tcs_tes_buffer_address. - multiply by 16 as late as possible. - Use tgsi_full_src_register_from_dst. - Remove some bad comments. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	c49e68dc4b	radeonsi: Add user SGPR for the layout of the offchip buffer. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	d9a0c54f6f	radeonsi: Use correct parameter index for LS_OUT_LAYOUT. This happens to be in the right position, but that changes when TCS/TES get new parameters. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	3e7a7a9a65	radeonsi: Add buffer load functions. v2: - Use llvm.admgcn.buffer.load instrinsics for new LLVM. - Code style fixes. v3: - Code style fix. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	9fdb778702	radeonsi: Define build_tbuffer_store_dwords earlier to support new users. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	5c34562d7c	radeonsi: Add offchip tessellation parameters. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	d27ff7d683	radeonsi: Add buffer for offchip storage between TCS and TES. The buffer is quite large, but should only be allocated if the application uses tessellation. Most non-games don't. v2: - Use the correct register for SI. - Add define for block size. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Rob Clark	6e51fe75a4	tgsi: fix coverity out-of-bounds warning CID 1271532 (#1 of 1): Out-of-bounds read (OVERRUN)34. overrun-local: Overrunning array of 2 16-byte elements at element index 2 (byte offset 32) by dereferencing pointer &inst.Dst[i]. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-26 15:17:49 -04:00
Rob Clark	3d66ba971e	tgsi: fix out of bounds access Not sure why coverity calls this an out-of-bounds read vs out-of-bounds write. CID 1358920 (#1 of 1): Out-of-bounds read (OVERRUN)9. overrun-local: Overrunning array r of 3 16-byte elements at element index 3 (byte offset 48) using index chan (which evaluates to 3). Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-26 15:17:49 -04:00
Anuj Phogat	0c02d7002d	i965: Don't use fast copy blit in case of logical operations other than GL_COPY XY_FAST_COPY_BLT command doesn't have a field for raster operation. So, fall back to using XY_SRC_COPY_BLT to handle those cases. Fixes piglit test gl-1.1-xor-copypixels when fast copy blit is enabled for all tiling formats. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-26 10:57:09 -07:00
Anuj Phogat	97f0f91cc1	i965/gen9: Remove the halign/valign field setup code in fast copy blit Experimentation with different values of src/dst horizontal/vertical alignment showed that these fileds are not used on gen9 hardware. A recent update in graphics specs has removed these fields from XY_FAST_COPY_BLT command. Cc: Ben Widawsky <ben@bwidawsk.net> Cc: Chad Versace <chad.versace@intel.com> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-05-26 10:57:09 -07:00
Samuel Pitoiset	c52e92ec3a	nvc0: allow to monitor MP perf counters with compute shaders To read out MP perf counters we use a compute shader and need to upload input data like a 64-bits addr used to store the values and a sequence ID for synchronization. Currently, this input data is uploaded as user uniforms which means that it's sticked to c0[], but if a compute shader from a real application is used, monitoring those performance counters will just overwrite some data and miserably crash. Instead, sticking the 64-bits addr and the sequence into the driver constant buffer seems like much better and will allow to monitor counters with GL 4.3 apps. Tested on GF119 and GK110, but should not hurt anything on GK104. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-26 19:34:57 +02:00
Kristian Høgsberg Kristensen	329d115ac6	mesa: Move robustness code to main/robustness.c Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-26 09:37:17 -07:00
Kristian Høgsberg Kristensen	d7d729b965	docs: Mark GL_KHR_robustness done for GLES3.2 as well Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-26 09:36:36 -07:00
Plamena Manolova	a0674ce5c4	egl: Additional attribute validation for eglCreatePbufferSurface eglCreatePbufferSurface should generate an EGL_BAD_MATCH error if: 1: The EGL_TEXTURE_FORMAT attribute is EGL_NO_TEXTURE and EGL_TEXTURE_TARGET is something other than EGL_NO_TEXTURE 2: EGL_TEXTURE_FORMAT is something other than EGL_NO_TEXTURE and EGL_TEXTURE_TARGET is EGL_NO_TEXTURE. This fixes the dEQP-EGL.functional.negative_api.create_pbuffer_surface test. Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-05-26 08:02:48 -07:00
Marek Olšák	8539c9bf31	gallium/radeon: add the kernel version into the renderer string Example: Gallium 0.4 on AMD TONGA (DRM 3.2.0 / 4.5.0, LLVM 3.9.0) My kernel version is pretty long already (4.5.0-amd-01025-g32791c1) and adding "kernel" into the string would make too it long for glxinfo to display. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-05-26 16:53:46 +02:00
Marek Olšák	53f33619a4	winsys/amdgpu: add back multithreaded command submission Ported from the initial amdgpu winsys from the private AMD branch. The thread creates the buffer list, submits IBs, and cleans up the submission context, which can also destroy buffers. 3-5% reduction in CPU overhead is expected for apps submitting a lot of IBs per frame. This is most visible with DMA IBs. v2: use a semaphore instead of a busy loop in amdgpu_ws_queue_cs add another amdgpu_cs_sync_flush call into amdgpu_bo_map Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-26 16:43:45 +02:00
Lars Hamre	c626a86586	gallium/tgsi: use _mesa_roundevenf in micro_rnd Fixes the following piglit tests (for softpipe): /spec/glsl-1.30/execution/built-in-functions/... fs-roundeven-float fs-roundeven-vec2 fs-roundeven-vec3 fs-roundeven-vec4 vs-roundeven-float vs-roundeven-vec2 vs-roundeven-vec3 vs-roundeven-vec4 /spec/glsl-1.50/execution/built-in-functions/... gs-roundeven-float gs-roundeven-vec2 gs-roundeven-vec3 gs-roundeven-vec4 Signed-off-by: Lars Hamre <chemecse@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-26 07:59:15 -06:00
Emil Velikov	d519f59a9f	.mailmap: use Jakob Bornecrantz's personal email The VMware one is bouncing. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-26 13:57:32 +01:00
Ilia Mirkin	f998e5dc6b	nvc0: add note about where the viewport mask would go Not piping this all the way through yet, but no better place to note this down. This will can be used with NV_viewport_array2. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-26 08:46:29 -04:00
Ilia Mirkin	b634936d3b	nvc0: enable 32 textures on kepler+ For fermi, this likely will require use of linked tsc mode. However on bindless architectures, we can have as many as we want. As it stands, the AUX_TEX_INFO has 32 teture handles reserved. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-05-26 08:46:13 -04:00
Alejandro Piñeiro	2ed9563e79	glsl: add unit tests data vertex/expected outcome for uninitialized warning v2: fix 025 test. Add three more tests (Ian Romanick) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-26 09:19:36 +02:00
Alejandro Piñeiro	eee00274fa	glsl: add warning-test It executes compiler-glsl on all the available shaders, and it checks that the outcome is the expected. Bash code based on the already existing optimization-test v2: rebasing: use --version option Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-26 09:19:17 +02:00
Alejandro Piñeiro	68c23d2d04	glsl: add just-log option for the standalone compiler. Add an option in order to ask to just print the InfoLog, without any header or separator. Useful if we want to use the standalone compiler to track only the warning/error messages. v2: all printfs goes on its own line (Ian Romanick) v3: rebasing: move just_log to standalone.h/cpp Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-26 08:46:05 +02:00
Alejandro Piñeiro	66ff04322e	glsl: do not raise uninitialized warning with out function parameters It silence by default warnings with function parameters, as the parameters need to be processed in order to have the actual and the formal parameter, and the function signature. Then it raises the warning if needed at verify_parameter_modes where other in/out/inout modes checks are done. v2: fix comment style, multi-line condition style, simplify check, remove extra blank (Ian Romanick) v3: inout function parameters can raise the warning too (Ian Romanick) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-26 08:39:17 +02:00
Alejandro Piñeiro	b9f90ef652	glsl: add a empty set_is_lhs on ast_node Just to allow to call set_is_lhs on any ast_node without a casting. Useful when processing a ast_node list that we know it contain ast_expression. v2: comment out new_value to avoid unused parameter warning (Ian Romanick) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-26 08:39:07 +02:00
Dave Airlie	5b2675093e	glsl: handle implicit sized arrays in ssbo The current code disallows unsized arrays except at the end of an SSBO but it is a bit overzealous in doing so. struct a { int b[]; int f[4]; }; is valid as long as b is implicitly sized within the shader, i.e. it is accessed only by integer indices. I've submitted some piglit tests to test for this. This also has no regressions on piglit on my Haswell. This fixes: GL45-CTS.shader_storage_buffer_object.basic-syntax GL45-CTS.shader_storage_buffer_object.basic-syntaxSSO This patch moves a chunk of the linker code down, so that we don't link the uniform blocks until after we've merged all the variables. The logic went something like: Removing the checks for last ssbo member unsized from the compiler and into the linker, meant doing the check in the link_uniform_blocks code. However to do that the array sizing had to happen first, so we knew that the only unsized arrays were in the last block. But array sizing required the variable to be merged, otherwise you'd get two different array sizes in different version of two variables, and one would get lost when merged. So the solution was to move array sizing up, after variable merging, but before uniform block visiting. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-26 12:42:10 +10:00
Dave Airlie	4d70fd1bc7	glsl: fix error message on uniform block mismatch This looks like a cut-paste from above. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-26 12:40:41 +10:00
Dave Airlie	c952c0e713	glsl/ast: assign explicit_xfb_buffer from correct place This fixes: GL44-CTS.tessellation_shader.tessellation_control_to_tessellation_evaluation.data_pass_through As the OUT_TC interface structures weren't matching because one of them had explicit_xfb_buffer set when it shouldn't. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-26 12:17:03 +10:00
Bruce Cherniak	c8835a5924	swr: [rasterizer] Correctly select optimized primitive assembly. Indexed primitives were always using cut-aware primitive assembly, whether primitive_restart was enabled or not. Correctly pass down primitive_restart and select optimized PA when possible. Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-05-25 18:47:16 -05:00
Kenneth Graunke	978ab88858	docs: Mention i965/gen8+ supports GL 4.2 in release notes.	2016-05-25 14:22:56 -07:00
Kenneth Graunke	72ba9c3160	docs: Update GL_OES_copy_image status.	2016-05-25 14:22:30 -07:00
Kenneth Graunke	0f0f357b77	i965: Enable OES_copy_image (and EXT) on Gen8+ and Baytrail. For now, only enable it on platforms that actually support ETC2. At this point, Broadwell is only failing 5 (out of 8358) dEQP tests: dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits. srgb8_alpha8_r11f_g11f_b10f.renderbuffer_to_texture3d srgb8_alpha8_rgb10_a2ui.renderbuffer_to_cubemap srgb8_alpha8_rgb10_a2ui.renderbuffer_to_renderbuffer srgb8_alpha8_rgb10_a2.renderbuffer_to_texture2d srgb8_alpha8_rgb9_e5.renderbuffer_to_texture3d These fail with all methods (meta, blorp, blitter, memcpy). All are blacklisted from the Android mustpass list, which makes me wonder whether there's an issue with the tests. The formats in question work with other targets, and the targets in question work with other formats... Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chris Forbes <chrisforbes@google.com>	2016-05-25 14:17:29 -07:00
Kenneth Graunke	88a630121d	i965: Implement a BLORP path for CopyImage and prefer it over Meta. We're dropping Meta in favor of BLORP everywhere we can. This also fixes bugs when copying cubemaps to 2D, which is currently broken in the meta pass. BLORP just works. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94198 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chris Forbes <chrisforbes@google.com>	2016-05-25 14:17:29 -07:00
Kenneth Graunke	2822c8a078	i965: Make the CopyImage BLT path bail for stencil images. The BLT can't handle S8 because it's W-tiled (at least without additional funny business, and I'm not sure we care). Disallow it so it falls back to the CPU path, which works. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chris Forbes <chrisforbes@google.com>	2016-05-25 14:17:29 -07:00
Kenneth Graunke	c51702bdc8	i965: Also copy stencil miptree data. The Meta path handles this, but the CPU/BLT fallbacks did not. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chris Forbes <chrisforbes@google.com>	2016-05-25 14:17:29 -07:00
Kenneth Graunke	45d6818021	i965: Make a helper function for CopyImage of a miptree. Currently, it only contains the BLT/CPU fallbacks, so the name is a bit too generic. But eventually this will use BLORP as well, at which point the name will make more sense. The next patch will introduce a second call. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chris Forbes <chrisforbes@google.com>	2016-05-25 14:17:29 -07:00
Kenneth Graunke	2dc98d9a15	i965: Combine src/dest tex vs. rb checks in intel_copy_image_sub_data. This simplifies things a little - now we only have one (tex or rb?) if-ladder for src, and a second for dst, rather than four. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chris Forbes <chrisforbes@google.com>	2016-05-25 14:17:29 -07:00
Kenneth Graunke	1b39c5efca	i965: Account for MinLayer in CopyImageSubData's blitter/CPU paths. Fixes Piglit's arb_copy_image-texview test with the Meta path disabled (so we hit the blitter/CPU fallback paths). v2: Add MinLayer even for cube maps (suggested by Ilia). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chris Forbes <chrisforbes@google.com>	2016-05-25 14:17:29 -07:00
Rob Clark	231dcb19f9	freedreno/ir3: cmdline compiler for glsl Use glsl/libstandalone.la to add support for taking glsl src files (in addition to .tgsi) as input. Then glsl->nir and feed the result into the ir3 backend as normal. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-25 16:31:15 -04:00
Rob Clark	0f982bb67d	glsl: split out libstandalone Split standalone glsl_compiler into a libstandalone.la and a thin main.cpp. This way drivers can re-use the glsl standalone frontend in their own standalone compilers. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-25 16:31:15 -04:00
Rob Clark	ec434d940d	android: drop build of standalone glsl_compiler It's only a tool for debugging the glsl compiler, and should not be installed. Signed-off-by: Rob Clark <robclark@freedesktop.org> Tested-by: Rob Herring <robh@kernel.org> Acked-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-25 16:31:15 -04:00
Matt Turner	61847d7708	i965: Mark fallthrough in switch statement. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-05-25 12:44:34 -07:00
Matt Turner	83c6749ddb	i965: Assert that a depth_mt exists when using HiZ. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-05-25 12:44:34 -07:00
Matt Turner	4a5e92ac70	nir: Strengthen assertion that 'out' is nonnull. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-05-25 12:44:34 -07:00
Matt Turner	44809f2371	spirv: Mark default cases unreachable(). Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-05-25 12:44:34 -07:00
Matt Turner	469a1c56a6	isl: Mark default cases unreachable. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-05-25 12:44:34 -07:00
Matt Turner	47dca31606	isl: Remove useless qualifier from return type. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-05-25 12:44:34 -07:00
Samuel Pitoiset	71c30bd87c	nvc0: add descriptions for hardware perf counters/metrics The GALLIUM_HUD does not yet expose a description for each events, but this might be useful for developers who want to have a long description of hw perf counters directly in the source code. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-25 21:06:49 +02:00
Brian Paul	89e4de20fa	mesa: 80-column wrapping for _context_lost_GetSynciv() Reviewed-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2016-05-25 12:23:12 -06:00
Brian Paul	ae7c4a6f98	mesa: add GLAPIENTRY to new _context_lost_X functions To fix MSVC build. Any function which goes into the dispatch table needs to have the GLAPIENTRY (__stdcall) tag. Reviewed-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2016-05-25 12:23:12 -06:00
Giuseppe Bilotta	1b62b47f6f	scons: support 2.5.0 The get_implicit_deps changed in SCons 2.5, expecting a callable rather than a path as third argument. Detect the SCons versions and set the argument appropriately to support both 2.5 and earlier versions. This closes #95211. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95211 Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Cc: mesa-stable@lists.freedesktop.org Acked-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-25 12:23:12 -06:00
Giuseppe Bilotta	8c00fe3970	scons: whitespace cleanup This text transformation was done automatically via the following shell command: $ find -name SCons\* -exec sed -i s/\\s\\+$// '{}' \; Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-25 12:23:12 -06:00
Alejandro Piñeiro	8c29bba242	i965/fs: take into account doubles when emitting system values Fixes the following cts test: GL42-CTS.vertex_attrib_64bit.limits_test Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-25 20:14:22 +02:00
Kristian Høgsberg Kristensen	89bb4be91e	i965: Fix shadowing of 'height' parameter The nested declaration of 'height' shadows a parameter and uses uninitialized memory. Fix by renaming to 'plane_height' which also makes the code clearer. This would typically break the bo size computation, but we don't use that except when mmaping, and we don't mmap YUV buffers much. Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net> Reported-by: Mathias Fröhlich <Mathias.Froehlich@gmx.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-25 09:42:55 -07:00
Kristian Høgsberg Kristensen	595224f714	mesa: Add .gitignore entries for make check binaries Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net> Acked-by: Matt Turner <mattst88@gmail.com>	2016-05-25 09:41:44 -07:00
Kristian Høgsberg Kristensen	85008db1d5	i965: Enable GL_KHR_robustness GL_KHR_robustness adds the GL_CONTEXT_LOST error and five new entry points that we already implement. This patch adds a new dispatch table that returns GL_CONTEXT_LOST from all entry points and implements the GL_LOSE_CONTEXT_ON_RESET strategy by setting that table when we learn that we've lost the context. With the GL_CONTEXT_LOST reporting in place and dispatch for the new entry points we can turn on GL_KHR_robustness. Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-25 09:41:44 -07:00
Emil Velikov	f036eea2cf	.mailmap: Use Chia-I Wu personal e-mail. The LunarG one is bouncing. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-25 17:38:06 +01:00
Emil Velikov	4b79f82836	.mailmap: Use my (Emil Velikov) personal e-mail. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-25 17:35:48 +01:00
Ilia Mirkin	21c1754306	docs: add missing GL_OES/EXT_gpu_shader5 enablement note Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-25 09:50:22 -04:00
Ilia Mirkin	601a5195eb	glsl: add GL_EXT_clip_cull_distance define, add helpers Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>	2016-05-25 09:50:07 -04:00
Brian Paul	9690ab0cdf	tgsi: print TGSI_PROPERTY_NEXT_SHADER value as string, not an integer Print "GEOM" instead of "2", for example. v2: also update the text parsing code, per Ilia. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-25 07:21:23 -06:00
Brian Paul	2b773fcf00	tgsi: s/6/PIPE_SHADER_TYPES/ for tgsi_processor_type_names array size Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-25 07:21:23 -06:00
Jason Ekstrand	998829f404	nir/spirv: Handle location decorations on structure members	2016-05-24 21:12:56 -07:00
Jason Ekstrand	961369d597	nir/spirv: Add explicit handling for all decorations From time to time we have had cases where glslang has added a decoration we don't handle and it has caused problems. This audit ensures that, for every decoration, we either handle it or hit an unreachable() with an accurate description of why we don't have to.	2016-05-24 21:12:56 -07:00
Jason Ekstrand	6f89e51c84	i965/draw: Use the correct buffer index for interleaved VBO sizes The buffer_range_* arrays are indexed by buffer index not element index. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-24 20:50:35 -07:00
Jordan Justen	e58fabc93a	i965/gen7: Fix gl_HelperInvocation It appears that UV immediates aren't working on Ivy Bridge. In this case, a signed version will work, and this fixes the piglit tests/spec/glsl-4.50/execution/helper-invocation.shader_test test. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-24 15:44:06 -07:00
Emil Velikov	e384d75b12	mesa_glinterop: make GL interop version field bidirectional This allows clear and easy communication between the two. Caller: Requesting information (struct vN) Callee: I know how to deal with older version (vN-1) only. Here is your data and the version I support. Caller: Older version ? Sure I'll cap all access to the fields provided by the older version (vN-1) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Tom Stellard <thomas.stellard@amd.com>	2016-05-24 23:03:00 +01:00
Emil Velikov	0e983276b9	mesa_glinterop: drop mesa_glinterop_device_info::interop_version One cannot use a single version to control both export_in and export_out versions. Using this forces us to always extend/bump both structs at the same time. An alternative scheme is coming with next patch. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Tom Stellard <thomas.stellard@amd.com>	2016-05-24 23:03:00 +01:00
Emil Velikov	f8a114aa5c	st/dri: add note about GL interop version checks ... and make them more explicit. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Tom Stellard <thomas.stellard@amd.com>	2016-05-24 23:03:00 +01:00
Emil Velikov	923bdbf48c	mesa_glinterop: rename MESA_GLINTEROP_INVALID_{VALUE,VERSION} Be more explicit what it actually does. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Tom Stellard <thomas.stellard@amd.com>	2016-05-24 23:03:00 +01:00
Emil Velikov	c196de23ae	mesa_glinterop: s/struct_version/version/ OCD polish for consistency with other mesa interfaces. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Tom Stellard <thomas.stellard@amd.com>	2016-05-24 23:03:00 +01:00
Emil Velikov	cb0708c843	mesa_glinterop: fix GL interop *_VERSION comments Using the macro to set the version is wrong and ill-advised. Please don't do it. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Tom Stellard <thomas.stellard@amd.com>	2016-05-24 23:03:00 +01:00
Emil Velikov	a3eb8702fb	mesa_glinterop: remove inclusion of EGL header Analogous to previous commit, but for EGL. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Tom Stellard <thomas.stellard@amd.com>	2016-05-24 23:03:00 +01:00
Emil Velikov	8472045b16	mesa_glinterop: remove inclusion of GLX header Since we only need partial information about the GLX symbols we can forward declare them and drop the include. Obviously each user of the said API will needs more than what's provides, so they'll include the GLX header. If they don't, the compiler will give us a nice warning ;-) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Tom Stellard <thomas.stellard@amd.com>	2016-05-24 23:03:00 +01:00
Emil Velikov	b5f9820d90	mesa_glinterop: remove unneeded GLAPI/GLAPIENTRY/APIENTRYP symbols These come from windows.h, gl.h, glcorearb.h and/or glext.h. The interop interface is aimed at non-Windows platforms while the macros are used/derived due to Windows specifics. Thus we can safely remove them. Strictly speaking there should be GLXAPIENTRY/EGLAPIENTRY and alike macros, although a) there is no GLX ones and b) this brings us even further from decoupling the file from the GLX/EGL header dependency. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Tom Stellard <thomas.stellard@amd.com>	2016-05-24 23:03:00 +01:00
Emil Velikov	bcf9e47653	mesa_glinterop: replace GL types with their native counterpart. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Tom Stellard <thomas.stellard@amd.com>	2016-05-24 23:02:56 +01:00
Emil Velikov	2e726144f9	mesa_glinterop: use generic variable types for the GL interop Thus we can preserve the ABI, while avoiding the inclusion of some/all of the following: EGL/egl.h GL/gl.h GL/glcorearb.h GLES/gl.h GLES2/gl2.h GLES3/gl3.h GLES3/gl31.h This will allow us to build/use it alongside any combination of APIs. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Tom Stellard <thomas.stellard@amd.com>	2016-05-24 23:02:08 +01:00
Emil Velikov	cbf29d90ba	mesa_glinterop: use consistent naming scheme for GL interop Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Tom Stellard <thomas.stellard@amd.com>	2016-05-24 23:02:08 +01:00
Emil Velikov	0d31bfd71a	Revert "mesa: Build EGL without X11 headers after interop patchset" This reverts commit `4e2c9a0435`. The solution was incomplete and fragile. An alternative one is coming shortly. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Tom Stellard <thomas.stellard@amd.com>	2016-05-24 23:02:05 +01:00
Ian Romanick	c8d9ed5ea1	docs: Note that GL_OES_geometry_shader and GL_OES_tessellation_shader are started The GL_OES_geometry_shader work is on the oes_shader_io_blocks branch of idr's fd.o repository. The GL_OES_tessellation_shader work is on the tess-gles branch of kwg's fd.o repository. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-24 12:45:46 -07:00
Emil Velikov	7e196cd170	c11/threads: resolve link issues with -O0 Add weak symbol notation for the pthread_mutexattr* symbols, thus making the linker happy. When building with -O1 or greater the optimiser will kick in and remove the said functions as they are dead/unreachable code. Ideally we'll enable the optimisations locally, yet that does not seem to work atm. v2: Add the AX_GCC_FUNC_ATTRIBUTE([weak]) hunk in configure. Cc: Alejandro Piñeiro <apinheiro@igalia.com> Cc: Ben Widawsky <ben@bwidawsk.net> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Cc: Rob Herring <robh@kernel.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Rob Clark <robdclark@gmail.com> Tested-by: Mark Janes <mark.a.janes@intel.com>	2016-05-24 20:21:31 +01:00
Tim Rowley	0ceed1701d	swr: [rasterizer] remove containers.hpp Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-24 13:29:37 -05:00
Tim Rowley	1e3e22efb5	swr: [rasterizer core] remove utility dead code Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-24 13:29:29 -05:00
Tim Rowley	dc34479b8c	swr: [rasterizer core] buckets fixes 1. Don't clear bucket descriptions to fix issues with sim level buckets getting out of sync. 2. Close out threadviz file descriptors in ClearThreads(). 3. Skip buckets for jitter based buckets when multithreaded. We need thread local storage through llvm jit functions to be fixed before we can enable this. 4. Fix buckets StopCapture to correctly detect capture complete. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-24 13:29:21 -05:00
Tim Rowley	3074a2b4fa	swr: [rasterizer core] move centroid setup out of CalcCentroidBarycentrics Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-24 13:29:14 -05:00
Tim Rowley	9a2a4ecb39	swr: [rasterizer jitter] implement InstanceID/VertexID in fetch jit Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-24 13:28:47 -05:00
Ian Romanick	7fc4a82007	mesa: Silence unused parameter warnings Neither shProg nor name was used. Remove them both. main/shader_query.cpp:779:53: warning: unused parameter ‘shProg’ [-Wunused-parameter] program_resource_location(struct gl_shader_program shProg, ^ main/shader_query.cpp:780:72: warning: unused parameter ‘name’ [-Wunused-parameter] struct gl_program_resource res, const char *name, ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-05-24 11:04:08 -07:00
Ian Romanick	78399cf170	glsl/linker: Silence unused parameter warning The parameter is required for the interface. glsl/link_uniforms.cpp:689:61: warning: unused parameter ‘record_type’ [-Wunused-parameter] bool row_major, const glsl_type *record_type, ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-05-24 11:04:05 -07:00
Kristian Høgsberg Kristensen	2bb935be2e	dri: Add YVU formats Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-24 10:14:57 -07:00
Kristian Høgsberg Kristensen	1be1114e6b	i965: Allow creating planar YUV __DRIimages Lift the resctriction we had before and allow creation of images with multiple planes. We still require all the planes to be within the same bo. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-05-24 10:14:57 -07:00
Kristian Høgsberg Kristensen	654e950cba	i965: Invoke lowering pass for YUV textures Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-24 10:14:57 -07:00
Kristian Høgsberg Kristensen	44997fc0c1	i965: Support textures with multiple planes Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-24 10:14:57 -07:00
Kristian Høgsberg Kristensen	3352f2d746	i965: Create multiple miptrees for planar YUV images Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-24 10:14:57 -07:00
Kristian Høgsberg Kristensen	6eede87631	i965: Refactor intel_set_texture_image_bo() to create_mt_for_dri_image() This function now only creates the mt and we then call intel_set_texture_image_mt() in intel_image_target_texture_2d() to set it for the texture image. Reviewed-by: Chad Versace <chad.versace@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-24 10:14:57 -07:00
Kristian Høgsberg Kristensen	8ceb7c7d9b	i965: Use intel_set_texture_image_mt() in intelSetTexBuffer2() Create the mt for the drawable bo directly and call our new intel_miptree_create_for_bo() helper instead. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-05-24 10:14:56 -07:00
Kristian Høgsberg Kristensen	40e9be4a5c	i965: Add new intel_set_texture_image_mt() helper This factors out the work of setting up a miptree as the backing for a texture image into a new helper. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-05-24 10:14:56 -07:00
Kristian Høgsberg Kristensen	a41b57679f	nir: Add a lowering pass for YUV textures This lowers sampling from YUV textures to 1) one or more texture instructions to sample each plane and 2) color space conversion to RGB. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-24 10:14:56 -07:00
Kristian Høgsberg Kristensen	50c24c3ff3	nir: Handle NULL in nir_copy_deref() Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-24 10:14:56 -07:00
Kristian Høgsberg Kristensen	29921ee987	nir: Add new 'plane' texture source type This will be used to select the plane to sample from for planar textures. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-24 10:14:56 -07:00
Brian Paul	39b7b8b906	mesa: log buffer ID numbers in decimal, not hexadecimal All the other error messages use decimal. Let's be consistent. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-05-24 10:26:26 -06:00
Brian Paul	ce1cc70e27	mesa: use enum name in bind_buffer_object() error message Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-05-24 10:26:26 -06:00
Brian Paul	55c19527a6	mesa: raise error for glEnable(GL_VERTEX_ARRAY), etc. in core profile Otherwise, if the call executes normally we'll hit an assertion later in the VBO code when we draw something. Note that these cases were already handled correctly for the glIsEnabled() function (and the API checks were copied from there). Tested with new piglit gl-3.1-enable-vertex-array test. v2: fix compat/es mix-up, per Ilia. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-24 10:26:26 -06:00
Nicolas Boichat	a9b2b5e241	docs/egl: Android platform can also be build using autotools We added support for Android build using autotools (configure), update the documentation to reflect that. Signed-off-by: Nicolas Boichat <drinkcat@google.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-24 16:24:54 +01:00
Juan A. Suarez Romero	e79aa19d88	i965: fix double-precision vertex inputs measurement For double-precision vertex inputs we need to measure them in dvec4 terms, and for single-precision vertex inputs we need to measure them in vec4 terms. For the later case, we use type_size_vec4() function. For the former case, we had a wrong implementation based on type_size_vec4(). This commit introduces a proper type_size_dvec4() function, that we use to measure vertex inputs. Measuring double-precision vertex inputs as dvec4 is required because ARB_vertex_attrib_64bit states that these uses the same number of locations than the single-precision version. That is, two consecutives dvec4 would be located in location "x" and location "x+1", not "x+2". Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-24 10:06:29 +02:00
Ilia Mirkin	ccd58015a2	docs: true up nvc0 status - images, etc Images aren't supported on maxwell, but neither is tessellation. Don't overly confuse matters by trying to expose those subtleties in the GL3.txt file/relnotes. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Dave Airlie <airlied@redhat.com>	2016-05-23 23:47:11 -04:00
Ilia Mirkin	856587909c	st/mesa: enable ARB_ES3_1_compatibility when ES 3.1 would be exposed Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-05-23 23:47:11 -04:00
Ilia Mirkin	5878254545	mesa: remove separate enable for KHR_robust_buffer_access_behavior This extension appears to be a strict subset of the ARB version. Also remove it from GL3.txt since it doesn't seem relevant. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-23 23:47:11 -04:00
Timothy Arceri	72449c477e	glsl: add support for explicit components to frag outputs V2: fix error checking for arrays and components. V1 was only taking into account all the array elements and all the components of one of the varyings during the comparision and treating the other as a single slot/component. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-05-24 12:46:48 +10:00
Ilia Mirkin	37266dfb7c	mesa: add view classes for 3d astc formats Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-05-23 22:34:37 -04:00
Ilia Mirkin	979bcb9f42	glsl: add EXT_clip_cull_distance support based on ARB_cull_distance Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-05-23 22:22:06 -04:00
Ilia Mirkin	f236f1f506	nvc0: expose robust buffer access We apparently pass all the relevant CTS tests. There are probably some shortcomings, but they can be addressed down the line. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-23 22:22:05 -04:00
Jason Ekstrand	9f5ccaf4dc	i965: Use ISL for surface format introspection With this, we can delete the surface format table in brw_surface_formats.c because all of the information we need is now in ISL.	2016-05-23 19:12:34 -07:00
Jason Ekstrand	d68acde1cb	anv/formats: Use isl_format_supports* for format introspection	2016-05-23 19:12:34 -07:00
Jason Ekstrand	7374d006b6	isl: Add per-gen format introspection This is just a copy-and-paste from brw_surface_formats.c. For the supports_vertex_fetch function, we do a bit more work so that it properly handles Bay Trail.	2016-05-23 19:12:34 -07:00
Jason Ekstrand	03a82dc5d1	isl: Add the ISL_FORMAT_R32G32_FLOAT_LD format	2016-05-23 19:12:34 -07:00
Jason Ekstrand	35a514e6ff	isl: Add support for quering the string name of a format	2016-05-23 19:12:34 -07:00
Jason Ekstrand	75d10dff0b	i965: Enable ARB/KHR_robust_buffer_access_behavior on BYT and HSW+ Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-23 19:12:34 -07:00
Jason Ekstrand	1a092fcf3b	main: Add extension enable bits for KHR_robust_buffer_access_behavior Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-23 19:12:34 -07:00
Jason Ekstrand	66e137ecf1	nir/lower_samplers: Protect against sampler index overflow Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-23 19:12:34 -07:00
Jason Ekstrand	27b9481d03	glsl: Add an option to clamp block indices when lowering UBO/SSBOs This prevents array overflow when the block is actually an array of UBOs or SSBOs. On some hardware such as i965, such overflows can cause GPU hangs. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-23 19:12:34 -07:00
Jason Ekstrand	ac242aac3d	glsl/linker: Add a helper variable for compiler options Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-23 19:12:34 -07:00
Jason Ekstrand	aec10a1d5b	i965/draw: Use the real size for index buffers Previously, we were using the size of the whole BO which may be substantially larger than the actual index buffer size. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-23 19:12:34 -07:00
Jason Ekstrand	7c8dfa78b9	i965/draw: Use the real size for vertex buffers Previously, we were using the size of the BO which may be substantially larger than the actual vertex buffer size. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-23 19:12:34 -07:00
Jason Ekstrand	a643bc6246	i965/draw: Use 3-channel formats for vertex fetch when possible. For a long time, several of the 3-channel vertex formats didn't exist so we faked them with 4-channel versions. Starting with Sandy Bridge, we can use R16G16B16_FLOAT and 8 and 16-bit integer formats become available on Haswell and Bay Trail. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-23 19:12:34 -07:00
Jason Ekstrand	ab3d8d5ea4	i965/surface_formats: Update the VB column for new formats added on BYT Bay Trail and Haswell added a bunch of new vertex formats. There was also the addition of 64-bit passthrough formats for BDW+. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-23 19:12:34 -07:00
Jason Ekstrand	d5b4ab2c5f	i965/draw: Properly handle rounding when dividing by InstanceDivisor The old code always divided rounded down and then subtracted 1. What we wanted was to divide rounded up and then subtract 1 which is equivalent to subtracting 1 and then dividing rounded down. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-23 19:12:34 -07:00
Jason Ekstrand	ad42ab473c	i965/draw: Account for BaseInstance in VBO bounds Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-23 19:12:34 -07:00
Jason Ekstrand	ad3deec8ca	i965/draw: Use worst-case VBO bounds if brw->num_instances == 0 Previously, we only handled the "I don't know what's going on" case for things with InstanceDivisor == 0. However, in the DrawIndirect case we can get num_instances == 0 and we don't know what's going on with the instanced ones either. This commit makes the worst-case bound the default and then conservatively tightens the bound. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-23 19:12:34 -07:00
Jason Ekstrand	8892519751	i965/draw: Delay when we get the bo for vertex buffers The previous code got the BO the first time we encountered it. However, this can potentially lead to problems if the BO is used for multiple arrays with the same buffer object because the range we declare as busy may not be quite right. By delaying the call to intel_bufferobj_buffer, we can ensure that we have the full range for the given buffer. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-23 19:12:34 -07:00
Jason Ekstrand	a01a1eb9e4	i965/draw: Stop relying on min_index == -1 for invalid index bounds The vbo layer passes an index_bounds_valid flag that we should be using instead. This also fixes a bug when min_index == -1 and basevertex != 0 where we were actually comparing min_index + basevertex == -1 which was false and we were getting the wrong buffer-sizing path. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-23 19:12:34 -07:00
Jason Ekstrand	a7011922f1	vbo: Declare the index range invalid for DrawTransformFeedback Right now, we're setting the range to [0, 0] which is obviously bogus. Instead, we should set it to be invalid like we do for DrawIndirect. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-23 19:12:34 -07:00
Jason Ekstrand	df6ec2aba5	vbo: Declare the index range invalid for DrawIndirect Right now, we're just setting the range to [0, MAX_UINT32] which, while correct isn't helpful. With DrawIndirect, you can't really know what the actual range is so we may as well flag it as being an invalid range. This is what we do for draws with index buffer which is similar (the indices aren't statically known) if a bit simpler. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-23 19:12:34 -07:00
Ilia Mirkin	21f3df0820	mesa/teximage: fix GL_FLOAT in comment Noticed by Brian. Trivial. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-23 21:44:41 -04:00
Timothy Arceri	2d9308012c	glsl: fix explicit location validation for doubles Previously we would fail to find a match for the second half of a dvec4 as 'i' would get incremented to 1 before we added the var to the array at component 0. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-05-24 11:30:51 +10:00
Dave Airlie	33397bf7fd	docs: update ARB_cull_distance status. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-24 11:27:58 +10:00
Dave Airlie	5c10d47bae	st/mesa: reenable culling Now the lowering pass is fixed, reenable ARB_cull_distance. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-24 11:27:54 +10:00
Dave Airlie	a88c5d7e55	i965: reenable ARB_cull_distance. Now the lowering pass is fixed we can reenable culling. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-24 11:27:29 +10:00
Dave Airlie	a08c4ebbe8	glsl: rewrite clip/cull distance lowering pass The last version of this broke clipping, and I had to spend sometime getting this working properly. I had to introduce a third pass to count the clip/cull totals, all due to one messy corner case. We have a piglit test tes-input-gl_ClipDistance.shader_test that doesn't actually output the clip distances, it just passes them like a varying from TCS->TES, the older lowering pass worked but to lower clip/cull we need to know the total number of clip+culls used to defined the new variable correctly, and to offset culls properly. This adds an extra pass that works out the sizes for clip/cull, then lowers gl_ClipDistance then gl_CullDistance into the new gl_ClipDistanceMESA. The pass checks using the fixed array sizes code if they array has been referenced, or is actually never used, and ignores it in the latter case. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-24 11:27:29 +10:00
Dave Airlie	8c628ab13e	glsl: make max array trackers ints and use -1 as base. (v2) This fixes a bug that breaks cull distances. The problem is the max array accessors can't tell the difference between an never accessed unsized array and an accessed at location 0 unsized array. This leads to converting an undeclared unused gl_ClipDistance inside or outside gl_PerVertex to a size 1 array. However we need to the number of active clip distances to work out the starting point for the cull distances, and this offset by one when it's not being used isn't possible to distinguish from the case were only the first element is accessed. I tried to use ->used for this, but that doesn't work when gl_ClipDistance is part of an interface block. So this changes things so that max_array_access is an int and initialised to -1. This also allows unsized arrays to proceed further than that could before, but we really shouldn't mind as they will get eliminated if nothing uses them later. For initialised uniforms we no longer change their array size at runtime, if these are unused they will get eliminated eventually. v2: use ralloc_array (Ilia) Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-24 11:27:29 +10:00
Nanley Chery	2ae493d686	anv/formats: Make alpha blending a property of render targets In agreement with the SNB PRM, alpha blending is a property that render targets may or may not support. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-23 17:26:17 -07:00
Nanley Chery	9721be6681	i965: Unset alpha blend for R10G10B10_SNORM_A2_UNORM This format does not support alpha blending, according to the SNB PRM. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-23 17:26:17 -07:00
Dave Airlie	8b89c92ef6	i965: deindent blorp code. gcc6 warns about this. Acked-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-24 10:14:31 +10:00
Dave Airlie	e257284481	glsl: reindent line in ast_function.cpp This fixes a warning with gcc -Wmisleading-indentation. Acked-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-24 10:14:31 +10:00
Ilia Mirkin	82d756f3af	mesa: allow GL_FRAMEBUFFER_DEFAULT_LAYERS to be queried with ES geometry When we have the geometry extensions, enable querying of the new param. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-05-23 20:03:40 -04:00
Ilia Mirkin	2dabd49704	mesa: allow xfb to be active in GLES when geometry shader is enabled. OES_geometry_shader has wording to allow xfb when using Draw*Indirect and DrawElements. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-05-23 20:03:20 -04:00
Ilia Mirkin	2e8e1e8909	main: check driver float texture support before upgrading to 16F/32F When passing in GL_RGBA or other base formats, we will try to upgrade the format to whatever the passed in type was. However not all drivers (notably nv30) support 32F textures, and so this would lead to crashes down the line. Only upgrade when the relevant extensions are available. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-05-23 20:00:39 -04:00
Ilia Mirkin	1e99a46b44	st/mesa: update inst->info along with inst->op Otherwise we still have TGSI_OPCODE_CMP's info, which causes a number of later logic to go wrong. This fixes dEQP-GLES2.functional.shaders.functions.control_flow.return_in_if_vertex on nv30. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-23 19:58:53 -04:00
Bas Nieuwenhuizen	533d1e9085	glsl: Use correct mode for split components. The mode should stay the same as the original struct. In particular, shared should not be changed to temporary. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-05-24 09:55:38 +10:00
Kenneth Graunke	1c1873b93b	mesa: Implement glGet*(GL_PRIMITIVE_RESTART_FOR_PATCHES_SUPPORTED). Technically, this was introduced with GL 4.4. However, I believe it was intended to be retroactive. As far as I know, AMD has never supported primitive restart with patches, while NVidia and Intel do. This necessitated the need for a query which would allow applications to figure out whether this was usable or not. I decided to expose it everywhere ARB_tessellation_shader is exposed. (It's also in both OES and EXT_tessellation_shader.) Enable this for i965 and Gallium drivers which expose the capability. v2: Fix a bug in the state_tracker code (caught by Ilia Mirkin). Bugzilla: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=10364 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-23 16:44:22 -07:00
Kenneth Graunke	70048eb1e3	gallium: Add a pipe cap for whether primitive restart works for patches. Some hardware supports primitive restart on patch primitives, and other hardware does not. Modern GL and ES include a query for this feature; adding a capability bit will allow us to answer it. As far as I know, AMD hardware does not support this feature, while NVIDIA and Intel hardware does. However, most Gallium drivers do not appear to support tessellation shaders yet. So, I've enabled it for nvc0 and disabled it everywhere else. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-23 16:44:11 -07:00
Francisco Jerez	015035027b	i965/fs: Mark UBO uniform pull constant loads as force_writemask_all. This lets the rest of the backend know that the uniform pull constant load opcodes don't respect channel enables -- Without this the register allocator has no way to know that the return payload of a pull constant load is not per-channel and spills of the destination will be broken under non-uniform control flow. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-23 14:07:23 -07:00
Francisco Jerez	7eb4966887	i965/fs: Allow spilling of non-contiguous registers. This should be working fine now. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94997 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-23 14:05:21 -07:00
Francisco Jerez	6fc5dd5b6a	i965/fs: Calculate the (un)spill block size correctly. Currently the spilling code attempts to guess the scratch message block size from the dispatch width of the shader, which is plain wrong for SIMD-lowered instructions (frequently but not exclusively encountered in SIMD32 shaders) or for instructions with register region data types of size other than 32 bit. Instead try to use the SIMD component size of the instruction which in some cases will allow the dataport to apply the correct channel mask to the scratch data read or written. In the spill case the block size needs to be clamped to the number of MRF registers reserved for spilling. In the unspill case I didn't even bother because we currently have no 100% accurate way to determine whether a source region is per-channel or whether it contains things like headers that don't respect channel boundaries -- That's fine, because the unspill is marked force_writemask_all we can just use the largest allowable scratch message size. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-23 14:05:21 -07:00
Francisco Jerez	11260cc54f	i965/fs: Set exec_all on spills not matching the channel layout of the instruction. This prevents the application of an incorrect channel mask by the scratch write instruction for spilled variables that don't have an exact one-to-one correspondence between channels of the variable and 32-bit components of the scratch write instruction. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-23 14:05:21 -07:00
Francisco Jerez	bb67c467a4	i965/fs: Set exec_all on unspills. This makes sure that unspills restore the exact contents of the variable in scratch space into the GRF without applying channel masking, which is incorrect under control flow for things like message headers or vectors of heterogeneous types that don't properly respect channel boundaries. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-23 14:05:20 -07:00
Francisco Jerez	07e67cc266	i965/fs: Move scratch block size calculation into the caller of emit_(un)spill. This makes emit_(un)spill even more stupid by removing the logic that decides what execution size each scratch read or write send message should have and instead relying on the caller to specify an appropriate execution size via the builder argument. This makes sense because the caller will need to act differently based on the scratch message width (e.g. emit an additional unspill before the instruction if the execution width and channel layout of the spill doesn't match the instruction's). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-23 14:05:20 -07:00
Francisco Jerez	284c8fbcef	i965/fs: Make emit_spill/unspill static functions taking builder as argument. This seems cleaner than exposing an implementation detail of brw_fs_reg_allocate.cpp to the world, and will give the caller control over the instruction execution flags (e.g. force_writemask_all) that are applied to the scratch read and write instructions. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-23 14:05:20 -07:00
Francisco Jerez	70023c40c6	i965/fs: Apply execution controls from the instruction to scratch messages. Until now the execution controls (e.g. channel group, force_writemask_all, exec_size) of the instruction had been completely ignored by spilling, even though that can lead to a mismatch between the channel mask applied to the contents of the (un)spilled memory and the GRF source or destination of the instruction. In some cases we'll actually want the (un)spill messages to be marked force_writemask_all regardless of whether the instruction has it set, but that will have to be handled specially by the caller. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-23 14:05:20 -07:00
Francisco Jerez	e98cf03114	i965/fs: Fix signedness of local variables and arguments of emit_(un)spill. To avoid some some spurious warnings about comparison signedness in the following commits. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-23 14:05:20 -07:00
Francisco Jerez	f471d3eede	i965/fs: Factor out calculation of the block of MRFs reserved for spilling. And as we're at it fix the calculation to allocate a larger block of registers for 32-wide dispatch. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-23 14:05:20 -07:00
Plamena Manolova	21edd24c0d	egl: Add OpenGL_ES to API string regardless of GLES version According to the EGL specifications eglQueryString(EGL_CLIENT_APIS) should return a string containing a combination of "OpenGL", "OpenGL_ES" and "OpenVG", any other values would be considered invalid. Due to this when the API string is constructed, the version of GLES should be disregarded and "OpenGL_ES" should be attached once instead of "OpenGL_ES2" and "OpenGL_ES3". Fixes: dEQP-EGL.functional.negative_api* and dEQP-EGL.functional.query_context.simple.query_api Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-05-23 13:46:01 -07:00
Rob Clark	46ff17559b	freedreno/ir3: disable cp for indirect src's The variable-indexing tests always had a few random fails, which I usually couldn't reproduce when running tests manually. Somehow recently this got a lot worse. I ported a couple of the shaders to GLES to see what blob does, and it also seems to be avoiding to cp indirect srcs. So I guess indirect w/ instructions other than cat1 (mov) are not totally reliable. Let's just switch that off until this is better understood. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-23 15:57:13 -04:00
Samuel Pitoiset	c3c4370299	nvc0: do not invalidate compute constbufs on Kepler Constbufs are only aliased on Fermi and this will reduce the number of flushes when we switch between 3d and compute. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-23 20:56:29 +02:00
Rob Clark	5245d845b6	nir/validate: fix null deref coverity warning CID 1265536 (#1 of 2): Explicit null dereferenced (FORWARD_NULL)6. var_deref_op: Dereferencing null pointer parent. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-23 10:14:50 -04:00
Nicolas Boichat	0cbc90c57c	mesa: dri: Add shared glapi to LIBADD on Android /system/vendor/lib/dri/*_dri.so actually depend on libglapi: without this, loading the so file fails with: cannot locate symbol "__emutls_v._glapi_tls_Context" On non-Android (non-bionic) platform, EGL uses the following workflow, which works fine: dlopen("libglapi.so", RTLD_LAZY \| RTLD_GLOBAL); dlopen("dri/<driver>_dri.so", RTLD_NOW \| RTLD_GLOBAL); However, bionic does not respect the RTLD_GLOBAL flag, and the dri library cannot find symbols in libglapi.so, so we need to link to libglapi.so explicitly. Android.mk already does this. Signed-off-by: Nicolas Boichat <drinkcat@google.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> [Emil Velikov: s/explicitely/explicitly/] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-23 13:25:51 +01:00
Nicolas Boichat	27d713a004	configure.ac: Add support for Android builds Add support for EGL android platform. Also, detect when --host finishes with -android. In that case, we do not set _GNU_SOURCE, and define autoconf symbol HAVE_ANDROID, so that Android-specific workarounds can be applied. Signed-off-by: Nicolas Boichat <drinkcat@google.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> [Emil Velikov: Rebase on top of HAVE_EGL_PLATFORM_NULL removal] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-23 13:23:39 +01:00
Emil Velikov	960d854a98	anv: remove define _DEFAULT_SOURCE The build systems already add this as applicable. There's no need to have this in the source file. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-23 12:09:11 +01:00
Emil Velikov	1b64d1247d	gbm: remove define _DEFAULT_SOURCE The build systems already add this as applicable. There's no need to have this in the source file. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-23 12:09:11 +01:00
Emil Velikov	efe4beb717	gbm: remove define _BSD_SOURCE The build systems already add this as applicable. There's no need to have this in the source file. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-23 12:09:11 +01:00
Jiri Slaby	a6ce91fe52	glxcmds: glXGetFBConfigs, fix screen bounds Bounds of screen are 0 (inclusive) and ScreenCount(dpy) (exclusive). The upper bound was too ScreenCount(dpy) (inclusive). This causes a crash invoked by java3d which passes down an invalid screen: 6 0x00007f0e5198ba70 in <signal handler called> () at /lib64/libc.so.6 7 0x00007f0e14531e14 in glXGetFBConfigs (dpy=<optimized out>, screen=1, nelements=nelements@entry=0x7f0dab3c522c) at glxcmds.c:1660 8 0x00007f0e14532f7f in glXChooseFBConfig (dpy=<optimized out>, screen=<optimized out>, attribList=0x7f0dab3c54e0, nitems=0x7f0dab3c535c) at glxcmds.c:1611 9 0x00007f0e1478d29b in find_S_FBConfigs () at /usr/lib64/libj3dcore-ogl.so 10 0x00007f0e1478d3dc in find_S_S_FBConfigs () at /usr/lib64/libj3dcore-ogl.so 11 0x00007f0e1478d567 in find_AA_S_S_FBConfigs () at /usr/lib64/libj3dcore-ogl.so 12 0x00007f0e1478d728 in find_DB_AA_S_S_FBConfigs () at /usr/lib64/libj3dcore-ogl.so 13 0x00007f0e1478d97c in Java_javax_media_j3d_X11NativeConfigTemplate3D_chooseOglVisual () at /usr/lib64/libj3dcore-ogl.so While ScreenCount(dpy) is actually 1: (gdb) p dpy->nscreens $2 = 1 screen=1 is passed to glXGetFBConfigs. Fix this typo in glXGetFBConfigs. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95456 Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-23 12:07:47 +01:00
Elie TOURNIER	0f738fa23e	doxygen: Add missing modules to Windows runner Acked-by: Rhys Kidd <rhyskidd@gmail.com>	2016-05-23 12:07:47 +01:00
Emil Velikov	793574afad	egl: add missing link against $(CLOCK_LIB) Some platforms require separate library in order to resolve the clock_gettime() symbol. Add the link or the build will fail. Fixes: `70299474f5` ("egl: add EGL_KHR_reusable_sync to egl_dri") Cc: Dongwon Kim <dongwon.kim@intel.com> Reported-by: Pali Rohár <pali.rohar@gmail.com> Tested-by: Pali Rohár <pali.rohar@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-23 12:07:47 +01:00
Emil Velikov	d67e757d11	egl: android: remove explicit glFlush call The DRI flush extension should already do the same thing. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Rob Herring <robh@kernel.org>	2016-05-23 12:07:47 +01:00
Emil Velikov	9b3c7481c6	egl: android: drop dri2_create_image_android_native_buffer argument The drv is no longer used/needed as of last commit. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Herring <robh@kernel.org>	2016-05-23 12:07:47 +01:00
Emil Velikov	38ef6f5f60	egl: android: directly use dri2_create_image_dma_buf() Make the function non static so that we can use it directly from the android platform code. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Herring <robh@kernel.org>	2016-05-23 12:07:47 +01:00
Emil Velikov	2cd687ce97	configure.ac: error out when building from git without python3 Bail early, as opposed to later on during the build. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-23 12:07:47 +01:00
Emil Velikov	a155cdaace	vl/drm: don't call close(-1) in vl_drm_screen_create error path Analogous to previous commits. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-05-23 12:07:47 +01:00
Emil Velikov	ed3f6ccce0	st/xa: don't call close(-1) in xa_tracker_create error path Analogous to previous commit. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-05-23 12:07:46 +01:00
Emil Velikov	6e00a1e6cb	st/dri: don't call close(-1) in dri{2, kms_}_init_screen error path Add separate labels and jump to the correct one as needed. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-05-23 12:07:46 +01:00
Eric Engestrom	7362bb3e21	vk/intel: use negative VK_NO_PROTOTYPES scheme `3d0fac7aca` changed all VK_PROTOTYPES to VK_NO_PROTOTYPES This brings the Intel header in line with the rest of the Vulkan code. Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-05-23 12:07:46 +01:00
Rob Herring	8aeb6d768b	gbm: Add map/unmap functions This adds map and unmap functions to GBM utilizing the DRIimage extension mapImage/unmapImage functions or existing internal mapping for dumb buffers. Unlike prior attempts, this version provides a region to map and usage flags for the mapping. The operation follows the same semantics as the gallium transfer_map() function. This was tested with GBM based gralloc on Android. Signed-off-by: Rob Herring <robh@kernel.org> [Emil Velikov: drop no longer relevant hunk from commit message.] Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-05-23 12:07:46 +01:00
Rob Herring	1f4869a208	configure.ac: add pthreadstubs support Add pthreadstubs to avoid pulling in full pthreads library. GBM will be the first user. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-05-23 12:07:46 +01:00
Rob Herring	0a4275b534	gbm: rename gbm_dri_bo_{map,unmap} to gbm_dri_bo_{map,unmap}_dumb In preparation to add public map/unmap functions, rename the existing gbm_dri_bo_{map,unmap} functions to indicate that they are only for dumb buffers. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-23 12:07:46 +01:00
Rob Herring	e8431a630d	st/dri: Add support for DRIimage extension mapImage/unmapImage Implement support for mapImage/unmapImage functions in version 12 of the DRIimage extension. Signed-off-by: Rob Herring <robh@kernel.org> [Emil Velikov: align/indent the map/unmap vfuncs] Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-05-23 12:07:46 +01:00
Rob Herring	a0f06f168f	DRI: Add DRIimage map and unmap functions Add mapImage and unmapImage functions to DRIimage extension for mapping and unmapping DRIimages for CPU access. The caller provides the region of the image to map and is returned a pointer to the beginning of the region and the stride (which could be different from the original). Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-05-23 12:07:46 +01:00
Rob Herring	bdfa635f72	gbm: Add Android build support In order to use libgbm for gralloc, add it to the Android build. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-05-23 12:07:46 +01:00
Rob Herring	64a005e3ee	gbm: add Android gallium_dri.so library loading support GBM needs the same special gallium_dri.so loading as EGL for Android, so copy over the same hunk from the EGL code. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-05-23 12:07:46 +01:00
Rob Herring	7d79eec456	gbm: split out source file to Makefile.sources In preparation to add Android build support, split out the source file lists to Makefile.sources Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Eric Anholt <eric@anholt.net> [Emil Velikov: Whitespace cleanup.] Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-23 12:07:46 +01:00
Rob Herring	fc1806e041	Android: Move setting DEFAULT_DRIVER_DIR to shared location Move the defining of DEFAULT_DRIVER_DIR path to a common location so both EGL and GBM can use it. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-05-23 12:07:45 +01:00
Emil Velikov	6ce11e7e2c	c11/threads: create mutexattrs only when needed If the mutexattrs are the default one can just pass NULL to pthread_mutex_init. As the compiler does not know this detail it unnecessarily creates/destroys the attrs. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-23 12:07:45 +01:00
Andres Gomez	4424bf5da4	configure: added xcb to dri3 modules to pkg-conf This fixes a recent linking error in libvulkan_common Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com>	2016-05-23 11:21:34 +02:00
Juan A. Suarez Romero	3c9096eea4	glsl/linker: dvec3/dvec4 consume twice input vertex attributes From the GL 4.5 core spec, section 11.1.1 (Vertex Attributes): "A program with more than the value of MAX_VERTEX_ATTRIBS active attribute variables may fail to link, unless device-dependent optimizations are able to make the program fit within available hardware resources. For the purposes of this test, attribute variables of the type dvec3, dvec4, dmat2x3, dmat2x4, dmat3, dmat3x4, dmat4x3, and dmat4 may count as consuming twice as many attributes as equivalent single-precision types. While these types use the same number of generic attributes as their single-precision equivalents, implementations are permitted to consume two single-precision vectors of internal storage for each three- or four-component double-precision vector." This commits makes dvec3, dvec4, dmat2x3, dmat2x4, dmat3, dmat3x4, dmat4x3 and dmat4 consume twice as many attributes as equivalent single-precision types. v3: count doubles as consuming two attributes (Dave Airlie) v4: make reference to spec (Michael Schellenberger Costa) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Antia Puentes <apuentes@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2016-05-23 10:48:07 +02:00
Francisco Jerez	b46867cd37	i965/fs: do not depend on std140 alignment rules for UBO loads The previous implementation relied on the std140 alignment rules to avoid handling misalignment in the case where we are loading more than 2 double components from a vector, which requires to emit a second load message. This alternative implementation deals with misalignment and is more flexible going forward. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-05-23 08:56:57 +02:00
Iago Toral Quiroga	38b719d624	nir: handle double-precision in fsign, fsat, fnot and frcp I think these are not strictly necessary since the floats in them should be automatically promoted to doubles when operated with double sources, but it makes things more explicit at least. Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-23 08:54:37 +02:00
Iago Toral Quiroga	3f73039ade	nir: handle double-precision in fabs, frsq and fsqrt Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-23 08:54:28 +02:00
Dave Airlie	3466db3969	glsl/parser: handle multiple layout sections with AST nodes. For geometry/compute inputs and tess control outputs, we create an AST node to keep track of some things. However if we have multiple layout sections, we don't ever link the node into the AST. This is because we create the node on the rightmost layout declaration and don't pass it back in so it gets linked at the end of the parsing of the rightmost. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 16:20:01 +10:00
Dave Airlie	aaa69c79cd	glsl: allow layout qualifier overrides with ARB_shading_language_420pack GLSL 4.20 allows overriding the layout qualifiers. This helps fix: GL45-CTS.shading_language_420pack.qualifier_override_layout Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 16:19:57 +10:00
Dave Airlie	6f2dc0d044	subroutines: handle explicit indexes properly The code didn't deal with explicit function indexes properly. It also handed out the indexes at link time, when we really need them in the lowering pass to create the correct if ladder. So this patch moves assigning the non-explicit indexes earlier, fixes the lowering pass and the lookups to get the correct values. This fixes a few of: GL45-CTS.explicit_uniform_location.subroutine-index-* Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 16:19:57 +10:00
Dave Airlie	5fe912831c	mesa/subroutines: fix reset on bindpipeline Fixes: GL45-CTS.shader_subroutine.subroutine_uniform_reset Reviewed-by: Chris Forbes <chrisforbes@google.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 16:19:57 +10:00
Dave Airlie	7fa0250f94	mesa/subroutines: count number subroutines properly. The code was implementing the ACTIVE_SUBROUTINE_UNIFORMS incorrectly, using the number of types not the number of uniforms. This is different than the locations as the locations may be sparsly allocated. This fixes: GL43-CTS.shader_subroutine.four_subroutines_with_two_uniforms Reviewed-by: Chris Forbes <chrisforbes@google.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 16:19:57 +10:00
Dave Airlie	22db9b10eb	mesa/subroutines: don't generate error in GetSubroutineIndex. GLSL spec says this doesn't generate an error. Fixes: GL45-CTS.explicit_uniform_location.subroutine-loc Reviewed-by: Chris Forbes <chrisforbes@google.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 16:19:57 +10:00
Dave Airlie	3b8b6be7bb	glsl/ast: for geom shaders allow stream flags in input flags. This fixes: GL45-CTS.shader_subroutine.subroutines_with_separate_shader_objects Since we set the stream flags earlier on all geom shaders, we shouldn't fall over later if we find one. Reviewed-by: Chris Forbes <chrisforbes@google.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 16:19:57 +10:00
Dave Airlie	93b3b6af3c	glsl/linker: skip inactive explicit locations. This fixes a crash in: GL45-CTS.explicit_uniform_location.subroutine-loc-negative-link-max-num-of-locations Reviewed-by: Chris Forbes <chrisforbes@google.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 16:19:57 +10:00
Dave Airlie	c714731653	glsl: fix subroutine uniform .length(). This fixes .length() on subroutine uniform arrays, if we don't find the identifier normally, we look up the corresponding subroutine identifier instead. Fixes: GL45-CTS.shader_subroutine.arrays_of_arrays_of_uniforms GL45-CTS.shader_subroutine.arrayed_subroutine_uniforms Reviewed-by: Chris Forbes <chrisforbes@google.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 16:19:57 +10:00
Dave Airlie	432ac19c1a	glsl/linker: link error on too many subroutine functions. This fixes: GL45-CTS.explicit_uniform_location.subroutine-index-negative-link-max-num-of-indices Reviewed-by: Chris Forbes <chrisforbes@google.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 16:19:56 +10:00
Dave Airlie	18b0a13e80	glsl: produce a linker error for a subroutine uniform with no functions. If a subroutine uniform is declared with no functions backing it, that isn't legal, so we should fail to link. Fixes: GL43-CTS.shader_subroutine.subroutine_uniform_wo_matching_subroutines Reviewed-by: Chris Forbes <chrisforbes@google.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 16:19:56 +10:00
Dave Airlie	b572b599ef	glsl: validate subroutine types match function signature. This fixes: GL43-CTS.shader_subroutine.subroutines_incompatible_with_subroutine_type It just makes sure the signatures match as well as the return types. Reviewed-by: Chris Forbes <chrisforbes@google.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 16:19:56 +10:00
Dave Airlie	ba3414d832	arb_shader_subroutine: check active subroutine limit _mesa_GetActiveSubroutineUniformiv needs to check against the number of types here. Noticed while playing with ogl conform. Reviewed-by: Chris Forbes <chrisforbes@google.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 16:18:25 +10:00
Ilia Mirkin	74e71cbfcb	nv30: don't assert when running out of registers This happens with dEQP tests. The code doesn't at all protect against this condition, so while unhandled, this is an expected situation. Also avoid using more than the first 16 registers for nv3x vertex programs. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-22 22:57:18 -04:00
Ilia Mirkin	36ff09cdfe	nouveau: allow allocating non-object-backed buffers On nv30, for example, there is no hardware index buffer support. So all of those will be created entirely in user memory. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-22 22:57:18 -04:00
Tobias Klausmann	96f390ff35	llvm/softpipe: Enable cull_distance as draw supports it. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 11:04:37 +10:00
Dave Airlie	e6d9389366	tgsi: remove culldist semantic. This isn't used anymore in the tree, culldist's are part of the clipdist semantic, we could in theory rename it, but I'm not sure there is much point, and I'd have to be careful with virgl. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 11:03:44 +10:00
Dave Airlie	d17062a40e	draw: stop using CULLDIST semantic. The way the HW works doesn't really fit with having two semantics for this. The GLSL compiler emits 2 vec4s and two properties, this makes draw use those instead of CULLDIST semantics. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 11:03:40 +10:00
Emil Velikov	bddb3b5375	virgl: remove unused state_tracker/graw.h include Cc: Dave Airlie <airlied@redhat.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 11:02:17 +10:00
Dave Airlie	62c728f7d8	mesa/queryobject: return INVALID_VALUE if offset < 0 (v2) This fixes: GL45-CTS.direct_state_access.queries_errors The ARB_direct_state_access spec agrees. v2: move check down further (Ilia) Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-23 07:33:03 +10:00
Samuel Pitoiset	a7fad12931	nvc0/ir: fix indirect access for images When the array doesn't start at 0 we need to account for su->tex.r. While we are at it, make sure to avoid out of bounds access by masking the index. This fixes GL45-CTS.shading_language_420pack.binding_image_array. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reported-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-22 23:06:16 +02:00
Ilia Mirkin	cb9a51d1f6	nv30: reset the stencil mask when fast-clearing Apparently the stencil mask applies to clears on nv30/nv40. Reset it to 0xff before doing a stencil clear. This fixes gl-1.0-readpixsanity and a number of other piglit tests. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-22 14:48:56 -04:00
Ilia Mirkin	f57a8440d5	nv30,nv50: add PIPE_SHADER_CAP_PREFERRED_IR support The mesa state tracker has recently started to query this. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-22 14:05:36 -04:00
Ilia Mirkin	9f19ccff9c	nvc0: fix setting of tess_mode in various situations This fixes a lot of INVALID_VALUE errors reported by the card when running dEQP tests. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-05-22 11:58:22 -04:00
Ilia Mirkin	d6edae7090	nv50/ir: fix prog info init Left over from the pre-mainline tess support. Adapt to use the new defines. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-05-22 11:58:22 -04:00
Ilia Mirkin	035b1097db	nvc0/ir: return 0 for gl_TessCoord.z for non-triangles modes Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-05-22 11:58:22 -04:00
Matt Turner	bdc9c20df0	mesa: Unlock mutex on error path. Caught by Coverity (CID 1362021). Caused by commit `015f2207c`.	2016-05-22 07:01:35 -07:00
Timothy Arceri	a83e9afbe4	i965: remove redundant NULL check We would have segfaulted in the above code if prog could be NULL. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-22 23:08:08 +10:00
Eduardo Lima Mitev	7dce4793b7	anv/nir_apply_pipeline_layout: Pass the nir_src from the nir_tex_src nir_instr_rewrite_src() expects a nir_src and it is currently being fed a nir_tex_src. This will crash something. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-21 19:57:31 +02:00
Samuel Pitoiset	30b93141aa	nvc0: expose GLSL version 420 on GF100 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-21 18:33:06 +02:00
Samuel Pitoiset	d04050071d	nvc0: enable ARB_shader_image_load_store on GF100 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-21 18:33:03 +02:00
Samuel Pitoiset	362e17a712	nvc0/ir: add a lowering pass for surfaces on Fermi Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-21 18:32:58 +02:00
Samuel Pitoiset	b663db44ba	nvc0/ir: add emission for SULDB and SUSTx Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-21 18:32:56 +02:00
Samuel Pitoiset	cd88d1a171	nvc0/ir: add emission for OP_SULEA Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-21 18:32:54 +02:00
Samuel Pitoiset	8aa1fd321d	nv50/ir: fix tex constraints for surface coords on Fermi Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-21 18:32:49 +02:00
Ilia Mirkin	be4caaf247	nv50/ir: use moveSources to condense sources This makes sure that rIndirectSrc and other things stay updated. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-05-21 18:32:46 +02:00
Samuel Pitoiset	879bd2ea0c	nvc0: bind images on fragment and compute shaders for Fermi Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-21 18:32:41 +02:00
Samuel Pitoiset	e7d2ef42a5	nvc0/ir: don't check the format for surface stores on Kepler Initially to make sure the format doesn't mismatch and won't produce out-of-bounds access, we checked that both formats have exactly the same number of bytes, but this should not be checked for type stores. This fixes serious rendering issues in the UE4 demos (tested with realistic and reflections). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-21 16:50:28 +02:00
Samuel Pitoiset	5e32cc9192	nv50/ir: fix a comment in canDualIssue() Trivial. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-21 16:50:25 +02:00
Samuel Pitoiset	70834d05cd	nv50/ir: fix SUSTx constraints on Kepler To prevent out-of-bounds access and format mismatch we add a predicate on sustp, but we have to account for it when the sources are condensed because a predicate is a source. Using the range 3:6 will only condense the input data and it's always the case. This also fixes constraints when an indirect access is used. This ensures that sources are correctly aligned. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-21 16:06:14 +02:00
Kenneth Graunke	9c0d16adc1	i965: Just read the existing tally on EndTransformFeedback if paused. If the transform feedback object is paused when ending, then there are no new snapshots to add to the tally. In fact, we haven't written a starting snapshot, so we'd best not try and compute (end - start). Just load the existing tally so we can convert it to the number of vertices written and store it to the final result location. This is the Haswell+ equivalent of the previous commit. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-20 19:55:42 -07:00
Kenneth Graunke	915f7c25fa	i965: Don't write a counter snapshot on EndTransformFeedback if paused. If the transform feedback object is paused, then we've already written an ending counter snapshot. We don't want to write another one. This fixes assertions in GL33-CTS.transform_feedback.api_errors_test, which calls EndTransformfeedback after PauseTransformFeedback. On the next BeginTransformFeedback, we tried to tally up the results, and saw an odd number of snapshots (due to the double-end), and tripped an assertion. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-20 19:55:40 -07:00
Kenneth Graunke	47fbe178fa	mesa: Call TransformFeedback driver hooks before setting flags. This way, the driver's EndTransformFeedback() hook can tell whether the transform feedback operation was paused. It's also convenient to have Paused remain false until the driver's PauseTransformFeedback hook finishes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-20 19:55:26 -07:00
Kenneth Graunke	f7eb95a526	nir: Fix crash in nir_lower_wpos_center(). Otherwise we rewrote the fadd to use itself, causing crashes in validation. Instead, start after the last use like we should. A brown paper bag fix. Fixes crashes in several Vulkan tests. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-20 16:33:24 -07:00
Dave Airlie	0970c563d6	nir: remove dead glsl variables before lowering io. For cull distance GLSL will let unsized unused arrays get into the backend, we should nuke those straight away, to save caring about them later. This fixes: arb_separate_shader_objects/linker/large-number-of-unused-varyings as a side effect (even without culling changes). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-21 08:56:45 +10:00
Kenneth Graunke	de45da6a8c	spirv: Handle the PixelCenterInteger execution mode. This isn't allowed by Vulkan, but might be useful someday for SPIR-V in OpenGL (if that ever becomes a thing). It's easy enough to hook up, and as precedent, we already do so for OriginLowerLeft. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-20 14:44:22 -07:00
Kenneth Graunke	9b8b3f7501	i965: Delete dead dFdy flipping code. Rob's nir_lower_wpos_ytransform() pass flips dFdy in the opposite case of what I expected, so we always take the negate_value case. It doesn't really matter. v2: Write src0 before src1 in ADD instructions (requested by Matt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-20 14:30:09 -07:00
Kenneth Graunke	08bc74e694	i965: Delete brw_wm_prog_key::render_to_fbo and drawable_height. Now that we handle flipping and other gl_FragCoord transformations via a uniform, these key fields have no users. This patch actually eliminates the associated recompiles. The Tomb Raider benchmark's minimum FPS increases from ~1 FPS to a reasonable number. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-20 14:30:09 -07:00
Kenneth Graunke	dac10e8a13	i965, anv: Use NIR FragCoord re-center and y-transform passes. This handles gl_FragCoord transformations and other window system vs. user FBO coordinate system flipping by multiplying/adding uniform values, rather than recompiles. This is much better because we have no decent way to guess whether the application is going to use a shader with the window system FBO or a user FBO, much less the drawable height. This led to a lot of recompiles in many applications. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-20 14:30:08 -07:00
Kenneth Graunke	6e5d86c07a	nir: Add a simple nir_lower_wpos_center() pass for Vulkan drivers. nir_lower_wpos_ytransform() is great for OpenGL, which allows applications to choose whether their coordinate system's origin is upper left/lower left, and whether the pixel center should be on integer/half-integer boundaries. Vulkan, however, has much simpler requirements: the pixel center is always half-integer, and the origin is always upper left. No coordinate transform is needed - we just need to add <0.5, 0.5>. This means that we can avoid using (and setting up) a uniform. I thought about adding more options to nir_lower_wpos_ytransform(), but making a new pass that never even touched uniforms seemed simpler. v2: Use normal iterator rather than _safe variant (noticed by Matt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Rob Clark <robdclark@gmail.com>	2016-05-20 14:30:00 -07:00
Kenneth Graunke	12ab7fc6ac	nir: Don't use ffma in nir_lower_wpos_ytransform(). ffma is an explicitly fused multiply add with higher precision. The optimizer will take care of promoting mul/add to fma when it's beneficial to do so. This fixes failures on Gen4-5 when using this pass, as those platforms don't actually implement fma(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-20 14:29:04 -07:00
Kenneth Graunke	b8b1b1c34c	nir: Handle fddy_fine and fddy_coarse in nir_lower_wpos_ytransform. These also need flipping! Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-05-20 14:29:04 -07:00
Kenneth Graunke	4b7577fad8	nir: Make lower_wpos_ytransform_block a void function. The return value was used for the old nir_foreach_block callback system, but at this point it no longer means anything. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-05-20 14:29:04 -07:00
Kenneth Graunke	88ea960aa7	nir: Make nir_lower_wpos_ytransform() match FragCoord by location. gl_FragCoord is a shader input with location == VARYING_SLOT_POS. ARB_fragment_programs have an equivalent input at VARYING_SLOT_POS, but it isn't called gl_FragCoord. We do want to transform it. Matching by location guarantees we catch both. Fixes several fp tests on a branch which uses this pass on i965. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-05-20 14:29:04 -07:00
Kenneth Graunke	c9192fcbd2	nir: Add interp_var_at_offset flipping. The Y-offset needs flipping as well, similar to ddy. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-05-20 14:29:04 -07:00
Kenneth Graunke	287f099db1	nir: Fix fddy swizzles in nir_lower_wpos_ytransform(). The original value might have been swizzled. That's taken care of in the fmul source - we don't want to reswizzle it again. Fixes validation failures in glsl-derivs-varyings on a branch of mine which uses this pass in i965. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-05-20 14:29:04 -07:00
Kenneth Graunke	7fe9a19302	nir: Fix wpos_ytransform lowering state_slot swizzle. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-05-20 14:28:30 -07:00
Kenneth Graunke	1539009bf0	i965: Fix brw_regs_equal() for NaN and positive/negative zero. We'd like the comparisons to mean "the exact same bits". Comparing doubles won't do that for NaN values or positive vs. negative zero. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-20 14:28:06 -07:00
Dave Airlie	b19a0d506d	virgl: handle cull distance cap. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-21 06:19:54 +10:00
Rob Herring	2235b80f2a	virgl: Add missing texture transfer_inline_write transfer_inline_write cannot be NULL and the virgl renderer doesn't support inline writes for textures, so add the default version. This fixes a crash in st_TexSubImage since commit `fb9fe352ea` ("st/mesa: use transfer_inline_write for memcpy TexSubImage path"). Cc: Marek Olšák <marek.olsak@amd.com> Cc: Dave Airlie <airlied@redhat.com> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-21 06:07:18 +10:00
Kristian Høgsberg Kristensen	12dc89d844	anv: Merge in my TODO list items Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2016-05-20 10:35:57 -07:00
Matt Turner	015f2207cf	mesa: Replace uses of Shared->Mutex with hash-table mutexes We were locking the Shared->Mutex and then using calling functions like _mesa_HashInsert that do additional per-hash-table locking internally. Instead just lock each hash-table's mutex and use functions like _mesa_HashInsertLocked and the new _mesa_HashRemoveLocked. In order to do this, we need to remove the locking from _mesa_HashFindFreeKeyBlock since it will always be called with the per-hash-table lock taken. Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-20 10:05:09 -07:00
Matt Turner	aded1160e5	hash: Add _mesa_HashRemoveLocked() function. Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-20 10:05:09 -07:00
Matt Turner	fb5dcb81cc	i965: Pass nir_src/nir_dest by reference. Cuts 6K of .text. text data bss dec hex filename 5772372 264648 29320 6066340 5c90a4 lib/i965_dri.so before `5766074` 264648 29320 6060042 5c780a lib/i965_dri.so after Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-20 10:04:06 -07:00
Mark Janes	9ca5ec2a31	glsl: Guard against NULL dereference This trivially corrects mesa `3ca1c221`, which introduced a check that crashes when a match is not found. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95005 Fixes: piglit.spec.glsl-1_50.compiler.interface-blocks-name-reused-globally-4.vert Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-05-20 09:52:49 -07:00
Nanley Chery	9b8c4000d0	anv: Enable textureCompressionASTC_LDR on Gen9+ Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-20 09:27:11 -07:00
Nanley Chery	0d2847e177	anv/format: Reorder ASTC mappings to match ISL enum ordering Keep the lists consistent for ease of use. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-20 09:27:11 -07:00
Nanley Chery	f3ed3a0a15	genxml: Expand SKL's SurfaceFormat field width for ASTC In the expanded field, only ASTC format enums have the MSB set to 1. Expanding the field width makes the process of handling these formats identical to the way other formats are handled. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-20 09:27:11 -07:00
Nanley Chery	a141576887	isl: Handle npot ASTC block dimensions on Gen9+ Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-20 09:27:11 -07:00
Nanley Chery	de86fb875d	isl: Add 2D ASTC format layouts and enums Also, make changes needed for successful compilation and registration as a texture compression mode. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-20 09:27:11 -07:00
Youry Metlitsky	4e2c9a0435	mesa: Build EGL without X11 headers after interop patchset Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-20 08:44:18 -07:00
Rob Clark	df361fc58c	nir/validate: assume() that hashtable entry exists At this point, it would require a logic error in nir_validate to not have already populated this hashtable entry, but coverity doesn't realize that: CID 1265547 (#1 of 1): Dereference null return value (NULL_RETURNS)3. dereference: Dereferencing a null pointer entry. CID 1271039 (#1 of 1): Dereference null return value (NULL_RETURNS)3. dereference: Dereferencing a null pointer entry. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-20 11:13:50 -04:00
Rob Clark	fcd6b3f42b	nir: coverity unitialized pointer read Not sure how coverity arrives at the conclusion that we can read comp[j] unitialized (around line 204), other than not being aware that ncomp is greater than 1 so it won't underflow in the 'if (tex->is_array)' case. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-20 11:13:50 -04:00
Rob Clark	53c48feae0	nir: coverity sign-extension fix Not 100% sure, but I think being an unsigned literal will help: CID 1358505 (#1 of 1): Unintended sign extension (SIGN_EXTENSION)sign_extension: Suspicious implicit sign extension: load1->def.num_components with type unsigned char (8 bits, unsigned) is promoted in load1->def.num_components * (load1->def.bit_size / 8) to type int (32 bits, signed), then sign-extended to type unsigned long (64 bits, unsigned). If load1->def.num_components * (load1->def.bit_size / 8) is greater than 0x7FFFFFFF, the upper bits of the result will all be 1. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-20 11:13:50 -04:00
Rob Clark	bb993da795	nir/glsl_to_nir: quell some uninit_member coverity errors Signed-off-by: Rob Clark <robclark@freedesktop.org> Acked-by: Matt Turner <mattst88@gmail.com>	2016-05-20 11:13:50 -04:00
Rob Clark	3a1bbd6a0a	freedreno/ir3: need to lower fmod too Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-20 11:13:50 -04:00
Mark Janes	a2d28ddc01	i965: Fix strerror error code sign This trivial fix to error-handling corrects the sign of drm error codes before passing them to strerror. Identified by Coverity: CID1358581	2016-05-20 05:58:18 -07:00
Jason Ekstrand	eb384daae8	nir/spirv: Handle the NonReadable decoration on struct members	2016-05-19 21:18:59 -07:00
Jason Ekstrand	ea8c11fdc2	anv/pipeline: Bounds-check resource indices when robuts_buffer_access is enabled	2016-05-19 21:18:59 -07:00
Jason Ekstrand	902628bce6	anv/pipeline: Only do buffer bounds checks if robustBufferAccess is enabled	2016-05-19 21:18:59 -07:00
Jason Ekstrand	23090b51e0	anv/apply_dynamic_offsets: Use rewrite_src instead of a regular assignment Originally we removed the instruction, changed the source, and then re-inserted it. This works, but nir_instr_rewrite_src is a bit more obviously correct.	2016-05-19 21:18:59 -07:00
Jason Ekstrand	c29ffea6d1	anv/device: Add a boolean for robust buffer access	2016-05-19 21:18:59 -07:00
Jason Ekstrand	d5b4638d6a	anv: Add a TODO file	2016-05-19 20:09:31 -07:00
Dave Airlie	3ca1c2216d	glsl: handle same struct redeclaration (v2) This works around a bug in older version of UE4, where a shader defines the same structure twice. Although we aren't sure this is correct GLSL (it most likely isn't) there are enough UE4 based things out there we should deal with this. This drops the error to a warning if the struct names and contents match. v1.1: do better C++ on record_compare declaration (Rob) v2: restrict this to desktop GL only (Ian) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95005 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-20 11:22:52 +10:00
Matt Turner	8a65b5135a	i965/fs: Recognize and emit ld_lz, sample_lz, sample_c_lz. Ken suggested instead of a big and complicated optimization pass, to just recognize the operations here. It's certainly less code and a lot prettier, but it seems to actually perform worse for currently unknown reasons. total instructions in shared programs: 8923452 -> 8904108 (-0.22%) instructions in affected programs: 814563 -> 795219 (-2.37%) helped: 3336 HURT: 10 total cycles in shared programs: 66970734 -> 66651476 (-0.48%) cycles in affected programs: 10582686 -> 10263428 (-3.02%) helped: 2438 HURT: 691 total spills in shared programs: 1811 -> 1789 (-1.21%) spills in affected programs: 85 -> 63 (-25.88%) helped: 4 total fills in shared programs: 3143 -> 3109 (-1.08%) fills in affected programs: 167 -> 133 (-20.36%) helped: 4 LOST: 2 GAINED: 36 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-19 17:27:49 -07:00
Matt Turner	75dccf5ac2	i965: Add infrastucture for sample lod-zero operations. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-19 17:27:49 -07:00
Matt Turner	07353599e0	i965/fs: Add and use get_nir_src_imm(). The next patch wants to inspect the LOD argument and do something different if it's 0.0f. But at that point we've emitted a MOV for it and we just have a register to look at. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-19 17:27:49 -07:00
Ilia Mirkin	8bf5493899	nvc0: account for shader-allocated local memory needs Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-05-19 20:20:23 -04:00
Ilia Mirkin	5c6b8cc7d0	nv50/ir: treat addresses as local Address registers are always loaded right before use. Don't treat them as "global", which will cause them to be put into the function's linkage, and will make the register allocator hold onto that register until the end of the function. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-19 20:20:23 -04:00
Tim Rowley	65c2abf6fd	swr: [rasterizer] utility functions for shared libs Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:27:18 -05:00
Tim Rowley	6deb9f7f2c	swr: [rasterizer jitter] fix assert in AVX implementation of MASKLOADD llvm changed the mask type to vector of ints with 3.8. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:27:12 -05:00
Tim Rowley	600528168b	swr: [rasterizer core] apply KNOB_TOSS_DRAW to more functions Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:27:06 -05:00
Tim Rowley	6d212cccf0	swr: [rasterizer jitter] add instancing to non-gather fetch path Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:27:01 -05:00
Tim Rowley	63d7ed835a	swr: [rasterizer core] move MultisampleTrait static from header to cpp Move a MultisampleTrait static from header to cpp as clang seemed to get confused with some specializations in the header vs some in cpp. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:26:54 -05:00
Tim Rowley	c969ef2d42	swr: [rasterizer core] clang override for _mm_undefined* Not supported in older xcode versions. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:26:49 -05:00
Tim Rowley	da75160039	swr: [rasterizer common] add OSX to unix portability sections Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:26:44 -05:00
Tim Rowley	4997169779	swr: [rasterizer] rename _aligned_malloc to AlignedMalloc Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:26:38 -05:00
Tim Rowley	2e4ef23523	swr: [rasterizer jitter] rename MEMCPY function to MEMCOPY Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:26:30 -05:00
Tim Rowley	aebbd2f7dd	swr: [rasterizer common] guard definition of __cdecl/__stdcall Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:26:24 -05:00
Tim Rowley	82e335ce67	swr: [rasterizer common] include cstddef for offsetof Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:26:19 -05:00
Tim Rowley	759d8cf3a3	swr: [rasterizer core] removed tabs that snuck in Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:26:14 -05:00
Tim Rowley	8e39d410f1	swr: [rasterizer core] code style cleanup Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:26:08 -05:00
Tim Rowley	b914217c25	swr: [rasterizer core] add dummy code for cygwin build Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:26:02 -05:00
Tim Rowley	a0747c4ce3	swr: [rasterizer core] move variable query outside loop Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:25:54 -05:00
Tim Rowley	f2a1f894ba	swr: [rasterizer core] utility function for getenv Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:25:48 -05:00
Tim Rowley	4a58b21ef7	swr: [rasterizer common] portable threadviz buckets Output with slashes instead of backslashes for unix/linux. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:25:30 -05:00
Tim Rowley	2031baffb5	swr: [rasterizer common] foreground win32 assert dialog Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:25:24 -05:00
Tim Rowley	33d4c2c798	swr: [rasterizer core] use parens to disambiguate operator precedence Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 16:25:06 -05:00
Tim Rowley	9475251145	swr: standardize linkage and check for unresolved symbols Acked-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-19 13:27:33 -05:00
Tim Rowley	6423004d85	swr: fix swr linkage so that static llvm works Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-19 13:27:33 -05:00
Tim Rowley	8987460b9e	swr: PIPE_CAP_CULL_DISTANCE cap request response Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 13:27:33 -05:00
Tim Rowley	78572c9b0b	docs: add swr to GL3.txt v2: not on gl3.3 list until gl3.2 is complete Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-19 13:27:17 -05:00
Leo Liu	2f90d11d86	st/va: use drm render node for wayland display type With xwayland, vainfo use VA_DISPLAY_WAYLAND as default and it fails and fails when specify display with `vainfo --display wayland`. In fact wayland support for libva uses drm path to connect device, and should use drm pipe loader to create screen. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-05-19 09:40:33 -04:00
Marek Olšák	f6742859b7	gallium/radeon: small cleanups in r600_texture_transfer_map Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-19 12:35:50 +02:00
Marek Olšák	54737aabb9	gallium/radeon: don't set PB_USAGE in winsyses There is no point. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-19 12:35:50 +02:00
Marek Olšák	f330b7a14f	gallium/radeon: handle VRAM_GTT placements as having slow CPU reads not sure if we should include GTT WC too Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-19 12:35:50 +02:00
Marek Olšák	5e14d0ac2c	gallium/radeon: ignore PIPE_TRANSFER_MAP_DIRECTLY Only st/xa is using this, which is irrelevant to us. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-19 12:35:50 +02:00
Marek Olšák	51cf04cf0e	radeonsi: add a workaround for a bug in LLVM <= 3.8 This is not directly applicable to stable and needs to be backported. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-19 12:35:50 +02:00
Eduardo Lima Mitev	7671687713	i965/fs: Silence warnings related to use of uninitialized values brw_fs.cpp: In function ‘const unsigned int* brw_compile_fs(const [...] brw_fs.cpp:6093:64: warning: ‘simd16_grf_start’ may be used uninitialized [...] prog_data->base.dispatch_grf_start_reg = simd16_grf_start; brw_fs.cpp:5996:29: note: ‘simd16_grf_start’ was declared here uint8_t simd8_grf_start, simd16_grf_start; brw_fs.cpp:6094:52: warning: ‘simd16_grf_used’ may be used uninitialized [...] prog_data->reg_blocks_0 = brw_register_blocks(simd16_grf_used); brw_fs.cpp:5997:29: note: ‘simd16_grf_used’ was declared here unsigned simd8_grf_used, simd16_grf_used; (and more) Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-05-19 09:05:18 +02:00
Eric Anholt	a507dcc160	vc4: Size transfer temporary mappings appropriately for full maps of 3D. We don't really support reading/writing of 3D textures since the hardware doesn't do 3D, but we do need to make sure that a pipe_transfer for them has enough space to store the image. This was previously not a problem because the state tracker only mapped a slice at a time until `fb9fe352ea`. Fixes glean glsl1 tests, which all have setup of a 3D texture at the start.	2016-05-18 17:30:07 -07:00
Nanley Chery	7ac08adfb4	anv/device: Fix viewportBoundsRange Align with the spec requirement that the range must be at least [−2 × maxViewportDimensions, 2 × maxViewportDimensions − 1]. Our hardware supports this. Fixes dEQP-VK.api.info.device.properties Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94896 Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-05-18 16:01:50 -07:00
Dave Airlie	61b6789252	glsl/linker: attempt to match anonymous structures at link This is my attempt at fixing at least one of the UE4 bugs with GL4.3. If we are doing intrastage matching and hit anonymous structs, then we should do a record comparison instead of using the names. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95005 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-19 08:16:50 +10:00
Mark Janes	4dfa89e33c	anv/batch_chain: free pointers for error cases Trivial fix to improperly handled cleanup during VK_ERROR_OUT_OF_HOST_MEMORY. Identified by Coverity: CID 1358908 and 1358909 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-18 15:14:22 -07:00
Wang He	f21b7d1e5c	st/nine: Minor change to support musl libc A few changes to support musl libc as well. In particular fpu_control.h is glibc specific. fenv.h doesn't enable to do exactly what we want either, so instead use assembly directly. Signed-off-by: Wang He <xw897002528@gmail.com> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Patrick Rudolph	de39231134	st/nine: Enable D3DPMISCCAPS_PERSTAGECONSTANT Nine already supports the feature. There are no failing WINE tests for per stage constants. Enabling D3DPMISCCAPS_PERSTAGECONSTANT as it fixes https://github.com/iXit/Mesa-3D/issues/205 Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	839f417634	st/nine: Turn on thread_submit by default when on different device The last remaining issues with thread_submit have been resolved, thus turn it when on a different device (the case where is is beneficial). Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	9cae3cdc89	st/nine: Fix usage of rasterizer multisample bit. pipe_rasterizer multisample bit should be enabled only when really wanting to do multisampling, thus we should disable when not having msaa render target. This fixes some depth calculation precision issues on radeon. Also disable it when depth and stencil tests are disabled, since in that case multisampling is same as not multisampled. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	f297e7de0f	st/nine: ATOC has effect only with ALPHATESTENABLE ATOC extension does something only when alpha test is enabled. Use a second bit to encode the difference with ATIATOC. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	edc5cdced5	st/nine: Add debug string for ATOC We were missing a debug string for this format. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	4e89dcf0c4	st/nine: Add asserts for output/input packing Nine doesn't support vs output/ps input packing. We haven't found any application requiring that, and implementing it properly is complex. Add asserts for now. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	aeddda0c3a	st/nine: Use correct PIPE_HANDLE_USAGE flag for frontbuffer copy When taking screenshots we do a copy from the frontbuffer to an allocated buffer (which we then copy to a ram buffer). Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	ca7c78a88e	st/nine: Fix output shift calculation We were getting it wrong for negative values. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	b8d95d4087	st/nine: Fix CheckDeviceFormat advertising for surfaces Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	6ef231c80f	st/nine: Improve buffer placement Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	7639033973	st/nine: Fix buffer bind flags Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	0f6e31823d	st/nine: Fix buffer locking flags handling Our behaviour was not entirely similar to what the docs and our tests describe. Drop d3dlock_buffer_to_pipe_transfer_usage. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Patrick Rudolph	f45b9894e5	st/nine: Improve logging Add missing DBG calls in dtors. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Patrick Rudolph	f3fa7e3068	st/nine: Use WINE thread for threadpool Use present interface 1.2 function ID3DPresent_CreateThread to create the thread for threadpool. Creating the thread with WINE prevents some rarely occuring crashes. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Patrick Rudolph	72be473ad1	st/nine: Don't present if window is occluded The problem is that if one d3d present call fails, because of our occlusion check in present method, the next presentation call will send the same pixmap to the Xserver again, without waiting it is released, which is wrong. Move the present call after occlusion check to return and prevent Xpixmaps errors. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Patrick Rudolph	c673c46ccf	st/nine: Use new function to query for resolution mismatch Any third party app might change the current screen resolution. Poll for resolution mismatch to force a device reset. Required for non ex devices only. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Patrick Rudolph	dae9a91727	st/nine: Implement IPresent version 1.2 Implement presentation interface version 1.2: * ID3DPresent_ResolutionMismatch Poll for resolution mismatch. A third party app might have changed resolution, which requires a device reset. * ID3DPresent_CreateThread Create a thread in WINE to allow nine to use Windows API functions. Required for multi-threaded presentation. In single-threaded presentation mode the calling thread is already known to WINE. * ID3DPresent_WaitForThread Wait for a wine thread to terminate. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	2e149a2bf0	st/nine: Implement BumpEnvMap for ff Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	c4e85202cb	st/nine: Format conversion for volumes in UpdateTexture We were doing the conversion for surfaces, but not yet volumes. Now that volumes can do conversion, use it. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	23e2a235dc	st/nine: Remove one useless function output Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	10e548c0c9	st/nine: Add support for X8L8V8U8 X8L8V8U8 support should be common. Some more recent cards do support this format, but not L6V5U5. Add fallback for this format to have it alwaus supported. L6V5U5 conversion rule apparently differs a bit from the normal spec, and thus the gallium equivalent format leads to slightly wrong colors. Since some recent cards do not support it, do not support it either. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	258ca1823c	st/nine: Add format fallback with conversion to volumes Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	755fbcdf24	st/nine: Add format fallback with conversion to surfaces Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	52cb8e33c3	gallium/util: Implement util_format_translate_3d This is the equivalent of util_format_translate, but for volumes. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	89344a80fc	st/nine: Fix Pointsize in programmable shader Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	ae0fdd8a40	st/nine: Fix ff pointscale computation Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	c4af309973	st/nine: Fix header of GetIndices There is a mistake in the online documentation, the function only has 2 arguments. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	3e9d01ff39	st/nine: Increase minor d3dadapter9drm ABI Version 0.1 allows to assume that the second element of the IDirect3D* structures will be a pointer to the internal nine vtable. This is useful if the gallium nine user wants to wrap some interfaces. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	2d51c817cd	st/nine: Fix leak after ctor failures Previously ctor failures would not unreference the device. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	7fc8391d23	st/nine: Add ColorFill test for compressed textures ColorFill should contain alignment checks for compressed textures. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	d11d913987	st/nine: PositionT and Tessfactor are forbidden as PS input According to wine tests, they are forbidden as PS input, which makes sense. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	44068af92e	st/nine: Fix some shader failures not triggering error Some failures during shader translation would not raise errors before this patch. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	a77d8cd710	st/nine: Forbid POSITION0 for PS3.0 POSITION0 input is forbidden for PS3.0 apparently. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	217d969746	st/nine: Rework UpdateTexture Checks Our code did match the user documentation of the function quite well (except for format check). However the DDI documentation and wine tests show that documentation was not correct. Thus adapt our code to fit the best possible to the -real- spec. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	4c77673de7	st/nine: Use bufs instead of Flags for Clear bufs doesn't contain depthstencil if there is z buffer mismatch. This is the behaviour we want. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	f7c3d27d18	d3dadapter9: Add ddebug, rbug and trace support Add support for ddebug, rbug and trace Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Axel Davy	0ae3c8ece7	radeon: Change AA sample locations for EG+ This sets the AA location to the d3d11 spec. EG/NI 8X MSAA is left as is. Not sure why it was set different to Cayman, so lets it as is. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-18 23:37:14 +02:00
Axel Davy	11e4987135	radeonsi: Mixed colorbuffer formats are unsupported Besides depth/stencil, the hardware doesn't support mixed formats. The GL state tracker doesn't make use of them. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-18 23:37:14 +02:00
Axel Davy	fc3533c088	radeonsi: Change default behaviour for undefined COLOR0 d3d 9 needs COLOR0 to be 1.0 on all channels when undefined. 0.0 for the others is fine. GL behaviour is undefined. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-18 23:37:14 +02:00
Axel Davy	a221f40dbb	r600g: Change default behaviour for undefined COLOR0 d3d 9 needs COLOR0 to be 1.0 on all channels when undefined. 0.0 for the others is fine. GL behaviour is undefined. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-18 23:37:14 +02:00
Axel Davy	7e05e4c388	r600: Change default behaviour for undefined COLOR0 d3d 9 needs COLOR0 to be 1.0 on all channels when undefined. 0.0 for the others is fine. GL behaviour is undefined. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-18 23:37:14 +02:00
Christian Schmidbauer	f5d6ed5702	st/nine: Clean up WINAPI definition As Emil pointed out, only gcc, clang and MSVC compatibility is required. Hence the check for GNUC can be skipped, as __i386__ and __x86_64__ are only defined for gcc/clang, not for MSVC. Remove the #undef which has been there for historic reasons, when wine dlls for nine have been built inside mesa. Instead use #ifndef in order to avoid redefining WINAPI from MSVC's headers. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Axel Davy <axel.davy@ens.fr>	2016-05-18 23:37:14 +02:00
Brian Paul	243fd02858	svga: add another debug_printf() in svga_screen_create() Signed-off-by: Brian Paul <brianp@vmware.com>	2016-05-18 14:58:35 -06:00
Brian Paul	96909ef128	spirv: add switch case for nir_texop_txf_ms_mcs in vtn_handle_texture() Mark it as unreachable. Silences a compiler warning: spirv/spirv_to_nir.c:1397:4: warning: enumeration value 'nir_texop_txf_ms_mcs' not handled in switch [-Wswitch] switch (instr->op) { ^ Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-05-18 14:57:45 -06:00
Matt Turner	9c290b1e54	Revert "i965/urb: fixes division by zero" This reverts commit `2a8aa1e3de`.	2016-05-18 12:48:50 -07:00
Ardinartsev Nikita	2a8aa1e3de	i965/urb: fixes division by zero Fixes regression introduced by `af5ca43f26` Reviewed-by: Matt Turner <mattst88@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95419	2016-05-18 11:09:37 -07:00
Matt Turner	caab3cd536	mesa: fclose() filename on error. Pretty useless, as it's in debugging code. Found by Coverity (CID 1257016).	2016-05-18 11:09:37 -07:00
Matt Turner	cbb0e3a7e8	i965/fs: Assert that nir_op_extract_*'s src1 is a constant.	2016-05-18 11:09:37 -07:00
Matt Turner	6a4ff51f7a	glsl: Check that layout is non-null before dereferencing. layout should only be null for structs, but it's checked everywhere else and confuses Coverity (CID 1358495). Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-05-18 11:09:37 -07:00
Matt Turner	53f64a8404	egl/dri2: Don't check return result of mtx_unlock(). Coverity (CID 1358496) warns that the cleanup code doesn't unlock the mutex (which is arguably kind of stupid, since the only case that can happen is when mtx_unlock() failed!). But, mtx_unlock() isn't going to fail -- the mutex was locked by this thread just a few lines above it.	2016-05-18 11:09:37 -07:00
Matt Turner	b1e6d069da	spirv: Properly size the src[] array. Operations like nir_op_bitfield_insert have four arguments, and Coverity isn't privy to the fact that 4-argument operations aren't possible here, so it thinks this can lead to memory corruption. Just increase the size of the array to quell any fears.	2016-05-18 11:09:37 -07:00
Matt Turner	0a548eb56f	isl: Mark default cases in switch unreachable. To silence -Wmaybe-uninitialized warnings.	2016-05-18 11:09:37 -07:00
Ian Romanick	7619aed41d	glsl/linker: Ensure the first stage of an SSO pipeline has input locs assigned Previously an SSO pipeline containing only a tessellation control shader and a tessellation evaluation shader would not get locations assigned for the TCS inputs. This would lead to assertion failures in some piglit tests, such as arb_program_interface_query-resource-query. That piglit test still fails on some tessellation related subtests. Specifically, these subtests fail: 'GL_PROGRAM_INPUT(tcs) active resources' expected 2 but got 3 'GL_PROGRAM_INPUT(tcs) max length name' expected 12 but got 16 'GL_PROGRAM_INPUT(tcs,tes) active resources' expected 2 but got 3 'GL_PROGRAM_INPUT(tcs,tes) max length name' expected 12 but got 16 'GL_PROGRAM_OUTPUT(tcs) active resources' expected 15 but got 3 'GL_PROGRAM_OUTPUT(tcs) max length name' expected 23 but got 12 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: mesa-stable@lists.freedesktop.org	2016-05-18 10:53:50 -07:00
Ian Romanick	79bbff9def	glsl/linker: Don't include interface name for built-in blocks Commit `11096ec` introduced a regression in some piglit tests (e.g., arb_program_interface_query-resource-query). I did not notice this regression because other (unrelated) problems caused failed assertions in those same tests on my system... so they crashed before getting to the new failure. v2: Use is_gl_identifier. Suggested by Tim. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: mesa-stable@lists.freedesktop.org	2016-05-18 10:53:34 -07:00
Ian Romanick	2ef4b5bc93	glsl: Assert that inputs have a location assigned This catches a problem previously undetected until deep in the backend. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-05-18 10:53:34 -07:00
Ian Romanick	cf9220b11f	glsl/linker: Fix trivial typos in comments Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-05-18 10:53:34 -07:00
Ian Romanick	d2579728c9	glsl/linker: Fix some formatting to match current coding conventions Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-05-18 10:53:34 -07:00
Ian Romanick	02e4753777	glsl/linker: Silence unused parameter warning The use of the parameter was removed in `d6b92028`. glsl/link_varyings.cpp:1390:39: warning: unused parameter ‘separate_shader’ [-Wunused-parameter] bool separate_shader) ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-05-18 10:53:34 -07:00
Ian Romanick	75c9aa6670	glsl/linker: Silence unused parameter warning The parameter appears to have been unused since the function was added in commit `12ba6cfb`. Remove it. glsl/linker.cpp:2886:60: warning: unused parameter ‘prog’ [-Wunused-parameter] match_explicit_outputs_to_inputs(struct gl_shader_program *prog, ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-05-18 10:53:34 -07:00
Ian Romanick	f687b8e178	i965: Silence unused parameter warnings The only place that actually used the type parameter was the GS visitor, and it was always passed glsl_type::int. Just remove the parameter. brw_vec4_vs_visitor.cpp:38:61: warning: unused parameter ‘type’ [-Wunused-parameter] const glsl_type *type) ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-05-18 10:53:34 -07:00
Daniel Scharrer	1d628ea09d	mesa: Don't advertise GLES 3.1 without compute support The MaxComputeWorkGroupInvocations constant is used in compute_version_es2() instead of extensions->ARB_compute_shader as ES has lower requirements than desktop GL. Both i965 and gallium set this constant before enabling compute support. Signed-off-by: Daniel Scharrer <daniel@constexpr.org> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-05-18 18:21:21 +02:00
Rob Clark	5827a1dc4b	mesa/st: don't leak name Pointed out by coverity. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-18 09:20:22 -04:00
Brian Paul	877a8026c7	svga: null out all sampler views if start=num=0 Because the CSO module handles sampler views for fragment shaders differently than vertex/geom shaders, VS/GS shader sampler views aren't explicitly unbound like for FS sampler vers. This code checks for the case of start=num=0 and nulls out the sampler views. Fixes a assert regression in piglit's arb_texture_multisample- sample-position test. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-05-17 19:20:36 -06:00
Brian Paul	fe430b0310	st/mesa: remove unused st_context::default_texture The code which used this was removed quite a while ago. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-05-17 19:20:36 -06:00
Brian Paul	5888c47cc9	cso: remove / add some comments Signed-off-by: Brian Paul <brianp@vmware.com>	2016-05-17 19:20:36 -06:00
Eric Anholt	18260d0582	vc4: Add support for vertex color clamping in the rasterizer. This gets us precompile of vertex shaders at the state tracker level as well.	2016-05-17 18:09:58 -07:00
Eric Anholt	474e2bbcc1	vc4: Move tgsi_to_nir to precompile time. Now we have an immutable nir shader in our shader's CSO that we can clone and lower/optimize.	2016-05-17 18:07:39 -07:00
Eric Anholt	734fe41092	vc4: Mark the driver as supporting fragment color clamping in rast. We always clamp fragment colors, since they're always 8-bit unorm, so there's no need to have us compile separate shaders based on GL_ARB_color_buffer_float. This gives us precompilation of fragment programs to the vc4_shader_state_create() level.	2016-05-17 18:07:39 -07:00
Eric Anholt	8835eb689b	vc4: Enable sharing shaders across contexts. This allows the same pipe_shader_state to be referenced from multiple contexts. Since our pipe_shader_state is treated as immutable (other than the variant number) within the driver, this is no problem.	2016-05-17 18:07:39 -07:00
Eric Anholt	62087cb9b8	vc4: Switch to using nir_load_front_face. This will be generated by glsl_to_nir, and it turns out that this is a more code-efficient path than the floating point math, anyway. No change on shader-db, but drops an instruction in piglit's glsl-fs-frontfacing.	2016-05-17 18:07:39 -07:00
Eric Anholt	0700e4c0c7	vc4: Drop the dead export_linkage array. This came from deriving from freedreno.	2016-05-17 18:07:39 -07:00
Eric Anholt	24e7e3d3fc	vc4: Fix a -Wformat-security warning. This is apparently enabled as an error in Android builds, and the compiler can't tell that the return value is safe.	2016-05-17 18:07:39 -07:00
Alex Deucher	86f51d7958	radeonsi: add new polaris11 pci ids Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-05-17 17:49:50 -04:00
Alex Deucher	768320b497	radeonsi: add new polaris10 pci ids Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2016-05-17 17:49:50 -04:00
Kenneth Graunke	dc657a8201	i965: Make brw_reg_from_fs_reg() halve exec_size when compressed. In `a5d7e144ea`, Connor generalized the exec_size halving code to handle more cases. As part of this, he made it not halve anything if the region accessed falls completely in a single register. Unfortunately, it started producing some invalid regions: -add(16) g6<1>F g10<8,8,1>UW -g1<0,1,0>F { align1 compr }; -add(16) g8<1>F g12<8,8,1>UW -g1.1<0,1,0>F { align1 compr }; +add(16) g6<1>F g10<16,16,1>UW -g1<0,1,0>F { align1 compr }; +add(16) g8<1>F g12<16,16,1>UW -g1.1<0,1,0>F { align1 compr }; Here, the UW source region completely fits within a register. However, we have to use instruction compression because the destination region spans two registers. <16,16,1> is invalid because it's compressed. To handle this, skip the "everything fits in one register" case and fall through to the exec_size halving case when compressed. Fixes hundreds of Piglit regressions on GM965. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95370 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-17 14:40:37 -07:00
Kenneth Graunke	062ad81669	i965: Move compression decisions before brw_reg_from_fs_reg(). brw_reg_from_fs_reg() needs to know whether the instruction will be compressed or not. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95370 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-17 14:40:31 -07:00
Kenneth Graunke	9a1936d965	i965: Enable ES 3.2 sample shading extensions. This enables: - GL_OES_sample_shading - GL_OES_sample_variables - GL_OES_shader_multisample_interpolation On Gen8, we pass all the CTS tests, and all but 4 of the dEQP-GLES31 tests (dealing with 1x/2x MSAA at half rate sampling). We believe those 4 dEQP-GLES31 tests are incorrect. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-17 14:27:29 -07:00
Jordan Justen	1ff212bfd3	anv: Fix warning: unused variable ‘cs_prog_data’ This was introduced in `8a80af2820`. Reported-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-17 14:09:56 -07:00
Mauro Rossi	0e81336550	android: fix building error in libmesa_st_mesa Fixes the following building error due to libmesa_nir dependency: In file included from external/mesa/src/mesa/state_tracker/st_glsl_to_nir.cpp:44:0: external/mesa/src/compiler/nir/nir.h:42:25: fatal error: nir_opcodes.h: No such file or directory #include "nir_opcodes.h" ^ compilation terminated. build/core/binary.mk:706: recipe for target 'out/target/product/x86/obj/STATIC_LIBRARIES/libmesa_st_mesa_intermediates/state_tracker/st_glsl_to_nir.o' failed make: * [out/target/product/x86/obj/STATIC_LIBRARIES/libmesa_st_mesa_intermediates/state_tracker/st_glsl_to_nir.o] Error 1 make: * Waiting for unfinished jobs.... Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-17 17:07:28 -04:00
Nicolai Hähnle	941756f092	radeonsi: force level zero on image instructions in non-fragment shaders (v2) Section 8.9 (Texture Functions) of the OpenGL Shading Language 4.5 specification: However, automatic level of detail is computed only for fragment shaders. Other shaders operate as though the base level of detail were computed as zero. and Section 8.9.3 (Texture Gather Functions): When performing a texture gather operation, the minification and magnification filters are ignored, and the rules for LINEAR filtering in the OpenGL Specification are applied to the base level of the texture image to identify the four texels i_0 j_1, i_1 j_1, i_1 j_0, and i_0 j_0. Of course, explicit LOD or derivative variants work in all shader types. This fixes several GL4x-CTS.texture_gather.* tests. v2: TG4 is always level zero (thanks, Ilia) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-17 15:28:40 -05:00
Nicolai Hähnle	988fd6c922	radeonsi: emit TXQ in separate functions TXQ is sufficiently different that having in it in the same code path as texture sampling/fetching opcodes doesn't make much sense. v2: guard against NULL pointer dereferences Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)	2016-05-17 15:28:40 -05:00
Nicolai Hähnle	d464bfd12a	winsys/amdgpu: cleanup error handling in amdgpu_ctx_create Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-17 15:28:40 -05:00
Nicolai Hähnle	fef08af99c	winsys/amdgpu: avoid ioctl call when fence_wait is called without timeout When user fences are used, we don't need the kernel for polling. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-17 15:28:39 -05:00
Nicolai Hähnle	0558564200	gallium/radeon: add radeon_emitted to check for non-trivial IBs Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-17 15:28:39 -05:00
Nicolai Hähnle	5e89b027b9	gallium/radeon: use radeon_emit_array Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-17 15:28:39 -05:00
Nicolai Hähnle	c23273532e	gallium/radeon: use radeon_emit Mostly generated using a sed-script, with manual fix-up for multi-line statements. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-17 15:28:38 -05:00
Nicolai Hähnle	4ac555e9e5	st/mesa: fix reversed copyimage canonical format The format_desc swizzle describes where in the array each color channel comes from - but the existing code was written as if each entry in the swizzle described the meaning of an array element. Fixes piglit's arb_copy_image-format-swizzle. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-17 15:28:38 -05:00
Jordan Justen	6c9f35bb73	Revert "HACK: Don't re-configure L3$ in render stages pre-BDW" This reverts commit `41af9b2e51`. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94468 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-17 13:04:03 -07:00
Jordan Justen	8a80af2820	anv: Port L3 cache programming from i965 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-17 13:04:03 -07:00
Jordan Justen	aa41de080d	anv/gen7: Add memory barrier to vkCmdWaitEvents call We also have this barrier call for gen8 vkCmdWaitEvents. We don't implement waiting on events for gen7 yet, but this barrier at least helps to not regress CTS cases when data caching is enabled. Without this, the tests would intermittently report a failure when the data cache was enabled. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-17 13:04:03 -07:00
Jordan Justen	8ee31828c6	anv: Keep track of whether the data cache should be enabled in L3 If images or shader buffers are used, we will enable the data cache in the the L3 config. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-17 13:04:03 -07:00
Jordan Justen	ff41738871	genxml/hsw: Add L3 cache control registers These were added to the i965 driver in `5912da45a6`. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-17 13:04:03 -07:00
Jan Vesely	47b390fe45	Treewide: Remove Elements() macro Signed-off-by: Jan Vesely <jano.vesely@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-17 15:28:04 -04:00
Jan Vesely	322cd2457c	r600g,sb: Don't use standard macro name Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-05-17 15:28:03 -04:00
Jason Ekstrand	b6c4d46a58	anv/formats: Add support for VK_FORMAT_B4G4R4A4_UNORM pre-gen8	2016-05-17 12:17:22 -07:00
Jason Ekstrand	45c93384e5	anv: Add a devinfo argument to the get_format functions	2016-05-17 12:17:22 -07:00
Jason Ekstrand	100db3d31c	anv/formats: Set the swizzle to RGB1 when using an RGBA format to fake RGB This way we get correct sampling from RGB formats that are faked as RGBA. This should also cause it to disable rendering and blending on those formats. We should be able to render to them and, on Broadwell and above, we can blend on them with work-arounds. However, we'll add support for that more properly later when it's deemed useful. For now, disabling rendering and blending should be safe.	2016-05-17 12:17:22 -07:00
Jason Ekstrand	ce375fba41	anv/formats: Refactor anv_get_format The new code removes the switch statement and instead handles depth/stencil as up-front special cases. This allows for potentially more complicated color format handling in the future.	2016-05-17 12:17:22 -07:00
Jason Ekstrand	34198d798c	anv: Use 16 bits for the isl_format in anv_format This way the entire anv_format structure fits in 32 bits	2016-05-17 12:17:22 -07:00
Jason Ekstrand	7cae59012d	anv/formats: Use the isl_channel_select enum for the swizzle	2016-05-17 12:17:22 -07:00
Jason Ekstrand	8ed429a4f0	anv/formats: Add an anv_get_format helper This commit removes anv_format_for_vk_format and adds an anv_get_format helper. The anv_get_format helper returns the anv_format by-value. Unlike anv_format_for_vk_format the format returned by anv_get_format is 100% accurate and includes any tweaks needed for tiled vs. linear. anv_get_isl_format is now just a wrapper around anv_get_format that picks off just the isl_format.	2016-05-17 12:17:22 -07:00
Jason Ekstrand	13f5cee663	anv/format: Simplify anv_format Now that we have VkFormat introspection and we've removed everything that tried to use anv_format for introspection, we no longer need most of what was in anv_format.	2016-05-17 12:17:22 -07:00
Jason Ekstrand	c1c004e5b2	anv/formats: Delete validate_GetPhysicalDeviceFormatProperties All it ever did was some extra logging that was useful when initially bringing up Dota2. We don't need it anymore.	2016-05-17 12:17:22 -07:00
Jason Ekstrand	aad56f3ee7	anv/image: Use aspects for computing full usage	2016-05-17 12:17:22 -07:00
Jason Ekstrand	fbc23d93e0	anv: Remove the anv_format member from anv_image	2016-05-17 12:17:22 -07:00
Jason Ekstrand	be94a23b44	anv/wsi: Use vk_format_info for asserts rather than anv_format	2016-05-17 12:17:22 -07:00
Jason Ekstrand	63dbb2c60a	anv/copy: Use the linear format from the image for the buffer block size Because the buffer is exposed to the user, the block size is defined to always exactly be the size of the actual vulkan format. This is the same size (it had better be) as the linaer image format.	2016-05-17 12:17:22 -07:00
Jason Ekstrand	c87429c5f1	anv/image: Stop using anv_format for image create validation	2016-05-17 12:17:22 -07:00
Jason Ekstrand	990a7420b6	anv/image: Make heavier use of aspects	2016-05-17 12:17:22 -07:00
Jason Ekstrand	369b8bf402	anv/copy: Use the color_surf from the image to get the block size	2016-05-17 12:17:22 -07:00
Jason Ekstrand	9102e88364	anv: Change render_pass_attachment.format to a VkFormat	2016-05-17 12:17:22 -07:00
Jason Ekstrand	ffc502ce0c	anv: Add helpers to provide simple VkFormat introspection As much as I hate adding yet more format introspection, there are times when the VkFormat is sufficient and we don't want to round-trip through isl_format. For these times, the new vk_format_info.c/h files provide some simple driver-agnostic VkFormat introspection. This intended to be specific to Vulkan but not to any driver whatsoever.	2016-05-17 12:17:22 -07:00
Jason Ekstrand	97ba402cc3	anv/image: Use get_isl_format when creating buffer views	2016-05-17 12:17:22 -07:00
Jason Ekstrand	234ecf26c6	anv/image: Add an aspects field This makes several checks easier and allows us to avoid calling anv_format_for_vk_format in a number of cases.	2016-05-17 12:17:22 -07:00
Jason Ekstrand	1bda8d06e5	anv: Make format_for_descriptor return an isl_format	2016-05-17 12:17:22 -07:00
Jason Ekstrand	263a8cb52d	anv/wayland: Don't allow non-renderable formats	2016-05-17 12:17:22 -07:00
Jason Ekstrand	eb6baa3174	anv/wsi: Make WSI per-physical-device rather than per-instance This better maps to the Vulkan object model and also allows WSI to at least know the hardware generation which is useful for format checks.	2016-05-17 12:17:22 -07:00
Adam Jackson	2ad9d6237a	glapi/gen: Copy some GL 1.0 enum details into ARB_viewport_array Otherwise the instances in the extension XML override the core definitions, and we stop knowing their sizes in indirect_size_get.c Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-05-17 15:04:56 -04:00
Adam Jackson	f4983b194d	glapi: Define PURE for Sun Studio as well Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-05-17 15:04:56 -04:00
Adam Jackson	f1dd8dd6b6	glapi/glx: Mark byteswap functions as _X_UNUSED (v2) Squashes the one remaining warning in the xserver build. v2: Also clean up some non-standard whitespace (Ian Romanick) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-05-17 15:04:56 -04:00
Adam Jackson	ea08a5bcf6	glapi: Harden GLX request size processing (v2) v2: Use == not is for equality testing (Dylan Baker) Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-05-17 15:04:56 -04:00
Adam Jackson	88cfc9ddaa	glapi: Add the safe_{add,mul,pad} functions from xserver We're about to update the generator scripts to use these, easier not to vary between client and server. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-05-17 15:04:56 -04:00
Adam Jackson	7bc5c7f586	glapi: Fix whitespace droppings when printing the license header Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-05-17 15:04:56 -04:00
Rob Clark	1e93b0caa1	mesa/st: add support for NIR as possible driver IR Signed-off-by: Rob Clark <robclark@freedesktop.org> Acked-by: Eric Anholt <eric@anholt.net>	2016-05-17 14:22:46 -04:00
Rob Clark	2bbb140be3	mesa/st: move things around a bit in st_create_fp_variant() Prep work for next patch. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-17 14:22:46 -04:00
Rob Clark	8f9a46dccb	mesa/st: add nir pass for lowering builtin uniforms Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-05-17 14:22:46 -04:00
Emil Velikov	52addd90d1	scons: gallium: link against nir as needed ... otherwise we'll produce uncomplete binaries with introduction of NIR as alternative IR with next commits. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-05-17 14:22:46 -04:00
Jason Ekstrand	265487aedf	i965/fs: Add an allow_spilling flag to brw_compile_fs This allows us to disable spilling for blorp shaders since blorp state setup doesn't handle spilling. Without this, blorp fails hard if you run with INTEL_DEBUG=spill. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Tested-by: Francisco Jerez <currojerez@riseup.net>	2016-05-17 10:20:11 -07:00
Ilia Mirkin	dd4b44efc0	nvc0/ir: fix shared atomic lowering to preserve shared memory location We were always doing atomics on shared memory location 0 instead of the originally supplied location. Make sure to pass through the original symbol and any indirection. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: mesa-stable@lists.freedesktop.org # note: expect minor conflict	2016-05-17 11:22:01 -04:00
Rob Clark	b65bd3dee5	freedreno/ir3: fix compiler warning Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-17 10:05:20 -04:00
Rob Clark	e8beffb1b3	nir/validate: dump annotated shader with error msgs Log all the errors, and at the end dump the shader w/ error annotations to make it easier to see where the problems are. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-05-17 10:05:20 -04:00
Rob Clark	54ecfcc162	nir/validate: assert() -> validate_assert() Prep work for next patch. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-05-17 10:05:20 -04:00
Rob Clark	a0ef26c1c2	nir/print: add support for print annotations Caller can pass a hashtable mapping NIR object (currently instr or var, but I guess others could be added as needed) to annotation msg to print inline with the shader dump. As the annotation msg is printed, it is removed from the hashtable to give the caller a way to know about any unassociated msgs. This is used in the next patch, for nir_validate to try to associate error msgs to nir_print dump. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-05-17 10:05:20 -04:00
Alejandro Piñeiro	e5e412cd27	i965: Expose OpenGL 4.2 for gen8+ ARB_vertex_attrib_64bit was the only feature missing. v2: we can expose 4.2 instead of 4.1 (Ian Romanick) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-17 09:05:55 +02:00
Alejandro Piñeiro	f051eae25a	docs: Mark ARB_vertex_attrib_64bit as done for i965/gen8+ v2: label as done for i965/gen8+ instead of i965 (Kenneth Graunke) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-17 09:05:55 +02:00
Alejandro Piñeiro	59b5441fd9	i965: Enable ARB_vertex_attrib_64bit for gen8+ Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-17 09:05:55 +02:00
Juan A. Suarez Romero	d6281a9d95	i965: take care of doubles when lowering VS inputs Input attributes can require 2 vec4 or 1 vec4 depending on whether they are double-precision or not. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-17 09:05:55 +02:00
Juan A. Suarez Romero	7ea09511ca	i965/fs: calculate first non-payload GRF using attrib slots When computing where the first non-payload GRF starts, we can't rely on the number of attributes, as each attribute can be using 1 or 2 slots depending on whether they are a dvec3/4 or other. Instead, we need to use the number of slots used by the attributes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-17 09:05:55 +02:00
Juan A. Suarez Romero	b7423b485e	i965/vec4: use attribute slots to calculate URB read length Do not use total attributes because a dvec3/dvec4 attribute requires two slots. So rather use total attribute slots. v2: do not use loop to calculate required attribute slots (Kenneth Graunke) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-17 09:05:55 +02:00
Juan A. Suarez Romero	b0fb08e179	i965: take care of doubles when remapping VS attributes Double-precision types require 1 slot in VUE for double and dvec2, and 2 slots for anything else. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-17 09:05:54 +02:00
Juan A. Suarez Romero	80535873bb	nir: add double input bitmap This bitmap tracks which input attributes are double-precision. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-17 09:05:54 +02:00
Juan A. Suarez Romero	ccfe25f758	i965/fs: shuffle 32bits into 64bits for doubles VS Thread Payload handles attributes in URB as vec4, no matter if they are actually single or double precision. So with double-precision types, value ends up in the registers split in 32bits chunks, in different positions. We need to shuffle the chunks to get the doubles correctly. v2: * Extra blank line. Add { } on if body (Ian Romanick) * Use dest directly (Kenneth Graunke) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-17 09:05:47 +02:00
Alejandro Piñeiro	96c276dda9	i965/fs: half exec_size when dealing with 64 bits attributes The HW has a restriction that only vertical stride may cross register boundaries. Until now this was only handled on VGRFs at rw_reg_from_fs_reg, but it is also needed for attributes. v2: * Remove reference to commit id on commit message (Juan Suarez) * Simplify code that compute final exec_size (Ian Romanick) * Use REG_SIZE on that same code (Kenneth Graunke) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-17 07:34:40 +02:00
Alejandro Piñeiro	1ff32ae8b2	i965: passthru formats cannot be used width edge flag enabled Add an assertion to detect this case. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-17 07:34:40 +02:00
Antia Puentes	8b0a334b5e	i965: Configure how to store 64PASSTHRU vertex components From the Broadwell specification, structure VERTEX_ELEMENT_STATE description: "When SourceElementFormat is set to one of the 64_PASSTHRU formats, 64-bit components are stored in the URB without any conversion. In this case, vertex elements must be written as 128 or 256 bits, with VFCOMP_STORE_0 being used to pad the output as required. E.g., if R64_PASSTHRU is used to copy a 64-bit Red component into the URB, Component 1 must be specified as VFCOMP_STORE_0 (with Components 2,3 set to VFCOMP_NOSTORE) in order to output a 128-bit vertex element, or Components 1-3 must be specified as VFCOMP_STORE_0 in order to output a 256-bit vertex element. Likewise, use of R64G64B64_PASSTHRU requires Component 3 to be specified as VFCOMP_STORE_0 in order to output a 256-bit vertex element." Uses 128-bits to write double and dvec2 vertex elements, and 256-bits for dvec3 and dvec4 vertex elements. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Signed-off-by: Antia Puentes <apuentes@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-17 07:34:40 +02:00
Alejandro Piñeiro	71150b73c8	i965: get the proper vertex surface type for doubles on gen8+ This commit adds support for PASSTHRU format when pushing double-precision attributes. Check glarray->Doubles in order to know if we should choose a format that does a conversion to float, or just passthru the 64-bit double. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-17 07:34:40 +02:00
Ilia Mirkin	b1d74e9486	nvc0/ir: make sure out-of-bounds buffer loads/atomics get a 0 result Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-05-17 01:27:29 -04:00
Timothy Arceri	4fb4fd0b6b	glsl: make reserved_varying_slot() static Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-05-17 15:06:39 +10:00
Timothy Arceri	1d752823af	glsl: include per-patch varyings when generating reserved slot bitfield Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-05-17 15:06:27 +10:00
Timothy Arceri	00441829e7	glsl: don't incorrectly eliminate patches with explicit locations These varying have a separate location domain from per-vertex varyings and need to be handled separately. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-05-17 15:06:21 +10:00
Timothy Arceri	3f477f0ea5	glsl: remove remainings tabs in link_varyings.cpp Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-05-17 15:06:16 +10:00
Timothy Arceri	6d5f7557fb	glsl: fix location and component packing validation on patches These varyings have a separate location domain from per-vertex varyings and need to be handled separately. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-05-17 15:06:12 +10:00
Kenneth Graunke	aae0865dc0	i965: Enable ARB_shader_precision on Gen8+. I recently fixed a bug in the Piglit tests: https://lists.freedesktop.org/archives/piglit/2016-May/019802.html With that patch in place, we pass all the tests. So, turn it on. We could probably expose this earlier than Gen8, but the extension says that OpenGL 4.0 is required, and all of our tests are written against GLSL 4.00 (which is only supported on Gen8+). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 17:52:45 -07:00
Jose Fonseca	cf010de6ee	vl/dri: Move the DRI3 check out of sources include into C. Fixes SCons build. Trivial. Built locally with SCons and autotools.	2016-05-16 21:50:43 +01:00
Leo Liu	5e2072c711	st/vdpau: add dri3 support Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Leo Liu	c122c74dca	vl/dri3: implement functions for get and set timestamp Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Leo Liu	9f50a79b8f	vl/dri3: handle PresentCompleteNotify event and get timestamp calculated based on the event's reply Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Leo Liu	e8282178ab	st/va: add dri3 support Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Leo Liu	8d7ac0a4e4	vl/dri3: implement DRI3 BufferFromPixmap We also need render to the front buffer of temporary X pixmap, this is the case of when we using opengl as video out for vaapi. the basic implementation is to pass pixmap ID to X server, and then X will return dma-buf fd, we will get the buffer object through this dma-buf fd. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Leo Liu	858b329c2c	vl/dri3: add support for resizing When drawable size changed, PresentConfigureNotify event will be emitted, by handling the event to re-allocate resized buffer. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Leo Liu	96580ad593	vl/dri3: implement funciton for get dirty area This will clear presentation area not covered by video content Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Leo Liu	b0bd908284	vl/dri3: implement function for flush frontbuffer Request drawable content in pixmap by calling DRI3 PresentPixmap, and handle PresentIdleNotify event. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Leo Liu	e1223282db	vl/dri3: add back buffers support This implements DRI3 PixmapFromBuffer. Create buffer objects, and associate it to a dma-buf fd, and then pass this fd with a pixmap ID to X server for creating pixmap object; also add a function for wait events. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Leo Liu	69ba9be4d2	vl/dri3: implement flushing for queued events also place holder for present events handling Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Leo Liu	758b1bbaa7	vl/dri3: register present events Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Leo Liu	672e8d5e7e	vl/dri3: set drawable geometry Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Leo Liu	12e5220e34	vl/dri3: add DRI3 support and implement create and destroy Required functions into place for implementation, create screen with device fd returned from X server, also bail out to DRI2 with certain conditions. v2: -organize the error out path (Axel) -squash previous patch 1 and 2 into one (Emil) Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-05-16 16:28:51 -04:00
Dave Airlie	30e437bd76	mesa/version.c: enable cull distance in version check. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-17 06:08:31 +10:00
Ian Romanick	11096ecc39	glsl/linker: Include the interface name for input and output blocks On my oes_shader_io_blocks branch, this fixes 71 dEQP-GLES31.functional.program_interface_query.* tests. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2016-05-16 11:18:03 -07:00
Ian Romanick	7c11589eb4	glsl/linker: Use canonical format for ARB_program_interface_query spec quotes Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 11:18:03 -07:00
Mark Janes	fd854c1add	i965: check tcs for NULL dereference Coverity issue 1361544 found an instance where the tcs variable is checked for NULL, but unconditionally dereferenced later in the same function. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 11:11:11 -07:00
Matt Turner	bf91034d44	i965: Mark is_lossless_compressed_aux UNUSED to silence warning. Used only in assert().	2016-05-16 11:08:55 -07:00
Matt Turner	1385018a72	genxml: Use llroundf() and store to appropriate type. Both functions return uint64_t, so I expect the masking/shifting should be done on 64-bit types. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-16 11:06:15 -07:00
Matt Turner	4191551262	nir: Mark nir_start_block()/nir_impl_last_block() with returns_nonnull. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 11:06:15 -07:00
Matt Turner	377ab2f2d7	util: Add ATTRIBUTE_RETURNS_NONNULL. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 11:06:15 -07:00
Jan Vesely	40c6d54e76	clover: grid_offset should be padded with 0 not 1 Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-16 13:58:14 -04:00
Iago Toral Quiroga	71465179fc	i965: Expose OpenGL 4.0 for gen8+ ARB_gpu_shader_fp64 was the only feature missing. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 09:55:34 +02:00
Iago Toral Quiroga	b1d21e1159	docs: Mark ARB_gpu_shader_fp64 as done for i965/gen8+ Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 09:55:33 +02:00
Iago Toral Quiroga	309d285c6b	i965: Enable ARB_gpu_shader_fp64 for gen8+ Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 09:55:33 +02:00
Iago Toral Quiroga	58f304defe	i965/tes/scalar: Fix load input for doubles Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-16 09:55:33 +02:00
Iago Toral Quiroga	61197b8d5d	i965/tcs/scalar: fix store output for doubles Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 09:55:33 +02:00
Iago Toral Quiroga	cda3435ea8	i965/tcs/scalar: fix load input for doubles v2: do not write to the original indirect_offset since that is an expression that could be used somewhere else (Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 09:55:33 +02:00
Iago Toral Quiroga	66192b3c16	i965/fs: fix nir_intrinsic_store_output for doubles Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 09:55:33 +02:00
Iago Toral Quiroga	3cce67aff0	i965/fs: fix number of output components for doubles Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 09:55:33 +02:00
Iago Toral Quiroga	0297f1021a	i965/vec4: handle doubles in type_size_vec4() The scalar backend uses this to check URB input sizes. v2: Removed redundant break after return (Curro) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-16 09:55:33 +02:00
Iago Toral Quiroga	8c6d147373	i965/fs: support doubles with shared variable stores This is pretty much the same we do with SSBOs. v2: do not shuffle in-place, it is not safe since the original 64-bit data could be used after the write, instead use a temporary like we do for SSBO stores (Iago) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 09:55:33 +02:00
Iago Toral Quiroga	943f9442bf	i965/fs: support doubles with ssbo stores Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 09:55:33 +02:00
Iago Toral Quiroga	b9aa66aa51	i965/fs: add shuffle_64bit_data_for_32bit_write helper This does the inverse operation of shuffle_32bit_load_result_to_64bit_data and we will use it when we need to write 64-bit data in the layout expected by untyped write messages. v2 (curro): - Use subscript() instead of stride() - Assert on the input types rather than silently retyping. - Use offset() instead of horiz_offset(), drop the multiplier definition. - Drop the temporary vgrf and force_writemask_all. - Make component_i const. - Move to brw_fs_nir.cpp v3 (curro): - Pass dst and src by reference. - Simplify allocation of tmp register. - Move to brw_fs_nir.cpp. - Get rid of the temporary. v3 (Iago): - Check that the src and dst regions do not overlap, since that would typically be a bug in the caller. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-16 09:55:33 +02:00
Iago Toral Quiroga	33f7ec18ac	i965/fs: support doubles with SSBO loads Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-16 09:55:33 +02:00
Iago Toral Quiroga	8aa01ac596	i965/fs: support doubles with shared variable loads Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-16 09:55:33 +02:00
Iago Toral Quiroga	6eab06b866	i965/fs: Add do_untyped_vector_read helper We are going to need the same logic for anything that reads doubles via untyped messages (CS shared variables and SSBOs). Add a helper function with that logic so that we can reuse it. v2: - Make this a static function instead of a method of fs_visitor (Iago) - We only support types with a size of 4 or 8 (Curro) - Avoid retypes by using a separate vgrf for the packed result (Curro) - Put dst parameter before source parameters (Curro) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-16 09:55:33 +02:00
Iago Toral Quiroga	b86d4780ed	i965/fs: support doubles with UBO loads UBO loads with constant offset use the UNIFORM_PULL_CONSTANT_LOAD instruction, which reads 16 bytes (a vec4) of data from memory. For dvec types this only provides components x and y. Thus, if we are reading more than 2 components we need to issue a second load at offset+16 to read the next 16-byte chunk with components w and z. UBO loads with non-constant offset emit a load for each component in the vector (and rely in CSE to fix redundant loads), so we only need to consider the size of the data type when computing the offset of each element in a vector. v2 (Sam): - Adapt the code to use component() (Curro). v3 (Sam): - Use type_sz(dest.type) in VARYING_PULL_CONSTANT_LOAD() call (Curro). - Add asserts to ensure std140 vector alignment rules are followed (Curro). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-16 09:55:33 +02:00
Iago Toral Quiroga	58f1804c4f	i965/fs: fix pull constant load component selection for doubles UNIFORM_PULL_CONSTANT_LOAD is used to load a contiguous vec4 starting at a constant offset that is 16-byte aligned. If we need to access an unaligned offset we emit a load with an aligned offset and use the remaining constant offset to select the component into the vec4 result that we are interested in. This component must be computed in units of the type size, since that is what fs_reg::set_smear expects. This patch does this change in the two places where we use this message: In demote_pull_constants when we lower uniform access with constant offset into the pull constant buffer and in UBO loads with constant offset. v2 (Sam): - Fix set_smear() in fs_visitor::lower_constant_loads(), take into account source type instead and remove MAX2 (Curro). - Improve changes to nir_intrinsic_load_ubo case in nir_emit_intrinsic() (Curro). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-16 09:55:33 +02:00
Francisco Jerez	71fd4942d1	i965/fs: Fix and document component(). This fixes a number of bugs of component() by reimplementing it in terms of horiz_offset(): Handling of base registers starting at a non-zero subreg_offset, handling of strided registers and overflow of subreg_offset into reg_offset. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 09:55:33 +02:00
Iago Toral Quiroga	e209134f71	i965/fs: Fix fs_visitor::VARYING_PULL_CONSTANT_LOAD for doubles v2 (Curro): - Assert on scale == 1 when shuffling 64-bit data. - Remove type_slots, use type_sz(vec4_result.type) instead. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-16 09:55:33 +02:00
Iago Toral Quiroga	50b7676dc4	i965/fs: add shuffle_32bit_load_result_to_64bit_data helper There will be a few places where we need to shuffle the result of a 32-bit load into valid 64-bit data, so extract this logic into a separate helper that we can reuse. v2 (Curro): - Use subscript() instead of stride() - Assert on the input types rather than retyping. - Use offset() instead of horiz_offset(), drop the multiplier definition. - Don't use force_writemask_all. - Mark component_i as const. - Make the function name lower case. v3 (Curro): - Pass src and dst by reference. - Move to brw_fs_nir.cpp Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-16 09:55:33 +02:00
Francisco Jerez	4d9c461e53	i965/fs: Stop using the LOAD_PAYLOAD instruction in lower_simd_width. Instead of using the LOAD_PAYLOAD instruction (emitted through the emit_transpose() helper that is no longer useful and this commit removes) which had to be marked force_writemask_all in some cases, emit a series of moves to apply proper channel enable signals to the destination. Until now lower_simd_width() had mainly been used to lower things that invariably had a basic block-local temporary as destination so it didn't seem like a big deal, but I found it to be the reason for several Piglit regressions in my SIMD32 branch and Igalia discovered the same issue independently while working on FP64 support. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 09:55:32 +02:00
Iago Toral Quiroga	9149fd6817	i965/fs: fix copy/constant propagation regioning checks We were not accounting for subreg_offset in the check for the start of the region. Also, fs_reg::regs_read() already takes the stride into account, so we should not multiply its result by the stride again. This was making copy-propagation fail to copy-propagate cases that would otherwise be safe to copy-propagate. Again, this was observed in fp64 code, since there we use stride > 1 often. v2 (Sam): - Rename function and add comment (Jason, Curro). - Assert that register files and number are the same (Jason). - Fix code to take into account the assumption that src.subreg_offset is strictly less than the reg_offset unit (Curro). - Don't pass the registers by value to the function, use 'const fs_reg &' instead (Curro). - Remove obsolete comment in the commit log (Curro). v3 (Sam): - Remove the assert and put the condition in the return (Curro). - Fix function name (Curro). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-16 09:55:32 +02:00
Iago Toral Quiroga	789eecdb79	i965/fs: fix copy propagation from load payload We were not considering the case where the load payload is writing to a destination with a reg_offset > 0. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-16 09:55:32 +02:00
Iago Toral Quiroga	cf375a3333	i965/fs: fix copy propagation of partially invalidated entries We were not invalidating entries with a src that reads more than one register when we find writes that overwrite any register read by entry->src after the first. This leads to incorrect copy propagation because we re-use entries from the ACP that have been partially invalidated. Same thing for entries with a dst that writes to more than one register. v2 (Sam): - Improve code by defining regions_overlap() and using it instead of a loop (Curro). Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 09:55:32 +02:00
Francisco Jerez	ea1ef49a16	i965/fs: Reindent register offset calculation of try_copy_propagate(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 09:55:32 +02:00
Francisco Jerez	0fb19806c0	i965/fs: Simplify and fix register offset calculation of try_copy_propagate(). try_copy_propagate() was special-casing UNIFORM registers (the BAD_FILE, ARF and FIXED_GRF cases are dead, see the assertion at the top of the function) and then failing to take into account the possibility of the instruction reading from a non-zero offset of the destination of the copy. The VGRF/ATTR handling takes it into account correctly, and there is no reason we couldn't use the exact same logic for the UNIFORM file aside from the fact that uniforms represent reg_offset in different units. We can work around that easily by defining an additional constant with the right unit reg_offset is expressed in. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 09:55:32 +02:00
Iago Toral Quiroga	7aa53cd725	i965/fs: disallow type change in copy-propagation if types have different sizes Because the semantics of source modifiers are type-dependent, the type of the original source of the copy must be kept unmodified while propagating it into some instruction, which implies that we need to have the guarantee that the meaning of the instruction is going to remain the same after we have changed the types. Whenthe size of the new type is different from the size of the old type the new and old instructions cannot possibly be equivalent because the new instruction will be reading more data than the old one was. Prevents that we turn this: load_payload(8) vgrf17:DF, \|vgrf4+0.0\|:DF 1sthalf mov(8) vgrf18:DF, vgrf17:DF 1sthalf load_payload(8) vgrf5:DF, vgrf18:DF, vgrf20:DF NoMask 1sthalf WE_all load_payload(8) vgrf21:UD, vgrf5+0.4<2>:UD 1sthalf mov(8) vgrf22:UD, vgrf21:UD 1sthalf into: load_payload(8) vgrf17:DF, \|vgrf4+0.0\|:DF 1sthalf mov(8) vgrf18:DF, \|vgrf4+0.0\|:DF 1sthalf load_payload(8) vgrf5:DF, \|vgrf4+0.0\|:DF, \|vgrf4+2.0\|:DF NoMask 1sthalf WE_all load_payload(8) vgrf21:UD, vgrf5+0.4<2>:UD 1sthalf mov(8) vgrf22:DF, \|vgrf4+0.4\|<2>:DF 1sthalf where the semantics of the last instruccion have changed. v2 (Curro): - Update commit log and add comment to explain the problem better. - Simplify the condition. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 09:55:32 +02:00
Iago Toral Quiroga	ac9b966aac	i965/fs: Fix copy propagation of load payload for double operands Specifically, consider the size of the data type of the operand to compute the number of registers written. v2 (Sam): - Fix line width (Jordan). - Add an assert (Jordan). - Use REG_SIZE in the calculation of regs_written (Curro) v3 (Sam): - Fix assert and calculation of regs_written (Curro). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-16 09:55:32 +02:00
Francisco Jerez	70dc19f9d6	i965/fs: Fix propagation of copies with strided source. This has likely been broken since we started propagating copies not matching the offset of the instruction exactly (`1728e74957`). The copy source stride needs to be taken into account to find out the offset at the origin that corresponds to the offset at the destination of the copy which is being read by the instruction. This has led to program miscompilation on both my SIMD32 branch and Igalia's FP64 branch. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 09:55:32 +02:00
Iago Toral Quiroga	17decd940c	i965/fs: fix subreg_offset overflow in byte_offset() This can happen if the register already has a non-zero subreg_offset when byte_offset() is called. v2 (Sam): - Refactor byte_offset() (Jordan). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-16 09:55:32 +02:00
Kenneth Graunke	2fd79ebe8f	i965: Fix JIP to skip over sibling do...while loops. We've apparently always been botching JIP for sequences such as: do cmp.f0.0 ... (+f0.0) break ... do ... while ... while Because the "do" instruction doesn't actually exist, the inner "while" is at the same depth as the "break". brw_find_next_block_end() thus mistook the inner "while" as the end of the loop containing the "break", and set the "break" to point to the wrong place. Only "while" instructions that jump before our instruction are relevant. We need to ignore the rest, as they're sibling control flow nodes (or children, but this was already handled by the depth == 0 check). See also commit `1ac1581f38`. This prevents channel masks from being screwed up, and fixes GPU hangs() in dEQP-GLES31.functional.shaders.multisample_interpolation. interpolate_at_sample.centroid_qualified.multisample_texture_16. The test ended up executing code with no channels enabled, and that code contained FIND_LIVE_CHANNEL, which returned 8 (out of range for a SIMD8 program), which then was used in indirect GRF addressing, which randomly got a boolean value (0xFFFFFFFF), interpreted it as a sample ID, OR'd it into an indirect send message descriptor, which corrupted the message length, sending a pixel interpolator message with mlen 15, which is illegal. Whew :) () Technically, the test doesn't GPU hang currently, but only because another bug prevents it from issuing pixel interpolator messages entirely...with that fixed, it hangs. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-16 00:20:07 -07:00
Kenneth Graunke	2f02fad6b3	i965: Make a "does this while jump before our instruction?" helper. I need to use this in an additional place. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-16 00:19:53 -07:00
Kenneth Graunke	b6f250d7f2	i965: Send the minimal number of STATE_BASE_ADDRESS packets. STATE_BASE_ADDRESS stalls the whole pipeline, and the documentation cautions us to emit it as little as possible for better performance. We recently put some hacks in BLORP to try and avoid emitting it if it was already set correctly. However, this wasn't quite minimal: if BLORP is the first operation (i.e. glClear()), then it would emit it, and subsequent draw calls would emit it again. This caused a small drop in performance in GPUTest Triangle when switching from Meta to BLORP. Unlike most packets, STATE_BASE_ADDRESS isn't influenced by GL state: it needs to be emitted once per batch, before most other commands, or whenever we change the program cache BO. It's also valid in both the 3D and compute pipelines, which makes it even more unique. This patch removes it from the atom mechanism and instead directly calls it as part of every draw, compute dispatch, or BLORP operation. We introduce a new flag indicating that STATE_BASE_ADDRESS has already been emitted this batch, and if so, skip doing it again. When we make a new program cache BO, we simply reset the flag, so the next operation will emit it again. When we flush/reset the batch, we reset the flag. This guarantees that we'll emit STATE_BASE_ADDRESS only when we have to. It's also less code than the old atom mechanism. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-16 00:11:51 -07:00
Kenneth Graunke	97179c606c	i965: Combine Gen4-7 and Gen8+ state base address emitters. We're about to start calling it directly, and this means the callers won't have to think about generations. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-16 00:11:50 -07:00
Kenneth Graunke	7b70a12e1c	i965: Move Gen4-5 programs to brw_upload_programs() too. This way all the programs are in one place again, and it also should make some future STATE_BASE_ADDRESS related changes possible. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-16 00:11:49 -07:00
Kenneth Graunke	b23b099a0b	i965: Mark brw const in brw_state_dirty and callers. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-16 00:11:43 -07:00
Kenneth Graunke	8e71ac731b	glsl: Don't do constant propagation in opt_constant_folding. opt_constant_folding is supposed to fold trees of constants into a single constant. Surprisingly, it was also propagating constant values from variables into expression trees - even when the result couldn't be folded together. This is opt_constant_propagation's job. The ir_dereference_variable::constant_expression_value() method returns a clone of var->constant_value. So we would replace the dereference with a constant, propagating it into the tree. Skip over ir_dereference_variable to avoid this surprising behavior. However, add code to explicitly continue doing it in the constant propagation pass, as it's useful to do so. shader-db statistics on Broadwell: total instructions in shared programs: 8905349 -> 8905126 (-0.00%) instructions in affected programs: 30100 -> 29877 (-0.74%) helped: 93 HURT: 20 total cycles in shared programs: 71017030 -> 71015944 (-0.00%) cycles in affected programs: 132456 -> 131370 (-0.82%) helped: 54 HURT: 45 The only hurt programs are by a single instruction, while the helped ones are helped by 1-4 instructions. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-15 23:59:39 -07:00
Kenneth Graunke	db8fcbbaf9	glsl: Avoid excess tree walking when folding ir_dereference_arrays. If an ir_dereference_array has non-constant components, there's no point in trying to evaluate its value (which involves walking down the tree and possibly allocating memory for portions of the subtree which are constant). This also removes convoluted tree walking in opt_constant_folding(), which tries to fold constants while walking up the tree. No need to walk down, then up, then down again. We did this for swizzles and expressions already, but I was lazy back in the day and didn't do this for ir_dereference_array. No change in shader-db. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-15 23:59:33 -07:00
Kenneth Graunke	329fe93210	glsl: Consolidate duplicate copies of constant folding. We could probably clean this up more (maybe make it a method), but at least there's only one copy of this code now, and that's a start. No change in shader-db. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-15 23:59:20 -07:00
Kenneth Graunke	3bf27a9a00	glsl: Remove bonus tree walking in opt_constant_folding(). It looks like this was missed when converting opt_constant_folding() from a hierarchical visitor to an rvalue visitor in `6606fde3`. ir_rvalue_visitor already processes values on the way back up the tree, so we will have already visited every child node. There's no point in doing it again. No change in shader-db. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-15 23:59:10 -07:00
Kenneth Graunke	8e59670bcf	glsl: Make opt_constant_variable() bail in useless cases. The pass ultimately skips over any entries with assignment_count != 1, so there's no need to do further work once we've determined that there are multiple assignments. The constant value could be a large array (i.e. uvec4[327]), at which point skipping the constant_expression_value() call (and the clone() call within) can save us piles of memory. No change in shader-db. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-15 23:59:05 -07:00
Kenneth Graunke	c907ca6c8d	i965: Flip interpolateAtOffset's y offset when necessary. Fixes 4 dEQP-GLES31.functional.shaders.multisample_interpolation tests: - interpolate_at_offset.no_qualifiers.default_framebuffer - interpolate_at_offset.centroid_qualifier.default_framebuffer - interpolate_at_offset.sample_qualifier.default_framebuffer - interpolate_at_offset.array_element.default_framebuffer Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-15 23:50:52 -07:00
Kenneth Graunke	6d65b0c6dc	nir: Add a nir->info.uses_interp_var_at_offset flag. I've added this to nir_gather_info(), but also to glsl_to_nir() as a temporary measure, since the i965 GL driver today doesn't use nir_gather_info() yet. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-15 23:50:28 -07:00
Kenneth Graunke	d4d7e1516b	glsl: Drop bad ASSERT_TRUE in gl_CullDistance link_varyings test. I don't know what the intention was here, but this function returns void. We can't assert anything about its return value. Fixes "make check" failures. v2: Also fix prototype for the function (caught by Jordan). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-15 23:49:19 -07:00
Jan Vesely	9525f33164	clover: Handle PIPE_SHADER_IR_NIR in switch Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-15 20:05:10 -04:00
Rob Clark	277818ecfb	freedreno/ir3: small standalone compiler cleanup Don't hard-code the gpu-id anymore. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-15 17:25:48 -04:00
Rob Clark	f06343d6ea	nir: forward-declare 'struct gl_shader_program' Drop extra #include which is otherwise unneeded (and makes this header difficult to include from outside of src/mesa). Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-15 17:25:48 -04:00
Rob Clark	79d6409a14	nir: return progress from lower_idiv With algebraic-opt support for lowering div to shift, the driver would like to be able to run this pass after the main opt-loop, and then conditionally re-run the opt-loop if this pass actually lowered some- thing. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-15 17:25:48 -04:00
Rob Clark	f8840f471d	freedreno/ir3: lower fdiv Not sure how we didn't hit this already, but since we want fdiv converted into mul + rcp, we should set this. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-15 17:25:48 -04:00
Rob Clark	53cde5e295	freedreno/ir3: handle VARYING_SLOT_PNTC In the glsl->tgsi path, this already gets translated to VAR8, which matches up with rasterizer->sprite_coord_enable. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-15 17:25:48 -04:00
Rob Clark	2f1581059b	freedreno/ir3: disable TGSI specific hacks in nir case When we got NIR directly from state tracker (vs using tgsi_to_nir) we need to realize this and skip some TGSI specific hacks. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-15 17:25:48 -04:00
Rob Clark	784086f3c1	freedreno/ir3: add support for NIR as preferred IR For now under debug flag, since only suitable for debugging/testing. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-15 17:25:47 -04:00
Rob Clark	8b24f7b440	nir: fix comment typo about f2d/d2f Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-15 17:25:47 -04:00
Ilia Mirkin	be2b13e3bf	nv50/ir: avoid asserts when the state tracker feeds us bogus inputs INTERP is defined (by me) to have to have a INPUT source. However the state tracker does not always obey this. This happens due to varying packing logic introducing additional mov's which can't always be undone. Instead of just giving up, we instead try harder to find the original input. This won't always be possible, for example with indirect accesses. There's not much we can (easily) do about that though. This fixes the remaining interpolateAt* failures in dEQP: dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at* some of which were asserting due to INTERP_* being passed a non-input. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-05-15 14:12:56 -04:00
Ilia Mirkin	9323d084ac	nvc0: don't try to go through the push path for indirect draws This fixes dEQP-GLES31.functional.draw_indirect.draw_elements_indirect.*.default_attribute These tests were causing a const vbo to be set up, and were small enough draws that the logic was trying to go via the push path (which emits data directly into the cmd stream rather than uploading a user vbo). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-05-15 10:48:39 -04:00
Ilia Mirkin	2ef3cdb07e	nvc0/ir: make sure to align the second arg of TXD to 4, as we do for TEX This was handled in handleTEX(), however the way the logic works, those extra arguments aren't added on by then, so it did nothing. Instead we must duplicate that bit here. GK110 appears to complain about MISALIGNED_GPR, however it's reasonable to believe that GK104 has the same requirements. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95403 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-05-15 10:48:39 -04:00
Tobias Klausmann	8c02939794	nv50,nvc0: add support for cull distances Cull distances are just a special case of clip distances as far as the hardware is concerned. Make sure that the relevant "planes" are enabled, and flip the clip mode to cull for those. Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> [imirkin: add enables on nvc0, add nv50 support] Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>	2016-05-15 10:48:39 -04:00
Ilia Mirkin	2ad970ecf4	st/mesa: disable cull distance for now The pass that st/mesa relies on to combine clip and cull distances has been reverted, so we can't expose ARB_cull_distance until that is resolved. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-15 10:48:38 -04:00
Jason Ekstrand	09e041d61d	i965: Use blorp for all clears We used to use a meta path on gen8 but we haven't since `c7cf17ae75`. We might as well delete the meta path since blorp works on all gens. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 14:18:21 -07:00
Jason Ekstrand	1cfb4bc890	i965: Use blorp for all stencil blits We used to use a meta path because blorp didn't support 16x MSAA. Now it does, so we don't need the meta paths anymore. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 14:18:21 -07:00
Jason Ekstrand	64f2907030	i965: Use blorp for all updownsample blits We used to use a meta path because blorp didn't support 16x MSAA. Now it does, so we don't need the meta paths anymore. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 14:18:21 -07:00
Jason Ekstrand	f5febc83a7	i965/blorp: Add support for 16x MSAA Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 14:18:21 -07:00
Jason Ekstrand	a32315bd19	i965: move brw_meta_set_fast_clear_color to brw_meta_util.c Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 14:18:21 -07:00
Jason Ekstrand	36529f670f	i965; Move brw_meta_get_*_rect to brw_meta_util.c Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 14:18:21 -07:00
Jason Ekstrand	21034f1b08	i965: Move brw_is_color_fast_clear_compatible to brw_meta_util Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 14:18:21 -07:00
Jason Ekstrand	b05c68fc8a	i965: Move brw_get_rb_for_slice to brw_meta_util Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 14:18:21 -07:00
Jason Ekstrand	672cffee0f	i965/blorp: Get rid of the blorp_prog_data_int() helper The helper was initially created to allow us to set reasonable defaults as we mutated the brw_blorp_prog_data structure in preparation for NIR. Now that everything is going through brw_blorp_compile_nir_shader() which fully fills out the brw_blorp_prog_data structure, we don't need the helper. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 13:34:54 -07:00
Jason Ekstrand	c228ea8345	i965/blorp: Delete the old blorp shader emit code Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 13:34:54 -07:00
Jason Ekstrand	c18da26abf	i965/blorp: Stop doing f2i(i2f(sample_id)) NIR gets kind of awkward when you have a 3-component vector with two floats and one int. This led to us accidentally going through float for the sample index. It doesn't hurt anything but it also isn't needed. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 13:34:53 -07:00
Jason Ekstrand	e503da61c6	i965/blorp: Refactor coordinate munging The original code-flow tried to map original blorp. This puts things more where they belong and simplifies some of the logic. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 13:34:53 -07:00
Jason Ekstrand	8636937dd6	i965/blorp: Add bilinear blending support to the NIR path Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 13:34:53 -07:00
Jason Ekstrand	6bd7bd6633	i965/blorp: Add support for averaging resolves to the NIR path Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 13:34:53 -07:00
Jason Ekstrand	c7269c1551	i965/blorp: Add MSAA encode/decode support to the NIR path Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 13:34:53 -07:00
Jason Ekstrand	df8c2936cd	i965/blorp: Add support for W-[de]tiling to the NIR path Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 13:34:53 -07:00
Jason Ekstrand	6adb8d6d3a	i965/blorp: Add support for discard-based bounds checks to the NIR path Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 13:34:53 -07:00
Jason Ekstrand	4bdace0791	i965/blorp: Add initial support for NIR-based blit shaders Many of the more complex cases still fall back to the old shader builder. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 13:34:53 -07:00
Jason Ekstrand	b0275ad0c9	i965/blorp: Refactor getting the blit kernel into a helper Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 13:34:53 -07:00
Jason Ekstrand	6df3d75206	i965/blorp: Use NIR for clear shaders Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95373 Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 13:34:53 -07:00
Jason Ekstrand	bb45f42f55	i965/blorp: Create the program key in get_clear_kernel There's no reason to be passing a whole struct around just for a single boolean. We can create it later when we actually need to use it as a key. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 13:34:53 -07:00
Jason Ekstrand	c1fe8859d3	i965/blorp: Add a helper for compiling NIR shaders Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 13:34:53 -07:00
Jason Ekstrand	353eadb170	blorp: Add initial state setup support for SIMD8 dispatch Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 13:34:52 -07:00
Jason Ekstrand	cd5a2905cf	i965/blorp: Add a param array to prog_data This array allows the push constants to be re-arranged on upload. The actual arrangement will, eventually, come from the back-end compiler. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 13:34:52 -07:00
Jason Ekstrand	c46cbe19f4	i965/blorp: Add a prog_data_init helper Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-14 13:34:52 -07:00
Jason Ekstrand	50e5e1f747	i965/fs: Implement the new NIR MCS texturing Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-14 13:34:49 -07:00
Jason Ekstrand	f47faa4316	nir: Add texture opcodes and source types for multisample compression Intel hardware does a form of multisample compression that involves an auxilary surface called the MCS. When an MCS is in use, you have to first sample from the MCS with a special opcode and then pass the result of that operation into the next sample instrucion. Normally, we just do this ourselves in the back-end, but we want to expose that functionality to NIR so that we can use MCS values directly in NIR-based blorp. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-14 13:34:44 -07:00
Jason Ekstrand	87a41e862b	nir/builder: Add a helper for grabbing multiple channels from an ssa def This is similar to nir_channel except that it lets you grab more than one channel by providing a mask. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-14 13:34:40 -07:00
Jason Ekstrand	fc58cb543f	nir/builder: Generate the alu helpers directly in python There's no reason for having a macro and a python generator. We can easily just do the whole thing in python. This has the advantage that we are no longer definining ALU# macros which conflict with the ones in brw_fs_builder.h. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-14 13:34:38 -07:00
Jason Ekstrand	a0e6e5f21f	i965/fs: Use MRF0 for the repclear message This is what BLORP does. Making them match cuts down on the noise when looking at AUB diffs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-14 13:34:33 -07:00
Jason Ekstrand	5a68df87da	i965/blorp: Simplify the sample layout calculation Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-14 13:34:30 -07:00
Jason Ekstrand	bee160b31b	i965/fs: Organize prog_data by ksp number rather than SIMD width The hardware packets organize kernel pointers and GRF start by slots that don't map directly to dispatch width. This means that all of the state setup code has to re-arrange the data from prog_data into these slots. This logic has been duplicated 4 times in the GL driver and one more time in the Vulkan driver. Let's just put it all in brw_fs.cpp. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-14 13:34:25 -07:00
Jason Ekstrand	7be100ac9a	i965/gen7_wm: Move where we set the fast clear op This better matches gen8 state setup Acked-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-14 13:34:21 -07:00
Jason Ekstrand	1ec466d0ff	i965/fs: Stop setting dispatch_grf_start_reg from the visitor Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-14 13:34:18 -07:00
Jason Ekstrand	082768af30	i965/fs: Clean up the logic in compile_fs a bit Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-14 13:34:13 -07:00
Jason Ekstrand	b0f8768905	i965/state: Clean up WM/PS state to pull more things out of prog_data Now that we have a persample_shading bit in prog_data we can reduce the amount the state setup code needs to be looking at the GL state. In particular, it no longer pulls anything directly out of the gl_fragment_program and no longer depends on NEW_FRAGMENT_PROGRAM. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-14 13:34:10 -07:00
Jason Ekstrand	712a980add	i965/fs: Rework the persample shading key/prog_data bits This commit reworks and simplifies the way we handle persample shading in the shader key and prog_data. The previous approach had three different key bits that had slightly different and hard-to-decern meanings while the new bits are far more clear. This commit changes it to two easily understood bits that communicate everything we need: 1) key->persample_interp: means that the user has requested persample interpolation through the API. This is equivalent to having SAMPLE_SHADING enabled and having MIN_SAMPLE_SHADING_VALUE set high enough that you actually get multiple per-sample invocations. 2) key->multisample_fbo: means that the shader will be running on an actual multi-sampled framebuffer. This commit also adds a new "persample_dispatch" bit to prog_data which indicates that the shader should be run in persample mode. This way the state setup code doesn't have to look at the fragment program or GL state and can just pull that data out of the prog_data. In theory, this shuffle could mean more recompiles. However, in practice, we were shoving enough state into the key before that we were probably hitting a recompile on every per-sample shader anyway. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-14 13:34:05 -07:00
Jason Ekstrand	a2f50d87b6	nir: Add an info bit for uses_sample_qualifier Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-14 13:33:52 -07:00
Kenneth Graunke	59156b2e96	i965: Fix undefined df bits in brw_reg comparisons. Commit `5310bca024` added a new "double df" field to the brw_reg struct, adding an extra 4 bytes of data that isn't usually initialized (or may contain irrelevant garbage if the struct is mutated). This means that it's no longer safe to memcmp(). Instead, add a brw_regs_equal() function which ignores the extra df bits unless they matter. To keep the implementation cheap, we wrap the first set of fields in a union/struct so that we can use a single DWord comparison. v2: Drop unnecessary casts (caught by Francisco Jerez). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-05-14 00:18:37 -07:00
Dave Airlie	9f8867d877	i965: disable cull distance temporarily. I'll fix this up on Monday, so leave the docs changes in place. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-14 11:39:34 +10:00
Dave Airlie	7a6d55826e	Revert "glsl: Extend lowering pass for gl_ClipDistance to support other arrays (v4)" This reverts commit `ad355652c2`. This broke a bunch of clip tests.	2016-05-14 11:39:34 +10:00
Ian Romanick	a608e946b5	docs: Mark GL_OES_shader_io_blocks as started Watch the oes_shader_io_blocks of my fd.o Mesa GIT repo for progress. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-13 17:48:46 -07:00
Kristian Høgsberg Kristensen	4e959cf9f9	docs: update ARB_cull_distance status. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2016-05-13 16:32:14 -07:00
Kristian Høgsberg Kristensen	c564348a2e	i965: Add support for GL_ARB_cull_distance Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-13 16:28:25 -07:00
Ilia Mirkin	a1c2444792	st/mesa: flip y coordinate of interpolateAtOffset for winsys This fixes a few dEQP tests like dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_offset.no_qualifiers.default_framebuffer Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-05-13 19:17:41 -04:00
Ilia Mirkin	0d8e850195	glsl: make sure that textureProj(bias) variants are only exposed in fs Many were already marked as fs_only, but not all. This fixes the remaining ir_txb entries. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-13 19:17:26 -04:00
Ilia Mirkin	37c8f4c609	glsl: be more strict when validating shader inputs interpolateAt* can only take input variables or an element of an input variable array. No structs. Further, GLSL 4.40 relaxes the requirement to allow swizzles, so enable that as well. This fixes the following dEQP tests: dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_sample.negative.interpolate_struct_member dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_centroid.negative.interpolate_struct_member dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_offset.negative.interpolate_struct_member Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Chris Forbes <chrisforbes@google.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-13 19:17:26 -04:00
Ilia Mirkin	5239f1e0c9	glsl: make sure that interpolateAt arguments are variables In the case of a constant, it might have been propagated through and variable_referenced() returns NULL. Error out in that case. Fixes 3 dEQP tests: dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_sample.negative.interpolate_constant dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_centroid.negative.interpolate_constant dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_offset.negative.interpolate_constant Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Chris Forbes <chrisforbes@google.com>	2016-05-13 19:17:26 -04:00
Tobias Klausmann	8f45f4f3ca	mesa/st: Add support for GL_ARB_cull_distance (v2) v2: don't bother with cull dist varyings except to assert. Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-14 08:28:23 +10:00
Tobias Klausmann	2be258ea18	gallium: Add a pipe cap for arb_cull_distance This lets us safely enable or disable the extension as needed Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-14 08:28:17 +10:00
Tobias Klausmann	d656736bbf	glsl: Add arb_cull_distance support (v3) v2: make too large array a compile error v3: squash mesa/prog patch to avoid static compiler errors in bisect Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-14 08:28:08 +10:00
Tobias Klausmann	ad355652c2	glsl: Extend lowering pass for gl_ClipDistance to support other arrays (v4) This will come in handy when we want to lower gl_CullDistance into gl_CullDistanceMESA. [airlied: drop separate APIs for clip/cull - just use single API to call both passes.] v3: reexamine my sanity, this was pretty broken, the new code creates one copy of gl_ClipDistanceMESA, as the clip distance varying and lowers everything into that in two passes, one for clips one for culls. v4: rework using the passes in clip/cull sizes, instead of the array sizes. Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-14 08:28:07 +10:00
Dave Airlie	dd3390e12f	glsl: rename lower_clip_distance to lower_distance. This just renames the file in anticipation of adding cull lowering, and renames the internals. Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-14 08:27:40 +10:00
Tobias Klausmann	eb18fea707	mesa/main: Add support for GL_ARB_cull_distance (v2) airlied: v2: rename LowerClipDistance to LowerCombinedClipCullDistnace. I don't think we want any other behaviour with any current hw. Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-14 08:27:29 +10:00
Tobias Klausmann	f2a2e08e01	glapi: Add GL_ARB_cull_distance Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-14 08:27:10 +10:00
Nanley Chery	6674d018f7	anv/copy: Fix copying Images from Buffers with larger dimensions This function previously assumed that the Buffer and Image had matching dimensions. However, it is possible to copy from a Buffer with larger dimensions than the Image. Modify the copy function to enable this. v2: Use ternary instead of MAX for setting bufferExtent (Jason Ekstrand) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95292 Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Tested-by: Matthew Waters <matthew@centricular.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-13 14:08:57 -07:00
Maarten Lankhorst	ff5c312623	.mailmap: Fix my email addresses. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@ubuntu.com>	2016-05-13 12:28:05 +02:00
Nicolai Hähnle	a694c20ecf	radeonsi/sid_tables: rename reg_table to sid_reg_table This is purely cosmetic, making it easier to assign blame for space used in the binary in case somebody else makes a similar cleanup effort in the future. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-05-13 01:03:39 -05:00
Nicolai Hähnle	c7f73a70f0	radeonsi/sid_tables: store offset into global fields table instead of pointer This avoids relocations in the final binary. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-05-13 01:03:39 -05:00
Nicolai Hähnle	54ab39caaf	radeonsi/sid_tables: store strings by offset instead of by pointer This saves some space and avoids the need for relocations. Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-05-13 01:03:39 -05:00
Nicolai Hähnle	ca8f71f4cb	r600: remove TABLE_SIZE macro Use ARRAY_SIZE instead. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-05-13 01:03:38 -05:00
Nicolai Hähnle	43ac091e4c	r600: move alu_op_table to .c file So that it gets compiled and emitted only once, saving space is the final binary. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-05-13 01:03:38 -05:00
Nicolai Hähnle	390c740b99	r600: move cf_op_table to .c file So that it gets compiled and emitted only once, saving space is the final binary. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-05-13 01:03:38 -05:00
Nicolai Hähnle	a180e1d22d	r600: move fetch_op_table to .c file So that it gets compiled and emitted only once, saving space is the final binary. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-05-13 01:03:37 -05:00
Nicolai Hähnle	6d350fb13f	r600: protect r600_isa.h with extern "C" Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-05-13 01:03:37 -05:00
Bas Nieuwenhuizen	ac77fb74a0	gallium/ddebug: Implement launch_grid. Does not implement dumping info. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-13 07:43:46 +02:00
Bas Nieuwenhuizen	22b35122fa	gallium/ddebug: Support compute states. v2: Reuse the macro for bind & delete. Note that may not be able to share the delete long-term as pipe_compute_state contains members not in pipe_shader_state, and we need to distinguish the pointer location if we add that struct to the union. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-13 07:43:37 +02:00
Bas Nieuwenhuizen	5efe477b13	gallium/ddebug: Add passthrough for get_compute_param. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-13 07:39:12 +02:00
Ian Romanick	8f05a0a4c0	nir: Remove empty visit_call_src and visit_load_const_src functions The guts were removed in `dfb3abba`. It has been almost exactly a year, so I dont think we're going to "decide we want [predication] back." Silences several "unused parameter" warnings: nir/nir.c: In function ‘visit_call_src’: nir/nir.c:1052:32: warning: unused parameter ‘instr’ [-Wunused-parameter] visit_call_src(nir_call_instr instr, nir_foreach_src_cb cb, void state) ^ nir/nir.c:1052:58: warning: unused parameter ‘cb’ [-Wunused-parameter] visit_call_src(nir_call_instr instr, nir_foreach_src_cb cb, void state) ^ nir/nir.c:1052:68: warning: unused parameter ‘state’ [-Wunused-parameter] visit_call_src(nir_call_instr instr, nir_foreach_src_cb cb, void state) ^ nir/nir.c: In function ‘visit_load_const_src’: nir/nir.c:1058:44: warning: unused parameter ‘instr’ [-Wunused-parameter] visit_load_const_src(nir_load_const_instr instr, nir_foreach_src_cb cb, ^ nir/nir.c:1058:70: warning: unused parameter ‘cb’ [-Wunused-parameter] visit_load_const_src(nir_load_const_instr instr, nir_foreach_src_cb cb, ^ nir/nir.c:1059:28: warning: unused parameter ‘state’ [-Wunused-parameter] void *state) ^ v2: Add some comments in nir_foreach_src suggested by Jason. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Cc: Connor Abbott <cwabbott0@gmail.com>	2016-05-12 16:47:14 -07:00
Ian Romanick	098166e1bc	nir: Silence unused parameter warnings These cases had the parameter removed: nir/nir_lower_vec_to_movs.c: In function ‘try_coalesce’: nir/nir_lower_vec_to_movs.c:124:66: warning: unused parameter ‘shader’ [-Wunused-parameter] try_coalesce(nir_alu_instr vec, unsigned start_idx, nir_shader shader) ^ nir/nir_lower_io.c: In function ‘load_op’: nir/nir_lower_io.c:147:32: warning: unused parameter ‘state’ [-Wunused-parameter] load_op(struct lower_io_state state, ^ These cases had the parameter (void) silenced because the parameter was necessary for an interface: nir/glsl_to_nir.cpp:1900:32: warning: unused parameter 'ir' [-Wunused-parameter] nir_visitor::visit(ir_barrier ir) ^ nir/nir.c: In function ‘remove_use_cb’: nir/nir.c:802:35: warning: unused parameter ‘state’ [-Wunused-parameter] remove_use_cb(nir_src src, void state) ^ nir/nir.c: In function ‘remove_def_cb’: nir/nir.c:811:37: warning: unused parameter ‘state’ [-Wunused-parameter] remove_def_cb(nir_dest dest, void state) ^ Number of total warnings in my build reduced from 2543 to 2538 (reduction of 5). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-12 16:46:41 -07:00
Leo Liu	bd9ae72459	vl/dri: fix close fd error out fd should be set to -1 only if it got closed by pipe_loader_release. Signed-off-by: Leo Liu <leo.liu@amd.com>	2016-05-12 18:26:48 -04:00
Samuel Pitoiset	988b09f9ac	nvc0: fix indentation in nvc0_invalidate_resource_storage() Trivial. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-12 21:37:08 +02:00
Samuel Pitoiset	abb3401095	nvc0: save some CPU cycles in nvc0_context_unreference_resources() This reduces the number of loop iterations for invalidating buffers and images. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-12 21:37:08 +02:00
Samuel Pitoiset	b8f0b00a9a	nvc0: invalidate texture buffers for compute This is a pretty rare situation but this can happen though. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-12 21:37:08 +02:00
Tim Rowley	2785f2f2d7	swr: properly expose compressed format support Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-12 14:12:18 -05:00
Jason Ekstrand	5186545d66	anv: Don't advertise shaderImageGatherExtended We don't actually support all of the extended gather functionality so we shouldn't be advertising it.	2016-05-12 10:57:00 -07:00
Rob Clark	9d3cc80b75	nir: glsl_get_bit_size() should take glsl_type It's what all the call-sites once, so gets rid of a bunch of inlined glsl_get_base_type() at the call-sites. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-12 13:39:40 -04:00
Topi Pohjolainen	b19cff1639	i965/gen9: Enable lossless compression I tried first creating the auxiliary buffer the same time with the color buffer. That, however, led me into a situation where we would later create the rest of the mip-levels and the compression would need to be disabled (it is only supported for single level buffers). Here we try to create it on demand just before the hardware starts to render. This is similar what we do with fast clear buffers, their creation is deferred until the first clear. This setup also gives the opportunity to detect if the miptree represents the temporaty texture used internally in the mesa core. This texture is mostly written by cpu and therefore enabling compression for it doesn't make much sense. Note that a heuristic is included. Floating point formats are not enabled yet as they are only seen to hurt performance. Some highlights with window system driver kept fixed to default and only the application driver changing: Manhattan: 8.32152% +/- 0.355881% Offscreen: 9.09713% +/- 0.340763% Glb trex: 8.46231% +/- 0.460624% Offscreen: 9.31872% +/- 0.463743% v2 (Ben): Re-use msaa layout type for single sampled case. v3: Moved the deferred allocation of mcs to brw_try_draw_prims() and brw_blorp_blit_miptrees() instead. v4: (Ken): Drop MIPTREE_LAYOUT_ACCELERATED_UPLOAD when allocating mcs. Do not enable for scanout buffers Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-05-12 19:49:37 +03:00
Topi Pohjolainen	cd9e97a020	i965: Set render state for lossless compressed v2: Add support for blorp and removed the support for meta v3 (Ben): Add assertion on compressed non-fast clear - must be partial clear. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-05-12 19:49:37 +03:00
Topi Pohjolainen	cda8c2a911	i965/wm: Don't sample lossless compressed as multisampled Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-05-12 19:49:37 +03:00
Topi Pohjolainen	683dda0083	i965/gen9: Setup MCS for compressed texture surfaces Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-05-12 19:49:37 +03:00
Topi Pohjolainen	1a05aeeb1c	i965/blorp: Do not resolve lossless compressed blit sources Blorp blits use sampling engine which is capable of resolving on the fly. Buffers are still resolved for blitter engine. Current understanding is that blitter doesn't understand lossless compression. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-05-12 19:49:37 +03:00
Topi Pohjolainen	01ba26d0b0	i965/blorp: Prepare blits for lossless compression Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-05-12 19:49:37 +03:00
Topi Pohjolainen	84066ebd63	i965: Deferred allocation of mcs for lossless compressed Until now mcs was associated to single sampled buffers only for fast clear purposes and it was therefore the responsibility of the clear logic to allocate the aux buffer when needed. Now that normal 3D render or blorp blit may render with mcs enabled also, they need to prepare the mcs just as well. v2: Do not enable for scanout buffers v3 (Ben): - Fix typo in commit message. - Check for gen < 9 and return early in brw_predraw_set_aux_buffers() - Check for gen < 9 and return early in intel_miptree_prepare_mcs() v4: Check for msaa_layput and number of samples to determine if lossless compression is to used. Otherwise one cannot distuingish between fast clear with and without compression. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-05-12 19:49:26 +03:00
Topi Pohjolainen	1ca02b6ebb	i965: Add flag telling if miptree is for client consumption Consider later on adding specific disable flags such as MIPTREE_LAYOUT_DISABLE_AUX_MCS = 1 << 3, /* CCS_D */ MIPTREE_LAYOUT_DISABLE_AUX_CCS_E = 1 << 4, MIPTREE_LAYOUT_DISABLE_AUX = MIPTREE_LAYOUT_DISABLE_AUX_MCS \| MIPTREE_LAYOUT_DISABLE_AUX_CCS_E, and equivalent boolean/enums into miptree. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-05-12 19:49:22 +03:00
Topi Pohjolainen	a6e0f1cc7f	i965: Add helper for lossless compression support v2: Check explicitly against base type of GL_FLOAT instead of using _mesa_is_format_integer_color(). Otherwise we miss GL_UNSIGNED_NORMALIZED. v3 (Ben): Also call intel_miptree_supports_non_msrt_fast_clear() in order to really check everything. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-05-12 19:49:22 +03:00
Topi Pohjolainen	874c5f05db	i965/gen9: Prepare surface state setup for lossless compression v2 (Ben): Use combination of msaa_layout and number of samples instead of introducing explicit type for lossless compression (intel_miptree_is_lossless_compressed()). v3 (Ben): Do not set fast claer state in surface state setup. Moved into brw_postdraw_set_buffers_need_resolve() using a separate patch. v4: Support for blorp v5 (Ben): Re-use gen8_get_aux_mode() Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-05-12 19:49:22 +03:00
Topi Pohjolainen	a8544267fd	i965/gen8: Expose auxiliary mode resolver Also use the opportunity to drop the unused surface type argument. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-05-12 19:49:22 +03:00
Topi Pohjolainen	94926492d8	i965: Relax assertion of halign == 16 for lossless compressed aux Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-05-12 19:49:22 +03:00
Topi Pohjolainen	ba9f954e60	i965/blorp: Set full resolve for lossless compressed v2 (Ben): Introduce union for fast clear and resolve ops Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-05-12 19:49:22 +03:00
Topi Pohjolainen	58e7392e12	i965/blorp: Do not skip fast color clear with new color This hasn't been visible before. It showed up with lossless compression with: dEQP-GLES3.functional.fbo.color.repeated_clear.sample.tex2d.rgb8 Current fast clear logic kicks color resolves even for gpu sampling. In the test case this results into trashing of the fast color clear state between two subsequent clears, and therefore each clear is performed correctly. With lossless compression the resolves are unnecessary and therefore the clear state indicates that the buffer is already cleared. Without considering if the previous color value was the same as the new, clears that need to be performed are skipped and the buffer ends up holding old pixel values. v2 (Ken): Fix the comparison for gen < 9 Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-05-12 19:48:47 +03:00
Kenneth Graunke	12dcad1b42	i965: Enable scalar GS by default. I'd originally left this off because Orbital Explorer was hanging the GPU, but it seems to be working these days. There have been a bunch of changes since then, so we probably fixed something. On my Broadwell laptop, both Synmark/GSCloth and Orbital Explorer seem to run at approximately the same framerate in either mode. This is despite large reductions in instruction count for Synmark, and large increases for Orbital Explorer. It apparently just doesn't matter. Switching to scalar mode will gain us fp64 support in the next release, as vec4-mode support isn't yet ready. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-12 01:01:42 -07:00
Kenneth Graunke	607fb0f13d	i965: Reduce the SIMD8 GS push constant threshold from 32 to 24. Three Shadow of Mordor geometry shaders increase by a single instruction, but the number of spills/fills in Orbital Explorer is reduced from 194:1279 -> 82:454. No other programs are affected. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-12 01:01:42 -07:00
Kenneth Graunke	3aa542c657	i965: Delete bogus assertion in emit_gs_input_load(). This looks like leftover cruft from an earlier attempt at writing point size hacks. Each vertex has its own copy of gl_PointSize, so accessing any vertex other than 0 would cause this to fail. The tests seem to work fine without it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-12 01:01:42 -07:00
Kenneth Graunke	1c41cb58de	i965: Support instanced GS inputs in the scalar backend. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-12 01:01:36 -07:00
Kenneth Graunke	5fc3772650	i965: Use an early return for the push case in emit_gs_input_load(). Just trying to keep things from getting too ugly in the next commit. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-12 00:59:08 -07:00
Kenneth Graunke	e9ca952581	i965: Drop BRW_NEW_BLORP from stipple and line parameter packets. BLORP never touches these, and they're all non-pipelined. Some are fairly large packets as well. I haven't tried to benchmark this; the effect is likely to be small. However, we may as well stop the pointless papercuts; maybe they'll add up someday. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-12 00:54:37 -07:00
Jakob Sinclair	18f7c88dd6	glsl: fixed uninitialized pointer Class "ir_constant" had a bunch of constructors where the pointer member "array_elements" had not been initialized. This could have lead to unsafe code if something had tried to write anything to it. This patch fixes this issue by initializing the pointer to NULL in all the constructors. This issue was discovered by Coverity. CID: 401603, 401604, 401605, 401610 Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-05-12 09:46:36 +02:00
Ilia Mirkin	ba3f0b6d59	nvc0: fix gl_SampleMaskIn computation The SAMPLEMASK semantic should only return the bits set covered by the current invocation. However we were always retrieving the covmask, which returns the covered samples of the whole pixel. When not doing per-sample invocation, this is precisely what we want. However when doing per-sample invocation, we have to select the sampleid'th bit and only return that. Furthermore, this means that we have to have a 1:1 correlation for invocations and samples. This fixes most dEQP-GLES31.functional.shaders.sample_variables.sample_mask_in.* tests. A few failures remain due to disagreements about nr_samples==1 logic as well as what happens with MSAA x2 RTs when the shading fraction is 0.5. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-11 20:39:27 -04:00
Ilia Mirkin	f5fe903002	nv50/ir: generalize interp fixups to be able to fixup anything Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-11 20:39:26 -04:00
Jason Ekstrand	66a442687f	.mailmap: Update the e-mail addresses for Kristian Høgsberg This changes it to use his personal e-mail and adds his @intel.com address Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-11 12:32:14 -07:00
Jason Ekstrand	7e759fbd60	.mailmap: Use Connor Abbott's personal e-mail	2016-05-11 12:27:15 -07:00
Giuseppe Bilotta	9c3392cb3a	Add .mailmap This adds a first tentative .mailmap file, to canonicize contributor name/emails in shortlogs and other statistical endeavours. Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-11 12:21:46 -07:00
Jason Ekstrand	f1dcc7976a	i965: Stop splitting fma() prior to optimization According to the GLSL spec, if the user uses the fma() intrinsic to generate a precise-consumed value, and you have it in your hardware, you shouldn't split it. For a while now, we've been splitting all ffma's up-front and then planned to fuse them later which isn't valid. Correctly handling the GLSL behaviour fixes rendering corruptions in Tomb Raider. The only reason why doing this possibly helped before was for ARB programs which is handled by the previous commit. Shader-db results on Haswell: total instructions in shared programs: 7560300 -> 7561510 (0.02%) instructions in affected programs: 56265 -> 57475 (2.15%) helped: 86 HURT: 291 The only shaders in the database that are affected are from "Shadow of Mordor" which is the first app in our database to use fma(). We could, at some point in the future, split inexact ffma opcodes which would fix the shader-db regressions since Shadow of Mordor doesn't ues precise. However, this fixes a bug now and and the shader-db impact is fairly small. Reported-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-11 11:44:35 -07:00
Jason Ekstrand	47f01e538a	ptn: Emit mul+add for MAD Unlike fma() in GLSL, MAD in ARB programs is 100% splittable. Just emit the split version and let the optimizer fuse them later. Shader-db results on Haswell: total instructions in shared programs: 7560379 -> 7560300 (-0.00%) instructions in affected programs: 143928 -> 143849 (-0.05%) helped: 443 HURT: 250 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-11 11:44:35 -07:00
Jason Ekstrand	1b72c31e1f	nir/algebraic: Separate ffma lowering from fusing The i965 driver has its own pass for fusing mul+add combinations that's much smarter than what nir_opt_algebraic can do so we don't want to get the nir_opt_algebraic one just because we didn't set lower_ffma. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-11 11:44:35 -07:00
Rob Clark	5886d1bad1	anv: fix build break Previous rename of lower-output-to-temps pass predated merging of anv, and apparently vulkan wasn't enabled in my local builds so overlooked this when rebasing. Reported-by: Mark Janes <mark.a.janes@intel.com> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-11 14:03:24 -04:00
Rob Clark	697382eb61	mesa/st: split the type_size calculation into it's own file We'll want to re-use this for NIR. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-11 12:20:12 -04:00
Rob Clark	0e5a369879	glsl: export accessor for builtin-uniform descriptors We'll need this for a nir pass to lower builtin-uniform access. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-11 12:20:12 -04:00
Rob Clark	dfbabc6bad	nir/lower-io: add support for lowering inputs Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-11 12:20:11 -04:00
Rob Clark	595f9d5476	nir/lower-io: split out some helper fxns Prep work to reduce the noise in the next patch. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-11 12:20:11 -04:00
Rob Clark	b085016f94	nir: rename lower_outputs_to_temporaries -> lower_io_to_temporaries Since it will gain support to lower inputs, give it a more generic name. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-11 12:20:11 -04:00
Rob Clark	47fcef9a20	nir: move callsite of lower_outputs_to_temporaries Going to convert this pass to parameterized lower_io_to_temporaries, and we want the user to be able to specify whether to lower outputs or inputs or both. The restriction of running this pass before validate to avoid output reads no longer applies. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-11 12:20:11 -04:00
Rob Clark	5261947260	nir: lower-io-types pass A pass to lower complex (struct/array/mat) inputs/outputs to primitive types. This allows, for example, linking that removes unused components of a larger type which is not indirectly accessed. In the near term, it is needed for gallium (mesa/st) support for NIR, since only used components of a type are assigned VBO slots, and we otherwise have no way to represent that to the driver backend. But it should be useful for doing shader linking in NIR. v2: use glsl_count_attribute_slots() rather than passing a type_size fxn pointer Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-11 12:20:11 -04:00
Rob Clark	b10cc24519	nir: passthrough-edgeflags support Handled by tgsi_emulate for glsl->tgsi case. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-05-11 12:20:11 -04:00
Rob Clark	3a939d034e	nir: add lowering pass for glBitmap Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-05-11 12:20:11 -04:00
Rob Clark	12c18ce476	nir: add lowering pass for glDrawPixels Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-05-11 12:20:11 -04:00
Rob Clark	b26645a00f	nir: add lowering pass for y-transform Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-05-11 12:20:11 -04:00
Rob Clark	e1d80f8603	gallium: add NIR as a possible IR Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-11 12:20:11 -04:00
Rob Clark	425dc4c4b3	gallium: refactor pipe_shader_state to support multiple IR's The goal is to allow the pipe driver to request something other than TGSI, but detect whether what is getting is TGSI vs what it requested. The pipe drivers will always have to support TGSI (and convert that into whatever it is that they prefer), but in some cases we should be able to skip the TGSI intermediate step (such as glsl->nir vs glsl->tgsi->nir). I think pipe_compute_state should get similar treatment. Currently, afaict, it has one user and one consumer, which has allowed it to be sloppy wrt. supporting alternative IR's. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-11 12:20:11 -04:00
Rob Clark	4500d17245	freedreno: fix multi-layer transfer_map's The use of transfer_inline_write() in TexSubImage path (see `fb9fe352ea`) exposed a bug for "layer_first" resources (ie. a4xx) not setting correct layer_stride. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-11 12:03:21 -04:00
Juan A. Suarez Romero	9bea018994	glsl: use var with initializer on global var validation Currently, when cross validating global variables, all global variables seen in the shaders that are part of a program are saved in a table. When checking a variable this already exist in the table, we check both are initialized to the same value. If the already saved variable does not have an initializer, we copy it from the new variable. Unfortunately this is wrong, as we are modifying something it is constant. Also, if this modified variable is used in another program, it will keep the initializer, when it should have none. Instead of copying the initializer, this commit replaces the old variable with the new one. So if we see again the same variable with an initializer, we can compare if both are the same or not. v2: convert tabs in whitespaces (Kenenth Graunke) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-11 13:50:04 +02:00
Jordan Justen	2c1c060b03	util/ralloc: Remove double zero'ing of rzalloc buffers Juha-Pekka found this back in May 2015: <1430915727-28677-1-git-send-email-juhapekka.heikkila@gmail.com> From the discussion, obviously it would be preferable to make ralloc_size no longer return zeroed memory, but Juha-Pekka found that it would break Mesa. In <56AF1C57.2030904@gmail.com>, Juha-Pekka mentioned that patches exist to fix i965 when ralloc_size is fixed to not zero memory, but the patches have not made their way to mesa-dev yet. For now, let's stop doing the double zeroing of rzalloc buffers. v2: * Move ralloc_size code to rzalloc_size, and add a comment as suggested by Ken. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 22:54:46 -07:00
Jonathan Gray	e3d43dc5ea	genxml: avoid using a GNU make pattern rule % pattern rules are a GNU extension. Convert the use of one to a inference rule to allow this to build on OpenBSD. v2: inference rules can't have additional prerequisites so add a target rule to still depend on gen_pack_header.py Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-10 20:54:33 -07:00
Roland Scheidegger	430797843a	gallivm: improve dumping of bitcode Use GALLIVM_DEBUG=dumpbc for dumping of modules as bitcode. Instead of a fixed llvmpipe.bc name, use ir_<modulename>.bc so multiple modules can be dumped (albeit it might still overwrite previous modules, particularly the modules from draw tend to always have the same name). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-05-11 04:43:35 +02:00
Vinson Lee	8d639138c7	swr: [rasterizer] Include cmath for std::isnan and std::isinf. This patch fixes this build error. CXX rasterizer/memory/libswrAVX_la-ClearTile.lo In file included from rasterizer/memory/ClearTile.cpp:34:0: ./rasterizer/memory/Convert.h: In function ‘uint16_t Convert32To16Float(float)’: ./rasterizer/memory/Convert.h:170:9: error: ‘__builtin_isnan’ is not a member of ‘std’ if (std::isnan(val)) ^ ./rasterizer/memory/Convert.h:170:9: note: suggested alternative: <built-in>: note: ‘__builtin_isnan’ ./rasterizer/memory/Convert.h:176:14: error: ‘__builtin_isinf_sign’ is not a member of ‘std’ else if (std::isinf(val)) ^ ./rasterizer/memory/Convert.h:176:14: note: suggested alternative: <built-in>: note: ‘__builtin_isinf_sign’ Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95180 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-05-10 17:11:05 -07:00
Jason Ekstrand	a5660bf1f8	i965/blorp: Don't blend integer values during MSAA resolves Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-10 15:32:00 -07:00
Jason Ekstrand	4f4f393bf3	meta/blit: Don't blend integer values during MSAA resolves Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-10 15:31:50 -07:00
Jason Ekstrand	203c786a73	i965/fs: Default all constants to a location of -1 Otherwise constants which aren't live get an undefined constant location. When we go to set up param and pull_param we end up assigning all unused uniforms to slot 0. This cases the Vulkan driver to segfault because it doesn't have pull_param. This fixes bugs in the Vulkan driver introduced in `c3fab3d000`. Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2016-05-10 15:25:30 -07:00
Dave Airlie	d36d11ad90	st/glsl_to_tgsi: attach image to correct instruction for samples This fixes a crash (but not the test): GL45-CTS.shader_texture_image_samples_tests.functional_test Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-11 06:55:09 +10:00
Dave Airlie	07df3b81ff	mesa: move MESA_MAP_NOWAIT_BIT up away from GL_MAP_PERSISTENT_BIT This was colliding badly and making GL45-CTS.buffer_storage.map_persistent_texture fail on radeonsi. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-11 06:54:56 +10:00
Dave Airlie	b230d51a18	mesa/meta: check for signed/unsigned int conversion for pbo getteximage When doing GetTexSubImage using a PBO, we should check if it involves a signed/unsigned conversion and bail if it does, just like in the other cases. This fixes: GL33-CTS.gtf32.GL3Tests.packed_pixels.packed_pixels_pbo on Haswell at least. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95324 Reviewed-by: Matt Turer <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-11 06:52:20 +10:00
Matt Turner	8bb156a261	i965: Handle BRW_OPCODE_DO on Gen6+ in brw_instruction_name(). This became a problem after the recent disassembler changes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 12:12:46 -07:00
Bas Nieuwenhuizen	3d21720d31	radeonsi: Set declared tessellation LDS size to hardware size. The calculated limit gave problems on SI as it was > 32 KiB and the hardware LDS size on SI is only 32 KiB. It isn't correct anyway when processing multiple patches in a threadgroup. As we potentially have any number of patches such that the used LDS is at most the hardware LDS size, and exact size per patch is not known at compile time, this seems like the only valid bound. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-10 20:14:55 +02:00
Rob Clark	8623e599fc	freedreno/ir3: size input/output arrays properly We index into these based on var->data.driver_location, which might have gaps (ie. two inputs, one w/ drvloc 0 and other 2). This shows up in (for example) 'bin/copyteximage 1D', but was only noticed recently due to additional asserts. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-10 13:17:27 -04:00
Ian Romanick	2483a9a08c	ir_to_mesa: Emit smarter ir_binop_logic_or for vertex programs Continue using ADD in the other case because a fragment shader backend could fuse the ADD with a MUL to generate a MAD for ((x && y) \|\| z). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-10 09:22:18 -07:00
Ian Romanick	f7328f9afd	prog: Delete all remains of OPCODE_SNE, OPCODE_SEQ, OPCODE_SGT, and OPCODE_SLE There is nothing left that can generate them. These used to be generated by ir_to_mesa or by the assembler for various NV extensions that have been removed. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-10 09:22:18 -07:00
Ian Romanick	fd63e77998	ir_to_mesa: Do not emit OPCODE_SEQ or OPCODE_SNE Nothing that consumes the output of this backend consumes them navtively. This is not the way i915 has implemented these instructions, but, as far as I am able to tell, this is the way both the Cg compiler and the HLSL compiler implement these operations. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-10 09:22:18 -07:00
Ian Romanick	15e6a1a3be	ir_to_mesa: Do not emit OPCODE_SLE or OPCODE_SGT Nothing that consumes the output of this backend consumes them navtively. This is the way i915 has implemented these instructions since it began consuming GLSL. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-10 09:22:18 -07:00
Samuel Pitoiset	e46ac18ebe	nvc0: enable compute support by default on GK110+ Compute support seems to be pretty stable now, and according to piglit it doesn't seem to break 3D state. As a side effect, this will expose ARB_compute_shader on GK110/GK208. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-10 17:47:01 +02:00
Marek Olšák	2b58bc4461	gallium/radeon: don't flush the GFX IB if DMA doesn't depend on it Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	fb89f06698	radeonsi: consolidate radeon_add_to_buffer_list calls for DMA Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	60946c0d60	gallium/radeon: add a heuristic for better (S)DMA performance Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	bb74152597	gallium/radeon: flush if DMA IB memory usage is too high This prevents IB rejections due to insane memory usage from many concecutive texture uploads. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	70934de00e	radeonsi: add new SDMA texture copy code This implements: - Linear-to-linear partial copies. (unaligned) - Tiled-to-linear and linear-to-tiled partial copies. (unaligned except 1-2 Bpp) - Tiled-to-tiled partial copies aligned to 8x8. v2: Extend the SDMA L2T VM fault workaround to T2L. - Same algorithm, just applied to T2L. (and using a 0-based address and surface.bo_size instead of buf->size) Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	a512da36ae	gallium/radeon: fix (S)DMA read-after-write hazards Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	f837c37f02	radeonsi: raise the max size for SDMA buffer copies Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	faa4f0191d	radeonsi: remove SDMA texture copy code Most of this has never worked according to the new test. The new code will be radically different. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	498a40cae8	radeonsi: only expose _init_dma_functions from (S)DMA files just normalizing the interfaces Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	3af28e558f	gallium/radeon: implement randomized SDMA texture copy testing (v2) v2: - adjustments for exercising all important SDMA code paths - decrease the probability of getting huge sizes (faster testing) - increase the probability of getting power-of-two dimensions - change the memory cap to 128MB (faster testing) - better detect which engine has been used Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	f475c9fb07	gallium/radeon: discard CMASK or DCC if overwriting a whole texture by DMA v2: simplify the conditionals Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	2f173b8e13	gallium/radeon: use a common function for DMA blit preparation this is more robust and probably fixes some bugs already Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	2af4b637d8	gallium/radeon: split out code for discarding DCC Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	c85d0c17d9	gallium/radeon: rename r600_texture_disable_cmask -> discard_cmask because it doesn't decompress Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	fb9fe352ea	st/mesa: use transfer_inline_write for memcpy TexSubImage path This allows drivers to use their own fast path for texture uploads. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	871d2aff24	gallium/radeon: fix partial layered transfers of cube (array) textures a staging cube texture with array_size % 6 != 0 doesn't work very well just use 2D_ARRAY or 2D for all staging textures Cc: 11.1 11.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	c2377b394b	gallium/radeon: align alignments for better buffer reuse It's for the buffer cache. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	544967faf5	gallium/radeon: use gart_page_size instead of hardcoded 4096 Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	bfa8a00920	winsys/radeon: use gart_page_size instead of private size_align Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	9d8c283f28	winsys/amdgpu: move gart_page_size to struct radeon_winsys Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Roland Scheidegger	e4cf8717de	gallivm: print declarations of intrinsics with GALLIVM_DEBUG=ir Those aren't really interesting, however outputting them is helpful when trying to feed the IR to llvm llc (or opt) for debugging. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-05-10 17:08:16 +02:00
Roland Scheidegger	5c200894c8	gallivm: use InternalLinkage instead of PrivateLinkage for texture functions At least with MCJIT the disassembler will crash otherwise when trying to disassemble such functions. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-05-10 17:08:16 +02:00
Roland Scheidegger	8b66e2647d	gallivm: disable avx512 features We don't target this yet, and some llvm versions incorrectly enable it based on cpu string, causing crashes. (Albeit this is a losing battle, it is pretty much guaranteed when the next new feature comes along llvm will mistakenly enable it on some future cpu, thus we would have to proactively disable all new features as llvm adds them.) This should fix https://bugs.freedesktop.org/show_bug.cgi?id=94291 (untested) Tested-by: Timo Aaltonen <tjaalton@ubuntu.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com CC: <mesa-stable@lists.freedesktop.org>	2016-05-10 17:08:16 +02:00
Jose Fonseca	94e8653a3b	Revert "nir: Try to warn when C99 extensions are used in nir headers." This reverts commit `99474dc29b`. -Wpedantic is too verbose, even when applied to just a few includes. We'll just have to deal with the issues as they come. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-10 03:29:24 -07:00
Samuel Iglesias Gonsálvez	4c9006f957	i965/fs: fix MOV_INDIRECT exec_size for doubles In that case, the writes need two times the size of a 32-bit value. We need to adjust the exec_size, so it is not breaking any hardware rule. v2: - Add an assert to verify type size is not less than 4 bytes (Jordan). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:09 +02:00
Samuel Iglesias Gonsálvez	75ada43a3a	i965/fs: take into account doubles when calculating read_size for MOV_INDIRECT v2: - Fix assert's line width (Topi). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-10 11:25:09 +02:00
Samuel Iglesias Gonsálvez	03687ab77f	i965/fs: demote_pull_constants() did not take into account double types The constants could be double, and it was allocating size for float types for the destination register of varying pull constant loads. Then the fs_visitor::validate() will complain. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-10 11:25:09 +02:00
Samuel Iglesias Gonsálvez	c3fab3d000	i965/fs: push first double-based uniforms in push constant buffer When there is a mix of definitions of uniforms with 32-bit or 64-bit data type sizes, the driver ends up doing misaligned access to double based variables in the push constant buffer. To fix this, this patch pushes first all the 64-bit variables and then the rest. Then, all the variables would be aligned to its data type size. v2: - Fix typo and improve comment (Jordan). - Use ralloc(NULL,...) instead of rzalloc(mem_ctx,...) (Jordan). - Fix typo (Topi). - Use pointers instead of references in set_push_pull_constant_loc() (Topi). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-05-10 11:25:09 +02:00
Iago Toral Quiroga	193cb67a84	i965/fs: recognize writes with a subreg_offset > 0 as partial Usually, writes to a subreg_offset > 0 would also have a stride > 1 and we would recognize them as partial, however, there is one case where this does not happen, that is when we generate code for 64-bit imemdiates in gen7, where we produce something like this: mov(8) vgrf10:UD, <low 32-bit> mov(8) vgrf10+0.4:UD, <high 32-bit> and then we use the result with a stride of 0, as in: mov(8) vgrf13:DF, vgrf10<0>:DF Although we could try to avoid this issue by producing different code for this by using writes with a stride of 2, that runs into other problems affecting gen7 and the fact is that any instruction that writes to a subreg_offset > 0 is a partial write so we should really recognize them as such. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:09 +02:00
Iago Toral Quiroga	34ed61b334	i965/fs/lower_simd_width: Fix registers written for split instructions When the original instruction had a stride > 1, the combined registers written by the split instructions won't amount to the same register space written by the original instruction because the split instructions will use a stride of 1. The current code assumed otherwise and computed the number of registers written by split instructions as an equal share based on the relation between the lowered width and the original execution size of the instruction. It is only after the split, when we interleave the components of the result from the lowered instructions back into the original dst register, that the original stride takes effect and we write all the registers specified by the original instruction. Just make the number of register written the same as the vgrf space we allocate for the dst of the split instruction. Fixes crashes in fp64 tests produced as a result of assigning incorrectly the number of registers written by split instructions, which led to incorrect validation of the size of the writes against the allocated vgrf space. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:09 +02:00
Iago Toral Quiroga	9741cff1ec	i965/fs: rename our lower_d2f pass to lower_d2x Since it no longer handles conversions from double to float but from double to various other 32-bit types. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:09 +02:00
Iago Toral Quiroga	efaf62a40a	i965/fs: implement i2d and u2d Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:08 +02:00
Iago Toral Quiroga	c63a6f2149	i965/fs: implement d2i and d2u These need the same treatment as d2f, so generalize our d2f lowering to cover these too. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:08 +02:00
Iago Toral Quiroga	e0c45182e3	i965/fs: implement d2b v2: Use subscript() instead of stride() (Curro) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:08 +02:00
Iago Toral Quiroga	80f60a4302	i965/fs: implement fsign() for doubles v2 (Sam): - Fix indentation (Kenneth) - Simplify code (Kenneth) v3: Use subscript() instead of stride() (Curro) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:08 +02:00
Iago Toral Quiroga	c9ecd651e6	i965/fs: add null_reg_df Probably not needed since we fix the dst type of comparisons automatically, but for consistency with the rest of null_reg_* functions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:08 +02:00
Iago Toral Quiroga	e8a8fc9563	i965/fs: We only support 32-bit integer ALU operations for now Add asserts so we remember to address this when we enable 64-bit integer support, as suggested by Connor and Jason. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:08 +02:00
Iago Toral Quiroga	9e5ce151a4	i965/fs: handle fp64 opcodes in brw_do_channel_expressions In the case of the pack opcode we are already doing the lowering in NIR, so no need to do it here. The unpack opcode operates on scalars, so it should not be lowered. In the case of frexp_sig and frexp_exp, they are lowered in lower_instructions, so we don't have to care about them. All the remaining opcodes involve conversions from and to doubles and are business as usual. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:08 +02:00
Connor Abbott	a644b0939d	i965/fs: add support for f2d and d2f Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:08 +02:00
Connor Abbott	9e1b3ea199	i965/fs: add a pass for legalizing d2f We need to do this late, in order to avoid partial writes during the optimization loop. v2: Use subscript() instead of stride(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:08 +02:00
Connor Abbott	2286a74e3b	i965/fs: fix dst width calculation in CSE v2 (Sam): - Fix line width (Topi). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-10 11:25:08 +02:00
Connor Abbott	fccd15524f	i965/fs: fix regs_written in LOAD_PAYLOAD for doubles v2: Account for the stride of the dst (Iago) Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-10 11:25:07 +02:00
Connor Abbott	6b6d68ae07	i965/fs: fix is_copy_payload() for doubles v2 (Sam): - LOAD_PAYLOAD treats each header source as a 32B block regardless of the datatype. Drop the change (Curro) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-10 11:25:07 +02:00
Connor Abbott	e83f51d54e	i965/fs: fix compares for doubles The destination has to have the same source as the type, or else the simulator will complain. As a result, we need to emit a CMP that outputs a 64-bit wide result and then do a strided MOV to pick out the low 32 bits of each channel. v2: Use subscript() instead of stride() (Curro) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:07 +02:00
Connor Abbott	a5d7e144ea	i965/fs: extend exec_size halving in the generator The HW has a restriction that only vertical stride may cross register boundaries. Previously, this only mattered for SIMD16 instructions where we needed to use the same regioning parameters as the equivalent SIMD8 instruction but double the exec size. But we need to do the same splitting for 64-bit instructions as well as instructions with a stride of 2 (which effectively consume 64 bits per element). Fix up the code to do the right thing instead of special-casing SIMD16. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:07 +02:00
Connor Abbott	4f3888c1ca	i965/fs: fix assign_constant_locations() for doubles Uniform doubles will read two registers, in which case we need to mark both as being live. v2 (Sam): - Use a formula to get the number of registers read with proper units (Curro). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:07 +02:00
Connor Abbott	cc64c9e441	i965/fs: use byte_offset() in offset() for uniforms This makes things more consistent, and also fixes the offset calculation for double uniforms. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:07 +02:00
Connor Abbott	fe949949a9	i965/fs: handle uniforms in byte_offset() v2: Do it only for uniforms (Iago) Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:06 +02:00
Connor Abbott	1f51aada3f	i965/fs: fix type_size() for doubles Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:06 +02:00
Iago Toral Quiroga	935e0e305d	i965/fs: optimize unpack double When we are actually unpacking from a double that we have previously packed from its 32-bit components we can bypass the pack operation and source from its arguments directly. v2 (Sam): - Fix line overflow (Topi) - Bail if the parent instruction's source is not SSA (Connor) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:06 +02:00
Iago Toral Quiroga	ba1907f040	i965/fs: optimize pack double When we are actually creating a double using values obtained from a previous unpack operation we can bypass the unpack and source from the original double value directly. v2: - Style changes (Topi) - Bail is parent instruction's src is not SSA (Connor) v3: Use subscript() instead of stride() (Curro) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:06 +02:00
Connor Abbott	7782f39e75	i965/fs/nir: translate double pack/unpack v2 (Sam): - Fix line overflow (Topi). v3: Use subscript() instead of stride() (Curro) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:06 +02:00
Connor Abbott	fd763177c1	i965/fs: add a pass for lowering PACK opcodes v2: Use subscript() instead of stride() (Curro) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:06 +02:00
Connor Abbott	ba582e58cd	i965/fs: add PACK opcode Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:05 +02:00
Francisco Jerez	cc3bae5cd7	i965/fs: Introduce helper to extract a field from each channel of a register. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-05-10 11:25:05 +02:00
Connor Abbott	d17cdacba3	i965/fs: always pass the bitsize to brw_type_for_nir_type() v2 (Sam): - Add bitsize to brw_type_for_nir_type() in optimize_extract_to_float() v3 (Sam): - Fix line width (Topi). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:05 +02:00
Connor Abbott	a308bae58f	i965/fs: add support for printing double immediates Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:05 +02:00
Connor Abbott	0f2e227d5c	i965/fs: don't propagate 64-bit immediates They can only be used with 1-src instructions, which practically (since we should've constant-propagated away all 1-src instructions with 64-bit immediates in NIR) means that they must be kept in separate MOV's and can't be propagated. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:05 +02:00
Connor Abbott	0f1690fd95	i965/fs: use the NIR bit size when creating registers v2 (Iago): - Squashed bits from 'support double precission constant operands for the implementation of 64-bit emit_load_const'. - Do not use BRW_REGISTER_TYPE_D for all 32-bit registers since that breaks asserts and functionality for some piglit tests. Just keep 32-bit types untouched and add 64-bit support. - Use DF instead of Q for 64-bit registers. Otherwise the code we generate will use Q sometimes and DF others and we hit unwanted DF/Q conversions, so always use DF. v3 (Sam): - Mark 'reg_type' occurrences as const (Topi). Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Signed-off-by: Tapani Palli <tapani.palli@intel.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:04 +02:00
Connor Abbott	76de7af8e2	i965: fixup uniform setup for doubles Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:04 +02:00
Iago Toral Quiroga	3210870b34	i965: two-argument instructions can only use 32-bit immediates Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-10 11:25:04 +02:00
Iago Toral Quiroga	3d10adf603	i965: fix brw_abs_immediate() for doubles Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-10 11:25:04 +02:00
Iago Toral Quiroga	830d87840c	i965: fix brw_saturate_immediate() for doubles v2 (Sam): - Mark 'size' as const (Topi). - Add comment to explain that we do copies 64-bits regardless of the type (Topi) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-10 11:25:03 +02:00
Connor Abbott	7bcc4cccad	i965: fix is_zero(), is_one() and is_negative_one() for doubles Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:03 +02:00
Connor Abbott	2ae409286c	i965: fix brw_negate_immediate() for doubles Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:03 +02:00
Connor Abbott	cbf7c7f099	i965/eu: add support for DF immediates v2 (Sam): - Remove 'however' from the comment (Topi) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:03 +02:00
Connor Abbott	c0a1cd24a8	i965: add support for disassembling DF immediates Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:03 +02:00
Connor Abbott	bb175db16b	i965: add support for getting/setting DF immediates Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:03 +02:00
Connor Abbott	5310bca024	i965: add brw_imm_df v2 (Iago) - Fixup accessibility in backend_reg Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:02 +02:00
Topi Pohjolainen	9add73f641	i965/eu: Allow 3-src float ops with doubles v2: - set 3src_src_type for BRW_REGISTER_TYPE_DF (Connor) Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:02 +02:00
Connor Abbott	367e762a71	i965/disasm: fix disasm of 3-src doubles Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:02 +02:00
Topi Pohjolainen	45066a6a59	i965: Tell backend register about double precision type Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Signed-off-by: Tapani P\344lli <tapani.palli@intel.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:02 +02:00
Topi Pohjolainen	520b3b2fd1	i965: Determine size of double precision float register This is used to determine how many registers an instruction reads and writes as well as for offseting register region into a desired component. v2 (Connor): rebase on master Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Signed-off-by: Tapani P\344lli <tapani.palli@intel.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:02 +02:00
Topi Pohjolainen	e88cf0f2d2	i965: Lower DFRACEXP/DLDEXP v2 (Connor): rebase on master which moved this to brw_link.cpp v3 (Sam): - Only enable DFREXP_DLDEXP_TO_ARITH in process_glsl_ir(). This is used for doubles. Single floating point op is lowered by NIR. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:02 +02:00
Connor Abbott	30424fd25a	i965: use pack/unpackDouble lowering Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:01 +02:00
Connor Abbott	bea2f8beb5	i965: use double lowering pass v2: also lower trunc, ceil, floor, fract and roundEven (Iago) v3: also lower mod for doubles (Sam) Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:01 +02:00
Samuel Iglesias Gonsálvez	d00a239b28	freedreno/ir3: lower lrp when operating with double operands Lower lrp when operating with double operands because float version of lrp is also lowered. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:01 +02:00
Samuel Iglesias Gonsálvez	93e690830a	i965: enable lrp lowering for doubles Broadwell and previous generations does not support lrp instruction operating with doubles. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-10 11:25:01 +02:00
Dave Airlie	008feb3687	st/glsl_to_tgsi: brown paper bag for the input offsets fix. Oops, thanks compiler. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-10 14:41:21 +10:00
Dave Airlie	4d8a71f7f1	glsl: check geometry output vertices limits. This fixes: GL45-CTS.geometry_shader.limits.max_output_vertices Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-10 14:26:03 +10:00
Dave Airlie	13c68e1447	mesa/vbo: fix check for zero aliases with 2/10/10/10 This fixes: GL33-CTS.gtf33.GL3Tests.vertex_type_2_10_10_10_rev.vertex_type_2_10_10_10_rev_attrib Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-10 14:24:49 +10:00
Eduardo Lima Mitev	60a5d02416	nir/print: Print memory qualifiers in a variable declaration Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-10 06:22:05 +02:00
Eduardo Lima Mitev	7f7f58f17f	glsl: Apply memory qualifiers to vars inside named block interfaces This is missing and memory qualifiers are currently being ignored for SSBOs. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-10 06:21:55 +02:00
Dave Airlie	f75a26d1ba	st/glsl_to_tgsi: handle offsets from inputs This fixes: GL45-CTS.gpu_shader5.texture_gather_offset_color_repeat Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-10 13:14:29 +10:00
Rob Clark	aa730aca20	scripts: bump git_reviewer.pl --git-min-percent default Bump up default percentage of commits required to be auto-picked for CC. Seems from a bit of trial-and-error to come up with a more reasonable list of CC's this way. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-09 19:30:28 -04:00
Kenneth Graunke	e034d80fe1	Revert "Revert "i965: Switch to scalar TCS by default."" This reverts commit `bd326c229c`. Now that we've fixed the GPU hangs, let's turn it back on. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-09 16:20:27 -07:00
Kenneth Graunke	5ce405ba0f	i965: Actually assign binding table offsets for the TCS. As far as I can tell, this was just entirely missing...honestly, I'm not sure how anything worked at all. Caught by noticing GPU hangs in image load store tests with scalar TCS, but probably has broader implications. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-09 16:20:18 -07:00
Kenneth Graunke	e0e7280db0	i965: Clamp "Maximum VP Index" to 1 when gl_ViewportIndex isn't written. fs_visitor::emit_urb_writes skips writing the VUE header for shaders that don't write gl_PointSize, gl_Layer, or gl_ViewportIndex. This leaves their values uninitialized. Kristian's nearby comment says: "But often none of the special varyings that live there are written and in that case we can skip writing to the vue header, provided the corresponding state properly clamps the values further down the pipeline." However, we were clamping gl_ViewportIndex to [0, 15], so we would end up using a random viewport. To fix this, detect when the shader doesn't write gl_ViewportIndex, and clamp it to [0, 0]. The vec4 backend always writes zeros to the VUE header, so it doesn't suffer from this problem. With vec4-style HWord writes, we can write the header and position together in a single message. In the FS world, we would need 4 extra MOVs of 0 and a longer message, or a separate OWord write. It's likely cheaper to just clamp the value. Fixes DiRT Showdown and Bioshock Infinite, which only rendered half of the screen - the lower left of two triangles. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93054 Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-09 15:31:27 -07:00
Jordan Justen	e74812dbfe	i965/hsw: Fix brw_store_data_imm* For Gen6 through Haswell dword 1 is MBZ. In gen 8 it becomes part of the 64-bit address. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-09 15:05:08 -07:00
Kenneth Graunke	96d43f2d08	i965: Reimplement ARB_transform_feedback2 on Haswell and later. My old implementation accumulated <start, end> pairs in a buffer, and eventually processed that data on the CPU. This meant flushing the batchbuffer and waiting for it to completely execute before we could map it, resulting in really long stalls. We could also run out of space in the buffer, and have to do this early. Instead, we can use Haswell's MI_MATH command to do the (end - start) subtraction, as well as the multiplication by 2 or 3 to convert from the number of primitives written to the number of vertices written. We still need to CS stall to read the counters, but otherwise everything is completely pipelined - there's no CPU<->GPU synchronization required. It also uses only 80 bytes in the buffer, no matter what. Improves performance in Manhattan on Skylake GT3e at 800x600 by 6.1086% +/- 0.954166% (n=9). At 1920x1080, improves performance by 2.82103% +/- 0.148596% (n=84). v2: Fix number of primitives -> number of vertices calculation for GL_TRIANGLES (I was multiplying by 4 instead of 3.) Caught by Jordan Justen. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-09 15:00:01 -07:00
Kenneth Graunke	fdb6c1887f	i965: Add a brw_load_register_reg64 helper. It appears that we can't do this in a single command (like we do for MI_LOAD_REGISTER_IMM) - the Skylake simulator gets rather grumpy about the command length if I try to combine them. No matter. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-09 15:00:01 -07:00
Kenneth Graunke	4c71c8a74a	i965: Only enable ARB_query_buffer_object for newer kernels on Haswell. On Haswell, we need version 6 of the kernel command parser in order to write the math registers. Our implementation of ARB_query_buffer_object heavily relies on MI_MATH, so we should only advertise it when MI_MATH is available. We also need MI_LOAD_REGISTER_REG, which requires version 7 of the command parser. To make these checks easier, introduce a screen->has_mi_math_and_lrr flag that will be set when both commands are supported. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-09 14:59:58 -07:00
Dave Airlie	2d41eb313f	mesa/objectlabel: don't return info on genned but never bound textures. This fixes some cases in the CTS KHR debug tests where it uses glIsTexture to find an invalid ID and then call GetObjectLabel. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-10 06:06:09 +10:00
Dave Airlie	bbc6a27590	mesa: don't use genned but unnamed xfb objects. If we try to draw or query an XFB object that hasn't been bound, we shouldn't return any information. This fixes a couple if cases in: GL33-CTS.transform_feedback.api_errors_test The ObjectLabel test is inspired by another test. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-10 06:06:09 +10:00
Samuel Pitoiset	eafe3905d9	nv50/ir: silence unsupported TGSI_PROPERTY_CS_FIXED_BLOCK_* We don't need them for compute shaders. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-09 21:58:56 +02:00
Jordan Justen	2e2aa992ff	mesa/compute: Fix indirect dispatch buffer size check on 32-bit systems `2655265fcb`, but for compute. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-09 11:16:39 -07:00
Rob Clark	57763ee735	freedreno/ir3: fix fallout from new block iterators Since this is potentially modifying the block structure of the shader, it needs the _safe() version of the iterator. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-09 13:52:29 -04:00
Nicolai Hähnle	fe102f7677	radeonsi: workaround for tesselation on SI We request more than 32KB of LDS here, which SI doesn't have. Since LLVM recently started checking the size of declared LDS allocations, all shaders involved in tesselation fail to compile on SI. Note that the entire calculation here seems wrong, given how we calculate indices for generic attributes, so the number ends up wrong on CI+ as well. A proper solution is clearly needed, but this patch should serve as a band-aid for SI in the meantime. Also note that the real size of the LDS allocation in hardware is independent from what we tell LLVM, so this is really more of a "cosmetic" change. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95198 Cc: "11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-09 11:52:46 -05:00
Nicolai Hähnle	d8f3e8e626	radeonsi: always allocate export memory for pixel shaders Experiments with framebuffer-no-attachments type draw calls have shown that NULL exports stall terribly unless we ensure that export memory is allocated by the SPI. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-09 11:52:46 -05:00
Nicolai Hähnle	ad1782cfb5	radeonsi: expose performance counters as 64 bit This is useful for shader-related counters, since they tend to quickly exceed 32 bits. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-09 11:52:46 -05:00
Rob Clark	f096096b77	nir/search: fix typo Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-09 12:46:24 -04:00
Tim Rowley	b65f7ec450	gallium: enable intel jitevents profiling LLVM when configured with "intel jitevents" enabled can inform VTune about dynamic code, so individual shaders are attributed profiling data and the resulting assembly can be examined. Acked-by: Roland Scheidegger <sroland@vmware.com>	2016-05-09 11:25:02 -05:00
Bruce Cherniak	0062c5f09b	swr: Add missing break in query switch statement. Missed a switch break in query stat collection when refactoring queries. Reviewed-by: George Kyriazis <george.kyriazis@intel.com>	2016-05-09 11:21:47 -05:00
Rob Clark	f33083a216	freedreno/ir3: allow for additional VS sysval inputs There are a total of four possible currently, rather than 2. So we need to be prepared for the input array to grow by 16 components. We could get away with less if we could pack sysval inputs.. and the way this is handled currently isn't really the nicest thing. But it's a tactical fix for an issue hit in: GL31-CTS.gtf30.GL3Tests.transform_feedback.transform_feedback_vertex_id Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-09 11:51:59 -04:00
Emil Velikov	a0d9279e3b	docs: add news item and link release notes for 11.1.4/11.2.2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-09 14:28:20 +01:00
Emil Velikov	0c5752b672	docs: add sha256 checksums for 11.2.2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-09 14:25:08 +01:00
Emil Velikov	f746aa348e	docs: add release notes for 11.2.2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-09 14:25:07 +01:00
Emil Velikov	596c881162	docs: add sha256 checksums for 11.1.4 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-09 14:25:04 +01:00
Emil Velikov	f93d8a885c	docs: add release notes for 11.1.4 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-09 14:25:02 +01:00
Jose Fonseca	c521f2d737	scons: Improve Python module dependency discovery. Several NIR scripts were using `from ... import ...` syntax, which wasn't supported. Using Python standard libary's modulefinder solves the problem with less effort and hacks. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-09 14:19:24 +01:00
Marek Olšák	172bfdaa9e	r300g: add support for PIPE_FORMAT_x8R8G8B8_* And set endian swap for packed formats the way it should be done in theory. This allows big endian to work again, but it can still be buggy. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71789 Cc: 11.1 11.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-05-09 13:11:40 +02:00
Daniel Stone	e54b2e902a	Revert "i965: Always use Y-tiled buffers on SKL+" This commit broke Weston, Mutter, and xf86-video-modesetting, on KMS. In order to use Y-tiled buffers, the kernel requires the tiling mode to be explicitly named through the I915_FORMAT_MOD_Y_TILED AddFB2 modifier; it disallows any attempt to infer the buffer's tiling mode. As the GBM API does not have a way to extract modifiers for a buffer, this commit broke all users of GBM on SKL+. Revert it for now, until we get a way to extract modifier information from GBM, and also let GBM users inform the implementation that it intends to use the modifiers. This reverts commit `6a0d036483`. Signed-off-by: Daniel Stone <daniels@collabora.com> Acked-by: Ben Widawsky <ben@bwidawsk.net> Tested-by: Hans de Goede <hdegoede@redhat.com>	2016-05-09 10:35:55 +01:00
Dave Airlie	920d78a32c	mesa/shader_query: add missing subroutines cases ARRAY_SIZE and LOCATION should accept the SUBROUTINE_UNIFORM types. Fixes: GL43-CTS.program_interface_query.subroutines-vertex GL43-CTS.program_interface_query.subroutines-tess-control GL43-CTS.program_interface_query.subroutines-tess-eval GL43-CTS.program_interface_query.subroutines-geometry GL43-CTS.program_interface_query.subroutines-fragment GL43-CTS.program_interface_query.subroutines-compute Reviewed-by: Antia Puentes <apuentes@igalia.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-09 06:30:52 +10:00
Kenneth Graunke	742bc53d04	spirv: Fix structure splitting with per-vertex interface arrays. We want to use interface_type, not vtn_var->type. They're normally equivalent, but for geometry/tessellation per-vertex interface arrays, we need to unwrap a level. Otherwise, we tried to iterate a structure members but instead used an array length. If the array length was longer than the number of fields in the structure, we'd crash. Fixes the CreatePipelineGeometryInputBlockPositive layer validation test. v2: Just use glsl_without_array() on the vtn_var type (requested by Jason Ekstrand). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisforbes@google.com>	2016-05-07 15:44:41 -07:00
Kenneth Graunke	1896682d27	compiler: Add a C wrapper for glsl_type::without_array(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisforbes@google.com>	2016-05-07 15:44:41 -07:00
Nicolai Hähnle	b9e6e8e7d4	radeonsi: fix undefined behavior (memcpy arguments must be non-NULL) Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-07 16:46:59 -05:00
Nicolai Hähnle	146927ce7b	radeonsi: fix some reported undefined left-shifts One of these is an unsigned bitfield, which I suspect is a false positive, but gcc 5.3.1 complains about it with -fsanitize=undefined. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-07 16:46:59 -05:00
Nicolai Hähnle	60d2fc233b	gallium/radeon: clean left-shift undefined behavior Shifting into the sign bit of a signed int is undefined behavior. Unfortunately, there are potentially many places where this happens using the register macros. This commit is the result of running sed -ie "s/((($\w\+$) & 0x$\w\+$) << $\w\+$)/(((unsigned)(\1) \& 0x\2) << \3)/g" on all header files in gallium/{r600,radeon,radeonsi}. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-07 16:46:59 -05:00
Nicolai Hähnle	62b7958cd0	gallium: fix various undefined left shifts into sign bit Funnily enough, some of these were turned into a compile-time error by gcc with -fsanitize=undefined ("initializer is not a constant"). Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-07 16:46:59 -05:00
Nicolai Hähnle	945c6887ab	compiler/glsl: do not downcast list sentinel This crashes gcc's undefined behaviour sanitizer. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-07 16:46:58 -05:00
Nicolai Hähnle	bdad1393a0	mesa/main: fix another undefined left shift Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-07 16:45:04 -05:00
Nicolai Hähnle	3e1cf8bf3f	mesa/main: define _NEW_xxx flags as unsigned shifts Since 1 << 31 complains about undefined behaviour; the others are changed only for consistency. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-07 16:44:33 -05:00
Bas Nieuwenhuizen	6291f19f71	radeonsi: Compute correct LDS size for fragment shaders. No sure where the 36 came from, but we clearly need at least 48 bytes per attribute per primitive. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-06 21:40:17 +02:00
Eric Anholt	a1f698881e	vc4: Add support for loading immediate values in QIR. This will be used for resetting the uniform stream in the presence of branching, but may also be useful as an optimization to reduce how many uniforms we have to copy out per draw call (in exchange for increasing icache pressure).	2016-05-06 10:25:55 -07:00
Eric Anholt	890dc19eeb	vc4: Make vc4_qpu_validate() produce more verbose failures. Seeing the expansion of a QPU_GET_FIELD in an assert isn't very informative, and it's hard find what's going wrong without getting a dump of the instruction that failed.	2016-05-06 10:25:55 -07:00
Eric Anholt	8e2d0843c0	vc4: Add a small QIR validate pass. This has caught a couple of bugs during loop development so far, and I should probably have written it long ago.	2016-05-06 10:25:55 -07:00
Eric Anholt	daaa9d579d	vc4: Fix the src count on exp2/log2. Found by the upcoming QIR validate pass.	2016-05-06 10:25:55 -07:00
Eric Anholt	d36b28402f	vc4: Reuse QPU disasm's cond flags in QIR. In the process, this made me flatten out the "%s%s%s%s" fprintf arguments.	2016-05-06 10:25:55 -07:00
Eric Anholt	419fee92ee	vc4: When emitting an instruction to an existing temp, mark it non-SSA. Prevents a bug in the later control-flow support series.	2016-05-06 10:25:55 -07:00
Eric Anholt	1387e722cd	vc4: Make sure that we don't overwrite the signal for PROG_END. We should have already emitted a NOP due to the last instruction being a TLB or VPM write. However, if you disable dead code elimination then you might get dead code at the end, and that dead code might have the signal bits set to something non-default, at which point you die in assertion failure.	2016-05-06 10:25:55 -07:00
Samuel Pitoiset	44de03b0f8	nvc0: unreference images when the context is destroyed Like other resources, we need to unreference all images. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-06 15:15:32 +02:00
Jose Fonseca	8ae78f7d28	nir: Remove spurious return from void function. Left over from `450c061362`. Trivial. Built locally with clang and gcc. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95296	2016-05-06 12:03:34 +01:00
Marek Olšák	901f57dff5	radeonsi: set DECOMPRESS_Z_ON_FLUSH if nr_samples >= 4 Vulkan always sets this. It only affects in-place Z decompression. This is recommended for performance, but what app uses MSAA depth texturing? Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-06 12:56:47 +02:00
Marek Olšák	4489d75a58	r600g: use the hw MSAA resolving if formats are compatible This allows resolving RGBA into RGBX. This should improve HL2 Lost Coast performance. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-05-06 12:56:47 +02:00
Kenneth Graunke	bd326c229c	Revert "i965: Switch to scalar TCS by default." This reverts commit `b593737ed8`. Apparently it causes GPU hangs on some image load store tests. Let's turn it back off until we figure out why.	2016-05-05 18:03:23 -07:00
Leo Liu	fef0e993a1	st/omx/enc: fix incorrect reference picture order for B frames Stacking frames is for driver that's capable to do dual instances encoding. Such feature is not enabled for B frames currently. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-05-05 19:26:43 -04:00
Jason Ekstrand	7bc987abe0	i965/fs: Move handling of samples_identical into the switch statement This is where we handle texop_texture_samples so it makes things more consistent.	2016-05-05 16:25:21 -07:00
Jason Ekstrand	3ba228f997	i965/fs: Simplify texture destination fixups There are a few different fixups that we have to do for texture destinations that re-arrange channels, fix hardware vs. API mismatches, or just shrink the result to fit in the NIR destination. These were all being done in a somewhat haphazard manner. This commit replaces all of the shuffling with a single LOAD_PAYLOAD operation at the end and makes it much easier to insert fixups between the texture instruction itself and the LOAD_PAYLOAD. Shader-db results on Haswell: total instructions in shared programs: 6227035 -> 6226669 (-0.01%) instructions in affected programs: 19119 -> 18753 (-1.91%) helped: 85 HURT: 0 total cycles in shared programs: 56491626 -> 56476126 (-0.03%) cycles in affected programs: 672420 -> 656920 (-2.31%) helped: 92 HURT: 42	2016-05-05 16:25:21 -07:00
Jason Ekstrand	7de0ae634e	i965/fs: stop inclinding glsl/ir.h in brw_fs.h We are no longer using anything from GLSL IR in the FS backend.	2016-05-05 16:25:21 -07:00
Jason Ekstrand	a815499294	i965/fs: Merge nir_emit_texture and emit_texture The fs_visitor::emit_texture helper originated when we still had both NIR and IR visitors for the FS backend. Since the old visitor was removed, emit_texture serves no real purpose beyond arbitrarily splitting heavily-linked code across two functions.	2016-05-05 16:25:21 -07:00
Connor Abbott	4fab8dd5ea	nir: remove now-unused nir_foreach_block*_call() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-05 16:19:42 -07:00
Connor Abbott	7c36f9eb52	vc4: fixup for new nir_foreach_block() Reviewed-by: Eric Anholt <eric@anholt.net>	2016-05-05 16:19:41 -07:00
Connor Abbott	582815d9ea	ir3: fixup for new nir_foreach_block()	2016-05-05 16:19:41 -07:00
Jason Ekstrand	31fc4a2528	nir/lower_double_ops: fixup for new nir_foreach_block() Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-05 16:19:41 -07:00
Jason Ekstrand	450c061362	nir/lower_double_pack: fixup for new nir_foreach_block() Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-05 16:19:41 -07:00
Jason Ekstrand	8c807cc2a6	nir/gather_info: fixup for new foreach_block() Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-05 16:19:41 -07:00
Connor Abbott	331b9f73a2	nir/lower_two_sided_color: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-05 16:19:41 -07:00
Connor Abbott	d40fbbc27e	nir/lower_tex: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-05 16:19:41 -07:00
Connor Abbott	8a7fe634d2	nir/lower_outputs_to_temporaries: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-05 16:19:41 -07:00
Kenneth Graunke	b593737ed8	i965: Switch to scalar TCS by default. Normally, we expect SIMD8 shaders to be more instructions than SIMD4x2 shaders, as it takes four instructions to operate on a vec4, rather than a single instruction. However, the benefit is that it can process 8 objects per shader thread instead of 2. Surprisingly, the shader-db statistics show an improvement in both instruction and cycle counts: Synmark: -31.25% instructions, -29.27% cycles, 0 hurt. Tessmark: -36.92% instructions, -37.81% cycles, 0 hurt. Unigine Heaven: -3.42% instructions, -17.95% cycles, 0 hurt. Shadow of Mordor: +13.24% instructions (26 with fewer instructions, 45 with more), -5.23% cycles (44 with fewer cycles, 27 with more cycles). Presumably, this is because the SIMD8 URB messages are a much more natural fit than the SIMD4x2 URB messages - there's a ton less header setup. I benchmarked Shadow of Mordor and Unigine Heaven on my Skylake GT3e, and the performance seems to be the same or increase ever so slightly (< 1 FPS difference). So I believe it's strictly superior. There's also a lot more optimization potential we can do in scalar mode. This will also help us finish fp64 support, as scalar support is going to land much sooner than vec4-mode support. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-05 14:24:00 -07:00
Kenneth Graunke	bc0062c54a	nir: Optimize out stores of undefs. There are a couple of cycle count changes in shader-db, but it's basically a wash. However, with the Broadwell scalar TCS backend enabled, many Shadow of Mordor shaders benefit from this patch. Because we don't batch up output writes for TCS, vec4 outputs might not have all components defined. Many output writes have a value of undef, which is useless. With scalar TCS, stats for tessellation shaders on Broadwell: total instructions in shared programs: 1283000 -> 1280444 (-0.20%) instructions in affected programs: 34302 -> 31746 (-7.45%) helped: 71 HURT: 0 total cycles in shared programs: 10798768 -> 10780682 (-0.17%) cycles in affected programs: 158004 -> 139918 (-11.45%) helped: 71 HURT: 0 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-05 14:24:00 -07:00
Kenneth Graunke	c7a8b32700	nir: Replace vecN(undef, undef, ...) with a single undef. shader-db statistics on Broadwell: total instructions in shared programs: 8963409 -> 8962455 (-0.01%) instructions in affected programs: 60858 -> 59904 (-1.57%) helped: 318 HURT: 0 total cycles in shared programs: 71408022 -> 71406276 (-0.00%) cycles in affected programs: 398416 -> 396670 (-0.44%) helped: 199 HURT: 51 GAINED: 1 The only shaders affected were in Dota 2 Reborn. It also sets up for the next optimization. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-05 14:24:00 -07:00
Kenneth Graunke	49ea7454a1	nir: Rename opt_undef_alu to opt_undef_csel; update comments. This better reflects what it does. I plan to add other ALU optimizations as well, so the old name would be confusing. In preparation for that, also move the file comments about csels above the opt_undef_csel function, and delete the ones about there not being other optimizations. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-05 14:24:00 -07:00
Kenneth Graunke	a808ba5965	i965: Rework passthrough TCS checks. According to Timothy, using program_string_id == 0 to identify the passthrough TCS is going to be problematic for his shader cache work. So, change it to strcmp() the name at visitor creation time. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-05 14:24:00 -07:00
Tim Rowley	ff8c0c9a35	swr: [rasterizer core] Faster modulo operator in ProcessVerts Avoid % operator, since we know that curVertex is always incrementing. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-05 14:50:11 -05:00
Tim Rowley	2be7c3e780	swr: [rasterizer] Small warning cleanup Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-05 14:50:03 -05:00
Tim Rowley	b39c530f88	swr: [rasterizer] Add SWR_ASSUME / SWR_ASSUME_ASSERT macros Fix static code analysis errors found by coverity on Linux Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-05 14:49:56 -05:00
Tim Rowley	db084f48eb	swr: [rasterizer] Miscellaneous backend changes Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-05 14:49:48 -05:00
Tim Rowley	3951a2109e	swr: [rasterizer] Add support for X24_TYPELESS_G8_UINT format Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-05 14:49:42 -05:00
Tim Rowley	909aee07f8	swr: [rasterizer jitter] Fix printing bugs for tracing. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-05 14:49:29 -05:00
Tim Rowley	bc084e6b3d	swr: [rasterizer memory] Add missing store tiles function Storing color hot tile to 8bit w-major stencil format. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-05 14:49:23 -05:00
Tim Rowley	5332c9d931	swr: [rasterizer jitter] Add asserts for supported formats in fetch shader Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-05 14:49:18 -05:00
Tim Rowley	6e89227054	swr: [rasterizer core] Fix thread allocation Fix windows in 32-bit mode when hyperthreading is disabled on Xeons. Some support for asymmetric processor topologies. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-05 14:49:11 -05:00
Tim Rowley	c2f5d2daa8	swr: [rasterizer core] Fix threadviz support in buckets Need to do lazy eval of the threadviz knob since order of globals is undefined. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-05 14:49:04 -05:00
Tim Rowley	1eb211c4a4	swr: [rasterizer] Whitespace cleanup and misc changes Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-05 14:48:55 -05:00
Nicolai Hähnle	d97e333ea4	radeonsi: mark descriptor loads as using dynamically uniform indices This tells LLVM to always use SMEM loads for descriptors. It fixes a regression in piglit's arb_shader_storage_buffer_object/execution/indirect.shader_test that was caused by LLVM r268259 (but the proper fix is really here in Mesa). Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-05 12:21:40 -05:00
Matt Turner	f01d92f473	i965/fs: Don't follow pow with an instruction with two dest regs. Beginning with commit `7b208a73`, Unigine Valley began hanging the GPU on Gen >= 8 platforms. Evidently that commit allowed the scheduler to make different choices that somehow finally ran afoul of a hardware bug in which POW and FDIV instructions may not be followed by an instruction with two destination registers (including compressed instructions). I presume the conditions are more complex than that, but the internal hardware bug report (BDWGFX bug_de 1696294) does not contain much more information. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94924 Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> [v1] Tested-by: Mark Janes <mark.a.janes@intel.com> [v1] Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-05-05 10:18:28 -07:00
Bruce Cherniak	9d86a5eea7	swr: Remove stall waiting for core query counters. When gathering query results, swr_gather_stats was unnecessarily stalling the entire pipeline. Results are now collected asynchronously, with a fence marking completion. Reviewed-By: George Kyriazis <george.kyriazis@intel.com>	2016-05-05 10:50:09 -05:00
Dave Airlie	76a36ac3ea	mesa/ubo: add missing compute cases for ubo/atomic buffers This fixes: GL43-CTS.compute_shader.resource-ubo Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-05 20:29:02 +10:00
Dave Airlie	2dd3fc3cac	mesa/compute: drop pointless casts. We already are a GLintptr, casting won't help. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-05 20:28:41 +10:00
Thomas Hindoe Paaboel Andersen	76a423efe0	mesa: remove null check before free Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-05-05 09:50:38 +02:00
Thomas Hindoe Paaboel Andersen	3a6763f0a0	freedreno: remove null check before free Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-05-05 09:34:01 +02:00
Thomas Hindoe Paaboel Andersen	8698194313	nir: fix assert for wildcard pairs The assert was null checking dest_arr_parent twice. The intention seems to be to check both dest_ and src_. Added in `d3636da9` Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-05-05 09:33:02 +02:00
Brian Paul	be5010c4b8	glapi: fix parameter type for GetSamplerParameterIuivEXT() in es_EXT.xml The function returns GLuint, not GLfloat values. v2: also fix the OES function Cc: "11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-04 14:49:39 -06:00
Brian Paul	54d203a319	mesa: include texture format in glGenerateMipmap error message Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-05-04 14:49:39 -06:00
Brian Paul	a62f031bc3	main: uses casts to silence some _mesa_debug() format warnings Silences warnings with 32-bit Linux gcc builds and MinGW which doesn't recognize the ‘t’ conversion character. Reviewed-by: Sinclair Yeh <syeh@vmware.com>	2016-05-04 14:49:39 -06:00
Jordan Justen	51300a0387	docs: Mark GL_ARB_query_buffer_object as done for i965/hsw+ Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-04 11:23:17 -07:00
Jordan Justen	f00c399bae	i965: Implement ARB_query_buffer_object for HSW+ v2: * Declare loop index variable at loop site (idr) * Make arrays of MI_MATH instructions 'static const' (idr) * Remove commented debug code (idr) * Updated comment in set_query_availability (Ken) * Replace switch with if/else in hsw_result_to_gpr0 (Ken) * Only divide GL_FRAGMENT_SHADER_INVOCATIONS_ARB by 4 on hsw and gen8 (Ken) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-04 11:23:17 -07:00
Jordan Justen	357ff91359	i965/gen6+: Add load register immediate helper functions Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-04 11:23:17 -07:00
Jordan Justen	959e1e9e66	i965/hsw+: Add support for copying a register Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-04 11:23:17 -07:00
Jordan Justen	aad14a22cb	i965/gen6+: Add support for storing immediate data into a buffer Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-04 11:23:17 -07:00
Jordan Justen	ac0bbf9ef3	i965: Add MI_MATH reg defs for HSW+ Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-04 11:23:17 -07:00
Jordan Justen	9f581f8f24	i965: Add brw_store_register_mem32 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-04 11:23:17 -07:00
Jordan Justen	c54e5c2fb2	i965: Use offset instead of index in brw_store_register_mem64 This matches the byte based offset of brw_load_register_mem. The function is also moved into intel_batchbuffer.c like brw_load_register_mem. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-04 11:23:10 -07:00
Jan Vesely	77959ce07b	r600,compute: create vtx buffer for text + rodata Reserve buffer id 2 Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2016-05-04 13:09:18 -04:00
Rob Clark	2e117a7649	freedreno: allow ctx->draw_vbo to fail Pretty much only happens if shader variant compile fails. But in this case, if we haven't emitted cmdstream, we don't want to set needs_flush. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-04 11:25:55 -04:00
Rob Clark	291ac872a4	freedreno: move shader-stage dirty bits to global dirty flag This was always a bit overly complicated, and had some issues (like ctx->prog.dirty not getting reset at the end of the batch). It also required some special hacks to avoid resetting dirty state on binning pass. So just move it all into ctx->dirty (leaving some free bits for future shader stages), and make FD_DIRTY_PROG just be the union of all FD_SHADER_DIRTY_*. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-04 11:25:55 -04:00
Rob Clark	a48cccacf3	freedreno/a4xx: fix bogus offset for f32x24s8 stencil restore fixes: $piglit/bin/fbo-clear-formats GL_ARB_depth_buffer_float Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-04 11:25:55 -04:00
Rob Clark	e7c64041e9	freedreno: add some debug_asserts() to catch insane offsets Ofc won't catch all faults, but at least helpful for catching offsets which are completely bogus. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-04 11:25:55 -04:00
Rob Clark	1f2bc64f31	freedreno/a4xx: deal with VS which do not write position Fixes $piglit/bin/glsl-1.40-tf-no-position a3xx may need similar? Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-04 11:25:55 -04:00
Rob Clark	a6ad30202c	freedreno/ir3: remove a couple redundant is_flow()s Now that the opc's encode the instruction category (making them unique) we no longer need to check the category in addition to the opc. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-04 11:25:55 -04:00
Rob Clark	f0a1f3de27	freedreno/ir3: cp small negative integers too Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-04 11:25:55 -04:00
Rob Clark	1f04d4bf59	freedreno/ir3: fix # of registers The instruction encoding allows for more registers, but at least on a3xx/a4xx they don't actually exist. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-04 11:25:55 -04:00
Rob Clark	173871dfb9	freedreno/ir3: lower immeds to const Helps reduce register pressure and instruction counts for immediates that would otherwise require a mov into gpr. total instructions in shared programs: 4455332 -> 4369297 (-1.93%) total dwords in shared programs: 8807872 -> 8614432 (-2.20%) total full registers used in shared programs: 263062 -> 250846 (-4.64%) total half registers used in shader programs: 9845 -> 9845 (0.00%) total const registers used in shared programs: 1029735 -> 1466993 (42.46%) half full const instr dwords helped 0 10415 0 17861 5912 hurt 0 1157 21458 947 33 Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-04 11:25:55 -04:00
Rob Clark	b15c7fc268	freedreno/ir3: add ir3_cp_ctx Needed in next commit.. just split out to reduce noise. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-04 11:25:55 -04:00
Rob Clark	b9985e5bde	add REVIEWERS and get_reviewer.pl script Copied from linux kernel (where it is called MAINTAINERS and get_maintainer.pl), with minimal changes to script (to recognize mesa src tree rather than linux kernel src tree, and to avoid accidentaly CC'ing Linus Torvalds on mesa patches), and slimmed down MAINTAINER file syntax to recognize that we don't really have subsystem "maintainers" in the same sense as the linux kernel (ie. no different mailing lists and git trees per subsystem). The main point is to automate slapping on the correct CC's for patches via git's --cc-cmd feature, more than anything else. I didn't attempt to fully populate the REVIEWERS file, by a long shot. This is an opt-in system and anyone else can add their own entries. To utilize: git send-email --cc-cmd ./scripts/get_reviewer.pl ... or to configure it to be the default: git config sendemail.cccmd ./scripts/get_reviewer.pl Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-04 11:25:46 -04:00
Ilia Mirkin	38fcf7cbad	nouveau/video: properly detect the decoder class for availability checks The kernel is now more strict with the class ids it exposes, so we need to check the G98 and MCP89 classes as well as the GT215 class. This effectively caused us to decide there were no decoding capabilities on newer kernel for VP3 chips. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95251 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2" <mesa-stable@lists.freedesktop.org>	2016-05-04 10:45:07 -04:00
Kenneth Graunke	0332963d19	i965: Delete stale perf_debug(). MOCS for 3DSTATE_SO_BUFFER has existed for ages.	2016-05-04 02:29:03 -07:00
Kenneth Graunke	3a886721ed	i965: Silence unused variable warning I added this when deleting some unnecessary code in a rebase.	2016-05-04 00:46:31 -07:00
Juan A. Suarez Romero	97989059b9	mesa/main: handle double uniform matrices properly When computing the offset in the uniform storage table, take into account the size multiplier so double precision matrices are handled correctly. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-04 08:08:12 +02:00
Samuel Iglesias Gonsálvez	2ab2d2e588	nir: Separate 32 and 64-bit fmod lowering Split 32-bit and 64-bit fmod lowering as the drivers might need to lower them separately inside NIR depending on the HW support. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-04 08:07:49 +02:00
Samuel Iglesias Gonsálvez	b902377a56	nir/lower_double_ops: lower mod() There are rounding errors with the division in i965 that affect the mod(x,y) result when x = N * y. Instead of returning '0' it was returning 'y'. This lowering pass fixes those cases. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-04 08:07:49 +02:00
Matt Turner	9f81434c5f	i965: Define GEN_GE/GEN_LE macros in terms of GEN_LT. GEN_LT has a straightforward implementation on which we can build the GEN_GE and GEN_LE macros. Suggested-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-03 22:34:01 -07:00
Matt Turner	affaae197f	i965: Add disassembler support for remaining opcodes. For opcodes that changed meaning on different generations, we store a pointer to a secondary table and the table's size in a tagged union in place of the mnemonic and number of sources. Acked-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-03 22:34:00 -07:00
Matt Turner	b89b0a03f2	i965: Make opcode_descs and gen_from_devinfo() static. The previous commit replaced direct uses of opcode_descs with calls to the wrapper function, which should be the only method of accessing opcode_descs's data. As a result gen_from_devinfo() can also be made static. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-03 22:34:00 -07:00
Matt Turner	0ff4912cf4	i965: Actually check whether the opcode is supported. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-03 22:33:59 -07:00
Matt Turner	667408b889	i965: Merge inst_info and opcode_desc tables. I merged opcode_desc into inst_info (instead of the other way around) because inst_info was sorted by opcode number. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-03 22:33:42 -07:00
Matt Turner	d01596613b	i965: Move inst_info from brw_eu_validate.c to brw_eu.c. Drop the uses of 'enum gen' to a plain int, so that we don't have to expose the bitfield definitions and GEN_GE/GEN_LE macros to other users of brw_eu.h. As a result, s/.gen/.gens/ to avoid confusion with devinfo->gen. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-03 22:33:42 -07:00
Francisco Jerez	1530e27534	i965/disasm: Wrap opcode_desc look-up in a function. The function takes a device info struct as argument in addition to the opcode number in order to disambiguate between multiple opcode_desc entries for different instructions with the same opcode number. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> [v1] [v2] mattst88: Put brw_opcode_desc() in brw_eu.c instead of moving it there in a later patch. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v2] [v3] mattst88: Return NULL if opcode >= ARRAY_SIZE(opcode_descs) Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-03 22:32:40 -07:00
Francisco Jerez	1cc7573162	i965: Pass devinfo pointer to is_3src() helpers. This is not strictly required for the following changes because none of the three-source opcodes we support at the moment in the compiler back-end has been removed or redefined, but that's likely to change in the future. In any case having hardware instructions specified as a pair of hardware device and opcode number explicitly in all cases will simplify the opcode look-up interface introduced in a subsequent commit, since the opcode number alone is in general ambiguous. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-03 18:06:21 -07:00
Francisco Jerez	c55dc77ab1	i965: Pass devinfo pointer to brw_instruction_name(). A future series will implement support for an instruction that happens to have the same opcode number as another instruction we support already on a disjoint set of hardware generations. In order to disambiguate which instruction it is brw_instruction_name() will need some way to find out which device we are generating code for. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-03 18:06:21 -07:00
Kenneth Graunke	7d9143ad88	i965: Write a scalar TCS backend that runs in SINGLE_PATCH mode. Unlike most shader stages, the Hull Shader hardware makes us explicitly tell it how many threads to dispatch and manually configure the channel mask. One perk of this is that we have a lot of flexibility - we can run it in either SIMD4x2 or SIMD8 mode. Treating it as SIMD8 means that shaders with 8 or fewer output vertices (which is overwhemingly the common case) can be handled by a single thread. This has several intriguing properties: - Accessing input arrays with gl_InvocationID as the index is a simple SIMD8 URB read with g1 as the header. No indirect addressing required. - Barriers are no-ops. - We could potentially do output shadowing to combine writes, as the concurrency concerns are gone. (We don't do this yet, though.) v2: Drop first_non_payload_grf change, as it was always adding 0 (caught by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-03 16:28:00 -07:00
Kenneth Graunke	75881bed9e	i965: Rework the TCS passthrough shader to use NIR. I'm about to implement a scalar TCS backend, and I'd rather not duplicate all of this code there. One change is that we now write the tessellation levels from all TCS threads, rather than just the first. This is pretty harmless, and was easier. The IF/ENDIF needed for that are gone; otherwise the generated code is basically identical. I chose to emit load/store intrinsics directly because it was easier. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-05-03 16:27:52 -07:00
Brian Paul	ef5a31fc06	gallium/util: change assertion to conditional in util_bitmask_destroy() If we fail to create a context in the VMware driver we call this function unconditionally to free a bunch of bit vectors. Instead of asserting on a null pointer, just no-op. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-05-03 15:40:49 -06:00
Brian Paul	68116dcd5a	cso: null-out previously bound sampler states If, for example, we previously had 2 sampler states bound and now we are binding one, we'd leave the second sampler state unchanged. This change nulls-out the second sampler state in this situation. We're already doing the same thing for sampler views. This silences an occasional warning issued by the VMware driver when the number of sampler views and sampler states disagreed. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-03 15:40:49 -06:00
Brian Paul	05abaa65c7	svga: try to flag surfaces for sampling, in addition to rendering This silences some warnings when we try to sample from surfaces that were created for drawing, such as when blitting from one of the framebuffer surfaces. We were already doing the opposite situation (adding a bind flag for rendering to surfaces declared as texture sources). Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-05-03 15:40:48 -06:00
Brian Paul	abc6432d54	svga: fix copying non-zero layers of 1D array textures Like cube maps, we need to convert the z information to a layer index. Also rename the _face vars to _face_layer to make things a little more understandable. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-05-03 15:40:48 -06:00
Brian Paul	b94f73c150	svga: clean up svga_pipe_blit.c Remove dead code. Fix formatting. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-05-03 15:40:48 -06:00
Brian Paul	8842be1132	rbug: s/Elements/ARRAY_SIZE/ Signed-off-by: Brian Paul <brianp@vmware.com>	2016-05-03 15:40:48 -06:00
Brian Paul	7f641916bf	freedreno: s/Elements/ARRAY_SIZE/ Signed-off-by: Brian Paul <brianp@vmware.com> Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-05-03 15:40:48 -06:00
Brian Paul	b91975714d	trace: s/Elements/ARRAY_SIZE/ Signed-off-by: Brian Paul <brianp@vmware.com>	2016-05-03 15:40:48 -06:00
Brian Paul	e193c5dd59	ilo: s/Elements/ARRAY_SIZE/ Signed-off-by: Brian Paul <brianp@vmware.com>	2016-05-03 15:40:48 -06:00
Brian Paul	951bf8b4a6	i915g: s/Elements/ARRAY_SIZE/ Signed-off-by: Brian Paul <brianp@vmware.com>	2016-05-03 15:40:48 -06:00
Samuel Pitoiset	5658ddc7fe	nvc0: compute a percentage for metric-achieved_occupancy metric-issue_slot_utilization and metric-branch_efficiency are already computed as percentages. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-05-03 23:18:50 +02:00
Samuel Pitoiset	10ec27760a	nvc0: display some performance metrics with a percentage This makes more sense for them. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-05-03 23:18:50 +02:00
Samuel Pitoiset	64937615a0	nvc0: store the driver query type for performance metrics This will allow to use percentages for some metrics because the Gallium HUD doesn't allow to display floating point numbers and 0 is printed instead. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-05-03 23:18:50 +02:00
Samuel Pitoiset	a9bc3211f5	nvc0: fix exposing of metric-issue_slots for SM21/SM30 This is most likely a copy-paste error when I reworked this area few weeks ago. For SM20, metric-issue_slots is equal to inst_issued because there is only one pipeline, so the metric is not exposed there. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reported-by: Karol Herbst <nouveau@karolherbst.de>	2016-05-03 23:18:50 +02:00
Mark Janes	0af8a7d50c	mesa/objectlabel: handle NULL src string This prevents a crash when a NULL src is passed with a non-NULL length. fixes: dEQP-GLES31.functional.debug.object_labels.query_length_only Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95252 Signed-off-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-05-03 14:07:31 -07:00
Dave Airlie	265fe9dce8	glsl: subroutine types cannot be used in constructors. This fixes two of the cases in GL43-CTS.shader_subroutine.subroutines_not_allowed_as_variables_constructors_and_argument_or_return_types Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-04 06:44:45 +10:00
Dave Airlie	3110a0aa23	glsl: resource is a reserved keyword in GLSL 4.20 as well resource just appears in GLSL 4.20 without any fanfare. Fixes GL43-CTX.CommonBugs.CommonBug_ReservedNames Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-04 06:44:45 +10:00
Jan Vesely	ebbe31d57c	gallium,utils: Fix trivial sign compare warnings Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Jakob Sinclair <sinclair.jakob@openmailbox.org> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-05-03 12:00:09 -04:00
Knut Andre Tidemann	c68a9cdaac	anv: fix hang during generation of dev_icd.json. Fixes: `b370ec7c76` ("anv: tweak the %.json rule") Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-03 11:42:47 +01:00
Anuj Phogat	883f3662db	swrast: Add texfetch_funcs entries for astc 3d formats Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:18 -07:00
Anuj Phogat	63432eb370	mesa: Enable translation between astc 3d gl formats and mesa formats Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:18 -07:00
Anuj Phogat	54cac7ad96	mesa: Handle astc 3d formats in _mesa_get_compressed_formats() Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:18 -07:00
Anuj Phogat	dcfea1d7eb	mesa: Handle astc 3d formats in _mesa_base_tex_format() Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:18 -07:00
Anuj Phogat	cf85ef1618	mesa: Account for astc 3d formats in _mesa_is_astc_format() Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:18 -07:00
Anuj Phogat	38cd8145a8	mesa: Add a helper function is_astc_3d_format() Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:18 -07:00
Anuj Phogat	72dfe0242d	mesa: Add the missing defines for GL_OES_texture_compression_astc Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:18 -07:00
Anuj Phogat	57451e0fc1	mesa: Align the values of #define's in glheader.h Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:18 -07:00
Anuj Phogat	0306110fa9	mesa: Add OES_texture_compression_astc to extension table and gl_extensions Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:18 -07:00
Anuj Phogat	059f36c671	mesa: Add entries for astc 3d formats initializing struct gl_format_info Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:18 -07:00
Anuj Phogat	705216dbed	mesa: Add mesa formats for astc 3d formats Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:18 -07:00
Anuj Phogat	24bb6ee8b6	glapi: Update dispatch XML files for OES_texture_compression_astc.xml Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:18 -07:00
Anuj Phogat	63a7a9d115	mesa: Account for block depth in _mesa_format_image_size() Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:18 -07:00
Anuj Phogat	87bf66daa9	mesa: Handle 3d block sizes in _mesa_compute_compressed_pixelstore Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:18 -07:00
Anuj Phogat	84a44844f2	mesa: Handle 3d block sizes in teximage error checks Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:18 -07:00
Anuj Phogat	ec60b3da69	mesa: Handle 3d block sizes in getteximage error checks Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:18 -07:00
Anuj Phogat	5713461ae7	mesa: Add an assert for BlockDepth in _mesa_get_format_block_size() Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:17 -07:00
Anuj Phogat	9163c37349	mesa: Add a helper function to query 3D block sizes Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:17 -07:00
Anuj Phogat	6abb1b4984	mesa: Add block depth field in struct gl_format_info This will be later required for 3D ASTC formats. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-03 03:43:17 -07:00
Dave Airlie	c4a0cd4662	mesa/copyimage: make sure number of samples match. This fixes GL43-CTS.copy_image.samples_missmatch which otherwise asserts in the radeonsi driver. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-03 20:13:29 +10:00
Dave Airlie	5989a2937f	mesa/objectlabel: don't do memcpy if bufSize is 0 (v2) This prevents GL43-CTS.khr_debug.labels_non_debug from memcpying all over the stack and crashing. v2: actually fix the test. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-03 20:12:59 +10:00
Dave Airlie	30823f997b	mesa/textureview: move error checks up higher GL43-CTS.texture_view.errors checks for GL_INVALID_VALUE here but we catch these problems in the dimensionsOK check and return the wrong error value. This fixes: GL43-CTS.texture_view.errors. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-03 20:12:52 +10:00
Marek Olšák	5541e11b9a	gallium/radeon: remove stencil_tile_split from metadata this is a leftover from the days when depth-stencil buffers were allocated by the DDX Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-05-02 22:49:25 +02:00
Marek Olšák	20a77397fa	gallium/radeon: remove tile_mode_array_valid flags Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-05-02 22:49:25 +02:00
Marek Olšák	c8aac4fc0d	winsys/amdgpu: pass PIPE_CONFIG to addrlib on texture import This hasn't been needed, but I think we should set it. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-05-02 22:49:25 +02:00
Marek Olšák	dc970c4f4e	winsys/amdgpu: read NUM_BANKS from buffer metadata Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-05-02 22:49:25 +02:00
Marek Olšák	02f90cef7d	radeonsi: remove unused tile mode getters Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-05-02 22:49:25 +02:00
Marek Olšák	b9e3e87069	radeonsi: just read tile mode arrays in SDMA setup Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-05-02 22:49:25 +02:00
Marek Olšák	0c2cba1ec6	radeonsi: just read tile mode arrays in SI DMA setup Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-05-02 22:49:25 +02:00
Marek Olšák	c3ca54aee9	radeonsi: just read tile mode arrays in DB setup Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-05-02 22:49:25 +02:00
Marek Olšák	ef45825708	gallium/radeon: add radeon_surf::macro_tile_index for indexing cik_macrotile_mode_array Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-05-02 22:49:25 +02:00
Marek Olšák	ed4fd542de	winsys/radeon: drop support for kernels lacking tile mode array queries This will allow us to simplify a lot of code around tiling. Kernel 3.10 is required for SI support. Kernel 3.13 is required for CIK support. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-05-02 22:49:25 +02:00
Marek Olšák	3d956b4bc0	st/mesa: fix blit-based GetTexImage for non-finalized textures This fixes getteximage-depth piglit failures on radeonsi. Cc: 11.1 11.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-02 22:49:25 +02:00
Marek Olšák	77af6bcc26	winsys/radeon: count buffer size only once Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-02 22:49:25 +02:00
Marek Olšák	3e3c43418e	winsys/amdgpu: count buffer size only once Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-02 22:49:25 +02:00
Marek Olšák	f98ba4123c	winsys/amdgpu: loosen up requirements for how much memory IBs can use ported from winsys/radeon. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-02 22:49:25 +02:00
Marek Olšák	9ec00c23c2	radeonsi: when parsing dmesg, skip empty lines Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-02 22:49:25 +02:00
Marek Olšák	9983efca76	radeonsi: use the hw MSAA resolving if formats are compatible This allows resolving RGBA into RGBX. This should improve HL2 Lost Coast performance. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-02 22:49:25 +02:00
Samuel Pitoiset	819836d240	nv50,nvc0: re-bind old compute state after reading MP perf counters This might be useful to avoid breaking the current compute state when monitoring MP perf counters because we use a compute kernel to read out those counters. This has been initially suggested by Ilia Mirkin. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-05-02 22:30:48 +02:00
Rob Clark	dcf8c4425a	nir: make lower_clamp_color pass work after lower i/o Kinda important to work with tgsi_to_nir, which generates nir which already has i/o lowered. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-02 14:25:38 -04:00
Eric Anholt	226bd92945	vc4: Use NIR lowering for sRGB decode. This should get us the same decode code generated, but with a lot less custom code in the driver.	2016-05-02 11:06:29 -07:00
Eric Anholt	4b326341f3	vc4: Just use NIR lowering for texture projection. This means doing Newton-Raphson on the RCP, but it's probably actually a good thing to be accurate on.	2016-05-02 11:06:29 -07:00
Eric Anholt	2f98bc100d	vc4: Scalarize phi nodes as well. This makes fewer programs with loops assertion fail, replacing them with the rendering failure warning.	2016-05-02 11:06:29 -07:00
Eric Anholt	4a2ad8500d	vc4: Add whitespace after each program stage dump. In particular it's been hard to find the point where we switch from dumping pre-optimization QIR and post-optimization QIR.	2016-05-02 11:06:29 -07:00
Eric Anholt	84322b2f31	vc4: Remove the CSE pass. It's not doing anything according to shader-db now that we're using NIR. It would have had to be reworked significantly anyway, to handle control flow.	2016-05-02 11:06:29 -07:00
Eric Anholt	b145b731ab	vc4: Emit only one FRAG_Z or FRAG_W QIR opcode. We were generating piles of FRAG_W for interpolation, only to CSE them away immediately. Since this is the only thing that CSE is doing for us any more, just avoid making the CSE work necessary.	2016-05-02 11:06:29 -07:00
Eric Anholt	e138716d8d	vc4: Use the NIR cubemap normalization instead of our own. This is one of two uses of the current QIR CSE pass according to shader-db. The NIR pass means that we'll end up doing Newton-Raphson on our RCP, which we weren't doing before, but that's probably actually a good thing.	2016-05-02 11:06:29 -07:00
Eric Anholt	3bee7581e6	vc4: Drop the support for DCE of texture instructions. Now that we're using NIR for our optimization, there's no need for this tricky code.	2016-05-02 11:06:29 -07:00
Nicolai Hähnle	155ce49603	radeonsi: fix PIPE_FORMAT_R11G11B10_FLOAT handling That format has first_non_void < 0. This fixes a regression in piglit arb_shader_image_load_store-semantics that was introduced by commit `76b8c5cc60`, while hopefully still shutting Coverity up (and failing in a more obvious way if a similar error should re-appear). Reviewed-by: Jakob Sinclair <sinclair.jakob@openmailbox.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-02 11:38:23 -05:00
Nicolai Hähnle	169ace5636	radeonsi: correct NULL-pointer check in si_upload_const_buffer Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-02 11:37:55 -05:00
Dave Airlie	cf6dadb00b	softpipe: bump 3D texture limit to 2048 The GL4.1 spec bumps this to 2048, so we should do so. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-02 07:29:02 +10:00
Dave Airlie	277170eeea	softpipe: allow r32 xchg on shader images. This is part of OES_shader_image_atomic.txt. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-05-02 07:28:58 +10:00
Ilia Mirkin	3950aa47df	softpipe: avoid leaking local_mem on machines alloc failure Spotted by Coverity Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Vinson Lee <vlee@freedesktop.org>	2016-05-01 11:19:08 -04:00
Ilia Mirkin	ad545d179b	vbo: avoid leaking prim on vbo bind failure Spotted by Coverity Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Vinson Lee <vlee@freedesktop.org>	2016-05-01 11:19:08 -04:00
Edward O'Callaghan	23cf24e227	mapi/glapi: Fix dup word typo in glapi_getproc.c Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-05-01 16:07:29 +02:00
Emil Velikov	44f921091a	isl: automake: don't explicitly EXTRA_DIST the tests folder The file(s) within are already picked thanks to the build rule of the respective test. No need to have the folder in EXTRA_DIST. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-01 14:17:30 +01:00
Timothy Arceri	f982e2434b	mesa: add LOCATION_COMPONENT support to GetProgramResourceiv From Section 7.3.1.1 (Naming Active Resources) of the OpenGL 4.5 spec: "For the property LOCATION_COMPONENT, a single integer indicating the first component of the location assigned to an active input or output variable is written to params. For input and output variables with a component specified by a layout qualifier, the specified component is written. For all other input and output variables, the value zero is written." Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-05-01 23:13:36 +10:00
Timothy Arceri	b1c872a81e	glsl: add component to has_layout() helper I don't think this will do much as it's a compiler error to use component without location which is already in the table but its good to be consistent. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-05-01 23:13:28 +10:00
Timothy Arceri	589053dac7	glsl: validate linking of intrastage component qualifiers Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-05-01 23:13:22 +10:00
Timothy Arceri	0317dfcd9b	glsl: update explicit location matching to support component qualifier This is needed so we don't optimise away the varying when more than one shares the same location. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-01 23:13:15 +10:00
Timothy Arceri	0d88b15f07	glsl: cross validate varyings with a component qualifier This change checks for component overlap, including handling overlap of locations and components by doubles. Previously there was no validation for assigning explicit locations to a location used by the second half of a double. V3: simplify handling of doubles and fix double component aliasing detection V2: fix component matching for matricies Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-01 23:13:10 +10:00
Timothy Arceri	94438578d2	glsl: validate and store component layout qualifier in GLSL IR We make use of the existing IR field location_frac used for tracking component locations. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-01 23:13:05 +10:00
Timothy Arceri	2d9936a686	glsl: allow component qualifier on varying inputs Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-05-01 23:13:00 +10:00
Timothy Arceri	daa8df590b	glsl: parse component layout qualifier Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-01 23:12:52 +10:00
WuZhen	ea4c1afd05	android: enable dlopen() on all architectures Cc: "11.2 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-01 12:31:29 +01:00
Jose Fonseca	5649d6ab06	winsys/sw/xlib: use correct free function for xlib_dt->data Analogous to previous commit. Cc: "11.2 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-01 12:31:29 +01:00
WuZhen	4f21f3f2e8	winsys/sw/dri: use correct free function for dri_sw_dt->data align_malloc() is used to allocate dri_sw_dt->data, thus we should not be using FREE() but align_free(). Cc: "11.2 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw> [Emil Velikov: tweak commit summary/shortlog] Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-05-01 12:31:29 +01:00
WuZhen	798f7a8596	tgsi: initialize stack allocated struct Cc: "11.2 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-01 12:31:29 +01:00
Emil Velikov	fb653641ea	egl: android: do not feed invalid fourcc/pitch into the dri module Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-01 12:31:29 +01:00
Rob Herring	34ddef39ce	egl: android: add dma-buf fd support Add support for creating images from Android native buffers with dma-buf fd. As dma-buf support also requires DRI image loader extension, add that as well. This is based on several originally patches written by Varad Gautam. I've collapsed them into logical changes and done a bit of reformatting. Using dma-bufs vs. GEM handles is now a runtime decision similar to the wayland EGL instead of being compile time selection. The dma-buf support is also re-written to use common dri2_create_image_dma_buf function in egl_dri2.c. Cc: Varad Gautam <varadgautam@gmail.com> Cc: Rob Clark <robdclark@gmail.com> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-01 12:31:29 +01:00
Rob Herring	81a6fff4c5	egl: android: factor out back buffer handling code In preparation to use the same code for dma-bufs, factor out the code to a separate function. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-01 12:31:28 +01:00
Rob Herring	dfaccf25f5	egl: android: factor out format conversion code to a function Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-01 12:31:28 +01:00
Rob Herring	d45884ef05	egl: android: disable __DRI_DRI2_LOADER support on render nodes Use of __DRI_DRI2_LOADER extension is only supported for card nodes. In order to support dmabufs, Android will be moving to using render nodes and we need to disable the DRI2 loader extension. This is based on the Wayland EGL code. Cc: Rob Clark <robdclark@gmail.com> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-01 12:31:28 +01:00
Rob Herring	dbbf7a8e61	Android: fix build ordering of subdirectories Different versions of make behave differently in whether $(wildcard) sorts the results or not. The Android build now explicitly sorts all-named-subdir-makefiles which breaks the build because src/gallium must be included after src/mesa/drivers/dri. The Android build system doesn't support doing "include $(call all-named-subdir-makefiles,...)" twice, so rework things by generating the included makefile list and including them in 2 steps. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-05-01 12:31:28 +01:00
Jamey Sharp	595d56cc86	glShaderSource must not change compile status. OpenGL 4.5 Core Profile section 7.1, in the documentation for CompileShader, says: "Changing the source code of a shader object with ShaderSource does not change its compile status or the compiled shader code." According to Karol Herbst, the game "Divinity: Original Sin - Enhanced Edition" depends on this odd quirk of the spec. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93551 Signed-off-by: Jamey Sharp <jamey@minilop.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-05-01 18:46:24 +10:00
Emil Velikov	9fa2e57a73	gallium/radeon: nuke the final pre LLVM 3.6 codepath Missed with commit `100796c15c` "gallium/radeon: drop support for LLVM 3.5" v2: s/LLVN/LLVM/ in shortlog (Nicolai) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-01 08:57:32 +01:00
Emil Velikov	7336df06ed	anv: include the files in the tarball Namely the python script, the ICD header and private headers. We could get the system version of the ICD ones, although there is no .pc file to easily locate and/or manage them. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-01 08:38:46 +01:00
Emil Velikov	9e09507516	i965: don't forget to ship brw_nir_trig_workarounds.py Otherwise we won't be able to regenerate the source file(s). Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-01 08:38:46 +01:00
Emil Velikov	1f04caa09c	isl: include all the files in the tarball Add the missing header(s), generation scripts, README ... Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-01 08:38:34 +01:00
Emil Velikov	cee69ccb92	spirv: automake: add missing headers to the tarball. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-01 08:38:06 +01:00
Emil Velikov	dc38e6b169	automake: wire up the intel vulkan driver to make distcheck Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-01 08:38:06 +01:00
Emil Velikov	dfbf1289a4	anv: update .gitignore Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-01 08:38:05 +01:00
Emil Velikov	fcdcb829d8	anv: automake: remove no longer needed include Thanks to last commit we can nuke it. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-01 08:38:05 +01:00
Emil Velikov	3285461ceb	anv: automake: tweak anv_entrypoint.[ch] rule Rather than using cat + cpp feed the file(s) directly into the latter. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-01 08:38:05 +01:00
Emil Velikov	bc7802098e	anv: tweak libvulkan_intel.so link libraries i.e do not use -lfoo directly. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-01 08:38:05 +01:00
Emil Velikov	9f235adf99	anv: cosmetic makefile changes Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-01 08:38:05 +01:00
Emil Velikov	446234033d	anv: place the builddir includes before the srcdir ones Otherwise we risk picking the possibly outdated file in the source dir over the fresh one in the builddir. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-01 08:38:05 +01:00
Emil Velikov	6cb814727d	automake: tweak SUBDIR reorder and comment it It should ease people with all the interaction and platforms and how they interact (at least from a build POV) with each other. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-01 08:38:05 +01:00
Emil Velikov	4fcf0ba113	configure.ac: remove unused HAVE_EGL_PLATFORM_NULL conditional Afaict the last user was based on st/egl. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-01 08:38:05 +01:00
Emil Velikov	9f3588eb37	automake: drop "EGL_" from HAVE_EGL_PLATFORM_WAYLAND Analogous to previous commit. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-01 08:38:05 +01:00
Emil Velikov	5459db91e3	automake: drop "EGL_" from HAVE_EGL_PLATFORM_X11 The variable covers more than just EGL, let's try to untangle the confusion it brings. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-01 08:38:05 +01:00
Emil Velikov	a56009d089	anv: get rid of VULKAN_ENTRYPOINT_CPPFLAGS variable Add the missing include to AM_CPPFLAGS and use it throughout the makefile. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-01 08:38:05 +01:00
Emil Velikov	6dc169e18f	anv: factor out the X11/XCB build Similar to earlier commit - move all the common bits into a single place, thus improving readability and allowing us to see what's missing. Also don't forget to add the missing bits. This commit should allows us to build wayland only vulkan ;-) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-01 08:38:04 +01:00
Emil Velikov	cbc4837b83	anv: kill of custom define HAVE_WAYLAND_PLATFORM Vulkan API already has equivalent, so simplify things as just use it. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-01 08:38:04 +01:00
Emil Velikov	9bc99f5668	anv: refactor wayland build handling Rather than having things split out in multiple places, consolidate it and add all the missing bits. Also ensure that we use the already built static library libwayland-drm.la. v2 Add missing '\' in the CFLAGS. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> (v1)	2016-05-01 08:38:04 +01:00
Emil Velikov	3a2d09dd65	automake: include vulkan subdir after wayland-drm We'll reuse the existing wayland-drm static library with next commit. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-01 08:38:04 +01:00
Emil Velikov	fe918556a2	anv: use a common variable to manage the library dependencies Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-01 08:38:04 +01:00
Emil Velikov	82d0b59f02	anv: use the GENERATED_FILES variable ... rather than having duplicates files through the sources lists. Splitting things as is, has the side effect of making things clearer and easing a potential android build. The latter of which automatically adds BUILT_SOURCES to the binary. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-01 08:38:04 +01:00
Emil Velikov	3ee7d8b0eb	anv: fold the tests' makefile Recent commit removed the winsys defines from anv_private.h thus breaking the tests. To fix that and avoid it in the future, merge the tests makefile in the libvulkan one. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-01 08:38:04 +01:00
Emil Velikov	f3cb0dcae1	anv: build the core vulkan only once Introduce a static library libvulkan_common.la that is used by libvukan_intel.la and libvulkan_test.la. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-01 08:38:04 +01:00
Emil Velikov	21800d77ff	anv: kill off custom CFLAGS AM_CFLAGS already does all that we need. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-01 08:38:04 +01:00
Emil Velikov	623cb3a598	anv: add missing link against the math library Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-01 08:38:04 +01:00
Emil Velikov	e98cf60446	anv: split sources lists to Makefile.sources Will allow others to reuse the lists (scons/android anyone ?) and makes the file a lot shorter and easier to read. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-01 08:38:04 +01:00
Emil Velikov	0d3e7b17c9	anv: remove custom rule to install the intel_icd.json Autoconf already does the exact same thing as the manually written rule. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94969 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-01 08:38:04 +01:00
Emil Velikov	30e6f68b3b	anv: tweak the LDFLAGS Copy/paste from the rest of mesa, but namely. - The module should be shared only. - We don't need the explicit ".so", as the vulkan loader will retrieve the full filename from the json - No unresolved symbols in the final binary - Use the linker garbage collector to slim down the final binary. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-01 08:38:03 +01:00
Emil Velikov	b370ec7c76	anv: tweak the %.json rule It's used only by dev_icd.json so just call it that way. While we're here, manually expand $< (as it might cause issue on some systems) and drop the unneeded install_libdir substitution. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-01 08:38:03 +01:00
Emil Velikov	abd360ab75	anv: add a comment about dev_icd.json Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-01 08:38:03 +01:00
Emil Velikov	44978a91ff	genxml: ship all the files needed in the tarball v2: The xml files are not called "gen*_pack.xml" (Jason) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-01 08:38:03 +01:00
Emil Velikov	3f23a0f8c1	anv: remove description about GENX_FUNC macro The macro has been gone since commit `1f1cf6fcb0` "anv: Get rid of GENX_FUNC" Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-01 08:37:25 +01:00
Emil Velikov	0700cdd5aa	gallium/target-helpers: remove inline_wrapper_sw_helper.h Unused as of commit `dddedbec0e` "{st,targets}/nine: use static/dynamic pipe-loader" Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-01 08:37:25 +01:00
Mark Kettenis	b8e59292e6	egl/x11: resolve "initialization from incompatible pointer type" warning With earlier commit we've moved a few functions and changing the argument type from _EGLDisplay * to struct dri2_egl_display *. The latter is effectively a wrapper around the former, thus functionality was preserved, although GCC rightfully warned us about the misuse. Add a simple wrapper that casts and propagates the correct type. Fixes: `9bbf3737f9` ("egl/x11: authenticate before doing chipset id ioctls") Cc: "11.2 11.1" <mesa-stable@lists.freedesktop.org> Reported-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-01 08:37:25 +01:00
Chuck Atkins	a92910ae37	glx: Refactor the configure options for glx implementation choice (v3) Instead of cascading support for various different implementations of GLX, all three options are now specified through the --enable-glx option: --enable-glx=dri : Enable the DRI-based GLX --enable-glx=xlib : Enable the classic Xlib-based GLX --enable-glx=gallium-xlib : Enable the gallium Xlib-based GLX --enable-glx[=yes] : Defaults to dri if DRI is enabled, else gallium-xlib if gallium is enabled, else xlib This removes the --enable-xlib-glx option and fixes a bug in which both the classic xlib-glx and gallium xlib-glx implementations were getting built causing different versioned and conflicting libGL libraries to be installed. v2: Changes from various review feedback from Emil: a) Fixed typos b) Corrected help docs for new option c) Added appropriate a-b and r-b tags in commit msg d) Fixed various GLX related dependency checks. v3: Rebased to current master and added changelog in commit msg Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94086 Acked-by: Brian Paul <brianp@vmware.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-05-01 08:37:25 +01:00
Thomas Hindoe Paaboel Andersen	cbcd7b60f5	nir/lower_double_ops: fix indentation Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-30 12:16:32 -07:00
Thomas Hindoe Paaboel Andersen	21424e019d	nir/opt_dead_cf: fix indentation Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-30 12:16:29 -07:00
Thomas Hindoe Paaboel Andersen	6935726197	nir/opt_dead_cf: correction of side effect check Parenthesis are needed here as ! takes precedence over the &. The check had the opposite effect than intended. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-30 12:16:22 -07:00
Rob Clark	663c0e5155	freedreno/ir3: use pipe_debug_callback for shader-db traces For multi-threaded shader-db support. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-30 14:56:20 -04:00
Rob Clark	2578e3edcb	freedreno/a4xx: add debug callback to emit Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-30 14:56:19 -04:00
Rob Clark	51f20dd279	freedreno/a3xx: add debug callback to emit Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-30 14:56:19 -04:00
Rob Clark	41d288c306	freedreno: wire up core pipe_debug_callback Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-30 14:56:19 -04:00
Rob Clark	e04db879f8	freedreno/ir3: handle color clamp variant ourselves Now that there is a pass to do this in NIR, lets just use that and manage the variants ourself, rather than letting state-tracker do it. This way, mesa/st will precompile shaders without requiring ST_DEBUG=precompile (which requires a debug build). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-30 14:56:19 -04:00
Rob Clark	64abf6d404	nir: clamp-color-output support Handled by tgsi_emulate for glsl->tgsi case. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-04-30 14:56:19 -04:00
Rob Clark	482cdc4c92	freedreno: fix indentation Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-30 14:56:19 -04:00
Marek Olšák	53435514c1	radeonsi: fix synchronization of shader images This fixes the winsys->cs_is_buffer_referenced query, which is used for synchronization before buffers are mapped. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-30 19:36:16 +02:00
Samuel Pitoiset	8f2238ccba	st/glsl_to_tgsi: fix potential crash when allocating temporaries When index - t->temps_size is greater than 4096, allocating space for temporaries on demand will miserably crash. This can happen when a game uses a lot of temporaries like the recent released Tomb raider. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-04-30 17:41:32 +02:00
Kenneth Graunke	750c38fad1	glsl: Lower vector_extracts to swizzles after lower_vector_derefs. lower_vector_derefs can produce new vector_extract operations. Neither i965 nor st_glsl_to_tgsi can handle them, so we'd best convert them to swizzles. Together with the previous patch, this fixes assertion failures in GLideN64, as well as a new Piglit test which reproduces the issue: spec/glsl-1.10/compiler/vector-dereference-in-dereference.frag Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95164 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-04-29 16:03:36 -07:00
Kenneth Graunke	1cd600dbb9	glsl: Convert lower_vec_index_to_swizzle to a rvalue visitor. The old visitor missed some cases. For example, it wouldn't handle an ir_dereference_array with a vector_extract as the index. Rather than trying to add the missing cases, just rewrite it as an ir_rvalue_visitor. This makes it easy to replace any expression, and is much less code. Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95164 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-04-29 16:03:29 -07:00
Thomas Faller	d53cf1ea4c	mesa: simplify _mesa_Lightfv Signed-off-by: Thomas Faller <tfaller1@gmx.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-04-29 11:08:01 -06:00
Nicolai Hähnle	aa6f88f891	gallium/radeon: fix crash in r600_set_streamout_targets Protect against dereferencing a gap in the targets array. This was triggered by a test in the Khronos CTS. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-29 11:55:06 -05:00
Nicolai Hähnle	98c348d26b	st/glsl_to_tgsi: reduce stack explosion in recursive expression visitor In optimized builds, visit(ir_expression *) experiences inlining with gcc that leads the function to have a roughly 32KB stack frame. This is a problem given that the function is called recursively. In non-optimized builds, the stack frame is much smaller, hence one gets crashes that happen only in optimized builds. Arguably there is a compiler bug or at least severe misfeature here. In any case, the easy thing to do for now seems to be moving the bulk of the non-recursive code into a separate function. This is sufficient to convince my version of gcc not to blow up the stack frame of the recursive part. Just to be sure, add the gcc-specific noinline attribute to prevent this bug from reoccuring if inliner heuristics change. v2: put ATTRIBUTE_NOINLINE into macros.h Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95133 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95026 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92850 Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-04-29 11:52:59 -05:00
Nicolai Hähnle	59af21c3e9	tgsi/text: fix parsing of memory instructions Properly handle Target and Format parameters when present. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-29 11:39:56 -05:00
Nicolai Hähnle	4055babc75	tgsi/text: add str_match_name_from_array Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-29 11:39:53 -05:00
Nicolai Hähnle	a56edbdd8f	tgsi/text: add str_match_format helper function Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-29 11:39:51 -05:00
Nicolai Hähnle	acb65a23a3	tgsi/build: pass Memory.Texture and .Format through tgsi_build_full_instruction Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-29 11:39:32 -05:00
Nicolai Hähnle	318d305f6d	tgsi/dump: signal nospace when the last print exceeded the size Previously, there was a bug where nospace wasn't signalled if it just so happened that the very last print exceeded the available space. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-29 11:39:28 -05:00
Nicolai Hähnle	e08eaa5b72	tgsi/dump: shared dump_ctx initialization Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-29 11:39:21 -05:00
Emil Velikov	4b1ea6910e	st/omx: don't return early in vid_enc_EncodeFrame() Earlier commit plugged a memory leak, although it missed a pair of brackets. Thus we unconditionally returned even in the case of no error. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95203 Fixes: `b87856d25d` ("st/omx: Fix resource leak on OMX_ErrorNone") Tested-by: Andy Furniss <adf.lists@gmail.com> Acked-by: Robert Foss <robert.foss@collabora.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> --- What an embarassing bug - missing brackets. Andy can you confirm that it resolves the issue ?	2016-04-29 15:36:18 +01:00
Andres Gomez	c750029b37	glsl: Checks for interpolation into its own function. This generalizes the validation also to be done for variables inside interface blocks, which, for some cases, was missing. For a discussion about the additional validation cases included see https://lists.freedesktop.org/archives/mesa-dev/2016-March/109117.html and Khronos bug #15671. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com>	2016-04-29 08:03:00 +02:00
Jason Ekstrand	6d4a426745	nir/algebraic: Support lowering for both 64 and 32-bit ldexp Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-04-28 21:36:52 -07:00
Jason Ekstrand	f0af5b87ec	nir/opcodes: Make ldexp take an explicitly 32-bit int There is no sense in having the double version of ldexp take a 64-bit integer. Instead, let's just take a 32-bit int all the time. This also matches what GLSL does where both variants of ldexp take a regular integer for the exponent argument. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-04-28 21:36:52 -07:00
Jason Ekstrand	bee40dd730	nir/opcodes: Simplify the expressions for [un]pack_double The new expressions are more explicit in terms of where the bits go so it's a little easier to tell what's going on. This is the way GLSL specifies things so it's a bit easier to verify too. It also has the benifit that the new expressions easily vectorize so we can constant-fold vector forms of the _split versions correctly. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-04-28 21:36:52 -07:00
Kenneth Graunke	2655265fcb	mesa: Fix indirect draw buffer size check on 32-bit systems. Fixes dEQP-GLES31.functional subtests: draw_indirect.negative.command_offset_not_in_buffer_signed32_wrap draw_indirect.negative.command_offset_not_in_buffer_unsigned32_wrap These tests use really large values that overflow GLsizeiptr, at which point the buffer size isn't less than "end". Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95138 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2016-04-28 16:31:45 -07:00
Jason Ekstrand	70f89dd75e	nir: Switch the arguments to nir_foreach_def This matches the "foreach x in container" pattern found in many other programming languages. Generated by the following regular expression: s/nir_foreach_def($[^,]$,\s$[^,]*$)/nir_foreach_def(\2, \1)/ Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-04-28 15:54:48 -07:00
Jason Ekstrand	5015260a05	nir: Switch the arguments to nir_foreach_use and friends This matches the "foreach x in container" pattern found in many other programming languages. Generated by the following regular expression: s/nir_foreach_use($[^,]$,\s$[^,]*$)/nir_foreach_use(\2, \1)/ and similar expressions for nir_foreach_use_safe, etc. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-04-28 15:54:48 -07:00
Jason Ekstrand	9464d8c498	nir: Switch the arguments to nir_foreach_function This matches the "foreach x in container" pattern found in many other programming languages. Generated by the following regular expression: s/nir_foreach_function($[^,]$,\s$[^,]*$)/nir_foreach_function(\2, \1)/ Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-04-28 15:54:48 -07:00
Jason Ekstrand	e63766fb4b	nir: Switch the arguments to nir_foreach_parallel_copy_entry This matches the "foreach x in container" pattern found in many other programming languages. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-04-28 15:54:48 -07:00
Jason Ekstrand	8564916d01	nir: Switch the arguments to nir_foreach_phi_src This matches the "foreach x in container" pattern found in many other programming languages. Generated by the following regular expression: s/nir_foreach_phi_src($[^,]$,\s$[^,]*$)/nir_foreach_phi_src(\2, \1)/ and a similar expression for nir_foreach_phi_src_safe. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-04-28 15:54:48 -07:00
Jason Ekstrand	707e72f13b	nir: Switch the arguments to nir_foreach_instr This matches the "foreach x in container" pattern found in many other programming languages. Generated by the following regular expression: s/nir_foreach_instr($[^,]$,\s$[^,]*$)/nir_foreach_instr(\2, \1)/ and similar expressions for nir_foreach_instr_safe etc. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-04-28 15:54:48 -07:00
Jason Ekstrand	261d62de33	anv/lower_push_constants: fixup for nir_foreach_block() Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Jason Ekstrand	bb65764a4a	anv/apply_pipeline_layout: fixup for nir_foreach_block() Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Jason Ekstrand	621cbc0c14	anv/apply_dynamic_offsets: fixup for nir_foreach_block() Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	7efff10585	i965/nir: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	3a8688fb41	nir/algebraic: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	1f8c100614	nir/validate: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	a471c161b1	nir/nir_worklist: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	db35177772	nir/remove_dead_variables: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	b3aaae398e	nir/split_var_copies: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	9d41a1ffeb	nir/repair_ssa: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	480a182ccd	nir/opt_peephole_select: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	e5f37701ab	nir/phi_builder: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	1ba40d834b	nir/opt_cp: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	8dd7d78925	nir/opt_remove_phis: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	1a8c17a59e	nir/opt_undef: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	52affdd2e6	nir/opt_dead_cf: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	ddc6639f85	nir/opt_dce: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	3afb3be674	nir/opt_gcm: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	eecf96f530	nir/opt_constant_folding: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	26b4c9ee15	nir/lower_samplers: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	f4ebff89e4	nir/normalize_cubemap_coords: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	492b3554a7	nir/lower_var_copies: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	c1b37c08bf	nir/move_vec_src_uses_to_dest: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	ceed12557d	nir/lower_vars_to_ssa: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	1557344c81	nir/lower_vec_to_movs: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	b1eada04b2	nir/lower_idiv: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	2febb88e6d	nir/lower_to_source_mods: fixup for new foreeach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	c81ca60b41	nir/lower_io: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	7e909972e3	nir/lower_system_values: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	76c74de456	nir/lower_phis_to_scalar: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	b89f0bb58c	nir/lower_indirect_derefs: fixup for new foreach_block() v2 (Jason Ekstrand): Use nir_foreach_block_safe Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	e3c5bda16a	nir/nir_lower_global_vars: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	480d78f55b	nir/lower_atomics: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	06cf73a7ba	nir/lower_load_const: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	15264133d7	nir/lower_locals_to_regs: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	1c6307aab4	nir/lower_gs_intrinsics: fixup for new foreach_block() v2 (Jason Ekstrand): Use nir_foreach_block_safe Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	3bf3100794	nir/nir: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	686f247b21	nir/lower_clip: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	e36fbcfc3f	nir/lower_alu_to_scalar: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	4179a56f42	nir/liveness: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	34af78edb3	nir/inline_functions: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	b23e59e172	nir/from_ssa: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Connor Abbott	d6a6c729ca	nir/dominance: fixup for new foreach_block() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 15:52:17 -07:00
Samuel Pitoiset	9f92a8f00a	nvc0: stick compute kernel arguments into uniform_bo Having one buffer object for input kernel arguments coming from clover and an other one for OpenGL user uniforms is unnecessary. Using the uniform_bo object for both GL/CL uniforms avoids to declare a new BO. This only affects compute programs but it should not hurt anything because the states are dirtied and data will get reuploaded. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-29 00:44:08 +02:00
Tim Rowley	124a5d4ca0	swr: remove duplicated constant update code Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-28 16:16:46 -05:00
Marek Olšák	1a8c2ccb24	gallium/radeon: add the size only once in r600_context_add_resource_size Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-04-28 21:06:31 +02:00
Bas Nieuwenhuizen	8e43bc0eb6	winsys/radeon: enlarge buffer_indices_hashlist Enlarge the buffer hashlist to prevent large numbers of misses due to adding more buffers than can be cached in the hashlist. Ported from winsys/amdgpu: `6373845d98` Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-04-28 21:06:31 +02:00
Marek Olšák	92f6af2c4a	gallium/radeon: drop support for LINEAR_GENERAL layout Unused. All texture imports use LINEAR_ALIGNED regardless of what the DDX does. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-04-28 20:16:56 +02:00
Marek Olšák	f564b61d33	radeonsi: rework clear_buffer flags Changes: - don't flush DB for fast color clears - don't flush any caches for initial clears - remove the flag from si_copy_buffer, always assume shader coherency Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-28 20:16:56 +02:00
Jason Ekstrand	d273ce5259	anv/dynamic_offsets: Fix the order of arguments to nir_build_imm	2016-04-28 11:05:56 -07:00
Jason Ekstrand	6028a67641	anv: Fix a build error caused by recent fp64 NIR changes	2016-04-28 10:13:42 -07:00
Jose Fonseca	99474dc29b	nir: Try to warn when C99 extensions are used in nir headers. Ideally we'd have nir.h being included with -Wpedantic too, but it fails with: src/compiler/nir/nir.h:754:20: warning: ISO C++ forbids zero-size array ‘src’ [-Wpedantic] nir_alu_src src[]; ^ In file included from src/compiler/nir/glsl_to_nir.cpp:42:0: src/compiler/nir/nir.h:919:16: warning: ISO C++ forbids zero-size array ‘src’ [-Wpedantic] nir_src src[]; Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-04-28 16:48:13 +01:00
Jose Fonseca	e7438009af	nir: Remove spurious ; after nir_builder functions. Makes -pedantic happy. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-04-28 16:48:12 +01:00
Jose Fonseca	caa5937ebb	nir: Remove spurious ; after namespace. Makes -pedantic happy. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-04-28 16:48:12 +01:00
Jose Fonseca	f7854d8227	nir: Avoid C99 field initializers. As they are not standard C++ and are not supported by MSVC C++ compiler. Just have nir_imm_double match nir_imm_float above. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com>	2016-04-28 16:48:12 +01:00
Brian Paul	a609da60c0	gallium/util: s/Elements/ARRAY_SIZE/ Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-28 09:04:24 -06:00
Brian Paul	f365488eaa	mesa: improve comment on _mesa_check_disallowed_mapping(), return bool The old comment was a bit terse. Also, change the function return type to bool. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-04-28 09:04:17 -06:00
Marek Olšák	7e7710a068	radeonsi: remove needless cache flushes at the end of CP DMA operations not needed AFAIK Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-28 12:46:47 +02:00
Marek Olšák	7d49b459b6	radeonsi: remove flushes at the beginning and end of IBs done by the kernel Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-28 12:46:47 +02:00
Samuel Iglesias Gonsálvez	db07b46f2c	nir: Add lrp lowering for doubles in opt_algebraic Some hardware (i965 on Broadwell generation, for example) does not support natively the execution of lrp instruction with double arguments. Add 'lower_flrp64' flag to lower this instruction in that case. v2: - Rename lower_flrp_double to lower_flrp64 (Jason) - Fix typo (Jason) - Adapt the code to define bit_size information in the opcodes. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 12:01:40 +02:00
Samuel Iglesias Gonsálvez	443600d51e	nir: rename lower_flrp to lower_flrp32 A later patch will add lower_flrp64 option to NIR. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 12:01:40 +02:00
Iago Toral Quiroga	072613b3f3	nir/lower_double_ops: lower round_even() At least i965 hardware does not have native support for round_even() on doubles. Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-28 12:01:40 +02:00
Iago Toral Quiroga	bf91df7f7f	nir/lower_double_ops: lower fract() At least i965 hardware does not have native support for fract() on doubles. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 12:01:40 +02:00
Iago Toral Quiroga	126a1ac03f	nir/lower_double_ops: lower ceil() At least i965 hardware does not have native support for ceil on doubles. v2 (Sam): - Improve the lowering pass to remove one bcsel (Jason). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 12:01:36 +02:00
Iago Toral Quiroga	29541ec531	nir/lower_double_ops: lower floor() At least i965 hardware does not have native support for floor on doubles. v2 (Sam): - Improve the lowering pass to remove one bcsel (Jason) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 11:58:35 +02:00
Iago Toral Quiroga	5fab3d178b	nir/lower_double_ops: lower trunc() At least i965 hardware does not have native support for truncating doubles. v2: - Simplified the implementation significantly. - Fixed the else branch, that was not doing what we wanted. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 11:58:13 +02:00
Connor Abbott	2ea3649c63	nir: add a pass to lower some double operations v2: Move to compiler/nir (Iago) v3: Use nir_imm_int() to load the constants (Sam) v4 (Sam): - Undo line-wrap (Jason). - Fix comment (Jason). - Improve generated code for get_signed_inf() function (Connor). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 11:58:13 +02:00
Connor Abbott	2cf3b28884	nir/builder: add nir_imm_double() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 11:58:13 +02:00
Samuel Iglesias Gonsálvez	3a150683ce	nir/builder: Add bit_size info to nir_build_imm() v2: - Group num_components and bit_size together (Jason) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-28 11:58:13 +02:00
Jakob Sinclair	76b8c5cc60	radeonsi: check if value is negative Fixes a Coverity defect by adding checks to see if a value is negative before using it to index an array. By checking the value first it makes the code a bit safer but overall should not have a big impact. CID: 1355598 Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-04-28 11:33:38 +02:00
Michel Dänzer	860210ccfc	clover: Fix build against clang SVN >= r267772 (Re-pushing previous fix for clang SVN r265359, which was reverted in the meantime) Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2016-04-28 12:57:03 +09:00
Lars Hamre	32cb7d61a9	glsl: fix lowering outputs for early/nested returns Return statements in conditional blocks were not having their output varyings lowered correctly. This patch fixes the following piglit tests: /spec/glsl-1.10/execution/vs-float-main-return /spec/glsl-1.10/execution/vs-vec2-main-return /spec/glsl-1.10/execution/vs-vec3-main-return Signed-off-by: Lars Hamre <chemecse@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-04-28 11:01:51 +10:00
Connor Abbott	122d27e998	nir: rewrite nir_foreach_block and friends Previously, these were functions which took a callback. This meant that the per-block code had to be in a separate function, and all the data that you wanted to pass in had to be a single void *. They walked the control flow tree recursively, doing a depth-first search, and called the callback in a preorder, matching the order of the original source code. But since each node in the control flow tree has a pointer to its parent, we can implement a "get-next" and "get-previous" method that does the same thing that the recursive function did with no state at all. This lets us rewrite nir_foreach_block() as a simple for loop, which lets us greatly simplify its users in some cases. This does require us to rewrite every user, although the transformation from the old nir_foreach_block() to the new nir_foreach_block() is mostly trivial. One subtlety, though, is that the new nir_foreach_block() won't handle the case where the current block is deleted, which the old one could. There's a new nir_foreach_block_safe() which implements the standard trick for solving this. Most users don't modify control flow, though, so they won't need it. Right now, only opt_select_peephole needs it. The old functions are reimplemented in terms of the new macros, although they'll go away after everything is converted. v2: keep an implementation of the old functions around v3 (Jason Ekstrand): A small cosmetic change and a bugfix in the loop handling of nir_cf_node_cf_tree_last(). v4 (Jason Ekstrand): Use the _safe macro in foreach_block_reverse_call Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-27 15:05:40 -07:00
Connor Abbott	958300137f	nir/opt_cp: use nir_block_get_following_if() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-27 15:05:34 -07:00
Jordan Justen	aaaa22c775	vbo: Return INVALID_OPERATION during draw with a mapped buffer Fixes the OpenGLES 3.1 CTS: * ESEXT-CTS.draw_elements_base_vertex_tests.invalid_mapped_bos Because this is triggering the error message after the normal API validation phase, we don't have the API function name available, and therefore we generate an error message without the draw call name: Mesa: User error: GL_INVALID_OPERATION in draw call (vertex buffers are mapped) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95142 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-27 14:30:06 -07:00
Nanley Chery	28d0bc72fb	anv/formats: Return proper error code for unsupported formats Fixes some failures in dEQP-VK.api.info.image_format_properties.* and enables the test group to execute without assert failing. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94896 Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-27 11:28:30 -07:00
Nanley Chery	5f7e8eac42	anv/device: Set the compressed texture feature flags correctly Sampling from an ETC2 texture is supported on Bay Trail and from Gen8 onwards. While ASTC_LDR is supported on Gen9, the logic to handle such formats has not yet been implemented in the driver. Fixes dEQP-VK.api.info.format_properties.compressed_formats. v2: Enable ETC2 for Bay Trail (Kenneth Graunke) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94896 Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-27 11:28:30 -07:00
Jason Ekstrand	e0806930ad	nir/algebraic: Add a bit-size validator This commit adds a validator that ensures that all expressions passed through nir_algebraic are 100% non-ambiguous as far as bit-sizes are concerned. This way it's a compile-time error rather than a hard-to-trace C exception some time later. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-04-27 11:21:06 -07:00
Jason Ekstrand	8a3e344180	nir/opt_algebraic: Fix some expressions with ambiguous bit sizes Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-04-27 11:21:06 -07:00
Jason Ekstrand	7e0ee3a38b	nir/search: Respect the bit_size parameter on nir_search_value Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-04-27 11:21:06 -07:00
Jason Ekstrand	fcc1c8a437	nir/algebraic: Add a mechanism for specifying the bit size of a value Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-04-27 11:21:06 -07:00
Jason Ekstrand	cafb885e45	nir/algebraic: Use "uint" instead of "unsigned" for uint types This is consistent with the rename done for the rest of NIR. Currently, "bool" is the only type specifier used in nir_opt_algebraic.py so this is really a no-op. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-04-27 11:21:06 -07:00
Jason Ekstrand	736ee0bef7	nir/algebraic: Do better error reporting of bad expressions Previously, if an exception was encountered anywhere, nir_algebraic would just die in a fire with no indication whatsoever as to where the actual bug is. This commit makes it print out the particular search-and-replace expression that is causing problems along with the exception. Also, it will now report all of the errors it finds and then exit at the end like a standard C compiler would do. Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-04-27 11:21:06 -07:00
Alejandro Piñeiro	b1dcedf393	isl: move -lm at the end of tests_ldadd The test was failing to build with "undefined reference to `roundf'" errors, so Make check on mesa was failing. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-27 20:14:56 +02:00
Topi Pohjolainen	aef6a6c382	i965/blorp/gen8: Fix blitting of interleaved msaa surfaces Fixes ES31-CTS.gtf.GL31Tests.texture_stencil8.texture_stencil8_multisample. Current logic divides given layer of one by number of samples (four) trashing the layer to zero. Layer adjustment is only to be used with non-interleaved msaa surfaces where samples for particular layer are in multiple slices. I copy-pasted a bit of documentation from brw_blorp.c::brw_blorp_compute_tile_offsets(). Also took the opportunity to fix the comment regarding sampling as 2D, cube textures are the only exception. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-04-27 19:57:40 +03:00
Brian Paul	1d242b6882	llvmpipe: s/Elements/ARRAY_SIZE/ Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-27 10:23:19 -06:00
Brian Paul	23c55e5c23	tgsi: s/Elements/ARRAY_SIZE/ Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-27 10:23:19 -06:00
Brian Paul	419e386571	os: s/Elements/ARRAY_SIZE/ Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-27 10:23:19 -06:00
Brian Paul	d902504a67	hud: s/Elements/ARRAY_SIZE/ Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-27 10:23:19 -06:00
Brian Paul	e522a76226	gallivm: s/Elements/ARRAY_SIZE/ Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-27 10:23:19 -06:00
Brian Paul	489df4a71a	draw: s/Elements/ARRAY_SIZE/ Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-27 10:23:19 -06:00
Brian Paul	f93802c465	softpipe: s/Elements/ARRAY_SIZE/ Try to standardize on the later, which is defined in the common util/ directory. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-27 10:23:19 -06:00
Nicolai Hähnle	562c4a17b7	winsys/radeon: remove use_reusable_pool parameter from buffer_create All callers set this parameter to true. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-27 11:16:41 -05:00
Nicolai Hähnle	13acf2b243	gallium/radeon: remove use_reusable_pool parameter from r600_init_resource All callers set it to true. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-27 11:16:41 -05:00
Nicolai Hähnle	c868974396	radeon/video: always use the reusable buffer pool A semantic error was introduced in a past refactoring that caused the bind parameter to be passed into the use_reusable_pool parameter of buffer_create. Since this clearly makes no sense, and there is no clear reason why the cache _shouldn't_ be used, just use the cache always. Cc: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-27 11:16:41 -05:00
Nicolai Hähnle	8c43c06e04	radeonsi: work around an MSAA fast stencil clear problem A piglit test (arb_texture_multisample-stencil-clear) has been sent. This problem was discovered analyzing Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93767 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-27 11:16:40 -05:00
Nicolai Hähnle	7a215a3e27	radeonsi: expclear must be disabled on first Z/S clear The documentation and the HW team say so. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-27 11:16:40 -05:00
Nicolai Hähnle	01a3bb5d8b	radeonsi: move blend choice out of loop in si_blit_decompress_color It does not depend on the level or layer. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-27 11:16:40 -05:00
Nicolai Hähnle	450ff0f0d5	radeonsi: use level mask for early out in si_blit_decompress_color Mostly for consistency with the other decompress functions, but note that in the non-DCC decompress case, the function can now early-out in slightly more (albeit probably rare) cases. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-27 11:16:40 -05:00
Nicolai Hähnle	0ff05b55c6	radeonsi: si_blit_decompress_depth is only used for staging Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-27 11:16:40 -05:00
Nicolai Hähnle	0b70fc2db4	radeonsi: only decompress the required ZS planes from si_blit This happens to "fix" a rendering bug in KotOR2, because it avoids a still not quite understood bug with MSAA fast stencil clear decompress. For the stencil clear bug, I have sent a piglit test (arb_texture_multisample-stencil-clear). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93767 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-27 11:16:39 -05:00
Nicolai Hähnle	def53a0b3d	radeonsi: decompress Z & S planes in one pass Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-27 11:16:39 -05:00
Nicolai Hähnle	dc6fc2f390	radeonsi: early out of si_blit_decompress_depth_in_place based on dirty mask Avoid dirtying the db_render_state atom when possible. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-27 11:16:39 -05:00
Nicolai Hähnle	d14d6c3f58	radeonsi: use MIN2 instead of expanded ?: operator Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-27 11:16:39 -05:00
Nicolai Hähnle	159f182a57	radeonsi: fix brace style Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-27 11:16:39 -05:00
Nicolai Hähnle	91fb4bb2e9	gallium/util: add u_bit_consecutive for generating a consecutive range of bits There are some undefined behavior subtleties, so having a function to match the u_bit_scan_consecutive_range makes sense. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-27 11:16:39 -05:00
Tim Rowley	504df3a1d7	swr: s/Elements/ARRAY_SIZE/ Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-27 11:07:34 -05:00
Nicolai Hähnle	836cab51c8	radeonsi: emit s_waitcnt for shader memory barriers and volatile Turns out that this is needed after all to satisfy some strengthened coherency tests. Depends on support in LLVM, added in r267729. v2: updated to reflect changes to the LLVM intrinsic Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)	2016-04-27 10:54:05 -05:00
Tim Rowley	e7201bd31b	swr: [rasterizer] warning cleanup Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-27 10:41:54 -05:00
Tim Rowley	24f23817d2	swr: [rasterizer core] implement legacy depth bias enable Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-27 10:41:45 -05:00
Tim Rowley	fa36f8ec9c	swr: [rasterizer jitter] support for dumping x86 asm Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-27 10:41:32 -05:00
Tim Rowley	a646ffdacf	swr: [rasterizer core] more backend refactoring BackendPixelRate should be easier to read/maintain now hopefully. Small perf bump by moving some of the pfn's to inline functions without template params. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-27 10:41:21 -05:00
Tim Rowley	8e815ff72c	swr: [rasterizer jitter] add mSimdInt1Ty Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-27 10:41:12 -05:00
Tim Rowley	4e1e0b3a32	swr: [rasterizer core] backend refactor Lump all template args into a bundle of traits, and add some functionality to the MSAA traits. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-27 10:40:44 -05:00
Brian Paul	43f46caf76	svga: use the SVGA3D_DEVCAP_MAX_FRAGMENT_SHADER_INSTRUCTIONS query Instead of a hard-coded 512. The query typically returns 65536 now. Fall back to 512 if the query fails as we do for vertex shaders (which should never happen). Note that we don't actually enforce this limit in our shaders but it gets reported via the glGetProgramivARB(GL_MAX_PROGRAM_INSTRUCTIONS_ARB) query. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-04-27 08:43:33 -06:00
Hans de Goede	b5e7907f30	nouveau: codegen: LOAD: Take src swizzle into account The llvm TGSI backend uses pointers in registers and does things like: LOAD TEMP[0].y, MEMORY[0], TEMP[0] Expecting the data at address TEMP[0].x to get loaded to TEMP[0].y. But this will cause the data at TEMP[0].x + 4 to be loaded instead. This commit adds support for a swizzle suffix for the 1st source operand, which allows using: LOAD TEMP[0].y, MEMORY[0].xxxx, TEMP[0] And actually getting the desired behavior Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-27 16:11:48 +02:00
Hans de Goede	90f45357ab	nouveau: codegen: LOAD: Do not call fetchSrc(1) if the address is immediate "off" later gets set to NULL when the address is immediate, so move the fetchSrc(1) call to the non-immediate branch of the if-else. This brings handleLOAD's offset handling inline with how it is done in handleSTORE. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-27 16:11:48 +02:00
Hans de Goede	1958397a58	nouveau: codegen: LOAD: Always use component 0 when getting the address LOAD loads upto 4 components from the specified resource starting at the passed in x value of the 2nd source operand, the y, z and w components of the address should not be used. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-27 16:11:48 +02:00
Stefan Dirsch	7d25ed7036	dri3: Check for dummyContext to see if the glx_context is valid According to the comments in src/glx/glxcurrent.c __glXGetCurrentContext() always returns a valid pointer. If no context is made current, it will contain dummyContext. Thus a test for NULL will always fail. https://lists.freedesktop.org/archives/mesa-dev/2016-April/113962.html Signed-off-by: Stefan Dirsch <sndirsch@suse.de> Reviewed-by: Egbert Eich <eich@freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-27 13:03:34 +01:00
Egbert Eich	4d9b518ad2	dri2: Check for dummyContext to see if the glx_context is valid According to the comments in src/glx/glxcurrent.c __glXGetCurrentContext() always returns a valid pointer. If no context is made current, it will contain dummyContext. Thus a test for NULL will always fail. https://bugzilla.opensuse.org/show_bug.cgi?id=962609 Tested-by: Olaf Hering <ohering@suse.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-27 13:03:11 +01:00
Timothy Arceri	6d1a59d15b	glsl: move uniform block validation to link_uniform_blocks.cpp Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-04-27 16:17:47 +10:00
Kenneth Graunke	73ada723f0	docs: Mention that {ARB,OES}_texture_stencil8 is supported on i965/gen8+ Thanks to Thomas Helland for reminding me to do this.	2016-04-26 21:32:35 -07:00
Kenneth Graunke	fd9a7d8f30	i965: Enable ARB_texture_stencil8 and OES_texture_stencil8 on Gen8+. Stencil texturing is required by ES 3.1. Apparently we never actually turned it on. Do that now. Also turn on the desktop extension. Fixes nine dEQP-GLES31.functional tests: stencil_texturing.format.stencil_index8_2d texture.border_clamp.formats.stencil_index8.nearest_size_pot texture.border_clamp.formats.stencil_index8.nearest_size_npot texture.border_clamp.formats.stencil_index8.gather_size_pot texture.border_clamp.formats.stencil_index8.gather_size_npot texture.border_clamp.unused_channels.stencil_index8 state_query.internal_format.renderbuffer.stencil_index8_samples state_query.internal_format.texture_2d_multisample.stencil_index8_samples state_query.internal_format.texture_2d_multisample_array.stencil_index8_samples Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2016-04-26 21:32:35 -07:00
Kenneth Graunke	12c43a355c	mesa: Try to fix CopyTex[Sub]Image of stencil textures. ES prohibits this, but GL appears to allow it. We at least need this much, or else we'll crash as there's no source to read from. This fixed crashes in the ES tests before I realized I needed to prohibit stencil instead. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2016-04-26 21:32:35 -07:00
Kenneth Graunke	027c6c1222	mesa: Disallow CopyTexSubImage on stencil formats in ES. Fixes - ES31-CTS.gtf.GL31Tests.texture_stencil8.texture_stencil8 - ES31-CTS.gtf.GL31Tests.texture_stencil8.texture_stencil8_multisample Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2016-04-26 21:32:35 -07:00
Kenneth Graunke	1e44599a43	i965: Fix MapTextureImage for multi-slice/level stencil buffers. We called intel_miptree_get_image_offset() to get the image offsets for the current level/slice, but then proceeded to ignore the results and clobber level/slice 0 every time. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94713 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2016-04-26 21:32:35 -07:00
Kenneth Graunke	361a24e140	i965: Move TCS output indirect_offset.file check out a level. I want to add another condition. Moving the indirect_offset.file check out a level should make this a little easier. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-04-26 19:59:56 -07:00
Kenneth Graunke	13195f7ef8	i965/fs: Reduce the response length of sampler messages on Skylake. Often, we don't need a full 4 channels worth of data from the sampler. For example, depth comparisons and red textures only return one value. To handle this, the sampler message header contains a mask which can be used to disable channels, and reduce the message length (in SIMD16 mode on all hardware, and SIMD8 mode on Broadwell and later). We've never used it before, since it required setting up a message header. This meant trading a smaller response length for a larger message length and additional MOVs to set it up. However, Skylake introduces a terrific new feature: for headerless messages, you can simply reduce the response length, and it makes the implicit header contain an appropriate mask. So to read only RG, you would simply set the message length to 2 or 4 (SIMD8/16). This means we can finally take advantage of this at no cost. total instructions in shared programs: 9091831 -> 9073067 (-0.21%) instructions in affected programs: 191370 -> 172606 (-9.81%) helped: 2609 HURT: 0 total cycles in shared programs: 70868114 -> 68454752 (-3.41%) cycles in affected programs: 35841154 -> 33427792 (-6.73%) helped: 16357 HURT: 8188 total spills in shared programs: 3492 -> 1707 (-51.12%) spills in affected programs: 2749 -> 964 (-64.93%) helped: 74 HURT: 0 total fills in shared programs: 4266 -> 2647 (-37.95%) fills in affected programs: 3029 -> 1410 (-53.45%) helped: 74 HURT: 0 LOST: 1 GAINED: 143 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-26 19:55:04 -07:00
Jason Ekstrand	d800b7daa5	nir: Add a helper for figuring out what channels of an SSA def are read Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-04-26 19:55:04 -07:00
Jason Ekstrand	acc2f1fe36	i965/fs: Use inst->regs_written for rlen for texture instructions Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-04-26 19:55:04 -07:00
Jason Ekstrand	c7a09c0571	i965/fs: Properly report regs_written from SAMPLEINFO The previous behavior would only allocate one register and then write four thus potentially stomping three innocent bystanders. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-04-26 19:55:04 -07:00
Jason Ekstrand	30b37e4e9b	i965/blorp: Set regs_written on texturing instructions Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-04-26 19:55:04 -07:00
Kenneth Graunke	0bd956b34b	i965: Don't force a header for texture offsets of 0. Calling textureOffset() with an offset of <0, 0, 0> is equivalent to calliing texture(). We don't actually need to set up an offset, which causes a message header to be created. A fairly common pattern is to sample at a point with a bunch of offsets, and average them. It's natural to write all the lookups as textureOffset, but use <0, 0> for the center sample. shader-db results on Skylake: total instructions in shared programs: 9092095 -> 9092087 (-0.00%) instructions in affected programs: 2826 -> 2818 (-0.28%) helped: 12 HURT: 2 total cycles in shared programs: 70870166 -> 70870144 (-0.00%) cycles in affected programs: 15924 -> 15902 (-0.14%) helped: 2 HURT: 0 This also helps prevent code quality regressions in a future patch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by Jason Ekstrand <jason@jlekstrand.net>	2016-04-26 19:55:04 -07:00
Patrick Rudolph	fb5d38e219	r600g: fix and optimize tgsi_cmp when using ABS and NEG modifier Some apps set NEG and ABS on the source param to test for zero. Use ALU_OP3_CNDE insted of ALU_OP3_CNDGE and unset both modifiers. It also removes the need for a MOV instruction, as ABS isn't supported on op3. Tested on AMD CAYMAN and AMD RV770. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-27 12:48:50 +10:00
Dave Airlie	7aa3a93656	docs: update softpipe for ARB_compute_shader Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-27 09:01:12 +10:00
Dave Airlie	e749c30ceb	softpipe: add support for compute shaders. (v2) This enables ARB_compute_shader on softpipe. I've only tested this with piglit so far, and I hopefully plan on integrating it with my vulkan work. I'll get to testing it with deqp more later. The basic premise is to create up to 1024 restartable TGSI machines, and execute workgroups of those machines. v1.1: free machines. v2: deqp fixes - add samplers support, finish atomic operations, fix load/store writemasks. Acked-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-27 09:01:03 +10:00
Dave Airlie	f78bcb7638	tgsi/exec: initialise SysSemanticToIndex array to -1 We want to use the SysSemanticToIndex to tell if we've seen the semantics at all. Acked-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-27 09:00:46 +10:00
Dave Airlie	fbea4e177f	tgsi/exec: implement restartable machine. This lets us restart the machine at a PC value, and exits the machine when we hit a barrier. Compute shaders will then execute all the threads up to the barrier, then restart the machines after the barrier once all are done. v2: comment the code a bit, change return types. Acked-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-27 09:00:44 +10:00
Dave Airlie	8ffa3c58d4	tgsi/exec: make inputs/outputs optional for compute shaders. compute shaders don't need input/outputs so don't bother allocating memory for these. Acked-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-27 09:00:41 +10:00
Dave Airlie	16a9dc1e49	tgsi/exec: implement load/store/atomic on MEMORY. This implements basic load/store/atomic ops on MEMORY types for compute shaders. Acked-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-27 09:00:35 +10:00
Dave Airlie	354c5f2d0f	tgsi/exec: split out setting up masks to separate function This is just a cleanup that will make later changes easier to make. Acked-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-27 08:56:22 +10:00
Dave Airlie	6cf36a7231	tgsi: accept a starting PC value for exec machine. This will be used later to restart barriered execution threads in compute, for now we just want to change the API. Acked-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-27 08:56:17 +10:00
Dave Airlie	912ed84f83	tgsi: move to using vector for system values. For compute support some of the system values are .xyz types, so move to using a vector instead of a single channel. [airlied: squash swizzle fix from compute series]. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-27 08:26:53 +10:00
Dave Airlie	9013d9267c	tgsi/exec: fix system value handling. a) SysSemanticToIndex needs to be indexed with the semantic name not the decl->Declaration.Semantic. b) doing this in run is too late, as the mappings are all setup prior to run in the execs. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-27 08:25:38 +10:00
Jason Ekstrand	4040fff81d	i965/blorp: Convert state setup to C Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-26 14:55:22 -07:00
Jason Ekstrand	71775afe6e	i965/blorp: Make state setup C-safe Previously they (very rarely) used C++isms that prevented them from being compiled as C. As of this commit, they can be compiled as either C or C++. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-26 14:55:22 -07:00
Jason Ekstrand	bed74299c2	i965/blorp: Convert brw_blorp.cpp to a C file Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-26 14:55:22 -07:00
Jason Ekstrand	0551f3dfa4	i965/blorp: Make all of brw_blorp.h accessible to C Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-26 14:55:22 -07:00
Jason Ekstrand	b3f08b5424	i965/blorp: Turn brw_blorp_params into a C-style struct Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-26 14:55:22 -07:00
Jason Ekstrand	33fa12c50f	i965/blorp: Turn coord_transform into a C-style struct Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-26 14:55:22 -07:00
Jason Ekstrand	b6dd8e42f0	i965/blorp: Turn blorp_surface_info into a C-style struct This commit is mostly mechanical except that it changes where we set the swizzle. Previously, the blorp_surface_info constructor defaulted the swizzle to SWIZZLE_XYZW. Now, we memset to zero and fill out the swizzle when we setup the rest of the struct. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-26 14:55:22 -07:00
Jason Ekstrand	a543f741bf	i965/blorp: Roll mip_info into surface_info Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-26 14:55:22 -07:00
Jason Ekstrand	3839936497	i965/blorp: Get rid of the blorp_blit_params class It was really just a wrapper around the function that constructed it. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-26 14:55:22 -07:00
Jason Ekstrand	8096ed7e27	i965/blorp: Remove the hiz params class Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-26 14:55:22 -07:00
Jason Ekstrand	e35d9407dc	i965/blorp: Remove the clear params classes They didn't really add anything other than a key and extra layers of function calls. This commit just inlines the extra functions and gets rid of the extra classes. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-26 14:55:22 -07:00
Jason Ekstrand	659400cba3	i965/blorp: Remove the arguments to brw_blorp_params() No one was using anything other than the defaults. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-26 14:55:22 -07:00
Jason Ekstrand	2dda4ff014	i965/blorp: Refactor to get rid of the get_wm_prog virtual function Instead of having a virtual member function for getting the WM/PS kernel, we simply add fields for prog_data and the kernel to brw_blorp_parms and always make sure those get set as part of the different constructors. v2: Use use prog_data != NULL to check for a valid program instead of a magic kernel offset value Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-26 14:55:22 -07:00
Tim Rowley	18d1658633	swr: autogenerate swr_context_llvm.h Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-26 16:45:26 -05:00
Laurent Carlier	12cf08fcc3	anv: honor DESTDIR when installing icd file https://bugs.freedesktop.org/show_bug.cgi?id=94969 Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-26 14:57:54 -07:00
Juha-Pekka Heikkila	ec5f7fc7bd	i965/meta: initialize values to avoid random behaviour on error path if brw_meta_stencil_blit() errored at wrong place 'target' would be uninitialized and cause random behaviour on leaving the funtion. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-26 14:54:29 -07:00
Juha-Pekka Heikkila	51632d6f27	meta: Avoid random memory access on error Initialize drawFb to NULL in _mesa_meta_CopyImageSubData_uncompressed() if getting readFb fails uninitialized drawFb will cause randomness on cleanup. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-26 14:54:02 -07:00
Grazvydas Ignotas	cea3a7e615	mesa: add tags file to gitignore For ctags users like me. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-26 14:49:27 -07:00
Jakob Sinclair	dda50af9c4	mesa: Remove every double semi-colon Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-26 14:36:29 -07:00
Jakob Sinclair	e5d027ec7d	glx: Remove every double semi-colon Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-26 14:36:29 -07:00
Jakob Sinclair	ea327dc451	gallium: Remove every double semi-colon Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-26 14:36:29 -07:00
Jakob Sinclair	de743a07ac	egl: Remove every double semi-colon Removes all accidental semi-colons in egl. Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-26 14:36:29 -07:00
Jakob Sinclair	e129e6eb89	gallium/r600: removing double semi-colons Trivial change. Removing unnecessary semi-colons from the code. I don't have push access so someone reviewing this can push it. Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-26 14:36:29 -07:00
Jakob Sinclair	12da8bb5f4	mesa/main: removing double semi-colons Trivial change. Removing unnecessary semi-colons from the code. I don't have push access so someone reviewing this can push it. Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-26 14:36:29 -07:00
Jakob Sinclair	09e4ac00ac	glsl: removing double semi-colons Trivial change. Removing unnecessary semi-colons from the code. I don't have push access so someone reviewing this can push it. Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-26 14:36:29 -07:00
Jose Fonseca	52c7443932	glx: Don't enclose includes inside `extern "C" { }`. Ran `make check` inside src/glx to verify everything compiles and links correctly. https://bugs.freedesktop.org/show_bug.cgi?id=95158 Reviewed-by: Brian Paul <brianp@vmware.com>	2016-04-26 21:28:34 +01:00
Marek Olšák	80e5fb60b4	radeonsi: add RW_BUFFERS only once in si_ce_needed_cs_space Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-04-26 21:37:07 +02:00
Marek Olšák	2b4b5ebfcf	egl: fix make check broken by interop support	2016-04-26 21:37:07 +02:00
Samuel Pitoiset	e64ee4cf60	docs: mark ARB_compute_shader as done for nvc0 This has been merged few months ago but this should help https://mesamatrix.net/ to update its list of supported extensions. Please note that compute shaders are not really useful without ARB_image_load_store and only GK104 and GK110 support it for now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-04-26 21:10:10 +02:00
Samuel Pitoiset	5c429f88d9	nvc0: expose GLSL version 420 on GK110 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	a0e777f6a1	nvc0: enable ARB_shader_image_load_store on GK110 This exposes 8 images for all shader types. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	2daaa5d657	gk110/ir: add emission for VSHL Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	af5925209d	gk110/ir: add emission for OP_SUEAU, OP_SUBFM and OP_SUCLAMP Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	1f8900a8e0	gk110/ir: add emission for OP_SULDB and OP_SUSTx Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	fddd8523d4	gk110/ir: add emission for OP_MADSP Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	c2ce22ca46	gk110/ir: add emission for OP_PERMT Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	222d1a1bff	nvc0: expose GLSL version 420 on GK104 Other chipsets will be added later. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Ilia Mirkin	9e367ed480	nvc0: enable ARB_shader_image_load_store on GK104 This exposes 8 images for all shader types. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	0d64d39e81	nvc0: inform users that 3D images are not fully supported 3D images are a bit more complicated to implement and will probably requires a bunch of headaches and we don't care for now because they do not seem to be really used by apps. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	fdbb476829	nvc0: reduce GL_MAX_3D_TEXTURE_SIZE to 2048 on Kepler+ The blob sets it to 2048 and using 4096 reports an INVALID_DATA error with RT_ARRAY_MODE when z is 4096. Suggested by Ilia Mirkin. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	6fc6d548ed	nvc0/ir: check that the image format doesn't mismatch This re-uses NVE4_SU_INFO_CALL which is not used anymore because we don't use our lib for format conversions. While we are at it, add a todo for image buffers because there are some robustness-related issues to fix. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	fbeb69757c	nvc0/ir: prevent out of bounds when no images are bound Checking if the image address is not 0 should be enough to prevent read faults. To improve robustness, make sure that the destination value of atomic operations is correctly initialized in case the instruction is not performed. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	5ba5714483	nvc0/ir: add indirect support for images on Kepler This fixes arb_shader_image_load_store-indexing and arb_shader_image_load_store-max-images. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	8b540db44c	nvc0/ir: fix 1D arrays images for Kepler For 1D arrays, the array index is stored in the Z component. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	e478156ed7	nvc0/ir: fix cube images for Kepler Like 2d array images, the z-dimension needs to be clamped. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Ilia Mirkin	3ce80f924d	nv50/ir: add support for SULDP -> SULDB conversion This will allow to convert surface formats without adding an extra call to our lib. [hakzsam: make use of this for GK104] Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	d64ea4e48e	nv50/ir: make use of OP_SUQ for surfaces query This implements RESQ for surfaces which comes from imageSize() GLSL bultin. As the dimensions are sticked into the driver constant buffer, this only has to be lowered with loads. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (v2)	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	7c47db359e	nv50/ir: add OP_BUFQ for buffers query TGSI RESQ allows both images and buffers but we have to make a distinction between these two type of resources in our lowering pass. Introducing OP_BUFQ which is a fake operand will allow to implement OP_SUQ for surfaces. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	e09434047d	nv50/ir: enable early fragment test with explicit user control This feature can be enabled in two ways: as an optimization and by explicit user control (with OpenGL 4.2 or ARB_shader_image_load_store). This makes use of the recent TGSI_PROPERTY_FS_EARLY_DEPTH_STENCIL to force early fragment tests when needed. This fixes a bunch of dEQP-GLES31.functional.image_load_store.early_fragment_tests.* tests. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	08f4faa542	nvc0/ir: fix constraints for OP_SUSTx on Kepler Destination type is actually always 32-bits, so typeSizeof() returns 4 and no sources are condensed. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	119d087758	nv50/ir: re-introduce TGSI lowering pass for images This is loosely based on the previous lowering pass wrote by calim four years ago. I did clean the code and fixed some issues. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	76ea143c38	nv50/ir: add support for TGSI image declarations Old and dead resource code will be removed once images are completely done. Based on original patch by Ilia Mirkin. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	1fb3cd2489	nvc0: add missing glMemoryBarrier bits This fixes a bunch of subtests of arb_shader_image_load_store-host-mem-barrier. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (v1)	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	9bc18a48f3	nvc0: enable RGB10_A2UI format on GK104 No clue why this was not enabled by default before, maybe because the SULDP conversion was wrong. Anyway, this helps in fixing all rgb10_a2ui piglit tests. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	da8171dc75	nvc0: shift address with blocksize for image buffers This fixes a bunch of dEQP image buffers related tests. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	285f2edd14	nvc0: fix address offset when images have multiple levels This fixes arb_shader_image_load_store-level. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	e28f247e24	nvc0: bind images on 3D shaders for Kepler Similar to surfaces validation for compute shaders. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	1eca4c51a2	nvc0: bind images on compute shaders for Kepler Old surfaces validation code will be removed once images are completely done for Fermi/Kepler, that explains why I only disable it for now. This also introduces nvc0_get_surface_dims() which computes correct dimensions regarding the given target. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	c6b3c346d1	nvc0: reserve an area for surfaces info in the driver constbuf To process surfaces coordinates from the codegen part, and because some information like the format is not always available (eg. when writeonly is used), we have to stick some surfaces data in the driver constbuf. This is especially true for OpenCL because we don't know the format at shader compile time. This bumps the size of each shader area from 1K to 2K. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	afa04785fa	nvc0: add preliminary support for images This implements set_shader_images() and resource invalidation for images. As OpenGL requires at least 8 images, we are going to expose this minimum value even if this might be raised for Kepler, but this limit is mainly for Fermi because the hardware only accepts 8 images. Based on original patch by Ilia Mirkin. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	c62b1b92f7	gk110/ir: add emission for (a OP b) OP c This is pretty similar to NVC0 except that offsets have changed. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-04-26 19:47:49 +02:00
Samuel Pitoiset	3da8528846	nvc0/ir: fix wrong emission of (a OP b) OP c The third source must be emitted at offset 49 instead of 17 and the not modifier is at 52 instead of 20. If you look a bit above in emitLogicOp() you will see that the dest is emitted at 17 which confirms that src(2) is obviously wrong. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-04-26 19:47:49 +02:00
Jose Fonseca	a2fe35bcdf	scons: Support Clang on Windows. - Introduce 'gcc_compat' env flag, for all compilers that define __GNUC__, (which includes Clang when it's not emulating MSVC.) - Clang doesn't support whole program optimization - Disable enumerator value warnings (not sure why Clang warns about them, as my understanding is that MSVC promotes enums to unsigned ints automatically.) This is not enough to build with Clang + AddressSanitizer though. More follow up changes will be required for that. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-04-26 17:17:00 +01:00
Jose Fonseca	dcc3baf733	gallium: Include intrin.h instead of defining ourselves. More portable, particularly when building with Clang, which implements all MSVC intrisincs in its own intrin.h, but doesn't actually support `#pragma instrinsic`. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-04-26 17:17:00 +01:00
Jose Fonseca	9a25c8af1b	scons: Whenever possible decide what to do based on platform and not compiler. Because compilers like GCC and Clang are effectively available everywhere so their presence/absence is seldom conclusive. Furthermore, all compilers we use now have stdint.h. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-04-26 17:17:00 +01:00
Jose Fonseca	c068610a7d	scons: Move fallback HAVE_* definitions to headers. These were being defined in SCons, but it's not practical: - we actually need to include Gallium headers from external source trees, with completely disjoint build infrastructure, and it's unsustainable to replicate the HAVE_xxx checks or even hard-coded defines across everywhere. - checking compiler version via command line doesn't really work due to Clang essentially being like a cameleon which can fake either GCC or MSVC There's no change for autoconf. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-04-26 17:17:00 +01:00
Juha-Pekka Heikkila	940da2ce0e	nir: Add missing break into switch in construct_value() There seemed to be missing one break in nested switchcases. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Antia Puentes <apuentes@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-04-26 17:45:56 +02:00
Bas Nieuwenhuizen	31631d8515	radeonsi: Fix memory leak in error path. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-26 15:41:19 +02:00
Oded Gabbay	514c5b5f4b	radeonsi: fix build error because of missing param Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-04-26 13:48:43 +03:00
Oded Gabbay	965175aba3	r600g: use do_endian_swap in texture swapping function For some texture formats we need to take "do_endian_swap" into account when configuring their swizzling. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-26 11:00:16 +03:00
Oded Gabbay	c86c761343	r600g: use do_endian_swap in color swapping functions For some formats we need to take "do_endian_swap" into account when configuring swapping for color buffers. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-26 11:00:16 +03:00
Oded Gabbay	686ad477bd	r600g: set endianess of 16/32-bit buffers according to do_endian_swap This patch modifies r600_colorformat_endian_swap(), so for 16-bit and for 32-bit buffers, the endianess configuration will be determined not only by the color/texture format, but also by the do_endian_swap parameter. The only exception is for array formats, which are always set to not do swapping, because for them gallium sets an alias based on the machine's endianess. v4: V_0280A0_COLOR_16_16 and V_0280A0_COLOR_16_16_FLOAT should be set to 8IN16 because the bytes inside need to be swapped even for array formats. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-26 11:00:16 +03:00
Oded Gabbay	2242dbe11d	r600g/radeonsi: send endian info to format translation functions Because r600 GPUs can't do swap in their DB unit, we need to disable endianess swapping for textures that are handled by DB. There are four format translation functions in r600g driver: - r600_translate_texformat - r600_colorformat_endian_swap - r600_translate_colorformat - r600_translate_colorswap This patch adds a new parameters to those functions, called "do_endian_swap". When running in a big-endian machine, the calling functions will check whether the texture/color is handled by DB - "rtex->is_depth && !rtex->is_flushing_texture" - and if so, they will send FALSE through this parameter. Otherwise, they will send TRUE. The translation functions, in specific cases, will look at this parameter and configure the swapping accordingly. v4: evergreen_init_color_surface_rat() is only used by compute and don't handle DB surfaces, so just sent hard-coded FALSE to translation functions when called by it. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-26 11:00:16 +03:00
Ilia Mirkin	4965c5bf72	glsl: add ability to use essl 3.20 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-25 23:40:54 -04:00
Ilia Mirkin	fa8c0ccfbc	main: select ES3.2 version when all extensions are available Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-25 23:40:34 -04:00
Dave Airlie	e3e6859381	tgsi: pass a shader type to the machine create and clean up. There was definitely bugs here mixing up the PIPE_ and TGSI_ defines, hopefully they didn't cause any problems, since mostly it was special cases for GEOMETRY. This clarifies at shader machine create what type of shader this machine will execute. This is needed also for compute shaders where we don't want to allocate inputs/outputs. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-26 13:05:32 +10:00
Dave Airlie	a6aae0c24d	gallium/tgsi: move tgsi_exec.h header out of draw_context.h It gets annoying that changing the tgsi exec rebuilds the state tracker unnecessarily. Putting this include into draw_gs.h which uses it causes a lot less rebuilds. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-26 13:00:57 +10:00
Roland Scheidegger	bd07e20d20	gallivm: make sampling more robust against bogus coordinates Some cases (especially these using fract for coord wrapping) did not handle NaNs (or Infs) correctly - the following code assumed the fract result could not be outside [0,1], but if the input is a NaN (or +-Inf) the fract result was NaN - which then could produce out-of-bound offsets. (Note that the explicit NaN behavior changes for min/max on x86 sse don't result in actual changes in the generated jit code, but may on other architectures. Found by looking through all the wrap functions.) This fixes https://bugs.freedesktop.org/show_bug.cgi?id=94955 No piglit changes. (v2: fix min/max typo in coord_mirror, add comment) Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Tested-by: Bruce Cherniak <bruce.cherniak@intel.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-26 04:55:37 +02:00
Dave Airlie	d8edc3e97c	radeonsi: fix missing include for Elements. Since u_blitter.h no longer defines this. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-26 09:36:23 +10:00
Samuel Pitoiset	d12c3b02ff	nvc0: bump the amount of shared memory per MP on Maxwell According to the CUDA compute capability version, GM10x can expose 64KB of shared memory while GM20x can use 96KB. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-26 00:32:25 +02:00
Dave Airlie	5b6a1aee46	r600: fix missing include for Elements macro This got removed from u_blitter.h and we were taking it from there, this should just move to ARRAY_SIZE eventually. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-26 08:01:01 +10:00
Samuel Pitoiset	725431a5db	gm107/ir: s/invalid load/invalid store/ Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-25 23:55:52 +02:00
Rob Clark	d2fcd0ce38	freedreno/a3xx: remove unused fxn Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-25 17:10:14 -04:00
Rob Clark	8fe2076243	freedreno/ir3: convert over to ralloc The home-grown heap scheme (which is ultra-simple but probably not good to always allocate and memset such a chunk of memory up front) was a remnant of fdre (where the ir originally came from). But since we have ralloc in mesa, lets just use that instead. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-25 17:09:09 -04:00
Rob Clark	27cf3b0052	mesa/st: log some additional invalid-fbo cases Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-04-25 17:08:22 -04:00
Rob Clark	2c8674f5a9	freedreno: honor handle->offset Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-25 16:16:22 -04:00
Rob Clark	dfd23abdcc	freedreno: disallow cat4 immed src Normally this would never happen (constant-propagation in NIR would eliminate the instruction), except it does happen for 'undef' which we turn into immed 0.0 for bookkeeping purposes. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-25 16:16:21 -04:00
Rob Clark	76c6cdd36a	freedreno/a4xx: add render-target formats Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-25 16:16:21 -04:00
Rob Clark	7add166a5c	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-25 16:16:21 -04:00
Rob Clark	edcc6ce75d	freedreno: reduce line width for deqp further See a7eb12d0.. but that wasn't restrictive enough. Fixes dEQP-GLES3.functional.rasterization.primitives.line_strip_wide, and similar Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-25 16:16:21 -04:00
Rob Clark	4610e5ef28	freedreno/ir3: fix sin/cos We seem to need range reduction to get sane results. Fixes glmark2 jellyfish bench, and a whole bunch of dEQP-GLES3.functional.shaders.builtin_functions.precision.{sin,cos,tan}.* v2: squashed in android build fixes from Rob Herring Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-25 16:16:21 -04:00
Kenneth Graunke	21b4bcdd05	i965: Unroll SIMD16 DDY_FINE on Sandybridge. This fixes 10 dEQP-GLES3 subtests: dEQP-GLES3.functional.shaders.derivate.dfdy.texture.float_nicest.*. Matt noticed that our Piglit tests for this use even numbered registers, while the failing dEQP tests use odd numbered registers. We believe that it works for even numbered registers, but not otherwise. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-25 13:13:00 -07:00
Brian Paul	e915903c10	docs: update the instructions for getting a git account Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-25 14:10:40 -06:00
Brian Paul	ef3f00edd8	docs: update link to Intel's graphics website Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-25 14:10:40 -06:00
Jordan Justen	50b82ecd77	mesa/gles: Allow format GL_RED to be used with MESA_FORMAT_R_UNORM If the bound framebuffer has a format of MESA_FORMAT_R_UNORM, then IMPLEMENTATION_COLOR_READ_FORMAT will return GL_RED. This change applies to OpenGLES contexts where additional restrictions are placed on the formats that are allowed to be supported. Fixes OpenGLES 3.1 CTS tests: * ES31-CTS.texture_border_clamp.sampling_texture.Texture2DDC16 * ES31-CTS.texture_border_clamp.sampling_texture.Texture2DDC16Linear * ES31-CTS.texture_border_clamp.sampling_texture.Texture2DDC32F * ES31-CTS.texture_border_clamp.sampling_texture.Texture2DDC32FLinear Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-25 12:09:15 -07:00
Charmaine Lee	c4cb879f00	svga: eliminiate unnecessary constant buffer updates Currently if the texture binding is changed, emit_fs_consts() is triggered to update texture scaling factor for rectangle texture or texture buffer size in the constant buffer. But the update is only relevant if the texture binding includes a rectangle texture or a texture buffer. To eliminate the unnecessary constant buffer updates due to other texture binding changes, a new flag SVGA_NEW_TEXTURE_CONSTS will be used to trigger fragment shader constant buffer update when a rectangle texture or a texture buffer is bound. With this patch, the number of constant buffer updates in Lightsmark2008 reduces from hundreds per frame to about 28 per frame. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-04-25 12:59:29 -06:00
Charmaine Lee	686cd3c606	svga: mark the texture dirty for write transfer map only Instead of unconditionally mark the texture subresource dirty at transfer map, we'll set the dirty bit for write transfer only. Tested with lightsmark2008 and glretrace. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-04-25 12:59:29 -06:00
Charmaine Lee	676931640f	svga: fix assert with PIPE_QUERY_OCCLUSION_PREDICATE for non-vgpu10 With this patch, when running in hardware version 11, we'll use SVGA3D_QUERYTYPE_OCCLUSION query type for PIPE_QUERY_OCCLUSION_PREDICATE and return TRUE if samples-passed count is greater than 0. Fixes glretrace/solidworks2012_viewport running in hardware version 11. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-04-25 12:59:29 -06:00
Charmaine Lee	d7a6c1a476	svga: minimize surface flush Currently, we always do a surface flush when we try to establish a synchronized write transfer map. But if the subresource has not been modified, we can skip the surface flush. In other words, we only need to do a surface flush if the to-be-mapped subresource has been modified in this command buffer. With this patch, lightsmark2008 shows about 15% performance improvement. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-04-25 12:59:29 -06:00
Frederic Devernay	23949cdf2c	glapi: fix _glapi_get_proc_address() for mangled function names In the dispatch table, all functions are stored without the "m" prefix. Modify code so that OSMesaGetProcAddress works both with gl and mgl prefixes. Similar to https://lists.freedesktop.org/archives/mesa-dev/2015-September/095251.html Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94994 Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-04-25 12:59:29 -06:00
Brian Paul	63df017fda	util/blitter: use ARRAY_SIZE macro And remove local definition of Elements() macro. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-25 12:59:29 -06:00
Brian Paul	e0184b3995	svga: s/Elements/ARRAY_SIZE/ Standardize on the later macro rather than a mix of both. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-04-25 12:59:29 -06:00
Brian Paul	77e4b41671	svga: whitespace and formatting fixes in svga_pipe_rasterizer.c	2016-04-25 12:59:29 -06:00
Brian Paul	25e0d3659f	svga: whitespace and formatting fixes in svga_pipe_depthstencil.c	2016-04-25 12:59:29 -06:00
Brian Paul	595fbc8dee	svga: whitespace and formatting fixes in svga_pipe_sampler.c	2016-04-25 12:59:29 -06:00
Brian Paul	1db8313168	gallium/util: initialize pipe_framebuffer_state to zeros To silence a valgrind uninitialized memory warning. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94955 Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-25 12:59:29 -06:00
Brian Paul	1e990978ee	util/cache: add comments, fix formatting	2016-04-25 12:59:29 -06:00
Kenneth Graunke	4e2d22c5a7	i965: Mark URB reads as volatile. They can be affected by URB writes. In the upcoming scalar TCS backend, this prevents read-modify-write cycles from being broken by CSE removing reads. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-04-25 11:45:15 -07:00
Kenneth Graunke	501bedffa6	i965: Make a few tessellation related functions non-static. Also, move them to brw_shader.cpp so they're in a location for code used by both the vec4 and fs worlds. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-04-25 11:44:48 -07:00
Brian Paul	464d6080c6	svga: separate HUD counters for state objects Count depth/stencil, blend, sampler, etc. state objects separately but just report the sum for the HUD. This change lets us use gdb to see the breakdown of state objects in more detail. Also, count sampler views too. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-04-25 09:45:16 -06:00
Robert Foss	b87856d25d	st/omx: Fix resource leak on OMX_ErrorNone Avoid leaking buffer allocated for task if an error has occured. Coverity id: 1213929 Signed-off-by: Robert Foss <robert.foss@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-25 15:09:37 +01:00
Jonathan Gray	3c8f9ed9b7	isl: remove ffs function that conflicts with system headers Remove a wrapper around __builtin_ffs that conflicts with system headers on OpenBSD and perhaps elsewhere: isl_priv.h:44: error: conflicting types for 'ffs' v2: include strings.h to ensure prototype is found Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-25 15:06:46 +01:00
Grazvydas Ignotas	dc732a8ef2	gallium: use unreachable instead of asserts Avoids warnings in release builds. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-25 12:23:34 +02:00
Grazvydas Ignotas	d14778656b	anv: fix warnings in release build Mark variables MAYBE_UNUSED to avoid unused-but-set-variable warnings in release build. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Chad Versace <chad.versace@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-25 12:23:31 +02:00
Grazvydas Ignotas	ff48375a16	isl: fix warnings in release build Mark variables MAYBE_UNUSED to avoid unused-but-set-variable warnings in release build. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Chad Versace <chad.versace@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-25 12:23:28 +02:00
Grazvydas Ignotas	29d2c0e9e6	spirv: fix warning in release build Mark variable MAYBE_UNUSED to avoid unused-but-set-variable warning in release build. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-25 12:23:25 +02:00
Grazvydas Ignotas	cbb0d4ad75	gallium: fix warnings in release build Mark variables MAYBE_UNUSED to avoid unused-but-set-variable warnings in release build. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-25 12:23:21 +02:00
Grazvydas Ignotas	bbeb9ab2f7	glsl: fix warning in release build Mark variable MAYBE_UNUSED to avoid unused-but-set-variable warning in release build. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-25 12:23:16 +02:00
Grazvydas Ignotas	e4fc06a2f8	util: add MAYBE_UNUSED for config dependent variables This is mostly for variables that are only used in asserts and cause unused-but-set-variable warnings in release builds. Could just use UNUSED directly, but MAYBE_UNUSED should be less confusing and is similar to what the Linux kernel has. And yes __attribute__((unused)) can be used on variables on both GCC 4.2 (oldest supported by mesa) and clang 3.0 (just some random old version, not sure what's the minimum for mesa). Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Chad Versace <chad.versace@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-25 12:23:10 +02:00
Hans de Goede	787a53988c	nouveau: codegen: combineLd/St do not combine indirect loads combineLd/St would combine, i.e. : st u32 # g[$r2+0x0] $r2 st u32 # g[$r2+0x4] $r3 into: st u64 # g[$r2+0x0] $r2d But this is only valid if r2 contains an 8 byte aligned address, which is not guaranteed for compute shaders This commit checks for src0 dim 0 not being indirect when combining loads / stores as combining indirect loads / stores may break alignment rules. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-25 11:45:07 +02:00
Rob Clark	0831eb94b9	freedreno/ir3: relax restriction in grouping Currently we were two restrictive, and would insert an output move in cases like: MOV OUT[0], IN[0].xyzw Loosen the restriction to allow the current instruction to appear in the neighbor list but only at it's current possition. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-24 13:40:57 -04:00
Rob Clark	36c9ea6e79	freedreno/ir3: fix small memory leak Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-24 13:40:57 -04:00
Rob Clark	610837fb98	freedreno/ir3: fix small RA bug Normally the offset in the group would be the same, but not always. For example, in a sam(w) which only writes the 4th component. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-24 13:40:57 -04:00
Rob Clark	adf795432f	freedreno/a4xx: better workaround for astc+srgb This seems like a hw bug, and maybe only applies to certain a4xx variants/revisions. But setting the SRGB bit in sampler view state (texconst0) causes invalid alpha for ASTC textures. Work around this setting up a second texture state and using that to sample alpha separately. This way, srgb->linear conversion happens in hw prior to interpolation. This fixes 546 dEQP tests: dEQP-GLES3.functional.texture.astcsrgb* Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-24 13:40:57 -04:00
Rob Clark	a148300b13	Revert "freedreno/a4xx: lower srgb in shader for astc textures" Better workaround in the following patch. This reverts commit `899bd63ace`.	2016-04-24 13:40:57 -04:00
Rob Clark	19118e6f47	freedreno/a4xx: blend state no longer depends on fb state Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-24 13:40:57 -04:00
Marek Olšák	c0c6ca40a2	Revert "st/dri: add 32-bit RGBX/RGBA formats" This reverts commit `ccdcf91104`. It breaks most KDE apps, because DRI doesn't support the RGBA component ordering. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95071	2016-04-24 15:16:07 +02:00
Jonathan Gray	147a2d25ad	genxml: use PYTHON3 Allows the build to work when the python3 binary is not "python3". v2: remove x bit from the script at Emil's suggestion Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-23 16:45:05 -07:00
Nanley Chery	710b1d2e66	i965/tex_image: Flush certain subnormal ASTC channel values When uploading a linear, void-extent, ASTC LDR block on Skylake, we are required to flush to zero the UNORM16 channel values that would be denormalized. This is specifically required for the values: 1, 2, and 3. Fixes the 14 failing tests in: dEQP-GLES3.functional.texture.compressed.astc.void_extent_ldr.* v2: Split out flushing function (Kristian Høgsberg) v3: Map with READ instead of INVALIDATE (Kenneth Graunke) Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-23 11:35:08 -07:00
Jonathan Gray	e29b3bfd6e	configure.ac: search for and set PYTHON3 src/intel/genxml/gen_pack_header.py requires python3. v2: check for python3.5 as well Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-23 01:06:20 -07:00
Topi Pohjolainen	f8dd07a2c3	i965/blorp: Enable for buffer resolves Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94181 Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-23 07:29:15 +03:00
Topi Pohjolainen	c7cf17ae75	i965/blorp: Enable for normal color clears Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-23 07:29:15 +03:00
Topi Pohjolainen	c4ec0121a8	i965/blorp: Fix clear code for ignoring colormask for XRGB formats on Gen9+ This is equivalent of `73b01e2711` for blorp. v2 (Ken): No need to call _mesa_format_has_color_component() now that the number of components is gotten from _mesa_base_format_component_count(). Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-23 07:29:15 +03:00
Topi Pohjolainen	19948f1bf6	mesa/formats: Take luminance into account in component count Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-04-23 07:29:15 +03:00
Topi Pohjolainen	9e153c0692	i965/blorp: Do not trigger re-emission of base state address In case blorp needs to configure it will be just as if render or compute pipeline had configured it. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-23 07:28:58 +03:00
Topi Pohjolainen	84db9ca3f7	i965/blorp: Reconfigure base state address only if needed Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-23 07:09:39 +03:00
Topi Pohjolainen	234b5f23f8	i965/blorp: Use BRW_NEW_BLORP instead of trashing all state bits Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-23 07:09:39 +03:00
Kenneth Graunke	6d5ce1b043	i965: Make all atoms to track BRW_NEW_BLORP by default Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com	2016-04-23 07:09:39 +03:00
Topi Pohjolainen	65a5af6dd0	i965: Introduce state flag for blorp In the past, BLORP has clobbered all BRW_NEW_* state flags, to trigger re-emission of the entire 3D pipeline on the next draw. However, there are some packets BLORP simply leaves alone, so there's no need to re-emit them. Trying to reduce the set of dirty bits flagged after BLORP runs is tricky. Instead, we introduce a BRW_NEW_BLORP flag. This should be set on any atom which emits a packet that BLORP also emits. When BLORP runs, it will flag BRW_NEW_BLORP, causing those packets to get re-emitted. This also makes it easy to avoid re-emitting specific atoms - we can simply drop the BRW_NEW_BLORP flag on those. To start, we assume that all packets need to be re-emitted. This is the safest approach and closest to the existing code's behavior. Many of these are obviously not required, and can be dropped in subsequent patches. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-23 07:09:39 +03:00
Topi Pohjolainen	0e850452d1	i965/blorp/gen6: Use normal base state address setup This is identical to the blorp version which only differs in case fragment shader isn't used. In that case blorp would reset batch buffer address to zero. This is not really needed, and having blorp to use base state address setup that is compatible with normal upload allows one to skip resetting it. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-23 07:09:39 +03:00
Topi Pohjolainen	ae73e86497	i965: Remove pointers to non-existing atoms Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-23 07:09:39 +03:00
Tom Stellard	9f110a9e10	radeonsi: Implement ddx/ddy on VI using ds_bpermute The ds_bpermute instruction allows threads to transfer data directly to or from the vgprs of other threads. These instructions use the LDS hardware to transfer data, but do not read or write LDS memory. DDX BEFORE: \| DDX AFTER: \| v_mbcnt_lo_u32_b32_e64 v2, -1, 0 \| v_mbcnt_lo_u32_b32_e64 v2, -1, 0 v_mbcnt_hi_u32_b32_e64 v2, -1, v2 \| v_mbcnt_hi_u32_b32_e64 v2, -1, v2 v_lshlrev_b32_e32 v4, 2, v2 \| v_and_b32_e32 v2, 60, v2 v_and_b32_e32 v2, 60, v2 \| v_lshlrev_b32_e32 v2, 2, v2 v_lshlrev_b32_e32 v3, 2, v2 \| ds_bpermute_b32 v3, v2, v0 s_mov_b32 m0, -1 \| ds_bpermute_b32 v0, v2, v0 offset:4 ds_write_b32 v4, v0 \| s_waitcnt lgkmcnt(0) s_waitcnt lgkmcnt(0) \| v_or_b32_e32 v0, 1, v2 \| v_lshlrev_b32_e32 v0, 2, v0 \| ds_read_b32 v1, v3 \| ds_read_b32 v0, v0 \| s_waitcnt lgkmcnt(0) \| \| LDS: 1 blocks \| LDS: 0 blocks Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Acked-by: Marek Olšák <marek.olsak@amd.com>	2016-04-22 23:48:43 +00:00
Tom Stellard	128267d781	radeonsi: Use llvm.amdgcn.mbcnt.* intrinsics instead of llvm.SI.tid We're trying to move to more of the new style intrinsics with include the correct target name, and map directly to ISA instructions. v2: - Only do this with LLVM 3.8 and newer. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-22 23:48:43 +00:00
Tom Stellard	d3427412a3	radeonsi: Set range metadata on calls to llvm.SI.tid The range metadata tells LLVM the range of expected values for this intrinsic, so it can do some additional optimizations on the result. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-22 23:48:41 +00:00
Tom Stellard	b31422d970	radeonsi: Create a helper function for computing the thread id Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-22 23:45:34 +00:00
Nanley Chery	86cd9a134f	i965: Disable KHR_texture_compression_astc_hdr on Gen9 Although Gen9 samples from most HDR ASTC surfaces of correctly, there currently are no software workarounds to fix the incorrect sampling that occurs in others of certain color endpoint modes. With this change, we are no longer failing the 14 tests from: dEQP-GLES3.functional.texture.compressed.astc.endpoint_value_hdr_cem_15.* Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-22 16:57:38 -07:00
Tim Rowley	ec089cd987	swr: [rasterizer memory] Constify load tiles Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-22 18:49:20 -05:00
Tim Rowley	6facf4b74a	swr: [rasterizer core] CompleteDrawContext changes for gcc Add explicit inline and non-inline versions of CompleteDrawContext to make gcc happy. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-22 18:49:04 -05:00
Tim Rowley	0487377dce	swr: [rasterizer] Small cleanups Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-22 18:48:56 -05:00
Tim Rowley	2c4c3c9c71	swr: [rasterizer scripts] Knob scripts tweaks Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-22 18:48:47 -05:00
Tim Rowley	ef293ee9c0	swr: [rasterizer] Interpolation utility functions v2: use _mm_cmpunord_ps for vIsNaN Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-22 18:48:38 -05:00
Tim Rowley	27cc5924ea	swr: [rasterizer core] TemplateArgUnroller Switch boolean template arguments to typename template arguments of type std::integral_constant<bool, VALUE>. This allows the template argument unroller to easily be extended to enums. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-22 18:48:29 -05:00
Tim Rowley	46a448d161	swr: [rasterizer core] Arena: make most allocated blocks the same size Reduces sorting cost Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-22 18:48:20 -05:00
Tim Rowley	794be41f91	swr: [rasterizer core] Fix global arena allocator bug - Plus some minor code refactoring Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-22 18:48:11 -05:00
Tim Rowley	e42f00ee39	swr: [rasterizer core] Fix thread binding for 32-bit windows Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-22 18:47:59 -05:00
Tim Rowley	cd21f90ecf	swr: [rasterizer fetch] Add support for fetching non-uniform component formats For example, R10G10B10A2_UNORM. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-22 18:47:48 -05:00
Tim Rowley	244ae7af1b	swr: [rasterizer core] Use CS spill/fill size in core Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-22 18:47:02 -05:00
Tim Rowley	ee9621e2f5	swr: fix memory leaks from vs/fs compilation v2: varient -> variant Reviewed by: George Kyriazis <George.Kyriazis@intel.com>	2016-04-22 18:05:02 -05:00
Tim Rowley	5815c8b3d3	swr: fix clang warnings v2: use alternate logic version in swr_check_render_cond Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-22 18:03:41 -05:00
Rob Clark	e85bef8b12	freedreno/a4xx: fix encoding of blend color state Fixes a whole bunch of dEQP-GLES3.functional.fragment_ops.random.* (now they all pass) Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-22 15:00:34 -04:00
Rob Clark	23abc41d2b	freedreno: update generated headers Pull in RB_BLEND_* fixes. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-22 15:00:34 -04:00
Eric Anholt	79b36168e0	vc4: Make sure we recompile when sample_mask changes. Part of fixing piglit EXT_framebuffer_multisample/sample-coverage inverted (there is also a bug with RCL tiled blits) Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-04-22 11:27:11 -07:00
Eric Anholt	876c647194	vc4: Fix validation of full res tile offset if used for non-MSAA. There's no reason we couldn't do non-MSAA full resolution tile buffer load/stores, but we would have claimed buffer overflow was being attempted. Nothing does this currently.	2016-04-22 11:27:11 -07:00
Eric Anholt	3fecaf0d0c	vc4: Only do MSAA FB operations if the FB is MSAA. I noticed this as a problem with ET:QW traces emitting coverage code when the framebuffer was supposed to be single sampled.	2016-04-22 11:27:11 -07:00
Eric Anholt	1410403e1e	vc4: Fix tests for format supported with nr_samples == 1. This was a bug from the MSAA enabling. Tests for surfaces with nr_samples==1 instead of 0 (generally GL renderbuffers) would incorrectly fail out. Fixes the ARB_framebuffer_sRGB piglit tests other than srgb_conformance. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-04-22 11:27:11 -07:00
Eric Anholt	6eabdb8959	vc4: Don't try to blit from MSAA surfaces with mismatched width to dst. I had made the previous blit fix non-MSAA only because I was thinking about how the hardware infers stride from the RENDERING_CONFIG packet. However, I'm also inferring the stride for both MSAA src and dst in vc4_render_cl.c from the width argument in the ioctl. Fixes 15 EXT_framebuffer_multisample piglit tests.	2016-04-22 11:27:11 -07:00
Kenneth Graunke	42dea145d9	i965: Disable channel expressions for scalar GS, TCS, TES. On Broadwell, I get the following shader-db statistics: Tessellation Control Shaders: total instructions in shared programs: 57327 -> 57012 (-0.55%) instructions in affected programs: 27334 -> 27019 (-1.15%) helped: 45 HURT: 0 total cycles in shared programs: 265692 -> 255188 (-3.95%) cycles in affected programs: 263122 -> 252618 (-3.99%) helped: 184 HURT: 26 Tessellation Evaluation Shaders: total instructions in shared programs: 23236 -> 23157 (-0.34%) instructions in affected programs: 2791 -> 2712 (-2.83%) helped: 27 HURT: 0 total cycles in shared programs: 151858 -> 149704 (-1.42%) cycles in affected programs: 151858 -> 149704 (-1.42%) helped: 101 HURT: 114 Geometry Shaders: Orbital Explorer goes from 6442 -> 6356 instructions. Two Shadow of Mordor shaders increase by a single instruction. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-22 10:26:30 -07:00
Topi Pohjolainen	1883613a24	i965/blorp: Add support for 2x msaa Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-22 17:02:29 +03:00
Topi Pohjolainen	125a7fdf32	i965/blorp: Add support for encoding/decoding interleaved 2x msaa Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-22 17:01:29 +03:00
Samuel Iglesias Gonsálvez	f70cacc4bd	i965: don't lower mod() in glsl ir NIR will lower it in nir_opt_algebraic. No change in shader-db. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-22 13:44:28 +02:00
Timothy Arceri	72b5d00c9c	glsl: fix cross validation for explicit locations on structs and arrays Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-22 20:59:57 +10:00
Nicolai Hähnle	39e9cf6cb1	radeonsi: implement TGSI_SEMANTIC_HELPER_INVOCATION Depends on LLVM support introduced in r267102. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-21 23:14:04 -05:00
Ilia Mirkin	2bac561787	swr: ignore generated files in rasterizer Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-04-22 00:07:25 -04:00
Ilia Mirkin	88ca4a43a2	nvc0: fix retrieving query results into buffer for timestamps The timestamps are stored in a funny place, and even though they are a 64-bit result, are not stored with is64bit. Account for that when retrieving the query result into a resource. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2" <mesa-stable@lists.freedesktop.org>	2016-04-22 00:06:49 -04:00
Jason Ekstrand	541e6c0500	i965/surface_state: Use libisl functions for image format lowering This lets us delete some redundant code and keep all of the image_load_store format lowering logic in one place: libisl. Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-21 20:44:27 -07:00
Jason Ekstrand	e53cabe730	i965/fs_surface_builder: Use isl instead of mesa for format info Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-21 20:44:27 -07:00
Jason Ekstrand	1831fa104c	i965/fs_surface_builder: Add a helper for converting GL to ISL formats Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-21 20:44:27 -07:00
Jason Ekstrand	24bb75049b	i965/fs_surface_builder: Explicitly handle FORMAT_NONE in num_image_coordinates Previously, we were relying on has_matching_typed_format returning true for MESA_FORMAT_NONE which, in turn, relied on _mesa_get_format_bytes returning 1 for MESA_FORMAT_NONE. When we switch to ISL, this behaviour will no longer be something we can rely on. Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-21 20:44:27 -07:00
Jason Ekstrand	f310c02b94	i965/fs_surface_builder: Take a GL format enum instead of mesa_format Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-21 20:44:27 -07:00
Jason Ekstrand	2980507a19	isl/format: Add a get_num_channels helper Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-21 20:44:27 -07:00
Jason Ekstrand	3415cf5f2f	isl/format: Add more isl_format_has_type_channel functions Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-21 20:44:27 -07:00
Jason Ekstrand	a4c04dd410	isl/format: Break the guts of has_[us]int_channel into a helper Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-21 20:44:27 -07:00
Jason Ekstrand	ca8c5993bf	anv/image: Use the has_matching_typed_storage_image_format helper from isl Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-21 20:44:27 -07:00
Jason Ekstrand	65bd8317e2	isl: Add a helper for determining when a typed load/store can be used Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-21 20:44:27 -07:00
Jason Ekstrand	90576ac963	isl: Take a devinfo in lower_storage_image_format instead of an isl_device We want to call this function from the shader compiler and having a full isl_device available at that point isn't practical. Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-21 20:44:27 -07:00
Jason Ekstrand	37f6f21b1f	isl: Don't use designated initializers in the header C++ doesn't support designated initializers and g++ in particular doesn't handle them when the struct gets complicated, i.e. has a union. Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-21 20:44:27 -07:00
Jason Ekstrand	2785840586	isl: Include c99_compat.h We need the restrict keyword in isl.h Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-21 20:44:27 -07:00
Jason Ekstrand	ef5dca2034	i965: Add a dependency on libisl To avoid build issues, ensure that you're running `make' at the top level and/or you've executed `make clean' beforehand. Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-21 20:44:27 -07:00
Nicolai Hähnle	fe3b1e1448	radeon: handle query buffer allocation and mapping failures Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94984 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-21 22:33:12 -05:00
Nicolai Hähnle	b222580578	radeon: wire end_query return value to sw/hw_end Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-21 22:33:07 -05:00
Nicolai Hähnle	71f33a6f69	st/mesa: check return value of begin/end_query They can only indicate out of memory conditions, since the other error conditions are caught earlier. v2: fix error message in EndQuery Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-04-21 22:33:03 -05:00
Nicolai Hähnle	32214e0c68	gallium: add bool return to pipe_context::end_query Even when begin_query succeeds, there can still be failures in query handling. For example for radeon, additional buffers may have to be allocated when queries span multiple command buffers. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-21 22:32:50 -05:00
Ben Widawsky	6a0d036483	i965: Always use Y-tiled buffers on SKL+ Starting with Skylake, the display engine is capable of scanning out from Y-tiled buffers. As such, we can and should use Y-tiling for better efficiency. This also has the added benefit of being able to fast clear the winsys buffer. Note that the buffer allocation done for mipmaps will already never allocate an X-tiled buffer for GEN9. This has an almost universal positive impact on benchmarks, some improving by as much as 20%. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 20:14:58 -07:00
Marek Olšák	c3b88cc2c1	softpipe: fix a warning due to an incorrect enum comparison no change in behavior, because both are defined the same Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-22 01:30:39 +02:00
Marek Olšák	c9e5a7df61	gallium: remove helpers converting to/from TGSI_PROCESSOR_* Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-22 01:30:39 +02:00
Marek Olšák	af249a7da9	gallium: use PIPE_SHADER_* everywhere, remove TGSI_PROCESSOR_* Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-22 01:30:39 +02:00
Marek Olšák	fb523cb6ad	gallium: merge PIPE_SWIZZLE_* and UTIL_FORMAT_SWIZZLE_* Use PIPE_SWIZZLE_* everywhere. Use X/Y/Z/W/0/1 instead of RED, GREEN, BLUE, ALPHA, ZERO, ONE. The new enum is called pipe_swizzle. Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-22 01:30:39 +02:00
Marek Olšák	ed23335a31	gallium: use enums in p_shader_tokens.h (v2) Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> (v1) Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v1) Acked-by: Jose Fonseca <jfonseca@vmware.com> (v1) v2: name enums	2016-04-22 01:30:36 +02:00
Marek Olšák	0135bd44c2	gallium: use enums in p_defines.h (v2) and remove number assignments which are consecutive Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> (v1) Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v1) Acked-by: Jose Fonseca <jfonseca@vmware.com> (v1) v2: name enums	2016-04-22 01:30:34 +02:00
Marek Olšák	8cfc4cf76d	radeonsi: remove the shader parameter from si_set_ring_buffer not used anymore this is a follow-up to the RW buffer cleanup. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-04-22 01:14:14 +02:00
Marek Olšák	3cbd8cfc7a	radeonsi: decrease GS copy shader user SGPRs to 2 const buffers are no longer used since the clip plane const buffer was moved to RW buffers Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-22 01:14:14 +02:00
Marek Olšák	3acaefb1bb	radeonsi: shorten slot masks to 32 bits Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-22 01:14:14 +02:00
Marek Olšák	0954d5e982	radeonsi: clean up shader resource limit definitions Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-22 01:14:14 +02:00
Marek Olšák	3138a28ff2	radeonsi: move default tess level constant buffer to RW buffers Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-22 01:14:14 +02:00
Marek Olšák	302bec24bd	radeonsi: move sample positions constant buffer to RW buffers Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-22 01:14:13 +02:00
Marek Olšák	860b658b97	radeonsi: move clip plane constant buffer to RW buffers Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-22 01:14:13 +02:00
Marek Olšák	698821bda3	radeonsi: rework polygon stippling to use constant buffer instead of texture add it to the RW_BUFFERS descriptor array now the slot masks don't have to have 64 bits Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-22 01:14:13 +02:00
Marek Olšák	bb1e647ada	radeonsi: generalize si_set_constant_buffer this will be used in the next commit Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-22 01:14:13 +02:00
Marek Olšák	36261c29cd	radeonsi: make RW buffer descriptor array global, not per shader stage v2: also simplify invalidation of RW buffer bindings (squashed) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-22 01:14:13 +02:00
Marek Olšák	1378487fb4	radeonsi: rename and rearrange RW buffer slots - use an enum - use a unique slot number regardless of the shader stage (the per-stage slots will go away for RW buffers) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-22 01:14:13 +02:00
Roland Scheidegger	4ff8cbb0d8	gallivm: fix bogus argument order to lp_build_sample_mipmap function Screwed up since `0753b135f6`. (Only an issue with different min/mag filters, and then only in some cases, which is probably why it went unnoticed for quite a while. The effect should have simply been nearest mip filter instead of linear, iff min was nearest, mag was linear, and all pixels hit the mignifying path.) Fixes a bunch of dEQP failures. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-04-21 23:57:24 +02:00
Kenneth Graunke	73b01e2711	i965: Fix clear code for ignoring colormask for XRGB formats on Gen9+. In commit `cda886a485`, Neil made us stop advertising RGBX formats on Gen9+, as the hardware apparently no longer has working fast clear support for those formats. Instead, we just fall back to RGBA formats, and use SCS to override alpha to 1.0. This is fine, but had one unintended side effect: it made us fall back to slow clears when the color mask disables alpha. Normally, we ignore the color mask for non-existent channels. This includes alpha for XRGB formats as writing garbage to the X channel is harmless. But, now that we use RGBA, we think there's a real alpha channel, and can't do the optimization. To hack around this, check if _BaseFormat is GL_RGB and ignore alpha. Improves WebGL Aquarium performance on Skylake GT3e by about 50% by letting it use repclears instead of slow clears. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-04-21 12:01:49 -07:00
Iago Toral Quiroga	bdaa0e12a2	i965/blorp: Improve precission of blitting coordinates when clipping We do this in two steps: first we clip the dst rect and adjust the src rect accordingly. Then we do it the other way around. In both passes the adjustment part involves multiplying by a scale factor that can lead to a small precision loss. This is breaking a few dEQP tests. Specifically, the problem happens when we need to clip the same coordinate twice. For example, if srcX0 and dstX0 need both to be clipped we want to avoid the situation where we clip srcX0 first, then adjust dstX0 accordingly but then we realize that the resulting dstX0 still needs to be clipped, so we clip dstX0 and adjust srcX0 again. Each of these two passes can lead to precission loss. What we want to do here is detect the rect that leads to the largest clip (accounting for the scale factor involved), clip that rect and adjust the other one. With this we ensure that the adjusted coordinate does not need to be clipped again and we can skip a second pass, improving precision. Fixes the following 4 dEQP tests: dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_x_nearest dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_x_linear dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_dst_x_nearest dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_dst_x_linear Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Mark Janes <mark.a.janes@intel.com>	2016-04-21 10:43:39 -07:00
Bas Nieuwenhuizen	38f4cee3ff	radeonsi: Add config parameter to si_shader_apply_scratch_relocs. shader->config is not updated for compute kernels. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2016-04-21 19:36:19 +02:00
Matt Turner	1bc983cd64	glsl: Relax GLSL 1.10 float suffix error to a warning. Float suffixes are allowed in all subsequent GLSL specifications, and it's obvious what the user meant if they specify one. Accept it with a warning to avoid breaking applications, like Planeshift (although it looks like between 0.6.1 and 0.6.3 they might have removed the suffixes from their shaders). Reviewed-by: Lars Hamre <chemecse@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:33:08 -07:00
Matt Turner	33565d6764	i965/fs: Readd opt_drop_redundant_mov_to_flags(). This reverts commit `b449366587`. I removed the pass thinking that it was now not useful, but that was not true. I believe I ran shader-db on HSW and saw no results, but HSW does not use the unlit centroid workaround code and as a result does not emit redundant MOV_DISPATCH_TO_FLAGS instructions. On IVB, the shader-db results are: total instructions in shared programs: 6650806 -> 6646303 (-0.07%) instructions in affected programs: 106893 -> 102390 (-4.21%) helped: 793 total cycles in shared programs: 56195538 -> 56103720 (-0.16%) cycles in affected programs: 873048 -> 781230 (-10.52%) helped: 553 HURT: 209 On SNB, the shader-db results are: total instructions in shared programs: 7173074 -> 7168541 (-0.06%) instructions in affected programs: 119757 -> 115224 (-3.79%) helped: 799 total cycles in shared programs: 98128032 -> 98072938 (-0.06%) cycles in affected programs: 1437104 -> 1382010 (-3.83%) helped: 454 HURT: 237 Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-04-21 10:32:40 -07:00
Topi Pohjolainen	0020ca3c92	i965/blorp: Do not emit pma stall on gen9+ This was left out from the original gen8 upload introduction. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 20:18:51 +03:00
Tim Rowley	81c1c481ed	swr: add PIPE_CAP_FRAMEBUFFER_NO_ATTACHMENT to get_param Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-21 11:32:09 -05:00
Emil Velikov	9dcb3dfb23	i965: automake: remove gratuitous "+" during variable assignment There is not initial assignment, thus appending to it does not work. Fixes: `b27c85c4c0` "i965: add build rule for brw_nir_trig_workarounds.c" Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-21 16:48:34 +01:00
Rob Herring	1ba203a085	gbm: add GBM_FORMAT_XBGR8888 format support Add GBM_FORMAT_XBGR8888/__DRI_IMAGE_FORMAT_XBGR8888 format support which is needed for Android. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-04-21 14:45:56 +01:00
Rob Herring	ccdcf91104	st/dri: add 32-bit RGBX/RGBA formats Add support for 32-bit RGBX/RGBA formats which are preferred for Android. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-04-21 14:45:53 +01:00
Rob Herring	3b69076435	dri/common: add MESA_FORMAT_R8G8B8{A8, X8}_UNORM formats as supported configs Add MESA_FORMAT_R8G8B8A8_UNORM and MESA_FORMAT_R8G8B8X8_UNORM formats as these are the preferred formats for Android. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-04-21 14:45:21 +01:00
Rob Herring	b27c85c4c0	i965: add build rule for brw_nir_trig_workarounds.c on Android Commit `bfd17c76c1` ("i965: Port INTEL_PRECISE_TRIG=1 to NIR.") added a generated file brw_nir_trig_workarounds.c which broke the Android build. Add the necessary makefiles to the Android build. Cc: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Rob Herring <robh@kernel.org> Tested-by: Chih-Wei Huang <cwhuang@linux.org.tw> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-21 14:43:26 +01:00
Rob Herring	30239ba056	glsl: android: add back missing generated glcpp include path Commit `4db8f15a25` ("glsl: move the android build scripts a level up") dropped a generated include path for glcpp. Add it back adjusting for the new location. Signed-off-by: Rob Herring <robh@kernel.org> Tested-by: Chih-Wei Huang <cwhuang@linux.org.tw> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-21 14:43:21 +01:00
Jonathan Gray	28e3ae344b	loader: add a libdrm case for loader_get_device_name_for_fd Use dev_node_from_fd() with HAVE_LIBDRM to provide an implmentation of loader_get_device_name_for_fd() for non-linux systems that use libdrm but don't have udev or sysfs. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-21 14:41:41 +01:00
Jonathan Gray	5d09394fb1	i965/tiled_memcpy: don't unconditionally use __builtin_bswap32 Use the defines Mesa configure sets to indicate presence of the bswap32 builtins. This lets i965 work on OpenBSD again after the changes that were made in `0a5d8d9af4`. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-21 14:41:41 +01:00
Jonathan Gray	9bbf3737f9	egl/x11: authenticate before doing chipset id ioctls For systems without udev or sysfs that use drm ioctls in the loader drm authentication must take place earlier or the loader will fail "MESA-LOADER: failed to get param for i915". Patch from Mark Kettenis. Cc: "11.2 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Mark Kettenis <kettenis@openbsd.org> Signed-off-by: Jonathan Gray <jsg@jsg.id.au> [Emil Velikov: remove gratuitous white-space] Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-21 14:40:44 +01:00
Bas Nieuwenhuizen	4abe051a3f	gallium/radeon: Silence possibly uninitialized variable warning. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-21 13:40:47 +02:00
Bas Nieuwenhuizen	51d1551241	winsys/amdgpu: Silence possibly uninitialized variable warning. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-21 13:40:42 +02:00
Bas Nieuwenhuizen	4d13c7c879	radeonsi: Enable loading into CE RAM. We need to enable a bit in the CONTEXT_CONTROL packet for the loads to work. v2: Style issues. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-21 12:50:58 +02:00
Bas Nieuwenhuizen	f45f54e14a	radeonsi: Use defines for CONTEXT_CONTROL instead of magic values. v2: Use field names provided by Nicolai. v3: Updated to use CONTEXT_CONTROL prefix. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-21 12:50:58 +02:00
Thomas Hindoe Paaboel Andersen	d4a21a0de0	winsys/amdgpu: fix preamble IB size The missing break caused the IB size to be overwritten with the size of IB_CONST. This was introduced in: `7201230582` Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-04-21 12:14:50 +02:00
Topi Pohjolainen	935ce14a44	i965/blorp: Reduce the urb size requirement for vertex buffer Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:03 +03:00
Topi Pohjolainen	26fdb7e51e	i965/blorp: Reduce the size of vertex buffer Previously the vertex buffer consisted of eight floats per vertex of which six where constants. These can be as easily provided by vertex fetcher as it is capable of filling vertex elements with constant one and zero. This reduces the size of the vertex buffer from 3 * 8 * 4 = 96 to 3 * 2 * 4 = 24 bytes. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:03 +03:00
Topi Pohjolainen	0ae360f098	i965/blorp: Do not tricker urb re-configuration unnecessarily Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:03 +03:00
Topi Pohjolainen	69dfb7b2b7	i965/blorp: Skip re-emitting urb config whenever possible Otherwise clearing with blorp will regress performance in some synthetic test cases. v2: Used vsize >= 2 instead of vsize > 0, and updated the comment. Review by Ken in one of the earlier patches revealed this. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:03 +03:00
Topi Pohjolainen	7644e8ab68	i965/blorp: Prepare to switch from compute pipeline Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:03 +03:00
Topi Pohjolainen	aa322f8ae5	i965/blorp: Skip uploading state/options not needed for clears In case there is no source it means the program does a simple clear or a resolve. In such case there is no need to program sampling state or enable pixel kill in fragment shader. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:03 +03:00
Topi Pohjolainen	87d333f2fe	i965/blorp: Re-introduce clear programs This partially reverts `2f28a0dc23` Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:03 +03:00
Topi Pohjolainen	69c364f2dc	i965/meta: Move check for srgb into is_color_fast_clear_compatible() Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:03 +03:00
Topi Pohjolainen	8a696e75d8	i965/meta: Expose check for fast clear compatibility Also add the additional render format check to the same utility. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:03 +03:00
Topi Pohjolainen	a848ad6806	i965/meta: Expose fast clear value setup Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:03 +03:00
Topi Pohjolainen	fb14a2fc78	i965/meta: Expose non-fast clear rectangle calculation Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:02 +03:00
Topi Pohjolainen	9d79235e4e	i965/meta: Expose resolve clear rectangle calculation Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:02 +03:00
Topi Pohjolainen	2757d723da	i965/meta: Expose fast clear rectangle calculation Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:02 +03:00
Topi Pohjolainen	3ef957e783	i965: Declare input to mcs alignment calculation constant Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:02 +03:00
Topi Pohjolainen	c40b1efa70	i965/blorp: Switch the order of render and texture targets On gen8 color resolving won't work anymore if the target isn't the first entry in the binding table. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:02 +03:00
Topi Pohjolainen	0d062d79c3	i965/blorp: Reduce scope for generator and its inputs Generator is only needed for getting the assembly. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:02 +03:00
Topi Pohjolainen	4c3de6b2d6	i965/blorp: Add support for disabling color blending Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:02 +03:00
Topi Pohjolainen	da5a477ce4	i965/blorp: Add support for setting fast clear operation Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:02 +03:00
Topi Pohjolainen	7de72f728b	i965/blorp: Enable blits on gen8 v2 (Ken): Moved switch cases for gen8/9 in texel_fetch() to earlier patch adding gen8/9 sampling support. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:02 +03:00
Topi Pohjolainen	f7ab4e0cc4	i965/blorp: Prepare stencil sampling for gen8 Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:02 +03:00
Topi Pohjolainen	708453952b	i965/blorp: Add check for supported sample numbers v2 (Ken): Fix the condition on using meta for stencil blits: use_blorp -> !use_blorp Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:01 +03:00
Topi Pohjolainen	9e4d19372b	i965/blorp: Add support for sampling 3D textures This patch adds additional MOV instruction for all blorp programs that use SHADER_OPCODE_TXF. Alternative is to augment blorp program key to tell if z-coordinate is needed, add condition to the blorp blit compiler and to produce a variant with and without the MOV. This seems a little overkill. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:01 +03:00
Topi Pohjolainen	6b33d63d77	i965/blorp: Add support for source swizzle In order to support cases where gen9 uses RGBA format to back client requested RGB, one needs to have means to force alpha channel to one when user requested RGB surface is used as blit source. v2 (Ken): Use helper for constructing the swizzle (this should be changed to use brw_get_texture_swizzle() as a follow-up). Also calculate the swizzle for CopyTexSubImage. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:01 +03:00
Topi Pohjolainen	52e7008a5a	i965/blorp: Pipeline upload support for gen8 v2 (Ken): Drop GEN8_RASTER_FRONT_WINDING_CCW in raster state Add emission of pma stall. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:20:01 +03:00
Topi Pohjolainen	2fda441371	i965/gen8: Expose pma stall emission Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 10:19:30 +03:00
Topi Pohjolainen	8b2332e3d1	i965: Allow texture surface state setup to be used by blorp Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 08:42:10 +03:00
Topi Pohjolainen	0ad83d222b	i965/blorp: Prepare sampling for gen9 v2 (Ken): Added switch cases for gen8/9 in texel_fetch(). These were wrongly introduced in blit-enabling patch. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 08:41:40 +03:00
Topi Pohjolainen	328ab6c268	i965/blorp: Prepare render target write for gen8 v2 (Ken): Use payload directly instead of retyping it into vec8. Drop the implied header, it isn't used for gen6+ anyway. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 08:40:33 +03:00
Topi Pohjolainen	135f00e666	i965/blorp/gen6: Prepare vertex buffer setup logic for gen8 Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 08:37:06 +03:00
Topi Pohjolainen	395abb9c3b	i965/blorp/gen7: Expose state setup applicable to gen8 Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 08:36:53 +03:00
Topi Pohjolainen	ede09e672a	i965/blorp: Use 8k chunk size for urb allocation Previously, we hardcoded "VS URB Starting Address" to 2 (in 8kB chunks), which meant VS URB data would start at an offset of 16kB. However, on Haswell GT3 and Gen8+, we allocate the first 32kB for the push constant region. This means that the PS push constant and VS URB data regions overlap, which can lead to corruption. v2 (Ken): Better description of the change, and do not change vs_size from 2 to 1. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 08:36:26 +03:00
Topi Pohjolainen	e04b3cdf33	i965/blorp/gen7: Prepare re-using for gen8 Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 08:36:14 +03:00
Topi Pohjolainen	f1ddfa8512	i965/blorp: Let compiler calculate the vertex buffer size Currently the size is sizeof(float) times too large. One reserves GEN6_BLORP_VBO_SIZE many floats whereas GEN6_BLORP_VBO_SIZE stands for the size of vertex buffer in bytes. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 08:35:58 +03:00
Topi Pohjolainen	4c526370ca	i965/gen8: Expose state base address setup Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 08:35:45 +03:00
Topi Pohjolainen	9949103756	i965/gen8: Expose surface state helpers Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 08:35:34 +03:00
Topi Pohjolainen	4f1d9f2879	i965/gen9: Use correct size for DS_STATE Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 08:32:12 +03:00
Roland Scheidegger	0295db2a8b	glsl: add forgotten textureOffset function for sampler2DArrayShadow This was part of EXT_gpu_shader4 - as such it should have been supported by glsl 130. It was however forgotten, and not added until glsl 430 - with the wrong syntax no less (glsl 430 mentions it was overlooked). glsl 440 (but revision 8 only) fixed this finally for good. At least nvidia supports this with just version glsl version 1.30 as well (the spec doesn't explicitly say it should be supported retroactively), so just add this to the other glsl 130 textureOffset functions. Passes a (hacked) piglit tex-miplevel-selection test (2DArrayShadow textureOffset -auto) with llvmpipe. v2: fix up comment (by Ian), add testing to commit message. Reviewed-by: Dave Airlie <airlied@gmail.com>	2016-04-21 02:38:46 +02:00
Kenneth Graunke	d8c8f4203f	i965: Fix interpolateAtSample() on single sampled buffers. Fixes dEQP-GLES31.functional.shaders.multisample_interpolation tests: - interpolate_at_sample.non_multisample_buffer.sample_n_default_framebuffer - interpolate_at_sample.non_multisample_buffer.sample_n_singlesample_rbo - interpolate_at_sample.non_multisample_buffer.sample_n_singlesample_texture Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-20 16:18:47 -07:00
Kenneth Graunke	447d3eec6a	i965: Fix gl_SampleMaskIn[] in per-sample shading mode. The coverage mask is not sufficient - in per-sample mode, we also need to AND with a mask representing the samples being processed by the current fragment shader invocation. Fixes 18 dEQP-GLES31.functional.shaders.sample_variables tests: sample_mask_in.bit_count_per_sample.multisample_{rbo,texture}_{1,2,4,8} sample_mask_in.bit_count_per_two_samples.multisample_{rbo,texture}_{4,8} sample_mask_in.bits_unique_per_sample.multisample_{rbo,texture}_{1,2,4,8} sample_mask_in.bits_unique_per_two_samples.multisample_{rbo,texture}_{4,8} Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-20 16:18:47 -07:00
Kenneth Graunke	66a725570c	i965: Only enable oMask output when there's a multisample FBO. The ARB_sample_shading specification says that setting gl_SampleMask bits to 0 means that the corresponding sample "should be considered uncovered for the purposes of multisample fragment operations (Section 4.1.3)." The OpenGL 4.4 specification, section 17.3.3 ("Multisample Fragment Operations") specifies: "No changes to the fragment alpha or coverage values are made at this step if MULTISAMPLE is disabled, or if the value of SAMPLE_BUFFERS is not one." oMask output alters coverage masks and can kill pixels. We need to disable it in the above case, which conveniently corresponds to key->multisample_fbo being false. Khronos bug #12188 also spells this out clearly: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=12188 Fixes two Piglit tests: tests/spec/arb_sample_shading/builtin-gl-sample-mask-simple 0 tests/spec/arb_sample_shading/builtin-gl-sample-mask 0 Fixes 21 ES3 conformance tests: ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_zero ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_0 ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_1 ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_2 ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_3 ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_7 ES31-CTS.sample_variables.mask.rgba8i.samples_0.mask_zero ES31-CTS.sample_variables.mask.rgba8i.samples_0.mask_3 ES31-CTS.sample_variables.mask.rgba8i.samples_0.mask_4 ES31-CTS.sample_variables.mask.rgba8i.samples_0.mask_5 ES31-CTS.sample_variables.mask.rgba8i.samples_0.mask_7 ES31-CTS.sample_variables.mask.rgba8ui.samples_0.mask_zero ES31-CTS.sample_variables.mask.rgba8ui.samples_0.mask_2 ES31-CTS.sample_variables.mask.rgba8ui.samples_0.mask_3 ES31-CTS.sample_variables.mask.rgba8ui.samples_0.mask_4 ES31-CTS.sample_variables.mask.rgba8ui.samples_0.mask_6 ES31-CTS.sample_variables.mask.rgba32f.samples_0.mask_zero ES31-CTS.sample_variables.mask.rgba32f.samples_0.mask_0 ES31-CTS.sample_variables.mask.rgba32f.samples_0.mask_2 ES31-CTS.sample_variables.mask.rgba32f.samples_0.mask_5 ES31-CTS.sample_variables.mask.rgba32f.samples_0.mask_7 Fixes 9 dEQP-GLES31.functional.shaders.sample_variables tests: sample_mask.discard_half_per_pixel.default_framebuffer sample_mask.discard_half_per_pixel.singlesample_rbo sample_mask.discard_half_per_pixel.singlesample_texture sample_mask.discard_half_per_sample.default_framebuffer sample_mask.discard_half_per_sample.singlesample_rbo sample_mask.discard_half_per_sample.singlesample_texture sample_mask.discard_half_per_two_samples.default_framebuffer sample_mask.discard_half_per_two_samples.singlesample_rbo sample_mask.discard_half_per_two_samples.singlesample_texture Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-20 16:18:47 -07:00
Kenneth Graunke	81407531e0	i965: Generalize wm_key->compute_sample_id to wm_key->multisample_fbo. I'm going to need a key entry meaning "we have a multisample FBO, and multisampling is enabled" in an upcoming patch. This is basically wm_key->compute_sample_id, except that it also checks that the SAMPLE_ID system value is read. The only use of wm_key->compute_sample_id is in emit_sampleid_setup(), which is only called when handling the SAMPLE_ID system value. So we can just eliminate the check and generalize the field. v2: Also update the Vulkan driver. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-20 16:18:47 -07:00
Kenneth Graunke	de0a46a040	i965: Delete now dead persample_2x FS program key flag. This was only used by the old gl_SampleID calculations. The new code doesn't need to handle 2x specially. v2: Delete it from the Vulkan driver, too. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-20 16:18:47 -07:00
Kenneth Graunke	57118a19da	i965: Simplify gl_SampleID setup on Gen8+. On Gen7+, the thread payload provides the sample ID - we can read it in two instructions, without any elaborate calculations. We don't even need a state dependency - this will properly produce zero in the non-MSAA case. Unfortunately, we need the state flag anyway, so we may as well continue to use it to produce a single MOV 0 instead of SHR/AND. For some reason, the sample ID field is always zero on Gen7/7.5, so we can't use this yet. However, it works fine on Gen8+. So, land the code and use it where it's working, and leave a TODO for later. v2: Fix register types in the comment (caught by Matt Turner!). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-20 16:18:47 -07:00
Kenneth Graunke	528255b0b1	i965: Flip key->compute_sample_id check. This just moves the simple case first. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-20 16:18:47 -07:00
Bas Nieuwenhuizen	43ed1f73f8	st/mesa: Use correct size for compute CAPs. Some CAPs are stored as 64-bit value while Mesa stores the related constant as 32-bit value. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-04-21 00:27:01 +02:00
Kenneth Graunke	60a17d0718	i965: Properly handle integer types in opt_vector_float(). Previously, opt_vector_float() always interpreted MOV sources as floating point, and always created a MOV with a F-type destination. This meant that we could mess up sequences of integer loads, such as: mov vgrf6.0.x:D, 0D mov vgrf6.0.y:D, 1D mov vgrf6.0.z:D, 2D mov vgrf6.0.w:D, 3D Here, integer 0/1/2/3 become approximately 0.0f, so we generated: mov vgrf6.0:F, [0F, 0F, 0F, 0F] which is clearly wrong. We can properly handle this by converting integer values to float (rather than bitcasting), and emitting a type converting MOV: mov vgrf6.0:D, [0F, 1F, 2F, 3F] To do this, see first see if the integer values (converted to float) are representable. If so, we use a D-type MOV. If not, we then try the floating point values and an F-type MOV. We make zero not impose type restrictions. This is important because 0D would imply a D-type MOV, but is often used in sequences such as MOV 0D, MOV 0x3f800000D, where we want to use an F-type MOV. Fixes about 54 dEQP-GLES2 failures with the vec4 VS backend. This recently became visible due to changes in opt_vector_float() which made it optimize more cases, but it was a pre-existing bug. Apparently it also manages to turn more integer loads into VFs, producing the following shader-db statistics on Haswell: total instructions in shared programs: 7084195 -> 7082191 (-0.03%) instructions in affected programs: 246027 -> 244023 (-0.81%) helped: 1937 total cycles in shared programs: 65669642 -> 65651968 (-0.03%) cycles in affected programs: 531064 -> 513390 (-3.33%) helped: 1177 v2: Handle the type of zero better. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-20 15:05:13 -07:00
Kenneth Graunke	1aa28f3509	i965: Make opt_vector_float() only handle non-type-conversion MOVs. We don't handle this properly - we'd have to perform the type conversion before trying to convert the value to a VF. While we could do that, it doesn't seem particularly useful - most vector loads should be consistently typed (all float or all integer). As a special case, we do allow type-converting MOVs of integer 0, as it's represented the same regardless of the type. I believe this case does actually come up. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-20 15:05:13 -07:00
Kenneth Graunke	2a25a5142b	i965: Fold vectorize_mov() back into the one caller. After the previous patch, this helper is only called in one place. So, just fold it back in - there are a lot of parameters here and not much code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-20 15:05:13 -07:00
Kenneth Graunke	9967561158	i965: Rework opt_vector_float() control flow. This reworks opt_vector_float() so that there's only one place that flushes out any accumulated state and emits a VF. v2: Don't break the sequence for non-representable numbers - just skip recording their values. Only break it for non-MOVs or register changes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-20 15:05:13 -07:00
Jason Ekstrand	50018522d2	anv: s/anv_batch_emit_blk/anv_batch_emit/ Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	0a45395902	anv: Remove the old emit macro Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	86c52bc757	anv/gen7_pipeline: Use the new emit macro Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	744e133431	anv/gen7_cmd_buffer: Use the new emit macro Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	cae2f14947	anv/device: Use the new emit macro Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	932c353592	anv/state: Use the new emit macro Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	9e9f3f4e71	anv/gen8_pipeline: Use the new emit macro Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	dba3727bea	anv/genX_pipeline: Use the new emit macro Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	a48f8340d9	anv/gen8_cmd_buffer: Use the new emit macro Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	8a6ced83e9	anv/cmd_buffer: Use the new emit macro for quaries Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	db25e1eec5	anv/cmd_buffer: Use the new emit macro for DRAWING_RECTANGLE Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	deb13870d8	anv/cmd_buffer: Use the new emit macro for compute shader dispatch Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	06fc7fa684	anv/cmd_buffer: Use the new emit macro for 3DSTATE_CONSTANT Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	a71ded0e18	anv/cmd_buffer: Use the new emit macro for DEPTH/STENCIL_BUFFER Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	56453eeaff	anv/cmd_buffer: Use the new emit macro for PIPE_CONTROL and STATE_BASE_ADDRESS Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	1d4d6852b4	anv/cmd_buffer: Use the new emit macro for 3DPRIMITIVE commands Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	64ad2d3bcd	anv: Add a new block-based batch emit macro This new macro uses a for loop to create an actual code block in which to place the macro setup code. One advantage of this is that you syntatically use braces instead of parentheses. Another is that the code in the block doesn't even get executed if anv_batch_emit_dwords fails. Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Samuel Pitoiset	d30768025a	gk110/ir: make use of IMUL32I for all immediates Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-04-20 22:55:36 +02:00
Samuel Pitoiset	17a37c78fc	gk110/ir: do not overwrite def value with zero for EXCH ops This is only valid for other atomic operations (including CAS). This fixes an invalid opcode error from dmesg. While we are it, make sure to initialize global addr to 0 for other atomic operations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-04-20 22:55:33 +02:00
Marcin Ślusarz	3caf2e89aa	anv: fix build without Wayland platform Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-20 11:12:10 -07:00
Laurent Carlier	6c952d8ac7	anv: fix building on i686 with -mcpu=generic mcpu=generic doesn't enable sse2, and anvil definitly needs it Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-20 10:48:11 -07:00
Jason Ekstrand	2ef7aef322	spirv: Trivially handle the NonWriteable decoration Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-20 10:33:23 -07:00
Connor Abbott	b6dc940ec2	nir: rename nir_foreach_block() to nir_foreach_block_call() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-20 09:47:05 -07:00
Samuel Pitoiset	7143068296	nvc0: avoid tex read fault from compute shaders on GK110 After some investigation, it seems like that disabling the UNK02C4 command avoid a read fault with texelFetch() from a compute shader. I have no clue on what this method actually does, but this avoid the GPU to hang with basic-texelFetch.shader_test without introducing any compute-related regressions. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-20 18:28:47 +02:00
Jason Ekstrand	87a4fb516e	i965/vec4: Always split uniforms in array_access_to_pull_constants Normally, we split uniforms at the end but in Vulkan, we bail because we don't want pull constants. However, we still need them split because pack_uniforms relies on it. I really don't like this patch not because it doesn't work (it does) but because now that we're using MOV_INDIRECT, uniform numbers and sizes don't really matter anymore. In the FS backend, uniform splitting and packing is handled all at once (actual re-assignment of locations happens later) and we really should do it that way in vec4 eventually as well. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001	2016-04-20 09:15:01 -07:00
Jason Ekstrand	b3f43822c7	i965/vec4: Use the correct offset for the swizzle shift in push constants This was actually caught by Ken in review the first time around but somehow didn't get fixed before the patches were pushed. :-( Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001	2016-04-20 09:15:01 -07:00
Jason Ekstrand	9f16e170fe	i965/vec4: Use nir_intrinsic_base in the load_uniform implementation We shouldn't be reading the const_index directly Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001	2016-04-20 09:15:01 -07:00
Jason Ekstrand	f63a95080f	anv/apply_dynamic_offsets: Provide a range on the load_uniform Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001	2016-04-20 09:14:58 -07:00
Jason Ekstrand	35b758c378	anv/lower_push_constants: Stop treating scalar specially All of the code that did something special based on vec4 vs. scalar is bogus. In the backend, everything is now in units of bytes and the vec4 backend can handle full std140 packing so we don't need to do anything special anymore. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998	2016-04-20 09:14:47 -07:00
Tim Rowley	3bbe8a09ea	swr: fix resource backed constant buffers Code was using an incorrect address for the base pointer. v2: use swr_resource_data() utility function. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94979 Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com> Tested-by: Markus Wick <markus@selfnet.de>	2016-04-20 09:57:55 -05:00
Hans de Goede	2ac2ecdd6c	nouveau: codegen: Add support for OpenCL global memory buffers Add support for OpenCL global memory buffers, note this has only been tested with regular load and stores and likely needs more work for e.g. atomic ops. Tested with piglet on a gf119 and a gk107: ./piglit run -o shader -t '.arb_shader_storage_buffer_object.' results/shader [9/9] pass: 9 / ./piglit run -o shader -t '.arb_compute_shader.' results/shader [20/20] skip: 4, pass: 16 \| Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-04-20 13:46:03 +02:00
Hans de Goede	61d52a5fb9	nouveau: codegen: Use FILE_MEMORY_BUFFER for buffers Some of the lowering steps we currently do for FILE_MEMORY_GLOBAL only apply to buffers, making it impossible to use FILE_MEMORY_GLOBAL for OpenCL global buffers. This commits changes the buffer code to use FILE_MEMORY_BUFFER at the ir_from_tgsi and lowering steps, freeing use of FILE_MEMORY_GLOBAL for use with OpenCL global buffers. Note that after lowering buffer accesses use the FILE_MEMORY_GLOBAL register file. Tested with piglet on a gf119 and a gk107: ./piglit run -o shader -t '.arb_shader_storage_buffer_object.' results/shader [9/9] pass: 9 / ./piglit run -o shader -t '.arb_compute_shader.' results/shader [20/20] skip: 4, pass: 16 \| Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-04-20 13:46:03 +02:00
Jose Fonseca	f02f4d09ce	scons: Build dri_common_interop.c.	2016-04-20 12:41:24 +01:00
Marek Olšák	4fa3d35cc5	st/dri: implement the GL interop DRI extension (v2.2) v2: - set interop_version - simplify the offset_after macro v2.1: - use version numbers, remove offset_after - set "out_driver_data_written" v2.2: - set buf_offset & buf_size for GL_ARRAY_BUFFER too - add whandle.offset to buf_offset - disable the minmax cache for GL_TEXTURE_BUFFER	2016-04-20 12:18:47 +02:00
Marek Olšák	37d3a26bd6	glx: implement GLX part of interop interface (v2) v2: - use const	2016-04-20 12:18:47 +02:00
Marek Olšák	b6eda70843	egl: implement EGL part of interop interface (v2) v2: - use const	2016-04-20 12:18:47 +02:00
Marek Olšák	5e9ed261ed	dri_interface: add interface for GL interop with other APIs (v2) v2: - use const	2016-04-20 12:18:47 +02:00
Marek Olšák	6eeb729490	include/GL: add mesa_glinterop.h for OpenGL-OpenCL interop (v4.2) v2: - use "enum" to define stuff v3: - more comments, define MESA_GLINTEROP_UNSUPPORTED v4: - add mesa_glinterop_device_info::interop_version - more comments - remove #define MESA_GLINTEROP_VERSION - use const for "in" v4.1: - use version numbers for structures - add "out_driver_data_written" v4.2: - buf_offset & buf_size affect GL_ARRAY_BUFFER too, this is required for sharing suballocations within a larger buffer	2016-04-20 12:15:41 +02:00
Nicolas Dufresne	8093990ef4	st/dri: Fix RGB565 EGLImage creation When creating egl images we do a bytes to pixel conversion by deviding by 4 regardless of the pixel format. This does not work for RGB565. In this patch, we avoid useless conversion and use proper API when the conversion cannot be avoided. Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-04-20 17:55:30 +09:00
Nicolas Dufresne	4463f38766	st/dri: Factor out DRI2 to PIPE_FORMAT conversion This code is already duplicated twice and will be useful again. This will also help when adding formats. Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-04-20 17:34:03 +09:00
Rob Clark	899bd63ace	freedreno/a4xx: lower srgb in shader for astc textures This seems like a hw bug, and maybe only applies to certain a4xx variants/revisions. But setting the SRGB bit in sampler view state (texconst0) causes invalid alpha for ASTC textures. Work around this by doing the srgb->linear conversion in the shader instead. This fixes 392 dEQP tests: dEQP-GLES3.functional.texture.astcsrgb* (The remaining fails seem to be a bug w/ ASTC + linear filtering, also possibly a420.0 specific.) Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-19 17:14:04 -04:00
Rob Clark	eddfc97709	nir/lower-tex: add srgb->linear lowering Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-19 17:13:50 -04:00
Rob Clark	eb00a0fc58	nir/builder: const'ify swiz param No need for it not to be const, and lets caller declare it const if desired. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-04-19 17:13:36 -04:00
Rob Clark	52ccc6349f	nir/lower-tex: make options a local var Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-19 16:12:49 -04:00
Rob Clark	d4ff42bd0a	freedreno: cleanup fd_set_sampler_views The separate FS/VS entrypoints are no longer used since `a3ed98f`. So just inline them. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-19 16:11:47 -04:00
Russell King	fadfaa82c6	tgsi/lowering: improved lowering for LRP Provide an improved lowering for LRP, which can be implemented in two MAD instructions with a bit of rearranging of the equation, rather than the literal implementation of two multiplies, an add and a subtract. Signed-off-by: Russell King <rmk@arm.linux.org.uk> Reviewed-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-19 16:04:44 -04:00
Russell King	67da7dd98a	tgsi/lowering: improved lowering for XPD Improve XPD lowering to consume less instructions by using the MAD instruction to perform the multiply and subtraction together. Signed-off-by: Russell King <rmk@arm.linux.org.uk> Reviewed-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-19 16:04:44 -04:00
Russell King	65460cf4c8	tgsi/lowering: add support for lowering TRUNC Add support for lowering TRUNC using the following sequence: FRC tmpA, \|src\| SUB tmpA, \|src\|, tmpA CMP dst, -tmpA, tmpA Note that this is incompatible with FRC lowering. Signed-off-by: Russell King <rmk@arm.linux.org.uk> Reviewed-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-19 16:04:44 -04:00
Russell King	23e870a888	tgsi/lowering: add support for lowering FLR and CEIL Add support for lowering FLR and CEIL to FRC/SUB and FRC/ADD instructions for GPUs that support FRC but not FLR or CEIL. Since these uses FRC, it is invalid to ask for FLR or CEIL to be lowered along with FRC, so add an assert to catch this invalid configuration. We also need to deal with FLR instructions emitted by the lowering code. Fix these up with the FRC+SUB equivalent when FLR lowering is enabled. Signed-off-by: Russell King <rmk@arm.linux.org.uk> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-19 16:04:44 -04:00
Bas Nieuwenhuizen	464cef5b06	radeonsi: enable TGSI support cap for compute shaders v2: Use chip_class instead of family. v3: Check kernel version for SI. v4: Preemptively allow amdgpu winsys for SI. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-19 18:31:23 +02:00
Bas Nieuwenhuizen	1f32d5d59f	radeonsi: Consider input SGPR count for compute shader SGPR count. si_shader_create corrects the SGPR count with si_fix_num_sgprs. We then recompute the rsrc1 register to use the new SGPR count. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-19 18:31:23 +02:00
Bas Nieuwenhuizen	6c833ba1ab	radeonsi: Add CE synchronization for compute dispatches. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-19 18:31:23 +02:00
Bas Nieuwenhuizen	e0b729c544	mesa/st: enable compute shaders if images are also supported v2: Also depend on atomic counters. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-19 18:10:32 +02:00
Bas Nieuwenhuizen	41d79bcbfa	radeonsi: clean up compute flush Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-19 18:10:32 +02:00
Bas Nieuwenhuizen	7a92c08428	radeonsi: do not do two full flushes on every compute dispatch v2: Add more CS_PARTIAL_FLUSH events. Essentially every place with waits on finishing for pixel shaders also has a write after read hazard with compute shaders. Invalidating L2 waits implicitly on pixel and compute shaders, so, we don't need a CS_PARTIAL_FLUSH for switching FBO. v3: Add CS_PARTIAL_FLUSH events even if we already have INV_GLOBAL_L2. According to Marek the INV_GLOBAL_L2 events don't wait for compute shaders to finish, so wait for them explicitly. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen	e764ee13ae	radeonsi: split setting graphics and compute descriptors Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen	061ce9399a	radeonsi: split texture decompression for compute shaders Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen	e56514f631	radeonsi: update predicate condition for compute dispatches Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen	c3083d841e	radeonsi: implement TGSI compute dispatch v2: - Use radeon_set_sh_reg_seq. - Set predicate bit for conditional rendering. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen	1349dd16ff	radeonsi: only emit compute shader state when switching shaders v2: - Do check if anything changed earlier - Use emitted_program instead of emitted_bo to prevent shaders with shader->bo = NULL confusing the check - Use radeon_set_sh_reg* Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen	ba1f66a73d	radeonsi: rework compute scratch buffer Instead of having a scratch buffer per program, have one per context. Also removed the per kernel wave count calculations, but that only helped if the total number of waves in the dispatch was smaller than sctx->scratch_waves. v2: Fix style issue. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen	107f4d3538	radeonsi: do per cs setup for compute shaders once per cs Also removes PKT3_CONTEXT_CONTROL as that is already being done by si_begin_new_cs, when emitting init_config. v2: - Use radeon_set_sh_reg_seq. - Also set COMPUTE_STATIC_THREAD_MGMT_SE2 / SE3 for CIK+ Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen	52d3584dec	radeonsi: don't pass scratch buffer to user SGPRs As far as I can see we use relocations for clover too. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen	422a19f76f	radeonsi: split input upload off from si_launch_grid Also uses a dynamically allocated buffer using u_upload_alloc. The old buffer per program approach required serializing all dispatches of the same program. v2: - Clarified commit message. - Use radeon_set_sh_reg_seq. - Also upload input buffer for clover kernels, even when input_size is 0, as it contains grid parameters. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen	898298efc9	radeonsi: implement TGSI compute shader creation v2: Moved scratch_enabled initialization after compile. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen	85fd7817ee	radeonsi: update shader count for compute shaders Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen	da88c2a8e8	radeonsi: set maximum work group size based on block size Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen	b082147b78	radeonsi: implement shared atomics v2: - Use single region - Use get_memory_ptr Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen	8acf3e501b	radeonsi: implement shared memory load/store v2: - Use single region - Combine address calculation Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-19 18:10:31 +02:00
Bas Nieuwenhuizen	84a6761ae3	radeonsi: add shared memory Declares the shared memory as a global variable so that LLVM is aware of it and it does not conflict with passes like AMDGPUPromoteAlloca. v2: - Use ctx->i8. - Dropped null-check for declare_memory_region. - Changed memory region array to single region. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-19 18:10:30 +02:00
Bas Nieuwenhuizen	753a3e472b	radeonsi: lower compute shader arguments Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-19 18:10:30 +02:00
Bas Nieuwenhuizen	008d977d01	radeonsi: Use CE for all descriptors. v2: Load previous list for new CS instead of re-emitting all descriptors. v3: Do radeon_add_to_buffer_list in si_ce_upload. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-19 18:10:30 +02:00
Bas Nieuwenhuizen	0b6c463dac	gallium/util: Add u_bit_scan_consecutive_range64. For use by radeonsi. v2: Make sure that it works for all 64 bits set. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-19 18:10:30 +02:00
Bas Nieuwenhuizen	058b54c624	radeonsi: Replace list_dirty with a mask. We can then upload only the dirty ones with the constant engine. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-19 18:10:30 +02:00
Bas Nieuwenhuizen	aabc7d61d6	radeonsi: Add CE uploader. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-19 18:10:30 +02:00
Bas Nieuwenhuizen	0d7ddd6819	radeonsi: Allocate chunks of CE ram. v2: Use 32 byte alignment. v3: Don't allocate CE space for vertex buffer descriptors. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-19 18:10:30 +02:00
Bas Nieuwenhuizen	86c71ff989	radeonsi: Add CE synchronization. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-19 18:10:30 +02:00
Bas Nieuwenhuizen	fe1ef23b66	radeonsi: Add CE packet definitions. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-19 18:10:30 +02:00
Bas Nieuwenhuizen	8fee75d606	radeonsi: Create CE IB. Based on work by Marek Olšák. v2: Add preamble IB. Leaves the load packet in the space calculation as the radeon winsys might not be able to support a premable. The added space calculation may look expensive, but is converted to a constant with (at least) -O2 and -O3. v3: - Fix code style. - Remove needed space for vertex buffer descriptors. - Fail when the preamble cannot be created. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-19 18:10:30 +02:00
Bas Nieuwenhuizen	7201230582	winsys/amdgpu: Enlarge const IB size. Necessary to prevent performance regressions due to extra flushing. Probably should enlarge it even further when also updating uniforms through the CE, but this seems large enough for now. v2: Add preamble IB. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-19 18:10:30 +02:00
Marek Olšák	7997b5f005	winsys/amdgpu: Add support for const IB. v2: Use the correct IB to update request (Bas Nieuwenhuizen) v3: Add preamble IB. (Bas Nieuwenhuizen) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-19 18:10:30 +02:00
Marek Olšák	e78170f388	winsys/amdgpu: split IB data into a new structure in preparation for CE Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-04-19 18:10:30 +02:00
Marek Olšák	f4b77c764a	gallium/radeon: move ring_type into winsyses Not used by drivers. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-04-19 18:10:30 +02:00
Jose Fonseca	1d2ac7a7ca	llvmpipe: Call LLVMShutdown before exiting. So that LLVM frees its globals. Trivial.	2016-04-19 12:10:09 +01:00
Jose Fonseca	524042fa35	llvmpipe: Avoid LLVMGetGlobalContext in tests. Trivial.	2016-04-19 12:10:02 +01:00
Jose Fonseca	bb9e8c5090	llvmpipe: Skip false exp2 failure in lp_test_arit due to buggy MSVCRT. 64bits MSVCRT's exp2f(-inf) returns -inf instead of 0. Tested with MSVC 2013's CRT. (I haven't tried 2015 yet.) Also this does not happen with MinGW. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-19 11:31:53 +01:00
Jose Fonseca	ee9876be1d	llvmpipe: Test more vector lengths. All power of two of up native vector length. There is actually a bug in lp_build_round for v2, whereby it doesn't round to nearest. Fixing is left to the future, but the test is now able to expect it to fail. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-19 11:31:44 +01:00
Jose Fonseca	932b71f17d	gallivm: Avoid llvm::sys::getProcessTriple(). Just use LLVM_HOST_TRIPLE, which is available at least from LLVM 3.3 onwards, and is pretty much what llvm::sys::getProcessTriple() does anyway, Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-19 11:31:37 +01:00
Jose Fonseca	b5ca689cee	gallivm: Remove lp_get_module_id. Just keep a copy of the module_name in gallivm. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-19 11:31:26 +01:00
Jose Fonseca	969ba8bfa7	gallivm: Fix MCJIT with LLVM 3.3. One needs to call setJITMemoryManager for LLVM 3.3, instead of setMCJITMemoryManager. This regressed in commits 065256df/75ad4fe7 when trying to make the code to build with LLVM 3.6. Tested MCJIT with LLVM 3.3 to 3.6. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-19 11:31:17 +01:00
Jose Fonseca	cf4105740f	gallivm: Make MCJIT a runtime option. On the LLVM versions that support it, so we can easily switch between MCJIT/old-jit for testing. The new option is GALLIVM_MCJIT. Unfortunately setting GALLIVM_MCJIT=1 for LLVM 3.3 or 3.4 causes segfault, both on Linux and Windows. I'm almost certain this used to work, so there probably is a regression somewhere. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-19 11:31:14 +01:00
Jose Fonseca	7d2151b6ea	scons: Show the unit test full path. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-19 11:31:11 +01:00
Jose Fonseca	2211f8d559	gallivm: Use LLVMSetTarget. Instead of LLVM C++ interfaces. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-19 11:31:00 +01:00
Jose Fonseca	9aa23b11e4	gallivm: Use LLVMPrintValueToString where available. And llvm::raw_string_ostream where not (LLVM 3.3). Thereby eliminating yet another dependency on unstable LLVM interfaces. As a bonus this also gets LLVM IR on OutputDebugMessageA on MSVC (which was disabled, probably due to C++ issues.) Tested `lp_test_arit -v -v` on LLVM 3.3, 3.4 and 3.8. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-19 11:28:37 +01:00
Jose Fonseca	f6621cd3be	gallium/tests: Update UTIL_FORMAT_MAX_* defines. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-19 11:28:16 +01:00
Jose Fonseca	121a0cedc8	Revert "nv50/ra: `isinf()` is in namespace `std` since C++11." This reverts commit `f525db6358`. It was superseeded by commit `649704f1f7`.	2016-04-19 11:22:45 +01:00
Eric Anholt	802b9292aa	vc4: Fix fbo-generatemipmap-formats for NPOT. Single-sampled texture miplevels > 1 are stored in POT-aligned areas, but we only get one value to control the stride of the src and dst for single sampled buffers. A RCL tile blit from level != 1 to level == 0 would therefore load from the wrong stride.	2016-04-18 16:55:36 -07:00
Eric Anholt	2402bb6095	vc4: Remove unused "immediates" field This was for TGSI, which we no longer have to deal with.	2016-04-18 16:48:45 -07:00
Ben Widawsky	2408899cb2	i965: Define miptree map functions static (trivial) They were already declared as such. It was changed here: commit `31f0967fb5` Author: Ian Romanick <ian.d.romanick@intel.com> Date: Wed Sep 2 14:43:18 2015 -0700 i965: Make intel_miptree_map_raw static Cc: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-04-18 16:12:13 -07:00
Matt Turner	b1d9353cb5	glsl: Properly handle ldexp(0.0f, non-zero-exp).	2016-04-18 15:48:54 -07:00
Dave Airlie	3a26ef23e7	gallivm: convert size query to using a set of parameters. This isn't currently that easy to expand, so fix it up before expanding it later to include dynamic samplers. [airlied: use some local variables (Roland)] Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-19 07:33:39 +10:00
Tim Rowley	3227c10270	swr: dereference cbuf/zbuf/views on context destroy Fixes resource memory leaks. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-18 15:52:26 -05:00
Rob Clark	77a9107bf2	freedreno/ir3: fix grouping issue w/ reverse swizzles When we have something like: MOV OUT[n], IN[m].wzyx the existing grouping code was missing a potential conflict. Due to input needing to be sequential scalar regs, we have: IN: x <-> y <-> z <-> w which would be grouped to: OUT: w <-> z2 <-> y2 <-> x (where the 2 denotes a copy/mov) but that can't actually work. We need to realize that x and w are already in the same chain, not just that they aren't both already in new chain being built. With this fixed, we probably no longer need the hack from `f68f6c0`. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-18 15:41:32 -04:00
Marek Olšák	ed66c75784	radeonsi: use enums in si_shader.h Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-18 19:51:25 +02:00
Marek Olšák	0c52caf7b7	gallium/radeon: use enums in r600_query.h Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-18 19:51:25 +02:00
Marek Olšák	dd9ca77cb9	radeonsi: always use PFP_SYNC_ME when doing flushes and waits This is typically used by the closed driver before SURFACE_SYNC. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-18 19:51:25 +02:00
Marek Olšák	1db5678688	radeonsi: don't do VS/PS partial flushes if SURFACE_SYNC waits too Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-18 19:51:25 +02:00
Marek Olšák	58494b42b5	radeonsi: add safety assertions for meta cache flushes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-18 19:51:25 +02:00
Marek Olšák	78f58a4e6f	radeonsi: don't use ACQUIRE_MEM on the graphics ring It's only required on the compute ring. This matches the closed driver. The compute flag is removed to prevent confusion and Bas's compute shader patches remove it in the whole function. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-18 19:51:25 +02:00
Marek Olšák	3faecdd4e1	radeonsi: remove TODO and correct a comment in si_emit_cache_flush Yes, that flag is really needed. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-18 19:51:25 +02:00
Marek Olšák	28c2573b4f	radeonsi: don't flush CB/DB caches for performance counters I'm not sure about this. This will make the engines go idle, but the caches will be unflushed. This should match app behavior without performance counters, which can be a good thing. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-18 19:51:24 +02:00
Marek Olšák	97c328b2a3	gallium/radeon: don't flush CB/DB caches for timestamp queries Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-18 19:51:24 +02:00
Marek Olšák	6dc21b1962	gallium/util: fix undefined shift to the last bit in u_bit_scan Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-18 19:51:24 +02:00
Marek Olšák	9434aa8103	gallium/util: fix u_bit_scan_consecutive_range for mask == 0xffffffff The second ffs returns 0, yielding count == -1. v2: change 1 to 1u Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-04-18 19:51:24 +02:00
Marek Olšák	e50e1f86b0	gallium/radeon: fix Nine with its slightly shifted viewports just need to do the calculation in floating-point and then round things properly Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-04-18 19:51:24 +02:00
Erik Faye-Lund	ee5b35142a	docs: correct name for GL_OES_primitive_bounding_box When this extension was added, an underscore were mistakenly replaced by a space. Let's correct this, so it's a tad easier to grep for this extension. Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>	2016-04-18 10:48:57 -07:00
Kenneth Graunke	c092f9b96a	meta: Don't botch color masks when changing drawbuffers. Color clears should respect each drawbuffer's color mask state. Previously, we tried to leave the color mask untouched. However, _mesa_meta_drawbuffers_from_bitfield() ended up rebinding all the color drawbuffers in a different order, so we ended up pairing drawbuffers with the wrong color mask state. The new _mesa_meta_drawbuffers_and_colormask() function does the same job as the old _mesa_meta_drawbuffers_from_bitfield(), but also rearranges the color mask state to match the new drawbuffer configuration. This code was largely ripped off from Gallium's st_Clear code. This fixes ES31-CTS.draw_buffers_indexed.color_masks, which binds up to 8 drawbuffers, sets color masks for each, and then calls glClearBufferfv to clear each buffer individually. ClearBuffer causes us to rebind only one drawbuffer, at which point we used ctx->Color.ColorMask[0] (draw buffer 0's state) for everything. We could probably delete _mesa_meta_drawbuffers_from_bitfield(), but I'd rather not think about the i965 fast clear code. Topi is rewriting a bunch of that soon anyway, so let's delete it then. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94847 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-18 10:39:31 -07:00
Kenneth Graunke	a33f94ba8c	meta: Don't smash ColorMask when using MESA_META_COLOR_MASK save bit. This allows meta operations to inspect the existing color mask, and then do their own smashing. BlitFramebuffer and Clear already override the color mask, so this was also redundant. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-18 10:39:26 -07:00
Eric Anholt	48fe53bbb9	vc4: Add support for rendering to cube map surfaces. We need to fix up the offset to point at the face of the cube. Fixes piglit fbo-cubemap, copyteximage CUBE, and glean's fbo test. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-04-18 10:10:44 -07:00
Eric Anholt	21a9ed6207	vc4: Don't flush on read-only access of buffers read by the CL. Fixes piglit mixed-immediate-and-vbo, and may significantly improve performance of applications that store a 4-byte IB in the same VBO as vertex data.	2016-04-18 10:10:44 -07:00
Eric Anholt	9e8a8b0c8b	vc4: Sanity check that flushes don't happen between state emit and draw. Catches the cause of failure in arb_vertex_buffer_object-mixed-immediate-and-vbo, I've had this class of failure before, and it probably won't be the last time.	2016-04-18 10:10:44 -07:00
Eric Anholt	56b14adf85	vc4: Sanity check strides for imported BOs. If we're going to sample from or render to them at some particular size, we'd better make sure that they actually are that size. Causes some tests under simulation to generate appropriate error messages instead of failures.	2016-04-18 10:10:44 -07:00
Pierre Moreau	649704f1f7	math: Import isinf and others to global namespace Starting from C++11, several math functions, like isinf, moved into the std namespace. Since cmath undefines those functions before redefining them inside the namespace, and glibc 2.23 defines the C variants as macros, the C variants in global namespace are not accessible any longer. v2: Move the fix outside of Nouveau, as suggested by Jose Fonseca, since anyone might need it when GCC switches to C++14 by default with GCC 6.0. v3: * Put the code directly inside c99_math.h rather than creating a new header file, as asked by Jose Fonseca; * Guard the code behind glibc version checks, as only glibc > =2.23 defines isinf & co. as functions, as suggested by Jose Fonseca. Signed-off-by: Pierre Moreau <pierre.morrow@free.fr> Signed-off-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-18 11:10:25 +01:00
Oded Gabbay	d3c98c73dc	r600g: Move R600_BIG_ENDIAN to r600_pipe_common.h I need to do this so I could use R600_BIG_ENDIAN in files which include r600_pipe_common.h but not r600_pipe.h Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-18 09:50:08 +03:00
Oded Gabbay	72d0d2ba59	r600g: fix code indentation Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-18 09:50:08 +03:00
Emil Velikov	a998e49259	docs: add news item and link release notes for 11.1.3 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-17 23:32:41 +01:00
Emil Velikov	50eeb5fb16	docs: add sha256 checksums for 11.1.3 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `596c6504b3`)	2016-04-17 23:32:41 +01:00
Emil Velikov	c1bf47ada2	docs: add release notes for 11.1.3 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `ca2fbf6f8f`)	2016-04-17 23:32:41 +01:00
Roland Scheidegger	d11111a551	gallivm: don't use vector selects with llvm 3.7 llvm 3.7 sometimes simply miscompiles vector selects. See https://bugs.freedesktop.org/show_bug.cgi?id=94972 This was fixed in llvm r249669 (https://llvm.org/bugs/show_bug.cgi?id=24532). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-18 00:23:34 +02:00
Dave Airlie	b3616f1326	nir: only dereference undef after NULL check. (v2) Pointed out by coverity. v2: nuke line, Jason pointed out the constructor does it. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-18 07:37:48 +10:00
Emil Velikov	96b4cfe834	docs: update the sha256 checksums for 11.2.1 Turns out the previous tarballs got corrupted during upload which I carelessly forgot to check prior to deleting the local ones. Lesson learned - double check before removing the local ones. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `79b0e13913`)	2016-04-17 19:32:20 +01:00
Emil Velikov	2197581816	docs: add news item and link release notes for 11.2.1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-17 18:36:59 +01:00
Emil Velikov	03a234c1d1	docs: add sha256 checksums for 11.2.1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `c65835d812`)	2016-04-17 18:36:59 +01:00
Emil Velikov	c15f457958	docs: add release notes for 11.2.1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `21e6440e82`)	2016-04-17 18:36:59 +01:00
Jason Ekstrand	f30f6e2625	i965/fs: Don't allow OOB array access of images We have had a guard against OOB array access of images on IVB for a long time, but it can actually cause hangs on any GPU generation. This can happen due to getting an untyped SURFACE_STATE for a typed message. We didn't used to hit this with the piglit test on anything other than IVB because the OOB in the test would cause us to go past the top of the pull constant UBO and we would get a surface index of 0 which is was always a valid surface. Now that we're pushing small arrays, we can end up grabbing garbage from the GRF and going to some random index which causes a hang. The solution is to just do the bounds check on all hardware. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94944 Reviewed-by: Francisco Jerez <currojerez@riseup.net> Tested-by: Mark Janes <mark.a.janes@intel.com>	2016-04-15 22:47:33 -07:00
Jason Ekstrand	93db828e42	anv/device: Images are only enabled in scalar stages Reported-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-15 16:40:56 -07:00
Marek Olšák	c1a2fe7fd1	gallium/radeon: handle vertex shaders that disable clipping & viewport Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-16 00:21:15 +02:00
Nanley Chery	696d8ff5a1	mesa/texstore: Use Driver.CompressedTexSubImage in the default CompressedTexImage Enable drivers to use their own implementation of this method instead of the mesa default. Since the drivers that currently overwrite dd_function_table::CompressedTexSubImage also overwrite ::CompressedTexImage, there should be no behavioral change. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-15 15:06:27 -07:00
Jason Ekstrand	5ec4ecce44	anv: Advertise vertexPipelineStoresAndAtomics based on scalar stages Previously, we just looked at the hardware generation but this meant that if you did INTEL_DEBUG=vec4 on BDW or SKL, you would have advertised but non-working features.	2016-04-15 14:53:16 -07:00
Jason Ekstrand	0166ad6ced	i965/vec4: Support full std140 layout for push constants Up until now, we have been able to assume that all push constants are vec4-aligned because this is what the GL driver gives us. In Vulkan, we need to be able to support full std140 because we get the layout from the client. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-15 14:04:38 -07:00
Jason Ekstrand	a112391d52	i965/vec4: Handle MOV_INDIRECT in pack_uniform_registers Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-15 14:04:38 -07:00
Jason Ekstrand	aaac8a1890	i965/vec4: Add support for SHADER_OPCODE_MOV_INDIRECT Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-15 14:04:38 -07:00
Jason Ekstrand	61ee5e62a2	i965/vec4: Use can_do_writemask in can_reswizzle Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-15 14:04:38 -07:00
Jason Ekstrand	75b68f9114	i965/vec4: Move can_do_writemask to vec4_instruction Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-15 14:04:37 -07:00
Chad Versace	4a80890177	util: Fix warning of invalid return value _mesa_libgcrypt_init() returns NULL, but its return type is void. Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2016-04-15 15:00:58 -07:00
Jason Ekstrand	cab30cc5f9	Merge branch 'vulkan'	2016-04-15 13:52:34 -07:00
Roland Scheidegger	64d3ae09b7	llvmpipe: (trivial) initialize src1_alpha var to NULL The blend code would do a conditional assignment based on it, causing valgrind to complain. Since that variable was actually unused in this case, this doesn't fix anything but the warning. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94955 Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-04-15 22:51:28 +02:00
Jason Ekstrand	d8b85c96d1	Merge remote-tracking branch 'public/master' into vulkan	2016-04-15 13:35:16 -07:00
Jason Ekstrand	1a100d4f28	configure: Add support for the Intel Vulkan driver This adds a --with-vulkan-drivers option with one driver, "intel". In the future, we may add more drivers to this list. v2: Don't enable any drivers by default. This should prevent this patch from breaking anyone's build. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-15 13:29:29 -07:00
Jason Ekstrand	ce7e82fb6f	i965/surface_formats: Update some formats for more recent gens The surface format table hasn't entirely been kept up-to-date. This commit marks a couple more compressed formats as sampleable on gen8+ and adds the A4B4G4R4 format as renderable on gen9. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-15 13:29:29 -07:00
Jason Ekstrand	7dac4a2889	util/list: Add list splicing functions This adds functions for splicing one list into another. These have more-or-less the same API as the kernel list splicing functions. The implementation, however, was stolen from the Wayland list implementation. Reviewed-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Rob Clark <robclark@freedesktop.org>	2016-04-15 13:29:09 -07:00
Jason Ekstrand	17a181bfa6	Remove the Intel Vulkan readme	2016-04-15 13:17:08 -07:00
Tim Rowley	082f6d75ae	gallium/swr: confine c++11 flag to swr driver On the philosophy that a driver shouldn't change the compile flags for the entire tree, take the clove approach of moving the c++11 flag to the swr driver directory. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-15 14:43:01 -05:00
Tim Rowley	ee72fec9cf	gallium/swr: allow swr use as a swrast dri driver Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-15 14:21:50 -05:00
Eric Anholt	f6d21bcd6b	vc4: Fix subimage accesses to LT textures. This code started out like the T case, iterating over utile offsets, but I had partially switched it to iterating over pixel offsets. I hadn't caught this before because it's unusual to do piecemeal uploads to small textures. Fixes bad text rendering in QT5 apps, which use a 256x16 glyph cache. Also fixes 6 piglit tests related to glTexSubImage() and glGetTexSubImage(). Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-04-15 11:57:17 -07:00
Mark Janes	ade3108bb5	util: Fix race condition on libgcrypt initialization Fixes intermittent Vulkan CTS failures within the test groups: dEQP-VK.api.object_management.multithreaded_per_thread_device dEQP-VK.api.object_management.multithreaded_per_thread_resources dEQP-VK.api.object_management.multithreaded_shared_resources Signed-off-by: Mark Janes <mark.a.janes@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94904 Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-15 10:24:40 -07:00
Jason Ekstrand	8403e6de9f	i965: Default to scalar GS	2016-04-15 09:54:42 -07:00
Jason Ekstrand	17d9a2b011	i965/surface_formats: Mark A4B4G4R4_UNORM as SKL+ only This is what is indicated by the bspec.	2016-04-15 09:53:55 -07:00
Jason Ekstrand	d7189bdeee	Revert "i965/fs: Properly write-mask spills" This reverts commit `9c0109a1f6`.	2016-04-15 09:53:55 -07:00
Jason Ekstrand	c3362453f9	Revert "i965/fs: Feel free to spill partial reads/writes" This reverts commit `2434ceabf4`.	2016-04-15 09:53:55 -07:00
Jason Ekstrand	2d5bd66e4f	configure: Add support for detecting valgrind headers We have several places where the Vulkan driver explicitly hooks into valgrind when it's available. We need to be able to detect it. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-15 09:41:15 -07:00
Eduardo Lima Mitev	7e4628da48	nir/print: Fix printing variable mode nir_variable_mode is currently a bitflag enum, while nir_print::print_var_decl() assumes is still a numbered list. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-15 16:41:41 +02:00
John Sheu	f8752e0d95	xlib: remove MESA_GLX_VISUAL_HACK This removes a hack introduced in 1999 in the first version of fakeglx.c, with the comment: /* XXX revisit this after 3.0 is finished. */ Mesa 4.0 was released in 2001. It is now 2016, and Mesa 11.0 was released last year. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-04-15 07:46:00 +02:00
John Sheu	8a9c0f1025	xlib: fix leaks of returned values from XGetVisualInfo Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-04-15 07:45:46 +02:00
John Sheu	781232e0ac	xlib: fix memory leak of and remove vishandle from XMesaVisualInfo The vishandle member of XMesaVisualInfo is used to support the comparison of XVisualInfo instances by pointer value, in find_glx_visual(). The comparison however will always be false, as in every case the comparison is made, the VisualInfo instance being compared to is a new allocation passed in through a GLX API call. In addition, the XVisualInfo instance pointed to by vishandle is itself never freed, causing a memory leak. Since vishandle is essentially useless, we just remove it and thereby also fix the leak. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-04-15 07:45:28 +02:00
John Sheu	fe9d8cd79e	xlib: do not cache return value of glXChooseVisual/glXGetVisualFromFBConfig The returned XVisualInfo from glXChooseVisual/glXGetVisualFromFBConfig is being cached in XMesaVisual.vishandle (and unconditionally overwritten on subsequent calls). However, these entry points are specified to return XVisualInfo instances to be owned by the caller and freed with XFree(), so the return values should not be retained. With this change, XMesaVisual.vishandle is essentially unused and will be removed in a subsequent change. v2: update commit message Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-04-15 07:44:34 +02:00
Jason Ekstrand	76fa7b16f4	Merge remote-tracking branch 'public/master' into vulkan	2016-04-14 18:30:52 -07:00
Jason Ekstrand	547032c56a	main/mtypes: Remove the "set" parameter from gl_uniform_block This is a left-over from the early days of the Vulkan driver	2016-04-14 18:27:09 -07:00
Jason Ekstrand	f0bbb34e49	Revert "i965/vec4: Add support for SHADER_OPCODE_MOV_INDIRECT" This reverts commit `4115648a6b`. This commit was half-baked and probably never should have been committed. We'll add this back in properly later when we need it.	2016-04-14 18:22:08 -07:00
Jason Ekstrand	eeff133158	i965: Expose the surface format table Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 18:07:48 -07:00
Jason Ekstrand	d7cddbd6d6	nir/lower_io: Add UBOs and SSBOs to get_io_offset_src Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 18:07:40 -07:00
Jason Ekstrand	c825e29a82	nir/intrinsics: Add a vulkan_resource_index intrinsic This is used to facilitate the Vulkan binding model where each resource is described by a (descriptor set, binding, array index) tuple. Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-04-14 17:20:05 -07:00
Jason Ekstrand	1e0012e3e4	nir: Add a descriptor_set field to nir_variable This is needed for supporting the Vulkan binding model Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-04-14 17:20:05 -07:00
Chad Versace	7a835b3fd9	dri: Fix robust context creation via EGL attribute driCreateContextAttribs() emits an error if bit __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS is set for an ES context. But, EGL_EXT_create_context_robustness and EGL 1.5 both allow creation of robust ES contexts. One requests a robust ES context by setting the EGL_CONTEXT_OPENGL_ROBUST_ACCESS attribute, which Mesa's EGL layer translates into the __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS bit. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-14 17:38:41 -07:00
Jason Ekstrand	5567ae0547	Merge remote-tracking branch 'public/master' into vulkan	2016-04-14 17:14:28 -07:00
Leo Liu	8f4340c5e6	radeon/uvd: fix tonga feedback buffer size This only applies to tonga Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-04-14 19:33:44 -04:00
Jason Ekstrand	f1d29099b4	i965: Push everything if pull_param == NULL Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 16:00:18 -07:00
Jason Ekstrand	963513bb24	i965/fs: Push small uniform arrays Unfortunately, this also means that we need to use a slightly different algorithm for assign_constant_locations. The old algorithm worked based on the assumption that each read of a uniform value read exactly one float. If it encountered a MOV_INDIRECT, it would immediately bail and push the whole thing. Since we can now read ranges using MOV_INDIRECT, we need to be able to push a series of floats without breaking them up. To do this, we use an algorithm similar to the on in split_virtual_grfs. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 15:59:33 -07:00
Jason Ekstrand	71f8039f72	i965/fs: Rename demote_pull_constants to lower_constant_loads Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 15:59:33 -07:00
Jason Ekstrand	8e76f664be	i965/vec4: Get rid of the uniform_size array Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 15:59:33 -07:00
Jason Ekstrand	056849772f	i965/vec4: Use MOV_INDIRECT instead of reladdr for indirect push constants This commit moves us to an instruction based model rather than a register-based model for indirects. This is more accurate anyway as we have to emit instructions to resolve the reladdr. It's also a lot simpler because it gets rid of the recursive reladdr problem by design. One side-effect of this is that we need a whole new algorithm in move_uniform_array_access_to_pull_constants. This new algorithm is much more straightforward than the old one and is fairly similar to what we're already doing in the FS backend. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 15:59:33 -07:00
Jason Ekstrand	479e38ad63	i965/fs: Get rid of the param_size array Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 15:59:33 -07:00
Jason Ekstrand	30874216cb	i965/fs: Stop relying on param_size in assign_constant_locations Now that we have MOV_INDIRECT opcodes, we have all of the size information we need directly in the opcode. With a little restructuring of the algorithm used in assign_constant_locations we don't need param_size anymore. The big thing to watch out for now, however, is that you can have two ranges overlap where neither contains the other. In order to deal with this, we make the first pass just flag what needs pulling and handle assigning pull constant locations until later. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 15:59:33 -07:00
Jason Ekstrand	275855f315	i965/fs: Get rid of reladdr We aren't using it anymore. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 15:59:33 -07:00
Jason Ekstrand	3c93cdfaf5	i965/fs: Use MOV_INDIRECT for all indirect uniform loads Instead of using reladdr, this commit changes the FS backend to emit a MOV_INDIRECT whenever we need an indirect uniform load. We also have to rework some of the other bits of the backend to handle this new form of uniform load. The obvious change is that demote_pull_constants now acts more like a lowering pass when it hits a MOV_INDIRECT. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 15:59:33 -07:00
Jason Ekstrand	63101177f3	nir: Add another index to load_uniform to specify the range read Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 15:59:33 -07:00
Jason Ekstrand	27bd8ac6f3	i965/fs: Add support for MOV_INDIRECT on pre-Broadwell hardware While we're at it, we also add support for the possibility that the indirect is, in fact, a constant. This shouldn't happen in the common case (if it does, that means NIR failed to constant-fold something), but it's possible so we should handle it. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 15:59:33 -07:00
Jason Ekstrand	889e6054b7	i965/fs: Fix regs_read() for MOV_INDIRECT with a non-zero subnr The subnr field is in bytes so we don't need to multiply by type_sz. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 15:59:33 -07:00
Jason Ekstrand	7e08a13009	i965/fs: Don't force MASK_DISABLE on INDIRECT_MOV instructions It should work fine without it and the visitor can set it if it wants. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 15:59:33 -07:00
Jason Ekstrand	40a8fe04dc	i965/fs: Add support for doing MOV_INDIRECT on uniforms Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 15:59:33 -07:00
Jason Ekstrand	48cc8c284a	anv: Install the installable ICD	2016-04-14 15:15:00 -07:00
Jason Ekstrand	e40b867145	anv/intel_icd: Don't provide an absolute path The driver will be installed to $(libdir)/libvulkan_intel.so and just providing a driver name is enough for the loader. This also ensures that multi-arch systems work ok.	2016-04-14 15:15:00 -07:00
Jason Ekstrand	ca16373a2b	configure: Add initial support for enabling Vulkan drivers	2016-04-14 15:15:00 -07:00
Jason Ekstrand	e61c812f76	anv/pipeline: Use the right mask for lower_indirect_derefs	2016-04-14 15:13:29 -07:00
Ben Widawsky	a8975a91cc	i965: Make intel_get_param return an int This will fix the spurious error message: "Failed to query GPU properties." that was unintentionally added in `cc01b63d73`. This patch changes the function to return an int so that the caller is able to do stuff based on the return value. The equivalent of this patch was in the original series that fixed up the warning, but I dropped it at the last moment. It is required to make the desired behavior of not warning when trying to query GPU properties from the kernel unless there is something the user can do about it. v2: Use strerror (Jason) Make EINVAL check similar in all places (Ian) NOTE: Broadwell appears to actually have some issue where the kernel returns ENODEV when it shouldn't be. I will investigate this separately. Reported-by: Chris Forbes <chrisf@ijw.co.nz> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-04-14 15:13:22 -07:00
Brian Paul	aed975d5c5	st/mesa: fix sampler view leak in st_DrawAtlasBitmaps() I neglected to free the sampler view which was created earlier in the function. So for each glCallLists() command that used the bitmap atlas to draw text, we'd leak a sampler view object. Also, check for st_create_texture_sampler_view() failure and record GL_OUT_OF_MEMORY. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-04-14 15:32:18 -06:00
Nicolai Hähnle	a17911ceb1	gallium/radeon: handle failure when mapping staging buffer Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-14 16:29:23 -05:00
Nicolai Hähnle	8bd0f0df50	radeonsi: mark ssbo and images descriptor pointers dirty at beginning of CS Without this, we were getting non-deterministic VM faults under high pressure. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-14 16:29:23 -05:00
Jason Ekstrand	cb372b39ea	i965/vec4: Use UD rather than D for uniform indirects Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 14:25:01 -07:00
Jason Ekstrand	240d16ea94	i965/fs: Use UD type for offsets in VARYING_PULL_CONSTANT_LOAD Reveiewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-14 14:24:57 -07:00
Samuel Pitoiset	bb4cdee9a4	nvc0: do not break the universe on GK110+ I removed that return 0 by mistake. Ooops. Fixes: `6e23fd4` ("nvc0: allow to use compute support on GM200") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-14 21:57:21 +02:00
Samuel Pitoiset	6e23fd420d	nvc0: allow to use compute support on GM200 This works like a charm but please not that NVF0_COMPUTE have to be set because compute support is still not enabled by default on GK110+. This will require more testing to make sure it won't break the 3D state. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-14 21:01:51 +02:00
Jason Ekstrand	34b5db17d9	i965: remove pointless diff with the master branch	2016-04-14 10:39:54 -07:00
Jason Ekstrand	769b5614f8	nir/opt_algebraic: Remove the encoding line This is an unneeded diff between the vulkan and master branches	2016-04-14 10:35:40 -07:00
Jason Ekstrand	c34be07230	spirv: Move to compiler/ While it does rely on NIR, it's not really part of the NIR core. At the moment, it still builds as part of libnir but that can be changed later if desired.	2016-04-14 10:28:47 -07:00
Jason Ekstrand	bfa3a38280	nir: Remove some pointless delta between vulkan and master	2016-04-14 10:24:33 -07:00
Jose Fonseca	ffcc00ce30	scons: Build NIR. Emil Velikov: - Attribute the src/{glsl,compiler}/nir move - Flesh out to separate SConscript Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-14 16:38:59 +01:00
Jose Fonseca	feb6732e80	nir: Use _snprintf on Windows. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-14 16:38:37 +01:00
Jose Fonseca	ba0c0e3940	nir: Avoid structure initalization expressions. Not supported by MSVC, and completely unnecessary -- inline functions work just as well. NIR_SRC_INIT/NIR_DEST_INIT could and probably should be replaced by the inline functions. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-14 16:38:37 +01:00
Jose Fonseca	8f96524f13	nir: Remove unistd.h include. It doesn't seem needed, and is not available on MSVC. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-14 16:38:31 +01:00
Jose Fonseca	f8e2f1fba5	nir: Avoid empty {} struct initializer. Not supported by MSVC and consistent through NIR. [Emil Velikov: rebase] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-14 16:33:52 +01:00
Emil Velikov	bb949e262c	gallium/swr: fold the almost identical Makefiles Rather than having two almost identical Makefiles, with various VPATH hacks just fold them, using COMMON_* variables and actually getting things buildable/shipable. v2: whitespace fixes, remove Makefile.sources-arch Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-04-14 16:30:57 +01:00
Tim Rowley	aee976703d	install-gallium-links.mk: handle multiple libraries Need to prevent bash from interpreting whitespace between libraries as a command line. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-14 16:30:57 +01:00
Marek Olšák	112291964e	radeonsi: don't overwrite the scratch offset in shader prologs Prologs only look at num_input_sgprs. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-04-14 17:00:14 +02:00
Marek Olšák	ffe44d0283	radeonsi: fold num_user_sgprs where it is possible Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-04-14 17:00:14 +02:00
Marek Olšák	51c4034f9b	radeonsi: fix SGPRS calculation once more This fixes GS piglit failures after adding SI_PARAM_SHADER_BUFFERS, which bumped NUM_USER_SGPRS and uncovered this bug on SI. If this was fixed in LLVM, these workarounds wouldn't be needed. LLVM would have to look at the calling convention to know how many SGPR inputs are declared, and add VCC and the scratch wave offset (which is enabled even if we spill SGPRs but not VGPRs, oh well). Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-04-14 17:00:14 +02:00
Marek Olšák	aaf5be4a29	radeonsi: disable hw ETC2 on Polaris not supported by hw directly, but it's still fully supported by the driver Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-14 16:58:59 +02:00
Emil Velikov	4358cfc4ad	doxygen: remove git rebase fallouts Should never have been (git) added in the first place. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-14 09:49:09 +01:00
Jose Fonseca	8fcacb4f90	appveyor: Run unit tests. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-14 07:19:04 +01:00
Jose Fonseca	50ddf03ada	scons: Add a "check" target to run all unit tests. Except: - u_cache_test -- too long - translate_test -- unreliable (it's probably testing corner cases that translate module doesn't care about.) Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-14 07:19:04 +01:00
Jose Fonseca	9ae0e8ee3c	test/unit: Make translate_test invoke translate_create by default. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-14 07:19:04 +01:00
Jose Fonseca	f8a51034bd	test/unit: Make pipe_barrier_test actually check correct bahavior. So it can run unattended. Also make it silent by default. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-14 07:19:04 +01:00
Jason Ekstrand	12f88ba32a	Merge remote-tracking branch 'public/master' into vulkan	2016-04-13 20:25:39 -07:00
Michel Dänzer	171a570f38	clover: Fix build against LLVM SVN >= r266163 createInternalizePass now takes a callback instead of a StringSet. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-04-14 11:53:41 +09:00
Nanley Chery	79fbec30fc	anv: Remove default scissor and viewport concepts Users should never provide a scissor or viewport count of 0 because they are required to set such state in a graphics pipeline. This behavior was previously only used in Meta, which actually just disables those hardware operations at pipeline creation time. Kristian noticed that the current assignment of viewport count reduces the number of viewport uploads, so it is not removed. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2016-04-13 18:02:38 -07:00
Nanley Chery	1949e502bc	anv: Replace ::disable_scissor with ::use_rectlists Meta currently uses screenspace RECTLIST primitives that lie within the framebuffer rectangle. Since this behavior shouldn't change in the future, disable the scissor operation whenever rectlists are used. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2016-04-13 18:00:41 -07:00
Nanley Chery	9f72466e9f	anv: Delete anv_graphics_pipeline_create_info::disable_viewport There are no users of this field. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2016-04-13 18:00:41 -07:00
Nanley Chery	cff0f6b027	gen{7,8}_pipeline: Always set ViewportXYClipTestEnable For the following reasons, there is no behavioural change with this commit: the ViewportXYClipTest function of the CLIP stage will continue to be enabled outside of Meta (where disable_viewport is always false), and the CLIP stage is turned off within Meta, so this function will continue to be disabled in that case. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2016-04-13 18:00:41 -07:00
Nanley Chery	992bbed98d	gen{7,8}_pipeline: Apply 3DPRIM_RECTLIST restrictions According to 3D Primitives Overview in the Bspec, when the RECTLIST primitive is in use, the CLIP stage should be disabled or set to have a different Clip Mode, and Viewport Mapping must be disabled: Clipping: Must not require clipping or rely on the CLIP unit’s ClipTest logic to determine if clipping is required. Either the CLIP unit should be DISABLED, or the CLIP unit’s Clip Mode should be set to a value other than CLIPMODE_NORMAL. Viewport Mapping must be DISABLED (as is typical with the use of screen-space coordinates). We swap out ::disable_viewport for ::use_rectlist, because we currently always use the RECTLIST primitive when we disable viewport mapping, and we'll likely continue to use this primitive. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2016-04-13 17:53:38 -07:00
Nanley Chery	88d1c19c9d	anv_cmd_buffer: Don't make the initial state dirty Avoid excessive state emission. Relevant state for an action command will get set by the user: From Chapter 5. Command Buffers, When a command buffer begins recording, all state in that command buffer is undefined. [...] Whenever the state of a command buffer is undefined, the application must set all relevant state on the command buffer before any state dependent commands such as draws and dispatches are recorded, otherwise the behavior of executing that command buffer is undefined. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2016-04-13 17:52:24 -07:00
Nanley Chery	9fae6ee026	anv/meta: Don't set the dynamic state for disabled operations CmdSet* functions dirty the CommandBuffer's dynamic state. This causes the new state to be emitted when CmdDraw is called. Since we don't need the state that would be emitted, don't call the CmdSet* functions. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2016-04-13 17:52:20 -07:00
Nanley Chery	76b0ba087c	anv/clear: Disable the scissor operation Since the scissor rectangle always matches that of the framebuffer, this operation isn't needed. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2016-04-13 17:45:18 -07:00
Jason Ekstrand	b63a98b121	nir/dead_variables: Configurably work with any variable mode The old version of the pass only worked on globals and locals and always left inputs, outputs, uniforms, etc. alone. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-13 15:45:10 -07:00
Kenneth Graunke	505a8fbdf8	i965: Switch to NIR for ldexp lowering. The old GLSL IR based lowering doesn't quite work right in all cases, and fails several dEQP-GLES31 and Vulkan CTS tests. Jason's new approach in NIR passes all the tests. There's not likely to be a ton of advantage to lowering early in GLSL IR anyway, so...switch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-13 15:44:33 -07:00
Jason Ekstrand	4455bfa9a0	nir/algebraic: Add lowering for ldexp The algorithm used is different from both the naive suggestion from the GLSL spec and the one used in GLSL IR today. Unfortunately, the GLSL IR implementation that we have today doesn't handle denormals (for those that care) or the case where the float source is +-inf. Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-13 15:44:19 -07:00
Jason Ekstrand	765dd65349	i965: Implement the new imod and irem opcodes Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-13 15:44:08 -07:00
Jason Ekstrand	745b3d295e	nir: Add more modulus opcodes These are all needed for SPIR-V Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-13 15:44:00 -07:00
Jason Ekstrand	d880c6f9f5	i965/vec4: Inline get_pull_constant_offset It's not really doing enough anymore to justify a helper function. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reveiewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-13 15:39:20 -07:00
Jason Ekstrand	dd616cab01	nir/lower_io: Allow for a full bitmask of modes Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-04-13 12:44:10 -07:00
Jason Ekstrand	2caaf0ac5e	nir/lower_indirect: nir_variable_mode is now a bitfield Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-04-13 12:44:07 -07:00
Jason Ekstrand	ffa0e12e15	nir: Convert nir_variable_mode to a bitfield There are several passes where we need to specify some set of variable modes that the pass needs top operate on. This lets us easily do that. Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-04-13 12:40:12 -07:00
George Kyriazis	f69a61b1aa	gallium/swr: Make flat shading tris work. - Incorporate flatshade flag into the shader generation - Use provoking vertex (vc) in shader when flat shading. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-13 13:46:37 -05:00
Rob Clark	c53a12fedc	Revert "freedreno/a4xx: better occlusion/sample counting" This reverts commit `62fa868728`. dEQP-GLES3.functional.occlusion_query.* was unhappy about that change. Still not really sure what the other slots in the sample results buffer are. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-13 14:16:40 -04:00
Rob Clark	46e9bbc918	freedreno/a4xx: rasterizer_discard support This one is slightly annoying, since trying to write RBRC from draw would clobber values set in the tiling/gmem code. We could do command- stream patching for RBRC, as is done on a3xx. Although since it seems to be a rarely used feature, it is easier just to do RMW to set/clear the bit. Fixes dEQP-GLES3.functional.rasterizer_discard.basic.write_depth_triangles and related tests. a3xx still needs the same feature, although there it probably makes more sense to take advantage of the existing cmdstream patching which is required for RBRC for other reasons. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-13 14:16:21 -04:00
Rob Clark	216225ce57	freedreno/ir3: fix array textures on a4xx Seems like a4xx needs offset added to array index for all arrays, whereas a3xx only for cubemap arrays. Fixes a whole swath of dEQP fails (roughly sampler2darray). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-13 14:16:14 -04:00
Rob Clark	7e93b26b5d	freedreno: fix stream-out offset handling for lines/tris We need to increment offset by # of vertices, not by # of prims. Fixes a bunch of dEQP fails involving prims other than points. For example, dEQP-GLES3.functional.transform_feedback.position.lines_separate Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-13 14:16:02 -04:00
Rob Clark	6ca6e80f61	freedreno: fix handling for stream-out offsets If changed && append, we shouldn't be resetting the internal offset back to zero. This fixes issues w/ sequences like: glBeginTransformFeedback() glDraw() glPauseTransformFeedback() glDraw() glResumeTransformFeedback() glDraw() glEndTransformFeedback() Fixes dEQP-GLES3.functional.transform_feedback.array.separate.points.lowp_vec3 and related tests. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-13 14:15:54 -04:00
Rob Clark	0a4b0fc315	freedreno: fix prims-emitted query This should only count when TF is not paused. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-13 14:15:47 -04:00
Rob Clark	a7eb12d089	freedreno: fix max-line-width dEQP noticed that we were advertising completely bogus values. The actual maximum is 127.0f. But we have to use an artifically low maximum to work around a bug in the dEQP test, which gets confused when the max line width is too large and lines start going off-screen. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-13 14:15:31 -04:00
Rob Clark	6bf462a1ab	freedreno: add flag to enable dEQP hacks Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-13 14:15:24 -04:00
Rob Clark	f68f6c0246	freedreno/ir3: hack to avoid getting stuck in a loop There are still some edge cases which result in a neighbor-loop. Which needs to be fixed, but this hack at least makes deqp tests finish. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-13 14:15:13 -04:00
Rob Clark	dd70945e09	freedreno/ir3: use (ss) instead of (sy) for ldlv Fixes a bunch of flat-varying fail on a4xx (where we need to use ldlv to read the un-interpolated varying). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-13 14:15:05 -04:00
Rob Clark	b35ad6e701	freedreno/ir3: cleanup double cmps.s from frontend Since we cannot mov into a predicate register, the frontend uses a 'cmps.s p0.x, cond, 0' as a stand-in for mov to p0.x. It does this since it has no way to know that the source cond instruction (ie. for a kill, br, etc) will only be used to write the predicate reg. Detect this, and re-write the instruction writing p0.x to skip the original cmps.[sfu]. (It is done like this, rather than re-writing the dest of the first cmps.[sfu] in case the first cmps.[sfu] actually has other users.) Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-13 14:14:41 -04:00
Matt Turner	9bac27dbf9	glsl: Rename "vertex_input_slots" -> "is_vertex_input" vertex_input_slots would be an appropriate name for an integer, but not a bool. Also remove a cond ? true : false from a count_attribute_slots() call site, noticed during the rename. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-04-13 11:00:21 -07:00
Jose Fonseca	9586468c03	gallivm: Workaround LLVM PR 27332. The credit for finding and isolating this bug goes to Vinson and Roland. The buggy LLVM versions were found by doing opt -instcombine llvm-pr27332.ll > /dev/null where llvm-pr27332.ll is the IR from https://llvm.org/bugs/show_bug.cgi?id=27332#c3 Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-13 16:42:55 +01:00
Marek Olšák	dd0a296895	gallium/radeon: move a comment to the correct place trivial	2016-04-13 17:31:03 +02:00
Nicolai Hähnle	9e9a2bb44a	radeonsi: gate PIPE_CAP_SHADER_BUFFER_OFFSET_ALIGNMENT by LLVM version Otherwise we incorrectly claim ARB_ssbo support even with older LLVM versions. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94917 Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-13 10:06:22 -05:00
Elie TOURNIER	f04565c876	doxygen: Generate Doxygen for NIR Now, one can do the following to generate and read the nir Doxygen: cd $MESA_TOP/doxygen make firefox nir/index.html Update v2: Correct TAGFILES in nir.doxy Signed-off-by: Elie TOURNIER <tournier.elie@gmail.com> Reviewed-by: Rhys Kidd <rhyskidd@gmail.com> [Emil Velikov] v3: Rebase. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-13 13:44:33 +01:00
Elie TOURNIER	3157df58d0	doxygen: update glsl link Signed-off-by: Elie TOURNIER <tournier.elie@gmail.com> Reviewed-by: Rhys Kidd <rhyskidd@gmail.com> Tested-by: Rhys Kidd <rhyskidd@gmail.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-13 13:44:30 +01:00
Rhys Kidd	0e9fc1228a	doxygen: Remove deprecated settings in common.doxy These Doxygen features are deprecated, as reported by Doxygen 1.8.9.1 Warning: Tag `USE_WINDOWS_ENCODING' at line 66 of file `common.doxy' has become obsolete. To avoid this warning please remove this line from your configuration file or upgrade it using "doxygen -u" Warning: Tag `DETAILS_AT_TOP' at line 157 of file `common.doxy' has become obsolete. To avoid this warning please remove this line from your configuration file or upgrade it using "doxygen -u" Warning: Tag `HTML_ALIGN_MEMBERS' at line 616 of file `common.doxy' has become obsolete. To avoid this warning please remove this line from your configuration file or upgrade it using "doxygen -u" Warning: Tag `XML_SCHEMA' at line 848 of file `common.doxy' has become obsolete. To avoid this warning please remove this line from your configuration file or upgrade it using "doxygen -u" Warning: Tag `XML_DTD' at line 854 of file `common.doxy' has become obsolete. To avoid this warning please remove this line from your configuration file or upgrade it using "doxygen -u" Warning: Tag `MAX_DOT_GRAPH_WIDTH' at line 1115 of file `common.doxy' has become obsolete. To avoid this warning please remove this line from your configuration file or upgrade it using "doxygen -u" Warning: Tag `MAX_DOT_GRAPH_HEIGHT' at line 1123 of file `common.doxy' has become obsolete. To avoid this warning please remove this line from your configuration file or upgrade it using "doxygen -u" Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-13 13:44:26 +01:00
Rhys Kidd	3d18ab72bf	doxygen: Fix typo in doxygen/tnl.doxy TAGFILE relative folder should match .tag file Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-13 13:44:23 +01:00
Rhys Kidd	4ba409a364	doxygen: Correct TAGFILE linkage of main core.doxy was renamed to main.doxy, along with output folder in the below 2004 commit. Correct the other modules' TAGFILE linkage to find the main folder. commit `3ef972f538` Author: Brian Paul <brian.paul@tungstengraphics.com> Date: Sun May 16 22:07:02 2004 +0000 Replaced 'core' with 'main'. Other minor updates. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-13 13:44:19 +01:00
Rhys Kidd	7703a3e3d0	doxygen: Update .gitignore The last of these output directories was removed in 2007. commit `c2e0570831` Author: Jerome Glisse <glisse@freedesktop.org> Date: Fri Feb 16 23:18:56 2007 +0100 Update doxygen doc to reflet vbo changes. Update doxygen doc, array_cache no longuer exist, new shiny vbo modules is there. Tested on unix, but i think i didn't broke that bat :). commit `3ef972f538` Author: Brian Paul <brian.paul@tungstengraphics.com> Date: Sun May 16 22:07:02 2004 +0000 Replaced 'core' with 'main'. Other minor updates. commit `69db632a9d` Author: Jose Fonseca <j_r_fonseca@yahoo.co.uk> Date: Thu May 1 23:32:54 2003 +0000 Move the Doxygen configuration files into the usual places and integrate with the build system. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-13 13:44:15 +01:00
Rhys Kidd	ced18f4d60	doxygen: Remove references to miniglx miniglx was removed in February 2010. Clean up remaining unnecessary doxygen references. commit `a9e3669683` Author: Kristian Høgsberg <krh@bitplanet.net> Date: Thu Feb 25 16:17:04 2010 -0500 Remove remaining miniglx references Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-13 13:44:12 +01:00
Rhys Kidd	29b805b929	doxygen: Fix doxygen/gbm.doxy TAGFILES There has never been a doxygen/gbm_setup output folder. Appears to have been a copy-paste error from original commit in `245341f406`. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-13 13:44:08 +01:00
Rhys Kidd	684e7a4a14	doxygen: Correct TAGFILE relative paths Per Doxygen documentation, to combine external documentation (stored in a *.tag file) with a project the TAGFILES option should be set in the configuration file. A tag file typically only contains a relative location of the documentation from the point where doxygen was run. So when you include a tag file in other project you have to specify where the external documentation is located in relation this project. You can do this in the configuration file by assigning the (relative) location to the tag files specified after the TAGFILES configuration option. If you use a relative path it should be relative with respect to the directory where the HTML output of your project is generated; so a relative path from the HTML output directory of a project to the HTML output of the other project that is linked to. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-13 13:44:04 +01:00
Rhys Kidd	f066fb529b	doxygen: Fix doxygen/glapi.doxy The src/mesa/glapi folder was relocated in the below commit. Amend the doxygen/glapi.doxy INPUT setting accordingly. Whilst here, in addition this change also avoids a bug in the consolidated Doxygen output caused by doxygen/glapi.doxy inadvertently overwriting doxygen/swrast.tag via its GENERATE_TAGFILE setting. This bug depended upon the specific order each *.tag was built. commit `296adbd545` Author: Chia-I Wu <olv@lunarg.com> Date: Mon Apr 26 12:56:44 2010 +0800 glapi: Move to src/mapi/. Move glapi to src/mapi/{glapi,es1api,es2api}. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-13 13:43:58 +01:00
Rhys Kidd	cf3bc91c06	doxygen: Remove src/mesa/shader/ references Mesa has not had a src/mesa/shader/ folder since Mesa 7.9 removed it in October 2010, as part of a revised GLSL compiler written by Intel. Remove doxygen/shader.doxy and consequential changes made throughout. In addition to removing an unnecessary Doxygen doxyfile, this change also avoids a bug in the consolidated Doxygen output caused by doxygen/shader.doxy inadvertently overwriting doxygen/swrast.tag via its GENERATE_TAGFILE setting. This bug depended upon the specific order each *.tag was built. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-13 13:43:54 +01:00
Marek Olšák	04f15e491f	gallium/radeon: add an env variable to force a level of aniso filtering Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-13 12:42:28 +02:00
Jose Fonseca	cc5d8b678e	llvmpipe: Test rounding of x.5. Leverage nearbyintif function, which should be available on all C99 implementations. Trivial.	2016-04-13 11:13:05 +01:00
Roland Scheidegger	cb438d8b3e	gallivm: use llvm.nearbyint instead of llvm.round. We used to use sse roundps intrinsic directly, but switched to use the llvm intrinsics for rounding with `e4f01da15d`. However, llvm semantics follows standard math lib round function which is specced to do roundNearestAwayFromZero but we really want roundNearestEven (moreoever, using round generates atrocious code since the cpu can't do it directly and it results in scalar calls to libm __roundf). So, use llvm.nearbyint instead, which does exactly the right thing, and even has the advantage of being available with llvm 3.3 too. (I've verified it actually generates a roundps instruction with llvm 3.3.) This fixes https://bugs.freedesktop.org/show_bug.cgi?id=94909 Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-13 11:13:03 +01:00
Pierre Moreau	f525db6358	nv50/ra: `isinf()` is in namespace `std` since C++11. This fixes a compile error while building Nouveau with C++11 enabled (and glibc >= 2.23). This happens if SWR is enabled, as it forces C++11. Signed-off-by: Pierre Moreau <pierre.morrow@free.fr> Signed-off-by: Jose Fonseca <jfonseca@vmware.com> https://bugs.freedesktop.org/show_bug.cgi?id=94907	2016-04-13 07:41:13 +01:00
Jose Fonseca	fa46848e51	scons: Allow building with Address Sanitizer. libasan is never linked to shared objects (which doesn't go well with -z,defs). It must either be linked to the main executable, or (more practically for OpenGL drivers) be pre-loaded via LD_PRELOAD. Otherwise works. I didn't find anything with llvmpipe. I suspect the fact that the JIT compiled code isn't instrumented means there are lots of errors it can't catch. But for non-JIT drivers, the Address/Leak Sanitizers seem like a faster alternative to Valgrind. Usage (Ubuntu 15.10): scons asan=1 libgl-xlib export LD_LIBRARY_PATH=$PWD/build/linux-x86_64-debug/gallium/targets/libgl-xlib LD_PRELOAD=libasan.so.2 any-opengl-application Acked-by: Roland Scheidegger <sroland@vmware.com>	2016-04-13 06:54:32 +01:00
Kenneth Graunke	d1c89f6005	mesa: Change an error code in glSamplerParameterI[iu]v(). This is supposed to be INVALID_OPERATION in ES. We already did this for the fv/iv variants, but not Iiv/Iuv, which are new in ES 3.2 (or extensions). Fixes: ES31-CTS.texture_border_clamp.samplerparameteri_non_gen_sampler_error Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-12 20:30:32 -07:00
Jose Fonseca	46bfcd61f5	softpipe: Free tgsi.image elements on context destruction. Courtesy of address sanitizer. [airlied: free buffers as well] Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-13 13:21:37 +10:00
Edward O'Callaghan	5a3d928e2c	softpipe: Enable ARB_framebuffer_no_attachments Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-13 13:21:37 +10:00
Eric Anholt	3b63301d9f	vc4: Work around hardware limits on the number of verts in a single draw. Fixes rendering failures in glmark2's refract and bump:render-mode=high-poly demos, and partially in its terrain demo.	2016-04-12 19:10:51 -07:00
Thomas Hindoe Paaboel Andersen	6d6525a377	softpipe: avoid buffer overflow Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-13 11:51:35 +10:00
Thomas Hindoe Paaboel Andersen	b89708f95f	tgsi: fix buffer overflow Increase r to four channels as rgba is written to it Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-13 11:51:34 +10:00
Tim Rowley	b9294bc345	swr: handle pci cap requests Reviewed-by: George Kyriazis <george.kyriazis@intel.com>	2016-04-12 20:18:00 -05:00
Tim Rowley	b19d214b23	swr: support samplers in vertex shaders Reviewed-by: George Kyriazis <george.kyriazis@intel.com>	2016-04-12 20:18:00 -05:00
Nicolai Hähnle	10cfd7a604	radeonsi: enable GLSL 4.20 and therefore OpenGL 4.2 This is the last necessary bit for OpenGL 4.2 support. All driver-specific functionality has already been implemented as part of extensions. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-12 20:13:49 -05:00
Iurie Salomov	047e3264f6	va: check null context in vlVaDestroyContext Signed-off-by: Iurie Salomov <iurcic@gmail.com> Reviewed-by: Julien Isorce <j.isorce@samsung.com>	2016-04-13 00:52:53 +01:00
Jason Ekstrand	8f3b516f2e	nir/clone: Copy bit size when cloning registers Reported-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-04-12 16:41:58 -07:00
Marek Olšák	8e70a58af3	radeonsi: fix a critical SI hang since PIPELINESTAT_START/STOP was added For some reason unknown to me, SI hangs if the event is written after CONTEXT_CONTROL.	2016-04-13 01:05:15 +02:00
Kenneth Graunke	95d622e16d	glsl: Don't copy propagate or tree graft precise values. This is kind of a hack. We currently track precise requirements by decorating ir_variables. Propagating or grafting the RHS of an assignment to a precise value into some other expression tree can lose those decorations. In the long run, it might be better to replace these ir_variable decorations with an "exact" decoration on ir_expression nodes, similar to what NIR does. In the short run, this is probably good enough. It preserves enough information for glsl_to_nir to generate "exact" decorations, and NIR will then handle optimizing these expressions reasonably. Fixes ES31-CTS.gpu_shader5.precise_qualifier. v2: Drop invariant handling, as it shouldn't be necessary (caught by Jason Ekstrand). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-12 15:57:48 -07:00
Mark Janes	9e351e077b	util: Fix race condition on libgcrypt initialization Fixes intermittent Vulkan CTS failures within the test groups: dEQP-VK.api.object_management.multithreaded_per_thread_device dEQP-VK.api.object_management.multithreaded_per_thread_resources dEQP-VK.api.object_management.multithreaded_shared_resources Signed-off-by: Mark Janes <mark.a.janes@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94904 Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-12 15:38:43 -07:00
Kristian Høgsberg Kristensen	8ec971a997	i965/tiled_memcpy: Fix rgba8_copy_16_aligned_dst() typo Copy and paste error in commit `eafeb8db66`: i965/tiled_memcpy: Unroll bytes==64 case. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-12 15:32:43 -07:00
Kristian Høgsberg Kristensen	1af0f0151c	glsl/linker: Recurse on struct fields when adding shader variables ARB_program_interface_query requires that we add struct fields recursively down to basic types. Fixes 52 struct test cases in dEQP-GLES31.functional.program_interface_query.* Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-12 14:38:26 -07:00
Kristian Høgsberg Kristensen	778fd46aa4	glsl/linker: Pass name and type through to create_shader_variable() No functional change here, but this now lets us recurse throught structs in add_shader_variable(). Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-12 14:38:26 -07:00
Kristian Høgsberg Kristensen	09f0121593	glsl/linker: Pass absolute location to add_shader_variable() This lets us pass in the absolution location of a variable instead of computing it in add_shader_variable() based on variable location and bias. This is in preparation for recursing into struct variables. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-12 14:38:26 -07:00
Kristian Høgsberg Kristensen	8ab6aae4dc	glsl/linker: Add add_shader_variable() helper This consolidates the combination of create_shader_variable() and add_program_resource() into a new helper function. No functional difference, but we'll expand add_shader_variable() in the next few commits. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-12 14:38:26 -07:00
Matt Turner	eafeb8db66	i965/tiled_memcpy: Unroll bytes==64 case. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-12 14:37:05 -07:00
Roland Scheidegger	0e605d9b3a	i965/tiled_memcpy: Provide SSE2 for RGBA8 <-> BGRA8 swizzle. The existing code uses SSSE3, and because it isn't compiled in a separate file compiled with that, it is usually not used (that, of course, could be fixed...), whereas SSE2 is always present with 64-bit builds. This should be pretty much as fast as the pshufb version, albeit those code paths aren't really used on chips without llc in any case. v2: fix andnot argument order, add comments v3: use pshuflw/hw instead of shifts (suggested by Matt Turner), cut comments v4: [mattst88] Rebase Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-12 14:37:01 -07:00
Matt Turner	fc88b4babf	i965/tiled_memcpy: Move SSSE3 code back into inline functions. This will make adding SSE2 code a lot cleaner. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-12 14:36:59 -07:00
Matt Turner	0a5d8d9af4	i965/tiled_memcpy: Optimize RGBA -> BGRA swizzle. Replaces four byte loads and four byte stores with a load, bswap, rotate, store; or a movbe, rotate, store. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-12 14:36:56 -07:00
Nicolai Hähnle	a191e6b719	radeonsi: fix bounds check in si_create_vertex_elements This was triggered by dEQP-GLES3.functional.vertex_array_objects.all_attributes Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-12 16:32:46 -05:00
Nicolai Hähnle	4285a97cea	docs: mark atomic counters and SSBOs as done for radeonsi Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-12 16:30:51 -05:00
Nicolai Hähnle	bfd11c5996	radeonsi: enable shader buffer pipe caps Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-12 16:30:48 -05:00
Nicolai Hähnle	4e81843b13	radeonsi: add shader buffer support to TGSI_OPCODE_RESQ Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-12 16:30:45 -05:00
Nicolai Hähnle	01109282ce	radeonsi: add shader buffer support to TGSI_OPCODE_STORE Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-12 16:30:43 -05:00
Nicolai Hähnle	745014c502	radeonsi: add shader buffer support to TGSI_OPCODE_LOAD Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-12 16:30:41 -05:00
Nicolai Hähnle	68bc25c931	radeonsi: add shader buffer support to TGSI_OPCODE_ATOM* Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-12 16:30:38 -05:00
Nicolai Hähnle	c6f5d000db	radeonsi: add offset parameter to buffer_append_args Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-12 16:30:35 -05:00
Nicolai Hähnle	c565466eea	radeonsi: adjust buffer_append_args to take a 128 bit resource Move the buffer resource extraction code out into its own function. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-12 16:30:32 -05:00
Nicolai Hähnle	e88018ffe5	radeonsi: preload shader buffers in shaders Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-12 16:30:29 -05:00
Nicolai Hähnle	c495c0ad37	radeonsi: implement set_shader_buffers Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-12 16:30:26 -05:00
Nicolai Hähnle	73c8b85b64	radeonsi: move resetting of constant buffers into a separate function This will be re-used for shader buffers. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-12 16:30:04 -05:00
Haixia Shi	35ade36c88	dri/i965: fix incorrect rgbFormat in intelCreateBuffer(). It is incorrect to assume that pixel format is always in BGR byte order. We need to check bitmask parameters (such as \|redMask\|) to determine whether the RGB or BGR byte order is requested. v2: reformat code to stay within 80 character per line limit. v3: just fix the byte order problem first and investigate SRGB later. v4: rebased on top of the GLES3 sRGB workaround fix. v5: rebased on top of the GLES3 sRGB workaround fix v2. Signed-off-by: Haixia Shi <hshi@chromium.org> Reviewed-by: Stéphane Marchesin <marcheu@chromium.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-12 14:06:45 -07:00
Kenneth Graunke	e303e88a9c	glsl: Reject illegal qualifiers on atomic counter uniforms. This fixes dEQP-GLES31.functional.uniform_location.negative.atomic_fragment dEQP-GLES31.functional.uniform_location.negative.atomic_vertex Both of which have lines like layout(location = 3, binding = 0, offset = 0) uniform atomic_uint uni0; The ARB_explicit_uniform_location spec makes a very tangential mention regarding atomic counters, but location isn't something that makes sense with them. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-04-12 14:06:42 -07:00
Kenneth Graunke	929e44099f	glsl: Add a method to print error messages for illegal qualifiers. Suggested by Timothy Arceri a while back on mesa-dev: https://lists.freedesktop.org/archives/mesa-dev/2016-February/107735.html Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Acked-by: Matt Turner <mattst88@gmail.com>	2016-04-12 14:06:42 -07:00
John Sheu	7f08547248	xlib: fix memory leak on Display close The XMesaVisual instances freed in the visuals table on display close are being freed with a free() call, instead of XMesaDestroyVisual(), causing a memory leak. Signed-off-by: John Sheu <sheu@google.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-04-12 13:56:41 -06:00
Jakob Sinclair	d04bb14d04	st/mesa: Replace GLvoid with void GLvoid was used before in OpenGL but it has changed to just using void. All GLvoids in mesa's state tracker has been changed to void in this patch. Tested this with piglit and no problems were found. No compiler warnings. Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-04-12 13:37:16 -06:00
Bas Nieuwenhuizen	126da23d70	radeonsi: Mark ARB_robust_buffer_access_behavior as supported. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-12 20:53:10 +02:00
Bas Nieuwenhuizen	70dcd841f7	gallium: Add capability for ARB_robust_buffer_access_behavior. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-12 20:53:06 +02:00
Bas Nieuwenhuizen	285dc05055	mesa: Expose the ARB_robust_buffer_access_behavior extension. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-12 20:40:26 +02:00
Miklós Máté	aad8707b28	main: rework the compatibility check of visuals in glXMakeCurrent Now it follows the compatibility criteria listed in section 2.1 of the GLX 1.4 specification. This is needed for post-process effects in SW:KotOR. Signed-off-by: Miklós Máté <mtmkls@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-04-12 19:48:01 +02:00
Tim Rowley	df37b06276	swr: [rasterizer core] warning cleanup Acked-by: Brian Paul <brianp@vmware.com>	2016-04-12 11:52:05 -05:00
Tim Rowley	06c59dc417	swr: [rasterizer] Put in rudimentary garbage collection for the global arena allocator - Check for unused blocks every few frames or every 64K draws - Delete data unused since the last check if total unused data is > 20MB Doesn't seem to cause a perf degridation Acked-by: Brian Paul <brianp@vmware.com>	2016-04-12 11:52:05 -05:00
Tim Rowley	b990483de2	swr: [rasterizer core] Put DRAW_CONTEXT on a diet No need for 256 pointers per DC. Acked-by: Brian Paul <brianp@vmware.com>	2016-04-12 11:52:05 -05:00
Tim Rowley	a939a58881	swr: [rasterizer core] Add experimental support for hyper-threaded front-end Acked-by: Brian Paul <brianp@vmware.com>	2016-04-12 11:52:05 -05:00
Tim Rowley	9a8146d0ff	swr: [rasterizer] Avoid segv in thread creation on machines with non-consecutive NUMA topology. Acked-by: Brian Paul <brianp@vmware.com>	2016-04-12 11:52:05 -05:00
Tim Rowley	2c71fd4bf8	swr: [rasterizer core] Replace all naked OSALIGN macro uses with OSALIGNSIMD / OSALIGNLINE Future proofing Acked-by: Brian Paul <brianp@vmware.com>	2016-04-12 11:52:05 -05:00
Tim Rowley	32a8653ad2	swr: [rasterizer] Ensure correct alignment of stack variables used as vectors Acked-by: Brian Paul <brianp@vmware.com>	2016-04-12 11:52:05 -05:00
Tim Rowley	e1871c4459	swr: [rasterizer core] Quantize depth to depth buffer precision prior to depth test/write. Fixes z-fighting issues. Acked-by: Brian Paul <brianp@vmware.com>	2016-04-12 11:52:05 -05:00
Tim Rowley	2a19aca05f	swr: [rasterizer common] win32 build fixups Acked-by: Brian Paul <brianp@vmware.com>	2016-04-12 11:52:05 -05:00
Tim Rowley	c25244f2f7	swr: [rasterizer core] Affinitize thread scratch space to numa node of worker Acked-by: Brian Paul <brianp@vmware.com>	2016-04-12 11:52:04 -05:00
Tim Rowley	f89f6d562a	swr: [rasterizer] Misc fixes identified by static code analysis No perf loss detected Acked-by: Brian Paul <brianp@vmware.com>	2016-04-12 11:52:04 -05:00
Brian Paul	6c01478213	st/mesa: fix memleak in glDrawPixels cache code If the glDrawPixels size changed, we leaked the previously cached texture, if there was one. This patch fixes the reference counting, adds a refcount assertion check, and better handles potential malloc() failures. Tested with a modified version of the drawpix Mesa demo which changed the image size for each glDrawPixels call. Cc: "11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-04-12 10:44:45 -06:00
Jose Fonseca	b5105e67a8	gallium: Use STATIC_ASSERT whenever possible. Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-12 16:56:15 +01:00
Jose Fonseca	b025c23cfe	softpipe: Use STATIC_ASSERT whenever possible. Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-12 16:56:15 +01:00
Jose Fonseca	2f13d7543f	svga: Use STATIC_ASSERT whenever possible. Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-12 16:56:15 +01:00
Jose Fonseca	7279098dc5	mesa: Use STATIC_ASSERT whenever possible. Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-12 16:56:15 +01:00
Marek Olšák	686b018ab3	r600g: use common scissor and viewport code It's the same as radeonsi. This adds guard band support to r600g. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Grigori Goronzy <greg@chown.ath.cx> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 17:13:25 +02:00
Marek Olšák	87a5b07f90	gallium/radeon: add R600/Evergreen/Cayman support to common viewport code Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Grigori Goronzy <greg@chown.ath.cx> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 17:13:25 +02:00
Marek Olšák	2ca5566ed7	radeonsi: move scissor and viewport states into gallium/radeon Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Grigori Goronzy <greg@chown.ath.cx> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 17:13:24 +02:00
Marek Olšák	db00f6cc9c	radeonsi: use guard band clipping Guard band clipping speeds up rasterization for primitives that are partially off-screen. This change in particular results in small framerate improvements in a wide range of games. Started by Grigori Goronzy <greg@chown.ath.cx>. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Grigori Goronzy <greg@chown.ath.cx> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 17:12:14 +02:00
Marek Olšák	cb21f8a97c	radeonsi: compute scissor from viewport in set_viewport_states and clamp it right before emitting. This is a prerequisite for computing the guard band. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Grigori Goronzy <greg@chown.ath.cx> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:49 +02:00
Marek Olšák	5b6a0b7fc0	gallium/radeon: set GTT WC on tiled textures Just for consistency. This should have no effect, because OpenGL textures always go to VRAM. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-12 14:29:48 +02:00
Marek Olšák	5a4b74d1ba	gallium/radeon: relax requirements on VRAM placements on APUs This makes Tonga with vramlimit=128 2x faster in Heaven. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-12 14:29:48 +02:00
Marek Olšák	a57309f807	winsys/amdgpu: remove hack for low VRAM configuration A better solution will be used. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-12 14:29:48 +02:00
Marek Olšák	b36f19bf98	r600g: disable aniso filtering for non-mipmap textures on EG this is the default behavior of the closed driver when running on VI Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:48 +02:00
Marek Olšák	3bc2d967c4	r600g: clean up aniso state translation Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:48 +02:00
Marek Olšák	b0d4469519	radeonsi: disable aniso filtering for non-mipmap textures on SI-CI The closed driver does this, but it looks at base_level and last_level and uses a conditional assignment, which LLVM can't generate on SGPRs. That led me to invent this solution that abuses the image descriptor. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:48 +02:00
Marek Olšák	ddd33431c5	radeonsi: clean up aniso state translation Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:48 +02:00
Marek Olšák	f7420ef5b4	radeonsi: enable some sampler fields to match the closed driver copied from the Vulkan driver Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:48 +02:00
Marek Olšák	1a98be001f	gallium/radeon: fix maximum texture anisotropy setup We were overdoing it for non-power-of-two values. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:48 +02:00
Marek Olšák	2d7be5d37e	gallium/radeon: never choose a linear tiling for DB surfaces Just for consistency. This is actually not a problem, because both addrlib and radeon check and fix this. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:48 +02:00
Marek Olšák	b7878146c4	gallium/radeon: removing dead code for sharing stencil buffers This is a remnant of the times when the DDX was allocating depth-stencil buffers for windows. Now, st/dri allocates them and doesn't share them. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:48 +02:00
Marek Olšák	73aeebd772	radeonsi: allow clearing buffers >= 4 GB Only CMASK and DCC clears can use this, because only textures can be so large. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:48 +02:00
Marek Olšák	1dd8832e04	gallium/radeon: allow allocating textures >= 4 GB Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:48 +02:00
Marek Olšák	0689741e51	winsys/radeon: fix printing allocation failures print as unsigned instead of signed Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:47 +02:00
Marek Olšák	0ba0933f48	winsys/amdgpu: add support for 64-bit buffer sizes v2: fail in radeon_winsys_bo_create if size > 32 bits Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:47 +02:00
Marek Olšák	7e78b5ed38	pb_buffer: switch pb_buffer::size to 64 bits being able to allocate more than 4 GB may be useful Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:47 +02:00
Marek Olšák	e241a63512	gallium/radeon: remove R600_QUERY_HW_FLAG_TIMER not used anymore Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:47 +02:00
Marek Olšák	0222351fc1	gallium/radeon: merge timer and non-timer query lists All of them are paused only between IBs. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:47 +02:00
Marek Olšák	7347c068d8	r600g: don't manually stop queries for blitter r600_set_active_query_state does it better. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:47 +02:00
Marek Olšák	12fee5b93e	r600g: add pausing pipeline & streamout queries into set_active_query_state Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:47 +02:00
Marek Olšák	e90fe60b72	r600g: implement set_active_query_state for pausing occlusion queries Use ZPASS_INCREMENT_DISABLE everywhere. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:47 +02:00
Marek Olšák	5248676f87	r600g: simplify r600_set_occlusion_query_state The caller does the same checking. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:47 +02:00
Marek Olšák	b82893f93a	gallium/radeon: move pipeline stat context flags to common code Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:47 +02:00
Marek Olšák	aa79a3269f	r600g: fix typo in r600 register definitions Acked-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-12 14:29:47 +02:00
Marek Olšák	a4c288d8e1	gallium/radeon: unify checking streamout enable state Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:47 +02:00
Marek Olšák	466aa57185	radeonsi: fix mask checking when emitting scissors and viewports Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Grigori Goronzy <greg@chown.ath.cx>	2016-04-12 14:29:46 +02:00
Marek Olšák	f3eebb84eb	radeonsi: implement and rely on set_active_query_state Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:46 +02:00
Marek Olšák	e599b8f384	gallium: pause queries for all meta ops Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:46 +02:00
Marek Olšák	26171bd67e	gallium: add pipe_context::set_active_query_state for pausing queries Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-12 14:29:46 +02:00
Bas Nieuwenhuizen	fc67375379	radeonsi: Synchronize a streamout write after read hazard. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-12 13:55:38 +02:00
Hans de Goede	dccdb655a1	nv30: Add missing PIPE_SHADER_CAP_INTEGERS to get_shader_param() Add missing PIPE_SHADER_CAP_INTEGERS for frag shaders to nv30_screen_get_shader_param(). Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-04-12 11:41:12 +02:00
Haixia Shi	b0e3ba61b5	dri/i965: extend GLES3 sRGB workaround to cover all formats It is incorrect to assume BGRA byte order for the GLES3 sRGB workaround. v2: use _mesa_get_srgb_format_linear to handle all formats Signed-off-by: Haixia Shi <hshi@chromium.org> Reviewed-by: Stéphane Marchesin <marcheu@chromium.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-12 02:06:12 -07:00
Eduardo Lima Mitev	ea8a65f503	i965: Add autogenerated 'brw_nir_trig_workarounds.c' to gitignore Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-12 10:44:19 +02:00
Rhys Kidd	703c1e69d8	glsl: Update hash table comments in constant propagation Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-12 01:29:19 -07:00
Dave Airlie	afa8707ba9	softpipe: add SSBO/shader atomics support. This adds support for the features requires for ARB_shader_storage_buffer_object and ARB_shader_atomic_counters, ARB_shader_atomic_counter_ops. [airlied: some cleanups applied] Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-12 14:16:13 +10:00
Dave Airlie	c2aeeca455	draw: add support for passing buffers to vs/gs shaders. Like the image code, but for shader buffers this time. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-12 14:15:36 +10:00
Dave Airlie	081a958bcd	tgsi: add support for buffer/atomic operations to tgsi_exec. This adds support for doing load/store/atomic operations on buffer objects. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-12 14:15:33 +10:00
Dave Airlie	9c7a0d188a	tgsi: set nonhelpermask for vertex shaders For atomic operations we really need to avoid executing unnecessary shaders, so for some tests that just draw a single point we only want one vertex to get processed not 4, this fixes a number of the atomic counters tests. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-12 14:15:16 +10:00
Ian Romanick	193a5cee6a	nir: Fix typo in comment Trivial. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2016-04-11 19:24:19 -07:00
Markus Wick	18c8b927e2	nir: Merge redudant integer clamping. Dolphin uses them a lot. Range tracking would be better in the long term, but this two lines works fine for now. Signed-off-by: Markus Wick <markus@selfnet.de> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 18:48:50 -07:00
Kenneth Graunke	bfd17c76c1	i965: Port INTEL_PRECISE_TRIG=1 to NIR. This makes the extra multiply visible to NIR's algebraic optimizations (for constant reassociation) as well as constant folding. This means that when the result of sin/cos are multiplied by an constant, we can eliminate the extra multiply altogether, reducing the cost of the workaround. It also means we only have to implement it one place, rather than in both backends. This makes INTEL_PRECISE_TRIG=1 cost nothing on GPUTest/Volplosion, which has a ton of sin() calls, but always multiplies them by an immediate constant. The extra multiply gets folded away. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-11 18:44:17 -07:00
Kenneth Graunke	b0dffdc616	i965: Pass brw_compiler into brw_preprocess_nir() instead of is_scalar. I want to be able to read other fields. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-11 18:44:12 -07:00
Kenneth Graunke	808d26c771	nir: Silence unused "options" warning in algebraic passes. Some passes may not refer to options->..., at which point the compiler will warn about an unused variable. Just cast to void unconditionally to shut it up. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-11 18:44:08 -07:00
Kenneth Graunke	5886cd79a0	nir: Do basic constant reassociation. Many shaders contain expression trees of the form: const_1 * (value * const_2) Reorganizing these to (const_1 * const_2) * value will allow constant folding to combine the constants. Sometimes, these constants are 2 and 0.5, so we can remove a multiply altogether. Other times, it can create more immediate constants, which can actually hurt. Finding a good balance here is tricky. While much more could be done, this simple patch seems to have a lot of positive benefit while having a low downside. shader-db results on Broadwell: total instructions in shared programs: 8963768 -> 8961369 (-0.03%) instructions in affected programs: 438318 -> 435919 (-0.55%) helped: 1502 HURT: 245 total cycles in shared programs: 71527354 -> 71421516 (-0.15%) cycles in affected programs: 11541788 -> 11435950 (-0.92%) helped: 3445 HURT: 1224 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-11 18:43:55 -07:00
Boyuan Zhang	1c7ba7f156	radeon/uvd: alignment fix for decode message buffer Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-04-11 19:30:47 -04:00
Brian Paul	704d203d5f	st/mesa: replace _mesa_sysval_to_semantic table with function Instead of using an array indexed by SYSTEM_VALUE_x, just use a switch statement. This fixes a regression caused by inserting new SYSTEM_VALUE_ enums but not updating the mapping to TGSI semantics. v2: fix a few switch statement mistakes for compute-related enums Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-11 17:04:13 -06:00
Jason Ekstrand	a9e6213edd	nir/lower_system_values: Add support for several computed values Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-04-11 13:53:03 -07:00
Jason Ekstrand	39103145ff	glsl/shader_enums: Add the other two compute builtins These weren't added before because they are actually calculated values that are computed from other inputs. However, in order to handle them in nir_lower_system_values, it's nice for them to have a cannonical locaiton. Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-04-11 13:53:00 -07:00
Jason Ekstrand	22836dbefa	glsl/shader_enums: Add an enum for Vulkan InstanceIndex In Vulkan, you have InstanceIndex which begins at the base instance value rather than the zero-based InstanceID of GL. Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-04-11 13:52:51 -07:00
Emil Velikov	581c8016f8	mesa: add missing header to the tarball Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-11 19:08:23 +01:00
Emil Velikov	5e010a72c9	drivers/softpipe: add missing header to the tarball Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-11 19:08:23 +01:00
Emil Velikov	c69ab885d7	mesa: automake: update and reuse X86_SSE41_FILES list Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 19:08:23 +01:00
Emil Velikov	28da0d6922	compiler: android: flesh out nir into separate makefile Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 19:08:23 +01:00
Emil Velikov	8d51500b2d	compiler: automake: flesh out NIR into separate makefile. Analogous to previous commit - improved readability at the expense of an extra file. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 19:08:23 +01:00
Emil Velikov	9324afc0e9	compiler: automake: split out glsl into separate makefile Preserve the functionality while keeping the files smaller and more readable. v2: Do not include Makefile.sources from the GLSL makefile (silences automake warnings) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> (v1)	2016-04-11 19:08:23 +01:00
Emil Velikov	3d67780b80	compiler: remove {glsl,nir}/Makefile.sources No longer used as of last commit. v2: Rebase. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> (v1)	2016-04-11 19:08:23 +01:00
Emil Velikov	c481c8f7f1	configure.ac: update the path of the generated files ... in order to determine if we need bison/flex. Failing to locate the files will lead to mandating bison/flex even when building from a release tarball. CC: "11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-11 19:08:23 +01:00
Emil Velikov	4db8f15a25	glsl: move the android build scripts a level up Analogous to previous commit. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 19:08:23 +01:00
Emil Velikov	abf7088eb7	glsl: move the scons build script a level up It will allow us to remove the duplicate glsl/Makefile.sources. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 19:08:23 +01:00
Emil Velikov	594e868555	Part revert "gallium/auxiliary: don't build NIR sources with MSVC2008 flags" This reverts commit `41c7912d04` but leaves out the pragma [that inspired the original commit]. Building mesa requires MSVC2013 or later, thus we no longer need this. v2: Use correct include path (src/glsl/nir -> src/compiler/nir) Conflicts: src/gallium/auxiliary/Makefile.am Acked-by: Jason Ekstrand <jason@jlekstrand.net> (v1)	2016-04-11 19:08:23 +01:00
Nicolai Hähnle	590a37dc05	GL3: ARB_shader_image_load_store/size is done for radeonsi also in GLES Trivial.	2016-04-11 12:48:10 -05:00
Brian Paul	05aec42d3d	docs: fix Coverity URL	2016-04-11 09:10:39 -06:00
Oded Gabbay	d97f5d60f5	tgsi/doc: fix spelling error Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-04-11 11:43:43 +03:00
Jason Ekstrand	3aa1a5ee88	nir/lower_system_values: Simplify the computation of LocalInvocationIndex	2016-04-10 23:43:38 -07:00
Connor Abbott	a89c474157	nir: add a pass for lowering (un)pack_double_2x32 v2: Undo unintended change to the signature of nir_normalize_cubemap_coords (Iago). v3: Move to compiler/nir (Iago) v4: Remove Authors from copyright header (Michael Schellenberger) v5 (Sam): - Use nir_channel() and nir_ssa_for_alu_src() helpers (Jason) - Inline lower_double_pack_instr() code into lower_double_pack_block() (Jason). - Initialize nir_builder at lower_double_pack_impl() (Jason). Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:29:27 +02:00
Connor Abbott	663e6421df	nir: add split versions of (un)pack_double_2x32 v2 (Sam): - Use uint64 instead of float64 for sources and destinations. (Connor) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:29:27 +02:00
Connor Abbott	b093808d26	nir: don't try to scalarize unpack_double_2x32 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:29:27 +02:00
Connor Abbott	9e31e0a21b	nir: add support for (un)pack_double_2x32 v2 (Sam): - Use uint64 instead of float64 for sources and destinations. (Connor) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:29:27 +02:00
Iago Toral Quiroga	d5d6260329	nir: add i2d and u2d opcodes v2: - Assert supports_int and don't fallback to nir_fmov (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:29:27 +02:00
Iago Toral Quiroga	b16d06252e	nir: add d2i, d2u, d2b opcodes Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:29:27 +02:00
Connor Abbott	a4bce07dc6	nir: add support for d2f and f2d Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:29:27 +02:00
Iago Toral Quiroga	fab5d4cd95	nir/glsl_to_nir: set bit_size on ssbo_load result v2 (Sam): - Add missing bit_size assignment when ssbo_load destination is a boolean. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:29:27 +02:00
Samuel Iglesias Gonsálvez	a741378cb5	nir/glsl_to_nir: add bit-size info to add_instr() Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:28:01 +02:00
Connor Abbott	4b37c64f3b	nir/split_var_copies: handle doubles Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:27:05 +02:00
Connor Abbott	106a1b5501	nir/instr_set: handle 64-bit bit-sizes v2: Revert spurious change in nir_opt_cse.c (Iago) Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:27:05 +02:00
Connor Abbott	f2ccb63be1	nir: handle doubles in nir_deref_get_const_initializer_load() v2 (Sam): - Use proper bitsize value when calling to nir_load_const_instr_create() (Jason). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:27:05 +02:00
Connor Abbott	41c2541fc7	nir/print: add support for printing doubles and bitsize v2: - Squash the printing doubles related patches into one patch (Sam). v3: - Print using PRIx64 format: long is 32-bit on some 32-bit platforms but long long is basically always 64-bit (Jason). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:27:05 +02:00
Connor Abbott	f5551f8a8b	nir/glsl_to_nir: support doubles v2: - Don't set sized types to the destination of texture related opcodes. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:27:05 +02:00
Iago Toral Quiroga	8e69782e3e	nir/lower_load_const_to_scalar: support doubles and multiple bit sizes v2 (Sam): - Add assert to detect bitsizes differents than 32 and 64 (Jason). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:27:05 +02:00
Iago Toral Quiroga	12f628adcb	nir/lower_to_source_mods: Handle different bit sizes v2 (Sam): - Use helper to get base type from nir_alu_type. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:27:04 +02:00
Samuel Iglesias Gonsálvez	3663a2397e	nir: add bit_size info to nir_load_const_instr_create() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:27:04 +02:00
Connor Abbott	a5b17ae745	nir/lower_vec: adapt to different bit sizes Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:27:04 +02:00
Samuel Iglesias Gonsálvez	e3edaec739	nir: add bit_size info to nir_ssa_undef_instr_create() v2: - Make the users to give the right bit_sizes as arguments (Jason). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:27:04 +02:00
Connor Abbott	41a39e3384	nir/locals_to_regs: adapt to different bit sizes Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:27:04 +02:00
Connor Abbott	40d1b671a9	nir/from_ssa: adapt to different bit sizes Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-11 08:27:04 +02:00
Timothy Arceri	4979cec820	i965: fix struct type in comment Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-04-11 14:03:09 +10:00
Jason Ekstrand	7d58cfa366	nir: Add a pass for gathering various bits of shader info Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-10 20:43:47 -07:00
Ilia Mirkin	875543e270	i965: enable OES_texture_buffer on gen7+ It will only end up getting exposed on gen8+ since it requires GL ES 3.1, but it should be ready to go on gen7 when support for GL ES 3.1 is completed there. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-10 20:24:26 -07:00
Dave Airlie	6f5f818b6d	docs: add some missing softpipe entries. I just forgot these when I added this stuff. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-11 13:14:48 +10:00
Kenneth Graunke	26c56e24e7	glsl: Don't remove XFB-only varyings. Consider the case of linking a program with both a vertex and fragment shader. The VS may compute output varyings that are intended for transform feedback, and not read by the fragment shader. In this case, var->data.is_unmatched_generic_inout will be true, but we still cannot eliminate the varyings. We need to also check !var->data.is_xfb_only. Fixes failures in ES31-CTS.gpu_shader5.fma_precision_*, which happen to use transform feedback in a way we apparently hadn't seen before. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-04-10 19:03:06 -07:00
Kenneth Graunke	ce84a92df5	i965/disasm: Decode per-slot offsets. We just never bothered to decode this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-04-09 21:10:32 -07:00
Kenneth Graunke	20c8f36508	i965/disasm: Decode "channel mask present" bit correctly. Bit 15 means "interleave" for most messages, but for SIMD8 messages it means "use channel masks". Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-04-09 21:10:20 -07:00
Kenneth Graunke	b790232524	i965/disasm: Simplify the URB opcode printing with ?:. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-04-09 21:10:11 -07:00
Ilia Mirkin	9b5bd20eb2	glsl: allow usage of the keyword buffer before GLSL 430 / ESSL 310 The GLSL 4.20 and ESSL 3.00 specs don't list 'buffer' as a reserved keyword. Make the parser ignore it unless GLSL 4.30 / ESSL 3.10 are used, or ARB_shader_storage_buffer_objects is enabled. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: mesa-stable@lists.freedesktop.org	2016-04-09 20:41:54 -04:00
Jason Ekstrand	bff7a8c4f3	anv/pipeline: Set up flat enables correctly	2016-04-09 17:06:59 -07:00
Jason Ekstrand	1275c7c744	genxml: Fix the name of a 3DSTATE_SF/SBE field on gen6-7.5	2016-04-09 17:02:21 -07:00
Jason Ekstrand	aa6f9a4e1e	genxml: Break output detail of 3DSTATE_SF on gen7 into a struct This makes it work like 3DSTATE_SBE[_SWIZ] on gen7+	2016-04-09 17:00:22 -07:00
Jason Ekstrand	ddae342618	genxml: Fix up MOCS in RENDER_SURFACE_STATE on gen6 to match gen7	2016-04-09 16:59:04 -07:00
Ilia Mirkin	cdb6fa91fa	nvc0: handle the case where there are no framebuffer attachments Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-09 14:55:44 -04:00
Ilia Mirkin	59ca92137b	nv50,nvc0: support sending string markers down into the command stream This should hopefully make it a little easier to debug with GL applications like glretrace and looking at command streams. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-09 14:55:43 -04:00
Ilia Mirkin	f9480d7918	nv50,nvc0: add invalidate_resource support for buffer resources Provide a callback to reallocate the underlying storage of a resource so that it is not bound to any existing fences. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-09 14:55:43 -04:00
Eric Anholt	30b818d5eb	vc4: Move FRAG_X/Y/REV_FLAG to a QFILE like VPM or TLB color writes. This gives us one less set of special instruction generation cases, and instead just the case for returning the correct register to read.	2016-04-08 18:41:46 -07:00
Eric Anholt	f029932cac	vc4: Allow TLB Z/color/stencil writes from any ALU operation in QIR. This lets us write the Z directly from the FTOI for computed Z, and may let us coalesce color writes in the future. No change in my shader-db, but clearly drops an instruction in piglit's early-z test.	2016-04-08 18:41:46 -07:00
Eric Anholt	44d7b8ad12	vc4: Add a helper function for the construction of qregs. The separate declaration of the struct is not helping clarity, and I was going to be writing a whole lot more of these in the upcoming patches.	2016-04-08 18:41:45 -07:00
Eric Anholt	114c8b38d3	vc4: Add missing scheduling dependency for MS color writes.	2016-04-08 18:41:45 -07:00
Eric Anholt	483c172989	vc4: Drop the multi_instruction distinction for QIR instructions. It wasn't correctly flagged everywhere, and QPU generation now handles the only remaining case that was paying attention to it. No change on shader-db.	2016-04-08 18:41:45 -07:00
Eric Anholt	a8b525f8c4	vc4: Handle SF on instructions that write r4. Normal SFU writes couldn't have SF because they were marked as multi_instruction, but tex_result and tlb_color_read weren't. This ended up not being a problem according to anything in shader-db, but it seems possible.	2016-04-08 18:41:45 -07:00
Eric Anholt	e46b48963a	vc4: Allow multi-instruction QIR nodes to get VPM optimization. There used to be multi-instruction operations that would use src[] twice, which is why we couldn't do some optimizations on them. This is no longer the case. total instructions in shared programs: 77973 -> 77969 (-0.01%) instructions in affected programs: 84 -> 80 (-4.76%) total estimated cycles in shared programs: 234165 -> 234157 (-0.00%) estimated cycles in affected programs: 92 -> 84 (-8.70%)	2016-04-08 18:41:45 -07:00
Eric Anholt	99a759a4a3	vc4: Switch to using NIR_PASS macros. This gets us better validation of our NIR transformations.	2016-04-08 18:41:45 -07:00
Eric Anholt	7030eadbed	vc4: Handle nir_intrinsic_load_user_clip_plane as a vec4. I liked having all my NIR be scalar, but nir_validate() complains that the intrinsic writes 4 components but the destination we set up was only 1 component. I could generate a new scalar variant, but it's a lot easier to just leave it as a vec4. This doesn't hurt codegen since we GC unused uniforms, and UCP dot products use all the components anyway.	2016-04-08 18:40:55 -07:00
Rhys Kidd	40e77741cf	vc4: Emit a warning and proceed for handling loops in NIR. We don't really suppor control flow yet, but it's a lot nicer to render something and warn on stderr than to crash. Fixes the following piglit tests: - shaders/complex-loop-analysis-bug - shaders/glsl-fs-discard-04 Converts the following piglit tests from crash to fail: - shaders/glsl-fs-continue-inside-do-while - shaders/glsl-fs-loop - shaders/glsl-fs-loop-continue - shaders/glsl-fs-loop-nested - shaders/glsl-texcoord-array - shaders/glsl-vs-continue-inside-do-while - shaders/glsl-vs-loop - shaders/glsl-vs-loop-continue - shaders/glsl-vs-loop-nested No piglit regressions. v2 (Eric): Add stronger stderr warning. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-04-08 18:28:43 -07:00
Rhys Kidd	2450b219e5	vc4: Add a stub for NIR->QIR of control flow function nodes We shouldn't have any NIR functions present since all GLSL functions get inlined, but this would be a more informative error if it does happen. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-04-08 18:28:43 -07:00
Rhys Kidd	e5997778bc	vc4: Add better debug of NIR->QIR control flow graph failure Ensure NIR control flow graph nodes that are unhandled in QIR are reported with sufficient verbosity to aid debugging. This improves piglit outputs, amongst other tools. There are no other remaining uses of assert(0) as a blunt tool within vc4. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-04-08 18:28:43 -07:00
Rhys Kidd	e529dd179f	vc4: Remove unused include from vc4_program.c Found with grep and inspection. Test compiled on RPi hw. Assists any future effort to remove TGSI as an intermediate stage. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-04-08 18:28:43 -07:00
Lars Hamre	e25c24c638	glsl: handle unsigned int wraparound in link_shaders() v2: change check_explicit_uniform_locations() to return an unsigned 0 (Timothy Arceri) We were storing the int result of check_explicit_uniform_locations() in num_explicit_uniform_locs as an unsigned int which caused it to be 4294967295 when a -1 was returned. This in turn would cause the following error during linking: error: count of uniform locations > MAX_UNIFORM_LOCATIONS(4294967295 > 98304) Results from running piglit tests/all with this patch and when ARB_explicit_uniform_location disabled: changes: 178 fixes: 176 regressions: 2 The two regressions are for the following tests: glean@glsl1-matrix column check (1) glean@glsl1-matrix column check (2) which regress from FAIL to CRASH. The regressions are acceptable because the tests are currently failing due to the aforementioned linker error. Signed-off-by: Lars Hamre <chemecse@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-04-09 11:06:04 +10:00
Jason Ekstrand	d4a28ae52a	anv/meta: Make clflushes conditional on !devinfo->has_llc	2016-04-08 17:07:49 -07:00
Jason Ekstrand	c226e72a39	anv/formats: Advertise blit support for stencil Thanks to advances in the blit code, we can do this now. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:59:29 -07:00
Jason Ekstrand	e3312644cb	anv/blit2d: Add support for W-tiled destinations Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-08 15:59:26 -07:00
Jason Ekstrand	0a6842c1bd	isl/surface_state: Set the correct pitch for W-tiled surfaces Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:58:52 -07:00
Jason Ekstrand	2e827816fa	anv/blit2d: Add another passthrough varying to the VS We need the VS to provide some setup data for other stages. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:58:49 -07:00
Jason Ekstrand	b377c1d08e	anv/image: Remove the offset parameter from image_view_init The only place we were using this was in meta_blit2d which always creates a new image anyway so we can just use the image offset. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:58:45 -07:00
Jason Ekstrand	f9a2570a06	anv/blit2d: Add a bind_dst helper function Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:58:42 -07:00
Jason Ekstrand	15a9468d85	anv/blit2d: Simplify create_iview Now it just creates the image and view. The caller is responsible for handling the offset calculations. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:58:40 -07:00
Jason Ekstrand	b8f3909b73	nir/gather_info: Handle discard_if Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:58:36 -07:00
Jason Ekstrand	819d0e1a7c	anv/meta2d: Add support for blitting from W-tiled sources on gen7 Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-08 15:58:03 -07:00
Jason Ekstrand	b0a5ca5cfc	isl: Remove surf_get_intratile_offset_el The intratile offset may not be a multiple of the element size so this calculation is invalid. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:58:01 -07:00
Jason Ekstrand	b37502b983	isl: Rework the get_intratile_offset function The old function tried to work in elements which isn't, strictly speaking, a valid thing to do. In the case of a non-power-of-two format, there is no guarantee that the x offset into the tile is a multiple of the format block size. This commit refactors it to work entirely in terms of a tiling (not a surface) and bytes/rows. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:57:58 -07:00
Jason Ekstrand	4caba94086	anv/image: Expose the guts of CreateBufferView for meta Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:57:55 -07:00
Jason Ekstrand	4ee80e8816	anv/blit2d: Refactor in preparation for different src/dst types Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:57:52 -07:00
Jason Ekstrand	85b9a007ac	anv/blit2d: Add layouts for using a texel buffer source Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:57:49 -07:00
Jason Ekstrand	28eb02e345	anv/blit2d: Rename the descriptor set and pipeline layouts Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:57:47 -07:00
Jason Ekstrand	00e70868ee	anv/blit2d: Enhance teardown and clean up init error paths Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:57:45 -07:00
Jason Ekstrand	43fbdd7156	anv/blit2d: Factor binding the source image into a helper Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:57:43 -07:00
Jason Ekstrand	5187ab05b8	anv/blit2d: Inline meta_emit_blit2d Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:57:41 -07:00
Jason Ekstrand	b0a6cfb9b4	anv/blit2d: Pass the source pitch into the shader Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:57:39 -07:00
Jason Ekstrand	e466164c87	anv/blit2d: Break the texelfetch portion of shader building into a helper Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:57:37 -07:00
Jason Ekstrand	afada45590	anv/blit2d: Fix whitespace Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:57:35 -07:00
Jason Ekstrand	9553fd2c97	anv/blit2d: Fix a NIR writemask Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:57:32 -07:00
Jason Ekstrand	b38a0d64ba	anv/meta2d: Don't declare an array sampler in the fragment shader With the new blit framework we aren't using array textures and, from talking with Nanley, we don't think it's going to be useful in the future either. Just get rid of it for now. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:57:28 -07:00
Jason Ekstrand	dd6f720046	anv/blit2d: Remove the tex_dim parameter from copy_fragment_shader Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-04-08 15:56:52 -07:00
Jason Ekstrand	6cc7aec5b0	i965/tiled_memcopy: Get rid of the direction parameter to get_memcpy Now that we can use the much simpler rgba8_copy function, we don't need to hand different functions out based on direction. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-08 12:09:20 -07:00
Jason Ekstrand	d2b32656e1	i965/tiled_memcpy: Rework the RGBA -> BGRA mem_copy functions This splits the two copy functions into three: One for unaligned copies, one for aligned sources, and one for aligned destinations. Thanks to the previous commit, we are now guaranteed that the aligned ones will only operate on aligned memory so they should be safe. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93962 Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-08 12:09:15 -07:00
Jason Ekstrand	f6f54a29ca	i965/tiled_memcopy: Add aligned mem_copy parameters to the [de]tiling functions Each of the [de]tiling functions has three mem_copy calls: 1) Left edge to tile boundary 2) Tile boundary to tile boundary in a loop 3) Tile boundary to right edge Copies 2 and 3 start at a tile edge so the pointer to tiled memory is guaranteed to be at least 16-byte aligned. Copy 1, on the other hand, starts at some arbitrary place in the tile so it doesn't have any such alignment guarantees. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-04-08 12:08:51 -07:00
Ben Widawsky	e5295b5fb4	i965: Check eu/subslices are > 0 Now that the check is restricted to gen8+, we should always get back a non-zero positive value for the EU and subslice counts. Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-08 11:52:29 -07:00
Ben Widawsky	cc01b63d73	i965: Fix eu/subslice warning Older gen platforms do not actually return a value for sublice and eu total (IMO, confusingly) they return -ENODEV. This patch defers the SSEU setup until we have the actual GPU generation to avoid useless warnings when running on older platforms with older kernels. Reported-by: Mark Janes <mark.a.janes@intel.com> Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-08 11:52:29 -07:00
Ben Widawsky	4213b00e30	i965: Extract SSEU configuration info Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-08 11:51:01 -07:00
Brian Paul	4420f189b6	st/mesa: fix glReadBuffer() assertion failure If the first call in a GL app is glReadPixels(GL_FRONT) we'd fail the assert(st->ctx->FragmentProgram._Current) at st_atom_shader.c:114 in update_fp(). This is because we were calling st_validate_state() without first updating Mesa state with _mesa_update_state(). The regression came from commit `83b589301f` "st/mesa: fix frontbuffer glReadPixels regressions". The new piglit gl-1.0-simple-readbuffer test exercises this. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-08 09:49:05 -06:00
Thomas Hindoe Paaboel Andersen	b9855dcdf7	st/va: avoid dereference after free in vlVaDestroyImage Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Julien Isorce <j.isorce@samsung.com>	2016-04-08 06:57:17 +01:00
Jason Ekstrand	e26a978773	Merge remote-tracking branch 'public/master' into vulkan	2016-04-07 16:56:34 -07:00
Jason Ekstrand	15895bf777	i965/fs: Use the scale helper in surface_builder As requested by Curro	2016-04-07 16:49:09 -07:00
Marek Olšák	1cd19ebc4a	radeonsi: do per-pixel clipping based on viewport states In other words, vport scissors are derived from viewport states. If the scissor test is enabled, the intersection of both is used. The guard band will disable clipping, so we have to clip per-pixel. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-08 00:23:05 +02:00
Samuel Pitoiset	059308db84	nv50/ir: do not try to attach JOIN ops to ATOM This might result in an INVALID_OPCODE dmesg error in case a join is attached to an atomic operation. Spotted with arb_shader_image_load_store-host-mem-barrier on GK104. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-04-07 23:10:26 +02:00
Nicolai Hähnle	2abe4f8d7d	radeonsi: raise number of samplers per shader to 32 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94835 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-07 13:15:06 -05:00
Nicolai Hähnle	9d2693f58a	radeonsi: expand the compressed color and depth texture masks to 64 bits This is in preparation of raising the number of exposed sampler views to 32 bits, which will raise the total number of sampler views to 33 for the polygon stipple texture. That texture should never be compressed (and it's certainly not a depth texture), but this approach seems cleaner to me than special-casing the last slot in all affected code paths. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-07 13:15:06 -05:00
Nicolai Hähnle	f270067ef9	radeonsi: replace magic 16 by SI_NUM_USER_SAMPLERS Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-07 13:15:06 -05:00
Nicolai Hähnle	f09036f6c0	gallium: raise PIPE_MAX_SAMPLERS to 32 The previous value of 18 was motivated by having drivers that want to expose 16 samplers but also use some additional samplers for internal use. Raising the value even higher isn't going to hurt that case. On the other hand, some drivers actually use PIPE_MAX_SAMPLERS as the number of samplers they expose externally, so raising this number above 32 is fragile (because several places in the code use bitfields, and tracking down and widening all of them is prone to miss some case). Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-07 13:15:05 -05:00
Nicolai Hähnle	84c4d069ac	st/glsl_to_tgsi: make samplers_used an uint32_t (v2) It is used as a bitfield, so it seems cleaner to keep it unsigned. The literal 1 is a (signed) int, and shifting into the sign bit is undefined in C, so change occurences of 1 to 1u. v2: add an assert for bitfield size and use 1u << idx Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v1) Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)	2016-04-07 13:15:05 -05:00
Nicolai Hähnle	4bfcc86bf9	tgsi/scan: add an assert for the size of the samplers_declared bitfield The literal 1 is a (signed) int, and shifting into the sign bit is undefined in C, so change occurences of 1 to 1u. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-04-07 13:15:05 -05:00
Nicolai Hähnle	cc39879989	draw/aaline: stronger guard against no free samplers (v2) Line anti-aliasing will fail when there is no free sampler available. Make the corresponding guard more robust in preparation of raising PIPE_MAX_SAMPLERS to 32. The literal 1 is a (signed) int, and shifting into the sign bit is undefined in C, so change occurences of 1 to 1u. v2: add an assert for bitfield size and use 1u << idx Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v1) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1) Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)	2016-04-07 13:15:05 -05:00
Nicolai Hähnle	040f5cb09e	util/pstipple: stronger guard against no free samplers (v2) When hasFixedUnit is false, polygon stippling will fail when there is no free sampler available. Make the corresponding guard more robust in preparation of raising PIPE_MAX_SAMPLERS to 32. The literal 1 is a (signed) int, and shifting into the sign bit is undefined in C, so change occurences of 1 to 1u. v2: add an assert for bitfield size and use 1u << idx Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v1) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1) Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)	2016-04-07 13:15:02 -05:00
Brian Paul	b7e67b2337	svga: new SVGA_MSAA env var to disable/enable MSAA pixel formats On by default. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-07 11:42:43 -06:00
Brian Paul	9f443af449	svga: add some trivial null pointer checks These small mallocs will probably never fail, but static analysis tools may complain about the missing checks. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-07 11:42:43 -06:00
Samuel Pitoiset	60cf2fa477	trace: add missing set_shader_images() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-07 18:52:27 +02:00
Marek Olšák	5fac4887d8	radeonsi: disable perfect ZPASS counts for PIPE_QUERY_OCCLUSION_PREDICATE Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-07 13:58:01 +02:00
Marek Olšák	baa0b3f4cc	radeonsi: don't use the real barrier instruction in tess ctrl shaders Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-07 13:58:01 +02:00
Michel Dänzer	715e97e342	Revert "clover: Fix build against clang SVN >= r265359" This reverts commit `0daab9878d`. The corresponding clang change was reverted. Trivial.	2016-04-07 17:03:09 +09:00
Jason Ekstrand	05db680248	nir/types: Add a wrapper for count_attribute_slots Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-04-07 09:44:11 +02:00
Kristian Høgsberg Kristensen	068935844c	genxml: Add GEN6 genxml Not used yet, but let's put it here for now.	2016-04-06 21:08:34 -07:00
Dave Airlie	828d84c8e2	r600: use radeon_emit in a few more places in evergreen_compute This is just a cleanup of the code. Acked-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-07 04:39:26 +01:00
Dave Airlie	0c40b6f96c	r600: make compute global buffer functions static. This moves things around so that the global buffer handling functions in evergreen_compute.c are static. Acked-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-07 04:39:22 +01:00
Dave Airlie	a5d247dda0	r600: make two compute functions static. These aren't used outside evergreen_compute.c Acked-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-07 04:39:17 +01:00
Dave Airlie	41558efa87	r600: using pipe_grid_info more in evergreen_compute. No reason to pull the pieces apart here, also make one of the functions static as it's unused outside this. Acked-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-07 04:39:13 +01:00
Dave Airlie	a6e17d7d69	r600: in evergreen_compute use ctx consistently instead of ctx_ Acked-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-07 04:39:09 +01:00
Dave Airlie	aeb2be3a2f	r600: use rctx consistently in evergreen_compute.c Another step towards cleaning this up. Acked-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-07 04:39:05 +01:00
Dave Airlie	0560c82ff6	r600: cleanup whitespace in evergreen_compute.c This aligns the code with the style of the rest of the driver. Makes editing it a lot less painful. Acked-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-07 04:38:51 +01:00
Edward O'Callaghan	6fc3e7c988	GL3.txt: Mark ARB_framebuffer_no_attachments as done Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-07 12:03:59 +10:00
Edward O'Callaghan	ea310f2b38	r600g: Enable ARB_framebuffer_no_attachments Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-07 12:03:59 +10:00
Edward O'Callaghan	483a686f80	radeonsi: Enable ARB_framebuffer_no_attachments Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-07 12:03:59 +10:00
Edward O'Callaghan	1156cad405	radeonsi: Improve assert info out of si_set_framebuffer_state() Lets give the developer a little hand if we are going to assert on a zero literal at the end of a branch. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-07 12:03:58 +10:00
Edward O'Callaghan	bb1bd0ddd7	radeonsi: Allow 16 samples MSAA mode for PIPE_FORMAT_NONE For ARB_framebuffer_no_attachment; A is_format_supported() query with 'PIPE_FORMAT_NONE' passed implies a query of the number of samples supported from the framebuffer with no attachment. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-07 12:03:58 +10:00
Edward O'Callaghan	63f2b2f2c0	softpipe: Set samples and layers in set_framebuffer_state() cb Carries across the number of samples and layers state in the 'softpipe_set_framebuffer_state()' callback. This state is part of 'ARB_framebuffer_no_attachments' support. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-07 12:03:58 +10:00
Edward O'Callaghan	c6a514d7df	mesa/st: Update framebuffer state with no.of samples,layers Handle the case of ARB_framebuffer_no_attachment. Also, kill off a dead debug printf() call while we are here. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-07 12:03:58 +10:00
Edward O'Callaghan	7ff28d2af0	gallium/trace: Dump no.of samples and layers in fb state Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-07 12:03:58 +10:00
Edward O'Callaghan	0b7075fed7	gallium: Put no.of {samples,layers} into pipe_framebuffer_state Here we store the number of samples and layers directly in the pipe_framebuffer_state so that in the case of ARB_framebuffer_no_attachment we may make use of them directly. Further, we adjust various gallium/auxiliary helper functions accordingly. V2: Convert branches in util_framebuffer_get_num_layers() and util_framebuffer_get_num_samples() to their canonical form. V3: 'git stash pop' the typo fix of 'cbufs' which should be 'nr_cbufs' that was missing in V2, woops! Thanks Marek for pointing this out yet again. V4: Squash in the following patch: 'gallium/util: Ensure util_framebuffer_get_num_samples() is valid' Upon context creation, internal driver structures are malloc()'ed and memset() to zero them. This results in a invalid number of samples 'by default'. Handle this in the simplest way to avoid elaborate and probably equally sub-optimial solutions. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-07 12:03:58 +10:00
Edward O'Callaghan	b512b5fd36	mesa/st: Set _NumSamples in update_framebuffer_state() Using PIPE_FORMAT_NONE to indicate what MSAA modes are supported with a framebuffer using no attachment. V.2: Rewrite MSAA mode loop to be more general. V.3: Move comment to right place after loop was rewritten. V.4: [airlied] remove unneeded variable, and assert, and unneeded pipe assignment Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-07 12:02:06 +10:00
Edward O'Callaghan	2016e9ffda	gallium: Obtain ARB_framebuffer_no_attachment constants Set default values for the constants required in ARB_framebuffer_no_attachments and obtained the number of layers from ``PIPE_CAP_MAX_TEXTURE_ARRAY_LAYERS``. We also obtain the MaxFramebufferSamples value using a query back to the driver for PIPE_FORMAT_NONE. V.1: Merge if branch predicates into one branch. Move const init into st_init_limits() [airlied: whitespace fixup] Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-07 11:56:44 +10:00
Edward O'Callaghan	4bc9130fba	gallium: Add PIPE_CAP_FRAMEBUFFER_NO_ATTACHMENT Add PIPE_CAP to determine if the GL extension 'GL_ARB_framebuffer_no_attachments' shall be supported. The driver is required to support 'PIPE_FORMAT_NONE' via its 'is_format_supported()' callback in order to determine the MSAA modes the hardware supports so that values requested from the application using 'GL_ARB_framebuffer_no_attachments' may be quantized to what the hardware expects. V.2: Fix doc for a more detailed description of the PIPE_CAP and the corresponding GL constant. V.3: Renamed and repurposed once again. V.4: Remove CAP from cap_mapping array. [airlied: fix damaged whitespace] Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-07 11:56:44 +10:00
Edward O'Callaghan	85f79f0c75	mesa/st: Use _mesa_geometric_ functions appropriately Change references to gl_framebuffer::Width, Height, MaxNumLayers and Visual::samples to use the _mesa_geometric_ convenience functions for those places where the geometry of the gl_framebuffer is needed. This is in contrast to the geometry of the intersection of the attachments of the gl_framebuffer. This patch paves the way to enable GL_ARB_framebuffer_no_attachements for all gallium drivers. V.2: Remove itermeditate variable state. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-07 11:56:35 +10:00
Edward O'Callaghan	b40375a21c	mesa: Add comment to framebuffer_parameteri() V.2: Change 'N.B.,' to 'NOTE:'. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-07 11:55:33 +10:00
Jason Ekstrand	c62db279b6	i965/sf_state: Pull flat_enables out of prog_data Previously, we were walking over the shader source to figure out which inputs should be marked flat. Now, we can just pull it out of prog_data. This is needed for properly setting up 3DSTATE_SF/SBE for Vulkan and it also means that it will get properly cached. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-06 18:08:56 -07:00
Jason Ekstrand	e61cc87c75	i965/fs: Add a flat_inputs field to prog_data Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-06 18:08:56 -07:00
Jason Ekstrand	5c5a9b7bf6	brw/device_info: Add a helper for getting a device name This is needed by the Vulkan driver Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-06 18:08:56 -07:00
Jason Ekstrand	a241ab43b5	i965/fs_surface_builder: Mask signed integers after conversion Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-04-06 18:08:56 -07:00
Jason Ekstrand	3921b64e63	i965/fs: Make the repclear shader support either a uniform or a flat input In the Vulkan driver we use a single flat input instead of a uniform because setting up push constants is more disruptive to the pipeline than setting up another vertex input. This uses the number of uniforms as a key to keep it working for the GL driver. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-06 18:08:50 -07:00
Jason Ekstrand	061969f9dd	i965: Move get_hw_prim_for_gl_prim to brw_util.c It's used by brw_compile_gs in brw_vec4_gs_visitor.cpp so it needs to be in a file that's linked into libi965_compiler.la. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-06 18:08:47 -07:00
Bas Nieuwenhuizen	3393358115	radeonsi: set shader calling conventions Note that old mesa + new LLVM or new mesa + old LLVM breaks with this change and the corresponding LLVM change (D18559). For LLVM version <= 3.8 we use the old method, but we can't detect people using a post 3.8 svn version that is still too old. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-06 21:54:35 +02:00
Marek Olšák	0293d72fa5	drirc: add a workaround for blackness in Warsow Cc: 11.1 11.2 <mesa-stable@lists.freedesktop.org>	2016-04-06 12:53:40 +02:00
Ilia Mirkin	2e123e1a25	glsl: use has_shader_storage_buffer_objects helper Replaces open-coded logic with existing helper. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-04-05 20:27:32 -04:00
Timothy Arceri	5d39f03806	glsl: remove remaining tabs in link_uniform_blocks.cpp Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-04-06 09:56:33 +10:00
Timothy Arceri	7ef57aa685	mesa: remove unused IsShaderStorage field Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-04-06 09:56:28 +10:00
Timothy Arceri	f1293b2f9b	glsl: fully split apart buffer block arrays With this change we create the UBO and SSBO arrays separately from the beginning rather than putting them into a combined array and splitting it apart later. A bug is with UBO and SSBO stage reference querying is also fixed as we now use the block index to lookup the references in the separate arrays not the combined buffer block array. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-04-06 09:56:24 +10:00
Rob Clark	506b561ba7	freedreno/ir3: insert extra move into phi We had an implicit assumption that the phi src was assigned in it's source (pred) block leading into the phi. But this is not true with NIR, so we can't just ignore the source block specified in the nir_phi_src. Insert an extra mov in the source block. If it is not required the CP pass will take it back out again. Fixes: ./tests/spec/glsl-1.10/execution/vs-call-in-nested-loop.shader_test ./tests/spec/glsl-1.10/execution/vs-inner-loop-modifies-outer-loop-var.shader_test and probably others. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-05 15:04:43 -04:00
Rob Clark	f9cdbf4405	freedreno/ir3: eliminate unnecessary absneg's The frontend inserts (abs) and (neg)'s to convert between NIR boolean (~0/0) and native boolean (1/0). So we'd end up with things like: cmps.s.ge r1.x, ... absneg.s r1.x, (neg)r1.x absneg.s r1.x, (abs)r1.x sel.b32 r2.x, r0.x, r1.x, r0.y The (neg) already gets collapsed due to the following (abs). Now by realizing that r1.x comes from a cmps.s instruction, we can drop the (abs) as well. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-05 15:04:25 -04:00
Michel Dänzer	0daab9878d	clover: Fix build against clang SVN >= r265359 Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2016-04-05 17:00:58 +00:00
Bas Nieuwenhuizen	799789ba99	radeonsi: use bounded indexing for samplers Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-05 19:19:18 +02:00
Bas Nieuwenhuizen	713353db18	radeonsi: use bounded indexing for constant buffers Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-05 19:19:07 +02:00
Marek Olšák	a64dbdf612	gallium/radeon: allow multiple exports of the same texture with different usage Instead of failing an assertion, disable DCC and CMASK on the first export that needs it, and merge the external usage flags. v2: clear the EXPLICIT_FLUSH flag if it's not set; whitespace fixes Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-04-05 15:32:40 +02:00
Marek Olšák	25f96d2b97	docs/relnotes: document EGL_KHR_reusable_sync	2016-04-05 15:32:40 +02:00
Dongwon Kim	70299474f5	egl: add EGL_KHR_reusable_sync to egl_dri This patch enables an EGL extension, EGL_KHR_reusable_sync. This new extension basically provides a way for multiple APIs or threads to be excuted synchronously via a "reusable sync" primitive shared by those threads/API calls. This was implemented based on the specification at https://www.khronos.org/registry/egl/extensions/KHR/EGL_KHR_reusable_sync.txt v2 - use thread functions defined in C11/threads.h instead of using direct pthread calls - make the timeout set with reference to CLOCK_MONOTONIC - cleaned up the way expiration time is calculated - (bug fix) in dri2_client_wait_sync, case EGL_SYNC_CL_EVENT_KHR has been added. - (bug fix) in dri2_destroy_sync, return from cond_broadcast call is now stored in 'err' intead of 'ret' to prevent 'ret' from being reset to 'EGL_FALSE' even in successful case - corrected minor syntax problems v3 - dri2_egl_unref_sync now became 'void' type. No more error check is needed for this function call as a result. - (bug fix) resolved issue with duplicated unlocking of display in eglClientWaitSync when type of sync is "EGL_KHR_REUSABLE_SYNC" Signed-off-by: Dongwon Kim <dongwon.kim@intel.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-04-05 15:24:57 +02:00
Rob Clark	3e13572826	freedreno/ir3: deal with duplicate phi sources Otherwise we end up with funny things like: mov.f32f32 r0.x, r1.y mov.f32f32 r0.x, r1.y (It doesn't happen as much after fixing the problem w/ CP into phi src, but it can still happen since we aren't too clever about generating phi sources in the first place.) Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-04 20:18:18 -04:00
Rob Clark	f8feb97ba5	freedreno/ir3: fix silly brain-fart in RA We want to consider all the vars, not 1/32nd of them, when extending live-ranges. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-04 20:18:18 -04:00
Rob Clark	8e451c2d06	freedreno/ir3: don't cp into phi's The block defining a phi source might not have been executed. If we allow copy propagation, we could end up pointing to a src instruction in the wrong block. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-04 20:18:18 -04:00
Rob Clark	383b6e87f9	freedreno/ir3: we can't store immediate values Fixes some transform-feedback piglits, like: bin/ext_transform_feedback-nonflat-integral Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-04 20:18:18 -04:00
Rob Clark	d47fb856af	freedreno/ir3: add dumping for use/def/live-in/live-out Turned out to be useful to debug an issue in RA. Let's keep it. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-04 20:18:18 -04:00
Rob Clark	38ae05a340	freedreno/ir3: drop unused instr category arg No longer used, so drop the extra arg to ir3_instr_create() Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-04 20:18:18 -04:00
Rob Clark	19739e4fb9	freedreno/ir3: remove ir3_instruction::category Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-04 20:18:18 -04:00
Rob Clark	70735643f4	freedreno/ir3: encode instruction category in opc_t Been on my TODO list for a while. If nothing else this will make gdb properly grok the opc_t enum. This first step preserves ir3_instruction::category (with an added assert that category matches what is encoded in opc_t). Next step is to drop the category field (and arg to ir3_instr_create()), but that is split into next commit for bisectability and so that we can run piglit in the intermediate state to flush out any problems. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-04 20:18:18 -04:00
Jason Ekstrand	5ea3647f89	i965/fs: Move the code for load/store_shared to emit_cs_intrinsic They are compute-shader only and that's where the code for doing atomics on shared variables lives so it seemes to make sense. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-04-04 15:56:50 -07:00
Jason Ekstrand	80c72a8ea7	i965/nir: Provide a default LOD for buffer textures Our hardware requires an LOD for all texelFetch commands even if they are on buffer textures. GLSL IR gives us an LOD of 0 in that case, but the LOD is really rather meaningless. This commit allows other NIR producers to be more lazy and not provide one at all. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-04-04 15:56:39 -07:00
Jason Ekstrand	e5c833db5a	i965/compiler: Remove a redundant declaration of brw_compiler_create	2016-04-04 14:51:35 -07:00
Kenneth Graunke	3babb7b0a4	nir: Use PRIi64 and PRIu64 instead of %ld and %lu. %ld and %lu aren't the right format specifiers for int64_t and uint64_t on 32-bit (x86) systems. They're %zu on Linux and %Iu on Windows. Use the standard C99 macros in hopes that they work everywhere. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-04 14:38:48 -07:00
Kenneth Graunke	da5d08707b	i965: Fix invalid pointer read in dead_control_flow_eliminate(). There may not be a previous block. In this case, there's no real work to do, so just continue on to the next one. v2: Update for bblock->prev() API change. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-04 14:34:40 -07:00
Kenneth Graunke	9486614938	i965: Make bblock_t::next and friends return NULL at sentinels. The bblock_t::prev/prev_const/next/next_const API returns bblock_t pointers, rather than exec_nodes. So it's a bit surprising. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-04 14:34:16 -07:00
Kenneth Graunke	5509d43a11	glsl: Lower variable indexing of system value arrays unconditionally. lower_variable_index_to_cond_assign() did not handle system values. gl_SampleMaskIn[] is a system value, and also an array. Accessing it with a variable index would trigger an unreachable() assert. Rather than adding a new EmitNoIndirectSystemValues flag, we simply lower unconditionally. There is exactly one case where this occurs, and for all current drivers, lowering produces optimal code. Even for future drivers with 32x MSAA, it produces reasonable code. Fixes Piglit's new samplemaskin-indirect test. Also fixes many ES31-CTS tests when OES_sample_variables is enabled. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-04-04 14:29:21 -07:00
Jason Ekstrand	db35a851ad	i965/defines: Unconditionally define primitives	2016-04-04 14:25:36 -07:00
Jason Ekstrand	6a04968784	Merge remote-tracking branch 'public/master' into vulkan	2016-04-04 13:58:05 -07:00
Jason Ekstrand	88ef2476dc	i965/peephole_ffma: Only match a mul+add if none of the ops are exact Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-04-04 13:48:10 -07:00
Jason Ekstrand	eb93d6dec8	nir/search: Don't match inexact expressions with exact subexpressions In the first pass of implementing exact handling, I made a mistake with search-and-replace. In particular, we only reallly handled exact/inexact on the root of the tree. Instead, we need to check every node in the tree for an exact/inexact match. As an example of this, consider the following GLSL code precise float a = b + c; if (a < 0) { do_stuff(); } In that case, only the add will be declared "exact" and an expression that looks for "b + c < 0" will still match and replace it with "b < -c" which may yield different results. The solution is to simply bail if any of the values are exact when matching an inexact expression. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-04-04 13:48:10 -07:00
Jason Ekstrand	fe247bbe92	nir: Stop double-printing function arguments	2016-04-04 12:10:20 -07:00
Jason Ekstrand	cb317b8d07	glsl: Stop force-enabling compute shaders This isn't needed since we no longer use the GLSL compiler in Vulkan.	2016-04-04 12:09:12 -07:00
Jason Ekstrand	4d040a4ad3	glsl/standalone: Get rid of the unneeded _mesa_error_no_memory stub This hasn't been needed since we stopped using the GLSL compiler in the Vulkan driver and it was tripping up scons. Removing it fixes the scons build.	2016-04-04 12:07:51 -07:00
Kenneth Graunke	65fbc43d54	i965: Add an INTEL_PRECISE_TRIG=1 option to fix SIN/COS output range. The SIN and COS instructions on Intel hardware can produce values slightly outside of the [-1.0, 1.0] range for a small set of values. Obviously, this can break everyone's expectations about trig functions. According to an internal presentation, the COS instruction can produce a value up to 1.000027 for inputs in the range (0.08296, 0.09888). One suggested workaround is to multiply by 0.99997, scaling down the amplitude slightly. Apparently this also minimizes the error function, reducing the maximum error from 0.00006 to about 0.00003. When enabled, fixes 16 dEQP precision tests dEQP-GLES31.functional.shaders.builtin_functions.precision. {cos,sin}.{highp,mediump}_compute.{scalar,vec2,vec4,vec4}. at the cost of making every sin and cos call more expensive (about twice the number of cycles on recent hardware). Enabling this option has been shown to reduce GPUTest Volplosion performance by about 10%. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-04 11:35:16 -07:00
Jason Ekstrand	8c8157bf6f	Remove more spirv2nir remnants	2016-04-04 11:24:48 -07:00
Kenneth Graunke	3aa51e02d6	i965: Allow 8x MSAA on >= 64bpp formats on Gen8+. See commit `3b0279a69` - this restriction is documented in the "Surface Format" field of RENDER_SURFACE_STATE. Looking at newer documentation, this restriction appears to exist on Haswell, but no longer applies on Gen8+. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-04-04 10:41:29 -07:00
Brian Paul	1eeec7ec41	docs: remove stray 'TBD' in 11.2.0 relnotes file	2016-04-04 10:33:11 -06:00
Emil Velikov	35132c413c	docs: add news item and link release notes for 11.2.0 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-04 12:57:56 +01:00
Emil Velikov	dc4923d41f	docs: add sha256 checksums for 11.2.0 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `e7fb889dcc`)	2016-04-04 12:55:55 +01:00
Emil Velikov	7dc11ed0b2	docs: Update 11.2.0 release notes Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `ff9ddb9eb1`)	2016-04-04 12:55:54 +01:00
Dave Airlie	f9b8b48bed	mesa/get: fix MAX_GEOMETRY_SHADER_STORAGE_BLOCKS this was returning the fragment shader value. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-04 10:52:25 +01:00
Ilia Mirkin	4bc3b1ca48	nvc0: add hardware ETC2 and ASTC support on GK20A and GM107+ Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-04 00:32:48 -04:00
Ilia Mirkin	dab40d8083	docs: add note about GL_EXT_base_instance, sort entries Trivial. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-03 21:18:17 -04:00
Ilia Mirkin	d76e1cd2dd	mesa: expose EXT_base_instance in ES3 contexts This extension is identical to ARB_base_instance. Reuse the same entrypoints. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-03 20:40:55 -04:00
Ilia Mirkin	807e2c27ac	mesa: expose EXT_polygon_offset_clamp in ES contexts The extension spec was extended to also support ES. This functionality is provided all the way back to ES 1.0. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-03 20:40:55 -04:00
Kenneth Graunke	40628886ca	glsl: Print "precise" on ir_variable nodes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-04-03 17:33:38 -07:00
Jose Fonseca	7ad49daca6	gallivm: Introduce lp_format_intrinsic. For adding .v4f32 like suffixes to intrinsics, taking special care for scalar case, which was being often neglected. This fixes invalid IR when doing mipmap filtering on SSE2 (the only case where we'd use intrinsics with scalars.) Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-04 00:06:09 +01:00
Ilia Mirkin	7af12a8dc6	glsl: make sampler2DMSArray available in ESSL 3.20 Also avoid double-adding the sampler2DMS types when the array ext is enabled. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-03 18:06:52 -04:00
Ilia Mirkin	aebb0e0186	glsl: make ssbo predicate return true when in a GLSL 430 or ESSL 310 shader I can't tell whether this actually matters, but we're creating function signatures with this predicate, so it should probably match when SSBO's are available. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-03 18:06:49 -04:00
Ilia Mirkin	87906cbc37	glsl: allow conservative depth qualifiers in GLSL 420 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-03 18:06:35 -04:00
Ilia Mirkin	d50ffb5e46	mesa: add always-false-for-now enables for GL 4.3, 4.4, 4.5. As the relevant extensions get implemented, the lines should be uncommented. I believe this is (almost) everything needed for those GL versions though. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-03 18:01:15 -04:00
Ilia Mirkin	9abbc49712	glsl: add ARB_ES3_1_compatibility support Oddly a bunch of the features it adds are actually from ESSL 3.20. But the spec is quite clear, oh well. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-03 18:01:15 -04:00
Ilia Mirkin	1708e24f65	mesa: add ES3_1_compatibility extension enable Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-03 18:01:15 -04:00
Jose Fonseca	a293f57e13	gallivm: Use llvm.fabs. Exactly the same code. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-03 22:09:09 +01:00
Jose Fonseca	e4f01da15d	gallivm: Prefer backend agnostic intrinsic for rounding. We could unconditionally use these instrinsics, but performance with SSE2 would suck, as LLVM falls back to calling libm. lp_test_arit. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-03 22:09:07 +01:00
Jose Fonseca	324451e73f	gallivm: Add debug option to force SSE2. For simulating less capable machines. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-03 22:08:57 +01:00
Jose Fonseca	5fa31a4aba	llvmpipe: Test abs. Trivial.	2016-04-03 11:17:20 +01:00
Jose Fonseca	522ebe701d	llvmpipe: Build lp_test_arit on MSVC too. It builds fine now. Probably due to C99 support. Trivial.	2016-04-03 11:17:20 +01:00
Jose Fonseca	b284f1f7f9	gallivm: Fix performance regressions due to vector selects. LLVM often can't determine the mask elements are all ones/zeros, and there doesn't seem to be a good way to hint that. Thanks to Roland Scheidegger for spotting and analyzing the issue. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-03 09:51:27 +01:00
Jose Fonseca	11c4e5b45c	gallivm: Remove lp_build_load_volatile. No longer needed. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-03 09:51:27 +01:00
Jose Fonseca	bcfb86b09d	gallivm: Use standard LLVMSetAlignment from LLVM 3.4 onwards. Only provide a fallback for LLVM 3.3. One less dependency on LLVM C++ interface. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-03 09:51:27 +01:00
Timothy Arceri	6d54096fa6	mesa: remove unrequired else The if always returns so no need for an else. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-04-03 09:55:19 +10:00
Ilia Mirkin	d64134ecae	gm107/ir: add OP_SELP emission, used in DSQRT lowering The current DSQRT lowering code emits an OP_SELP, so we have to handle its emission. This will eventually go away, but no harm supporting this op. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-02 19:27:51 -04:00
Ilia Mirkin	3610b1466d	nv50/ir: we can't load local memory directly into an output This fixes piglit tests like tests/spec/glsl-1.10/execution/variable-indexing/vs-output-array-float-index-wr.shader_test and related ones. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-04-02 18:10:20 -04:00
Christian Schmidbauer	2a529a8ac8	st/nine: specify WINAPI only for i386 and amd64 Currently mesa fails building with the x32 abi as ms_abi is not defined in such a case. The patch uses ms_abi only for amd64 targets and stdcall only for i386 targets to be sure that those are defined. This patch additionally checks for __GNUC__ to guarantee that __attribute__ is available. CC: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Christian Schmidbauer <ch.schmidbauer@gmail.com> Acked-by: Axel Davy <axel.davy@ens.fr>	2016-04-02 23:30:40 +02:00
Samuel Pitoiset	0852c5703b	nv50/ir: fix envyas variants when building the code lib nvc0 and nve4 have been respectively replaced by gf100 and gk104. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-02 20:00:57 +02:00
Brian Paul	36d8fed798	svga: remove unused svga_compile_key::texture_msaa field Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-02 08:05:20 -06:00
Brian Paul	b283c76342	svga: check TXF instruction's target to determine MSAA Rather than the currently bound texture. This goes along with the earlier patch to get away from examining bound textures and sampler views during shader translation. Fixes VMware bug 1632739. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-02 08:05:20 -06:00
Brian Paul	ef10b5427a	tgsi: add simple tgsi_is_msaa_target() helper Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-02 08:05:20 -06:00
Timothy Arceri	070e5a7405	glsl: rename var and simplify if is_ubo_var is true for both UBOs and SSBOs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-02 17:10:56 +11:00
Timothy Arceri	0fbd073dc2	glsl: store ubo or ssbo index in block index Previously we store the buffer block index i.e the index of a combined ubo/ssbo list. Fixes several dEQP-GLES31.functional tests: - program_interface_query.uniform.block_index.block_array - program_interface_query.uniform.block_index.named_block - program_interface_query.uniform.block_index.unnamed_block - program_interface_query.uniform.random.10 - program_interface_query.uniform.random.15 - program_interface_query.uniform.random.22 - program_interface_query.uniform.random.24 - program_interface_query.uniform.random.26 - program_interface_query.uniform.random.28 - program_interface_query.uniform.random.3 - program_interface_query.uniform.random.31 - program_interface_query.uniform.random.38 - program_interface_query.uniform.random.5 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94116 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-02 17:10:56 +11:00
Timothy Arceri	1265e1c4e1	glsl: store stage reference in gl_uniform_block This allows us to simplify the code and drop InterfaceBlockStageIndex which is a per stage array of integers the size of all blocks in the program combined including duplicates across stages. Adding a stage ref per block will use less memory. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-02 17:10:56 +11:00
Timothy Arceri	d8855d66f4	glsl: simplify buffer block resource limit checking This changes the code to use the buffer counts stored for each stage rather than counting from scratch. It also moves the checks outside of the for loop which means we now just get a single link error message if we go over the max rather than X error messages where X is the number we have exceeded the max by. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-02 17:10:56 +11:00
Timothy Arceri	0082b33a78	glsl: simplify SSBO resources check We already have a count of active SSBOs per stage so use it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-02 17:10:56 +11:00
Timothy Arceri	3e74bf5b9d	glsl: split buffer block arrays earlier This will allow us to use them when checking resources in a following patch and clean up a bunch of code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-02 17:10:56 +11:00
Timothy Arceri	0163881528	glsl: only set buffer block binding once during initialisation Since `8683d54d2b` there is now a single instance of the buffer block information that needs to be updated rather than one instance for each stage. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-02 17:10:56 +11:00
Kenneth Graunke	94ed482c19	glsl: Fix prorgram interface query locations biasing for SSO. With SSO, the GL_PROGRAM_INPUT and GL_PROGRAM_OUTPUT interfaces refer to the first and last shader stage linked into a program. This may not be the vertex and fragment shader stages. So, subtracting VERT_ATTRIB_GENERIC0 and FRAG_RESULT_DATA0 is bogus. We need to subtract VERT_ATTRIB_GENERIC0 for VS inputs, FRAG_RESULT_DATA0 for FS outputs, and VARYING_SLOT_VAR0 for other cases. Note that built-in variables get a location of -1. Fixes 4 dEQP-GLES31.functional.program_interface_query tests: - program_input.location.separable_fragment.var_explicit_location - program_input.location.separable_fragment.var_array_explicit_location - program_output.location.separable_vertex.var_array_explicit_location - program_output.location.separable_vertex.var_array_explicit_location Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-04-01 22:05:20 -07:00
Kenneth Graunke	c123294dfe	glsl: Return -1 for program interface query locations in many cases. We were recording locations for all variables, even ones without an explicit location set. Implement the rules from the spec, and record -1 in the resource list accordngly. Make program_resource_location stop doing math on negative values. Remove hacks that are no longer necessary now that we've stopped doing that. Fixes 4 dEQP-GLES31.functional.program_interface_query tests: - program_input.location.separable_fragment.var - program_input.location.separable_fragment.var_array - program_output.location.separable_vertex.var_array - program_output.location.separable_vertex.var_array v2: Delete more code Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-04-01 22:05:20 -07:00
Kenneth Graunke	9fe211bec4	glsl: Consolidate gl_VertexIDMESA -> gl_VertexID query hacks. A program will either have gl_VertexID or gl_VertexIDMESA (the lowered zero-based version), not both. Just spoof it in the resource list so the hacks are done in a single place. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-04-01 22:05:20 -07:00
Kenneth Graunke	013f25c3b3	glsl: Clean up some leftover cruft. stages is always 1 << stage now. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-04-01 22:05:20 -07:00
Kenneth Graunke	98c22c0403	glsl: Add all system variables to the input resource list. System values are just built-in input variables that we've opted to special-case out of convenience. We need to consider all inputs, regardless of how we've classified them. Unfortunately, there's one exception: we shouldn't add gl_BaseVertex unless ARB_shader_draw_parameters is enabled, because it doesn't actually exist in the language, and shouldn't be counted in the GL_ACTIVE_RESOURCES query. Fixes dEQP-GLES31.functional.program_interface_query.program_input. resource_list.compute.empty, which expects gl_NumWorkGroups to appear in the resource list. v2: Delete more code Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-04-01 22:05:18 -07:00
Kenneth Graunke	6e8b9d5bdd	glsl: Delete hack for VS system values. This makes no sense. If the stage being considered is the vertex shader, then we'll add inputs and system values appropriately. If we're not considering the vertex shader, then we absolutely should not do anything with it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-04-01 21:58:25 -07:00
Kenneth Graunke	47daf17da0	glsl: Make add_interface_variables only consider the appropriate stage. add_interface_variables() is supposed to add variables for the inputs of the first shader stage linked into a program, and the outputs of the last shader stage linked into a program. From the ARB_program_interface_query specification: "* PROGRAM_INPUT corresponds to the set of active input variables used by the first shader stage of <program>. If <program> includes multiple shader stages, input variables from any shader stage other than the first will not be enumerated. * PROGRAM_OUTPUT corresponds to the set of active output variables (section 2.14.11) used by the last shader stage of <program>. If <program> includes multiple shader stages, output variables from any shader stage other than the last will not be enumerated." Previously, we used build_stageref here, which walks over all linked shaders in the program. This meant that internal varyings would be visible. We don't actually need any of build_stageref's code: we already explicitly skip packed varyings, handle modes, and the name comparisons just do a fuzzy string comparison of name with itself. Fixes two tests: dEQP-GLES31.functional.program_interface_query. program_{input,output}.referenced_by.referenced_by_vertex_fragment. These tests have a VS and FS linked together into a single program. Both stages have an input called "shaderInput". But the FS input should not be visible because it isn't the first stage. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-04-01 21:58:25 -07:00
Kenneth Graunke	998ef1ad71	glsl: Clarify "mask" variable in add_interface_variables(). This is a bitfield of which stages refer to a variable. It is not used to mask off bits. In fact, it's used to contribute additional bits. Rename it and tidy a bit of the logic. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-04-01 21:58:25 -07:00
Kenneth Graunke	356c99b4e7	glsl: Pass stage to add_interface_variables(). add_interface_variables is supposed to add variables from either the first or last stage of a linked shader. But it has no way of knowing the stage it's being asked to process, which makes it impossible to produce correct stagerefs. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-04-01 21:58:25 -07:00
Kenneth Graunke	2c5afe1fa9	glsl: Make vertex ID lowering declare gl_BaseVertex as hidden. If the GL_ARB_shader_draw_parameters extension is enabled, we'll already have a gl_BaseVertex variable. It will have var->how_declared set to ir_var_declared_implicitly, and will appear in the program resource list. If not, we make one for internal use. We don't want it to be listed in the program resource list, as the application won't be expecting it. Marking it hidden will properly exclude it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 21:58:22 -07:00
Kenneth Graunke	33df1c2935	glsl: Exclude ir_var_hidden variables from the program resource list. We occasionally generate variables internally that we want to exclude from the program resource list, as applications won't be expecting them to be present. The next patch will make use of this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 21:56:43 -07:00
Kenneth Graunke	15cd3ebede	mesa: Make _mesa_choose_tex_format() handle stencil textures. This is necessary for ARB_texture_stencil8 support on classic drivers. Presumably Gallium works because it implements its own ChooseTexFormat. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-01 19:04:28 -07:00
Jordan Justen	ef1b397b07	glsl: Don't require matching centroid qualifiers Note: This patch appears to violate older OpenGL and OpenGLES specs. The OpenGLES GLSL 3.1 and OpenGL GLSL 4.3 specifications both remove the requirement for the output and input centroid qualifiers to match. The deqp dEQP-GLES3.functional.shaders.linkage.varying.rules.differing_interpolation_2 test wants the newer OpenGLES 3.1 specification behavior, even for OpenGLES 3.0. This patch simply removes the checking in all cases. The OpenGLES 3.0 conformance test suite doesn't appear to require the older ("must match") spec behavior. For reference, here are the relavent spec citations: The OpenGL 4.2 spec says: "the last active shader stage output variables and fragment shader input variables of the same name must match in type and qualification (other than out matching to in)" The OpenGL 4.3 spec says: "interpolation qualification (e.g., flat) and auxiliary qualification (e.g. centroid) may differ." The OpenGLES GLSL 3.00.4 specification says: "The output of the vertex shader and the input of the fragment shader form an interface. For this interface, vertex shader output variables and fragment shader input variables of the same name must match in type and qualification (other than precision and out matching to in)." The OpenGLES GLSL 3.10 Specification says: "interpolation qualification (e.g., flat) and auxiliary qualification (e.g. centroid) may differ" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92743 Bugzilla: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=7819 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Cc: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-01 18:06:19 -07:00
Bas Nieuwenhuizen	1a5c8c24b5	gallium: distinguish between shader IR in get_compute_param For radeonsi, native and TGSI use different compilers and this results in different limits for different IR's. The set we strictly need for radeonsi is only the MAX_BLOCK_SIZE and MAX_THREADS_PER_BLOCK params, but I added a few others as shader related that seemed like they would also typically depend on the compiler. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-02 01:51:13 +02:00
Bas Nieuwenhuizen	be5899dcf9	gallium: add global buffer memory barrier bit Currently radeonsi synchronizes after every dispatch and Clover does nothing to synchronize. This is overzealous, especially with GL compute, so add a barrier for global buffers. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-02 01:51:06 +02:00
Bas Nieuwenhuizen	01f993a21f	gallium: add threads per block TGSI property The value 0 for unknown has been chosen to so that drivers using tgsi_scan_shader do not need to detect missing properties if they zero-initialize the struct. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-02 01:50:59 +02:00
Bas Nieuwenhuizen	ea8f4a6b13	gallium: add compute shader IR type Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-02 01:49:57 +02:00
Timothy Arceri	5ea825f556	glsl: remove tabs and fix some other style issues in glcpp-parse.y Note there are still tabs left in the parser rules. Acked-by: Dave Airlie <airlied@redhat.com>	2016-04-02 10:32:01 +11:00
Jason Ekstrand	cc1320220f	nir/gather_info: Add an assert for supported stages	2016-04-01 15:44:43 -07:00
Jason Ekstrand	ebb0bcc11d	nir: Move variable_get_io_mask back into gather_info It used to be in nir_gather_info.c until I moved it out to nir.h so it could be re-used with some linking code that never got merged. We'll move it back out if and when we have real code to share it with.	2016-04-01 15:39:48 -07:00
Jason Ekstrand	95106f6bfb	Merge remote-tracking branch 'public/master' into vulkan	2016-04-01 15:16:21 -07:00
Jason Ekstrand	14c46954c9	i965: Add an implemnetation of nir_op_fquantize2f16 Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-01 13:52:56 -07:00
Jason Ekstrand	de60e250f5	nir: Add an opcode for stomping a 32-bit value to 16-bit precision This correlates directly to the SPIR-V opcode OpQuantizeToF16 Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-04-01 13:52:28 -07:00
Samuel Pitoiset	60e1c6a7fc	nvc0: enable compute shaders on GK104 and GM107+ Compute support on GK110 is still unstable for weird reasons, but this can be fixed later as the NVF0_COMPUTE envvar prevent using compute. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 22:26:24 +02:00
Samuel Pitoiset	71f327aa21	nvc0: bump the maximum number of UBOs for compute on Kepler The maximum number of uniform blocks (MAX_COMPUTE_UNIFORM_BLOCKS) per compute program must be at least 12. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 22:26:24 +02:00
Samuel Pitoiset	839a469166	nvc0/ir: do not lower shared+atomics on GM107+ For Maxwell, the ATOMS instruction can be used to perform atomic operations on shared memory instead of this load/store lowering pass. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 22:26:24 +02:00
Samuel Pitoiset	543fb95473	nvc0/ir: add atomics support on shared memory for Kepler Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 22:26:24 +02:00
Samuel Pitoiset	275019d7db	nvc0/ir: fix wrong pred emission for ld lock on GK104 This fixes `84b9b8f` (nvc0/ir: add missing emission of locked load predicate). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 22:26:24 +02:00
Samuel Pitoiset	4f58b78c30	nvc0/ir: add support for compute UBOs on Kepler Make sure to avoid out of bounds access in presence of indirect array indexing by loading the size from the driver constant buffer. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 22:26:24 +02:00
Samuel Pitoiset	3b246a71d7	nvc0: add indirect compute support on Kepler The grid size is stored as three 32-bits integers in the indirect buffer but the launch descriptor uses a 32-bits integer for both griddim_y and griddim_z like this (z << 16) \| y. To make it work, the 16 high bits of griddim_y are overwritten by griddim_z. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 22:26:24 +02:00
Samuel Pitoiset	7797d5f7d9	nvc0: reduce likelihood of collision for real buffers on Kepler Reduce likelihood of collision with real buffers by placing the hole at the top of the 4G area. This fixes some indirect draw+compute tests with large buffers. Suggested by Ilia Mirkin. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 22:26:24 +02:00
Samuel Pitoiset	e2e8085fac	nvc0: store ubo info to the driver constbuf on Kepler Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 22:26:24 +02:00
Samuel Pitoiset	12aa047c98	nvc0: bind user uniforms for compute on Kepler Uniform buffer objects will be sticked to the driver constant buffer like buffers because the launch descriptor only allows 8 CBs. Input kernel parameters for OpenCL are still uploaded to screen->parm which is bound on c0, but this will be changed later with a new series. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 22:26:24 +02:00
Samuel Pitoiset	1828d90a00	nvc0: bind shader buffers for compute on Kepler Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 22:26:24 +02:00
Samuel Pitoiset	debd910512	nvc0: bind driver cb for compute on c7[] for Kepler Instead of using the screen->parm buffer object which will be removed, upload auxiliary constants to uniform_bo to be consistent regarding what we already do for Fermi. This breaks surfaces support (for compute only) but this will be properly re-introduced later for ARB_shader_image_load_store. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 22:26:24 +02:00
Jose Fonseca	f72de6f386	gallivm: Prevent disassembly debug output from being truncated. By using os_log_message directly, as _debug_vprintf truncates messages to 4K. Also cleanup the disassemble interface. Spotted by Roland. Trivial.	2016-04-01 21:22:42 +01:00
Rob Clark	972054f5bf	compiler: random comment fixup Just noticed this in passing.. gl_shader_stage already has tess so this comment no longer applies. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-01 12:34:40 -04:00
Brian Paul	58557b345c	docs: minor updates to license.html file Mesa demos are no longer part of the main Mesa tree/tarball. Add Gallium and GLX code to list of major components. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-01 09:50:08 -06:00
Mauro Rossi	e09d04cd56	radeonsi: use util_strchrnul() to fix android build error Android Bionic does not support strchrnul() string function, gallium auxiliary util/u_string.h provides util_strchrnul() This change avoids the following building error: external/mesa/src/gallium/drivers/radeonsi/si_shader.c:3863: error: undefined reference to 'strchrnul' collect2: error: ld returned 1 exit status Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-01 13:56:57 +01:00
Rob Herring	952720ccee	egl: android: enable EGL_FRAMEBUFFER_TARGET_ANDROID and EGL_RECORDABLE_ANDROID Set EGL_FRAMEBUFFER_TARGET_ANDROID and EGL_RECORDABLE_ANDROID config attributes to true for Android. These are required in Marshmallow. The implementation of EGL_RECORDABLE_ANDROID support has 2 options in the definition of the extension. Android implements the 2nd option which is the encoder must support RGB input. The requested input format is RGB888, so setting the attribute on all the native Android visual formats should be sufficient. Similarly, setting EGL_FRAMEBUFFER_TARGET_ANDROID for all configs with a EGL_NATIVE_VISUAL_ID should be sufficient. Most likely, the HWC should support the same set of formats the underlying DRM driver supports. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-01 13:45:13 +01:00
Rob Herring	e21e81aa18	egl: Add EGL_RECORDABLE_ANDROID attribute This is used by Android to select an eglconfig compatible with screen recording. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Rob Herring <robh@kernel.org> [Emil Velikov: add the _eglIsConfigAttribValid check] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-01 13:45:08 +01:00
Rob Herring	8975527f58	egl: Add EGL_FRAMEBUFFER_TARGET_ANDROID attribute This is used by Android to select an eglconfig compatible with HWComposer. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Rob Herring <robh@kernel.org> [Emil Velikov: add the _eglIsConfigAttribValid check] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-01 13:44:25 +01:00
Rob Herring	2d9e0f24e1	Android: fix x86 gallium builds Builds with gallium enabled fail on x86 with linker error: external/mesa3d/src/mesa/vbo/vbo_exec_array.c:127: error: undefined reference to '_mesa_uint_array_min_max' The problem is sse_minmax.c is not included in the libmesa_st_mesa library. Since the SSE4.1 files are needed for both libmesa_st_mesa and libmesa_dricore, move SSE4.1 files into a separate static library that can be used by both. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-01 13:44:22 +01:00
Jose Fonseca	cdf7c6b83d	gallivm: Use vector selects on LLVM 3.3+. This is an old patch I had around. Vector selects seem to work well from LLVM 3.3. Using them should improve code quality, as it might make constant propagation pass more effective. Tested lp_test_* Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-01 09:05:19 +01:00
Alejandro Piñeiro	cd7d631c71	glsl: do not raise unitialized variable warnings on builtins/reserved GL variables Needed because not all the built-in variables are marked as system values, so they still have the mode ir_var_auto. Right now it fixes raising the warning when gl_GlobalInvocationID and gl_LocalInvocationIndex are used. v2: use is_gl_identifier instead of filtering for some names (Ilia Mirkin) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-01 09:54:09 +02:00
Ilia Mirkin	df03be196a	nv50,nvc0: add PIPE_BIND_LINEAR support to is_format_supported vdpau has recently come to rely on this, so make sure to check it properly. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-31 21:53:11 -04:00
Ilia Mirkin	e0e1683087	mesa: add GL_OES/EXT_draw_buffers_indexed support This is the same ext as ARB_draw_buffers_blend (plus some core functionality that already exists). Add the alias entrypoints. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 21:12:49 -04:00
Kenneth Graunke	a57320a9ba	i965: Use brw->urb.min_vs_urb_entries instead of 32 for BLORP. Haswell GT2 and GT3 have a minimum of 64 entries. Hardcoding 32 is not legal. v2: Delete stale comment (caught by Alejandro). Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-31 16:45:07 -07:00
Kenneth Graunke	58d4751fa0	i965: Fix textureSize() depth value for 1 layer surfaces on Gen4-6. According to the Sandybridge PRM's description of the resinfo message, the .z value returned will be Depth == 0 ? 0 : Depth + 1. The earlier PRMs have the same table. This means we return 0 for array textures with a single slice, when we ought to return 1. Just override it to max(depth, 1). Fixes 10 dEQP-GLES3.functional tests on Sandybridge: shaders.texture_functions.texturesize.sampler2darray_fixed_vertex shaders.texture_functions.texturesize.sampler2darray_fixed_fragment shaders.texture_functions.texturesize.sampler2darray_float_vertex shaders.texture_functions.texturesize.sampler2darray_float_fragment shaders.texture_functions.texturesize.isampler2darray_vertex shaders.texture_functions.texturesize.isampler2darray_fragment shaders.texture_functions.texturesize.usampler2darray_vertex shaders.texture_functions.texturesize.usampler2darray_fragment shaders.texture_functions.texturesize.sampler2darrayshadow_vertex shaders.texture_functions.texturesize.sampler2darrayshadow_fragment Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-31 15:23:49 -07:00
Ian Romanick	08ff5f4d1f	nir: Simplify a bcsel to logical-or Oddly, this did not affect the shader where I first noticed the pattern. That particular shader doesn't get its if-statement converted to a bcsel because there are two assignments in the else-statement. This led to me submitting https://bugs.freedesktop.org/show_bug.cgi?id=94747. shader-db results: Sandy Bridge total instructions in shared programs: 8467384 -> 8467069 (-0.00%) instructions in affected programs: 36594 -> 36279 (-0.86%) helped: 46 HURT: 0 total cycles in shared programs: 117573448 -> 117568518 (-0.00%) cycles in affected programs: 339114 -> 334184 (-1.45%) helped: 46 HURT: 0 Ivy Bridge / Haswell / Broadwell / Skylake: total instructions in shared programs: 7774258 -> 7773999 (-0.00%) instructions in affected programs: 30874 -> 30615 (-0.84%) helped: 46 HURT: 0 total cycles in shared programs: 65739190 -> 65734530 (-0.01%) cycles in affected programs: 180380 -> 175720 (-2.58%) helped: 45 HURT: 1 No change on G45 or Ironlake. I also tried these expressions, but none of them affected any shaders in shader-db: (('bcsel', a, 'a@bool', 'b@bool'), ('ior', a, b)), (('bcsel', a, 'b@bool', False), ('iand', a, b)), (('bcsel', a, 'b@bool', 'a@bool'), ('iand', a, b)), Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-03-31 14:59:36 -07:00
Ian Romanick	cdea12bf03	ptn: Fix all users of ptn_swizzle None of the callers actually wanted what it did. In ptn_xpd, you only ever want a vec3 swizzle. In ptn_tex, you want a swizzle that matches the number of required texture coordinates. shader-db results: G45: total instructions in shared programs: 4011240 -> 4010911 (-0.01%) instructions in affected programs: 59232 -> 58903 (-0.56%) helped: 114 HURT: 0 total cycles in shared programs: 84314194 -> 84313220 (-0.00%) cycles in affected programs: 779150 -> 778176 (-0.13%) helped: 110 HURT: 13 Ironlake: total instructions in shared programs: 6397262 -> 6396605 (-0.01%) instructions in affected programs: 117402 -> 116745 (-0.56%) helped: 227 HURT: 0 total cycles in shared programs: 128889798 -> 128888524 (-0.00%) cycles in affected programs: 1214644 -> 1213370 (-0.10%) helped: 179 HURT: 44 Sandy Bridge: total instructions in shared programs: 8467391 -> 8467384 (-0.00%) instructions in affected programs: 3107 -> 3100 (-0.23%) helped: 10 HURT: 6 total cycles in shared programs: 117580120 -> 117573448 (-0.01%) cycles in affected programs: 103158 -> 96486 (-6.47%) helped: 84 HURT: 11 Ivy Bridge: total instructions in shared programs: 7774255 -> 7774258 (0.00%) instructions in affected programs: 1677 -> 1680 (0.18%) helped: 8 HURT: 6 total cycles in shared programs: 65743828 -> 65739190 (-0.01%) cycles in affected programs: 89312 -> 84674 (-5.19%) helped: 78 HURT: 23 Haswell: total instructions in shared programs: 7107172 -> 7107150 (-0.00%) instructions in affected programs: 2048 -> 2026 (-1.07%) helped: 16 HURT: 0 total cycles in shared programs: 64653636 -> 64647486 (-0.01%) cycles in affected programs: 86836 -> 80686 (-7.08%) helped: 85 HURT: 17 Broadwell and Skylake: total instructions in shared programs: 8447529 -> 8447507 (-0.00%) instructions in affected programs: 2038 -> 2016 (-1.08%) helped: 16 HURT: 0 total cycles in shared programs: 66418670 -> 66413416 (-0.01%) cycles in affected programs: 90110 -> 84856 (-5.83%) helped: 83 HURT: 20 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-03-31 14:59:36 -07:00
Ian Romanick	8bb9c6ff7f	ptn: Silence unused parameter warning The KIL instruction doesn't have a destination, so ptn_kil never uses dest. program/prog_to_nir.c: In function ‘ptn_kil’: program/prog_to_nir.c:547:38: warning: unused parameter ‘dest’ [-Wunused-parameter] ptn_kil(nir_builder b, nir_alu_dest dest, nir_ssa_def *src) ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-03-31 14:59:36 -07:00
Samuel Pitoiset	d22eca5f90	tgsi: silence compiler warning in fetch_sampler_unit() The unit variable can be used uninitialized. Fixes: `24e77cb09` ("tgsi: handle indirect sampler arrays. (v2)") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-01 07:16:24 +10:00
Samuel Pitoiset	05902a6686	tgsi: fix out of bounds access in exec_atomop() The number of channels must be 4 for all RGBA components. Fixes: `22d129601` ("tgsi: add support for image operations to tgsi_exec. (v2.1)") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-01 07:15:16 +10:00
Brian Paul	9076e04934	tgsi: split tgsi_util_get_texture_coord_dim() function into two It was kind of overloaded, returning two different things. Now get the index of the shadow reference src register with a new tgsi_util_get_shadow_ref_src_index() function. To verify the new code, I added some temp/debug code which looped over all TGSI_TEXTURE_x values, calling the old function and new and checking that the returned indexes matched. Also tested piglit "shadow" tests with softpipe/llvmpipe. No testing of ilo and radeonsi changes. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 09:48:00 -06:00
Brian Paul	9d7cd43988	tgsi: skip texture query opcodes when examining texture targets Should fix the assertion in piglit spec@arb_gpu_shader5@texturegather@fs-r-none-shadow-2d when the TXQ instruction specifies a 2D target but the sampler view was declared as SHADOW2D. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2016-03-31 09:47:40 -06:00
Pierre Moreau	f96a403bc3	nv50/ir: Check for valid insn instead of def size This fixes a null pointer dereference during the register allocation pass, if a function had arguments. Functions arguments get a definition from the function itself, a definition which is therefore not linked to any instruction. If a value ends up having a definition but no linked instruction, the register allocation pass doesn't need to consider whether that value is generated by an instruction that can only handle "short" registers (on nv50). Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>	2016-03-31 10:30:29 -04:00
Ilia Mirkin	a94d8d51d7	mesa: add GL_EXT_copy_image support The extension is identical to GL_OES_copy_image. But dEQP has tests that want the EXT variant. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-30 22:57:17 -04:00
Ilia Mirkin	ebdb534548	mesa: add GL_OES_copy_image support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-30 22:57:17 -04:00
Ilia Mirkin	571f538a62	mesa: remove duplicate MAX_GEOMETRY_SHADER_INVOCATIONS entry Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-30 22:57:17 -04:00
Ilia Mirkin	2c7f5fe296	st/mesa: add ES sample-shading support We require the full ARB_gpu_shader5 for now, but in the future some other CAP could get exposed to indicate that only the multisample-related behavior of ARB_gpu_shader5 is available. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-30 22:57:17 -04:00
Ilia Mirkin	3002296cb6	mesa: add GL_OES_shader_multisample_interpolation support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-03-30 22:57:17 -04:00
Ilia Mirkin	411a88accc	mesa: add GL_OES_sample_shading support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-03-30 22:57:17 -04:00
Ilia Mirkin	5283e81015	glsl: add GL_OES_sample_variables support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-03-30 22:57:17 -04:00
Ilia Mirkin	6a8ca859f9	mesa: add OES_sample_variables to extension table, add enable bit Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-03-30 22:57:17 -04:00
Ilia Mirkin	903640c2ac	glsl: add gl_MaxSamples, new in GL 4.5 / GL ES 3.2 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-03-30 22:57:17 -04:00
Matt Turner	4fea98991c	i965: Don't add barrier deps for FB write messages. Ken did this earlier, and this is just me reimplementing his patch a little differently. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-30 19:54:30 -07:00
Matt Turner	3495265158	i965: Add and use is_scheduling_barrier() function.	2016-03-30 19:54:30 -07:00
Matt Turner	b4e223cfbf	i965: Remove NOP insertion kludge in scheduler. Instead of removing every instruction in add_insts_from_block(), just move the instruction to its scheduled location. This is a step towards doing both bottom-up and top-down scheduling without conflicts. Note that this patch changes cycle counts for programs because it begins including control flow instructions in the estimates. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-30 19:54:30 -07:00
Matt Turner	a607f4aa57	i965: Assert that an instruction is not inserted around itself. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-30 19:54:30 -07:00
Matt Turner	7b208a7312	i965: Relax restriction on scheduling last instruction. I think when this code was written, basic blocks were always ended by a control flow instruction or an end-of-thread message. That's no longer the case, and removing this restriction actually helps things: instructions in affected programs: 7267 -> 7244 (-0.32%) helped: 4 total cycles in shared programs: 66559580 -> 66431900 (-0.19%) cycles in affected programs: 28310152 -> 28182472 (-0.45%) helped: 9577 HURT: 879 GAINED: 2 The addition of the is_control_flow() checks is not a functional change, since the add_insts_from_block() does not put them in the list of instructions to schedule. I plan to change this in a later patch. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-30 19:54:30 -07:00
Matt Turner	f60750968c	i965/vec4/tcs: Set conditional mod on TCS_OPCODE_SRC0_010_IS_ZERO. Missing this causes an assertion failure in the scheduler with the next patch. Additionally, this gives cmod propagation enough information to optimize code better. total instructions in shared programs: 7112991 -> 7112852 (-0.00%) instructions in affected programs: 25704 -> 25565 (-0.54%) helped: 139 total cycles in shared programs: 64812898 -> 64810674 (-0.00%) cycles in affected programs: 127224 -> 125000 (-1.75%) helped: 139 Acked-by: Francisco Jerez <currojerez@riseup.net>	2016-03-30 19:54:30 -07:00
Matt Turner	436bdd7403	Revert "i965: Don't add barrier deps for FB write messages." This reverts commit `d0e1d6b7e2`. The change in the vec4 code is a mistake -- there's never an FS_OPCODE_FB_WRITE in vec4 code. The change in the fs code had the (harmless) effect of not recognizing an FB_WRITE as a scheduling barrier even if it was marked EOT -- harmless because the scheduler marked the last instruction of a block as a barrier, something I'm changing in the following patches. This will be reimplemented later in the series.	2016-03-30 19:54:30 -07:00
Matt Turner	0d253ce34a	i965: Simplify full scheduling-barrier conditions. All of these were simply code for "architecture register file" (and in the case of destinations, "not the null register"). Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-30 19:54:30 -07:00
Matt Turner	65bc94022b	i965: Remove incorrect cycle estimates. These printed the cycle count the last basic block (sched.time is set per basic block!). We have accurate, full program, data printed elsewhere. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-30 19:54:29 -07:00
Dave Airlie	10b189f985	st/mesa: fix fallout from xfb changes. Failed to update state tracker with new buffer interface. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:36:55 +10:00
Matt Turner	05ee6627d6	nir: Fix typo from commit `6702f1acde`.	2016-03-30 19:18:35 -07:00
Timothy Arceri	b273958c74	docs: mark xfb_* qualifiers as DONE Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:53:08 +11:00
Timothy Arceri	c5704bb350	mesa: add query support for GL_TRANSFORM_FEEDBACK_BUFFER interface Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:53:02 +11:00
Timothy Arceri	7234be0338	glsl: add transform feedback buffers to resource list Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:52:57 +11:00
Timothy Arceri	9e317271d7	mesa: add support to query GL_TRANSFORM_FEEDBACK_BUFFER_INDEX Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:52:47 +11:00
Timothy Arceri	51142e7705	mesa: add support to query GL_OFFSET for GL_TRANSFORM_FEEDBACK_VARYING Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:52:43 +11:00
Timothy Arceri	047139e8a0	mesa: rename tranform feeback varying macro XFB to XFV A latter patch will use XFB for buffers. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:52:39 +11:00
Timothy Arceri	b77c909878	glsl: always enable transform feedback mode when xfb_stride defined This enables in shader defined transform feedback mode even if the only place xfb_stride is defined is on the global out. We don't worry about xfb_buffer since Issue 22 c) in the spec says: "If the shader has an "xfb_buffer" qualifier identifying a buffer, but doesn't declare "xfb_offset" on anything associated with it, what happens? ... variables not qualified with "xfb_offset" are not captured, which makes the associated "xfb_buffer" qualifier irrelevant." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:52:34 +11:00
Timothy Arceri	c95e92b14d	glsl: handle varyings that are not written to but have an xfb_offset Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:52:29 +11:00
Timothy Arceri	d5c09d40b9	glsl: when lowering named interface set assigned flag This will be used when checking if xfb should attempt to capture a varying. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:52:22 +11:00
Timothy Arceri	a2fbc5ed44	glsl: reset current stream tracker When we move to the next buffer we need to reset the stream so that we don't generate an error message about streams not matching. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:52:17 +11:00
Timothy Arceri	f2a3c87a00	glsl: generate link error when implicit stride is to large This moves the check until after we have done the stride calculation and applies it to the xfb_* qualifiers. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:52:11 +11:00
Timothy Arceri	2fab85aaea	glsl: add xfb_stride link time validation From the ARB_enhanced_layous spec: "It is a compile-time or link-time error to have any xfb_offset that overflows xfb_stride, whether stated on declarations before or after the xfb_stride, or in different compilation units. ... When no xfb_stride is specified for a buffer, the stride of a buffer will be the smallest needed to hold the variable placed at the highest offset, including any required padding." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:52:05 +11:00
Timothy Arceri	8120e869b1	glsl: validate global out xfb_stride qualifiers and set stride on empty buffers Here we use the built-in validation in ast_layout_expression::process_qualifier_constant() to check for mismatching global out strides on buffers in a single shader. From the ARB_enhanced_layouts spec: "While xfb_stride can be declared multiple times for the same buffer, it is a compile-time or link-time error to have different values specified for the stride for the same buffer." For intrastage validation a new helper link_xfb_stride_layout_qualifiers() is created. We also take this opportunity to make sure stride is at least a multiple of 4, we will validate doubles at a later stage. From the ARB_enhanced_layouts spec: "If the buffer is capturing any double-typed outputs, the stride must be a multiple of 8, otherwise it must be a multiple of 4, or a compile-time or link-time error results." Finally we update store_tfeedback_info() to apply the strides to LinkedTransformFeedback and update the buffers bitmask to mark any global buffers with a stride as active. For example a shader with: layout (xfb_buffer = 0, xfb_offset = 0) out vec4 gs_fs; layout (xfb_buffer = 1, xfb_stride = 64) out; Is expected to have a buffer bound to both 0 and 1. From the ARB_enhanced_layouts spec: "A binding point requires a bound buffer object if and only if its associated stride in the program object used for transform feedback primitive capture is non-zero." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:52:00 +11:00
Timothy Arceri	cf039a309a	mesa: split transform feedback buffer into its own struct This will be used in a following patch to implement interface query support for TRANSFORM_FEEDBACK_BUFFER. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:51:52 +11:00
Timothy Arceri	258299d87a	glsl: use bitmask of active xfb buffer indices This allows us to print the correct binding point when not all buffers declared in the shader are bound. For example if we use a single buffer: layout(xfb_buffer=2, offset=0) out vec4 v; We now print '2' when the buffer is not bound rather than '0'. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:51:47 +11:00
Timothy Arceri	99cb5151ed	glsl: sort xfb varyings in offset/buffer order The existing transform feedback code expects to receive the list of varyings in increasing buffer order. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:51:38 +11:00
Timothy Arceri	0c66460fc6	glsl: basic linking support for xfb qualifiers This adds the initial infrastructure for enabling transform feedback mode via in shader qualifiers and adds initial buffer support. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:51:33 +11:00
Timothy Arceri	4305a60173	glsl: add xfb helpers and fields to the tfeedback_decl class We also apply any array/struct offsets. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:51:27 +11:00
Timothy Arceri	0822517936	glsl: add helper to process xfb qualifiers during linking This function checks for any xfb_* qualifiers which will enable transform feedback mode and cause any API defined xfb varyings to be ignored. It also counts the number of varyings that have a xfb_offset qualifier and finally it calls the create_xfb_varying_names() helper to generate the names of varyings to be caputured. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:51:21 +11:00
Timothy Arceri	707fd3972f	glsl: add helper to generate xfb varying names Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:51:17 +11:00
Timothy Arceri	8b6f8fe503	glsl: add helper for counting varyings This will be used to get a count of the number of varying name strings we are required to generate for use with the query api. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:51:06 +11:00
Timothy Arceri	ba7a7d4c39	glsl: add xfb qualifier lowering support for named blocks Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:51:01 +11:00
Timothy Arceri	4a873ef049	glsl: add xfb qualifiers to has_layout helper Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:50:54 +11:00
Timothy Arceri	598790e856	glsl: apply xfb_stride to implicit offsets for ifc block members When we have an interface block like: layout (xfb_buffer = 0, xfb_offset = 0) out Block { vec4 var1; layout (xfb_stride = 32) vec4 var2; vec4 var3; }; We take into account the stride of var2 when calculating the offset for var3. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:50:49 +11:00
Timothy Arceri	04a72e6e57	glsl: add xfb_stride compile time rules From the ARB_enhanced_layouts spec: "The xfb_stride qualifier specifies how many bytes are consumed by each captured vertex. It applies to the transform feedback buffer for that declaration, whether it is inherited or explicitly declared. It can be applied to variables, blocks, block members, or just the qualifier out. If the buffer is capturing any double-typed outputs, the stride must be a multiple of 8, otherwise it must be a multiple of 4, or a compile-time or link-time error results. ... The resulting stride (implicit or explicit) must be less than or equal to the implementation-dependent constant gl_MaxTransformFeedbackInterleavedComponents." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:50:44 +11:00
Timothy Arceri	edddad0eee	glsl: add xfb_offset compile time rules We also copy the qualifier values to the IR in this step. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:50:39 +11:00
Timothy Arceri	f6a8c7ef21	glsl: add xfb_buffer compile time rules Also copies the qualifier values to GLSL IR. From the ARB_enhanced_layouts spec: "The xfb_buffer qualifier can be applied to the qualifier out, to output variables, to output blocks, and to output block members. Shaders in the transform feedback capturing mode have an initial global default of layout(xfb_buffer = 0) out; This default can be changed by declaring a different buffer with xfb_buffer on the interface qualifier out. This is the only way the global default can be changed. When a variable or output block is declared without an xfb_buffer qualifier, it inherits the global default buffer. When a variable or output block is declared with an xfb_buffer qualifier, it has that declared buffer. All members of a block inherit the block's buffer. A member is allowed to declare an xfb_buffer, but it must match the buffer inherited from its block, or a compile-time error results. The xfb_buffer qualifier follows the same conventions, behavior, defaults, and inheritance rules as the qualifier stream, and the examples for stream apply here as well. This includes a block's inheritance of the current global default buffer, a block member's inheritance of the block's buffer, and the requirement that any xfb_buffer declared on a block member must match the buffer inherited from the block. ... It is a compile-time error to specify an xfb_buffer that is greater than the implementation-dependent constant gl_MaxTransformFeedbackBuffers." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:50:34 +11:00
Timothy Arceri	04d2f770c8	glsl: add field to track if xfb_buffer is an explicit or implicit value Since any of the xfb_* qualifiers trigger the shader to be in transform feedback mode we need an extra field to track if the xfb_buffer on interface members was set explicitly since xfb_buffer will always have a default value. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:50:29 +11:00
Timothy Arceri	733f1b2a55	glsl: add xfb_* qualifiers to glsl_struct_field These will be used to hold qualifier values for interface and struct members. Support is added to the struct/interface constructors to copy these fields upon creation. We also update record_compare() to ensure we don't reuse a glsl_type with the wrong xfb_* qualifier values. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:50:19 +11:00
Timothy Arceri	2dbcecb7a9	glsl: add IR fields for transform feedback layout qualifiers Adds xfb_buffer/stride fields and adds comment to offset field which is reused for xfb_offset. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:50:13 +11:00
Timothy Arceri	5c2516fc33	glsl: add validation for out layout qualifiers This adds validation for all qualifiers as allowed by the table in Section 4.4 (Layout Qualifiers) of the GLSL 4.5 spec. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:50:08 +11:00
Timothy Arceri	7b407fecec	glsl: relax stage restrictions on layout defaults for outputs The new xfb_buffer and xfb_stride global qualifiers are allowed in geom, tess and vertex stages. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:50:04 +11:00
Timothy Arceri	c9afd94af6	glsl: parse new transform feedback layout qualifiers We reuse the existing offset field for holding the xfb_offset expression but create a new flag as to avoid hitting the rules for the offset qualifier for UBOs. xfb_buffer qualifiers require extra processing when merging as they can be applied to global out defaults. We just apply the same rules as we do for the stream qualifier as the spec says: "The xfb_buffer qualifier follows the same conventions, behavior, defaults, and inheritance rules as the qualifier stream, and the examples for stream apply here as well." For xfb_stride we push everything into a global out field for later processing as xfb_stride applies to the entire buffer. We still need to have a separate field to store per variable strides because they can still effect implicit offsets e.g. when applied to block members with implicit offsets. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:50:00 +11:00
Timothy Arceri	13f6c788eb	glsl: move process_qualifier_constant() to ast_type.cpp We will make use of this function being here in the following patch. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:49:55 +11:00
Timothy Arceri	52caeee7e7	glsl: add transform feedback built-in constants These are new built-ins added by ARB_enhanced_layouts. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:49:51 +11:00
Timothy Arceri	8765a9e0fe	glsl: generate named interface block names correctly Firstly this updates the named interface lowering pass to store the interface without the arrays removed. Note we need to remove the arrays in the interface/varying matching code to not regress things but in future this should be fixed futher as it would seem we currently successfully match interface blocks with differnt array sizes. Since we now know if the interface was an array we can reduce the IR flags from_named_ifc_block_array and from_named_ifc_block_nonarray to just from_named_ifc_block. Next rather than having a different code path for named interface blocks in program_resource_visitor we just make use of the one used by UBOs this allows us to now handle arrays of arrays correctly. Finally we add a new param to the recursion function named_ifc_member this is because we only want to process a single member at a time. Note that this is also the glsl_struct_field from the original ifc type before lowering rather than the type from the lowered variable. This fixes a bug in Mesa where we would generate the names like WithInstArray[0].g[0][0] when it should be WithInstArray[0].g[0] for the following interface. out WithInstArray { float g[3]; } instArray[2]; Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:49:47 +11:00
Timothy Arceri	7ebc3deaad	glsl: Fix segfault when lhs is error_type in TCS It seems expected that both lhs and rhs could be of type error_type in this code however the TCS case wasn't expecting it. Fixes segfault in an enhanced layouts GL CTS test. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 12:49:42 +11:00
Dave Airlie	c9367c13ca	docs: update softpipe status for shader_image_load_store. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-03-31 09:14:30 +10:00
Dave Airlie	eb9ad9faa3	softpipe: add image support to softpipe (v3) This adds support for ARB_shader_image_load_store to softpipe. v2: add RESQ support (Ilia) v3: constify, cleanup internals, add some comments (Brian). Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-03-31 09:14:16 +10:00
Dave Airlie	0d1f679ded	draw: add support for passing images to vs/gs shaders. This just adds support for passing through images to the tgsi execution stage. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-03-31 09:14:11 +10:00
Dave Airlie	22d1296013	tgsi: add support for image operations to tgsi_exec. (v2.1) This adds support for load/store/atomic operations on images along with image tracking support. v2: add RESQ support. (Ilia) v2.1: constify interface (Brian) split get_image_coord_dim (Brian) Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-03-31 09:14:05 +10:00
Dave Airlie	493eab7679	softpipe: add support for explicit early depth testing ARB_shader_image_load_store adds support for explicit early depth testing. However we need to make sure we don't overwrite values using the shader written values in this case. This fixes early depth testing in softpipe to conform with those requirements. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-03-31 09:13:54 +10:00
Dave Airlie	827393b76f	tgsi: introduce NonHelperMask This is a mask of which of the current 2x2 grid are non-helper invocations. This allows us to mask off the helper invocations later for the image operations. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-03-31 09:13:50 +10:00
Dave Airlie	ca180c09bb	tgsi_exec: handle execmask when doing indirect lookups Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-03-31 09:13:46 +10:00
Dave Airlie	1ff4cc0535	tgsi_exec: add support for up to 3 address registers (v2) v2: be consistent with other definitions. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-03-31 09:13:08 +10:00
Matt Turner	6702f1acde	nir: Propagate negates up multiplication chains. total instructions in shared programs: 7112159 -> 7088092 (-0.34%) instructions in affected programs: 1374915 -> 1350848 (-1.75%) helped: 7392 HURT: 621 GAINED: 2 LOST: 2	2016-03-30 13:12:34 -07:00
Matt Turner	a74fc3fe8a	i965: Don't inline intel_batchbuffer_require_space(). It's called by the inline intel_batchbuffer_begin() function which itself is used in BEGIN_BATCH. So in sequence of code emitting multiple packets, we have inlined this ~200 byte function multiple times. Making it an out-of-line function presumably improved icache usage. Improves performance of Gl32Batch7 by 3.39898% +/- 0.358674% (n=155) on Ivybridge. Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2016-03-30 13:12:34 -07:00
Christian König	1faca438bd	r600: ignore PIPE_BIND_LINEAR in *_is_format_supported Similar to radeonsi linear layout should work for all not compressed or depth/stencil formats. Fixes issues with VDPAU on r600. Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2016-03-30 20:00:27 +02:00
Thomas Hindoe Paaboel Andersen	9a73f5728e	st/vdpau: correct null check The null check of result was the wrong way around. Also, move memset and dereference of result after the null check. Reviewed-by: Christian König <christian.koenig@amd.com>	2016-03-30 20:00:27 +02:00
Brian Paul	4541a78502	docs: remove docs/COPYING which contains GPL license There hasn't been GPL code in Mesa for a long time now. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-03-30 11:38:51 -06:00
Samuel Pitoiset	bb37886f75	glsl: add missing types for buffer images Type of GLSL_SAMPLER_DIM_BUF can be sampler or image. Spotted while trying to run dEQP tests related to ARB_shader_image_load_store. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-30 19:01:33 +02:00
Lars Hamre	6773128bbf	glsl: invalidate float suffixes for GLSL 1.10 and GLSL ES 1.00 Float suffixes are not allowed in GLSL 1.10 nor GLSL ES 1.00. Fixes the following piglit tests: tests/spec/glsl-1.10/compiler/literals/invalid-float-suffix-capital-f.vert tests/spec/glsl-1.10/compiler/literals/invalid-float-suffix-f.vert` v2: modify error message v3: parse the float instead of returning an ERROR_TOK v4: (by Ken) Change to is_version(120, 300) to avoid breaking ES3 shaders; update commit message accordingly. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81585 Signed-off-by: Lars Hamre <chemecse@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-03-29 21:26:34 -07:00
Jason Ekstrand	cf2257069c	nir/spirv: Set a default number of invocations for geometry shaders The SPIR-V spec says geometry shaders are supposed to have one invocation by default. The execution mode is only required if there are multiple invocations.	2016-03-29 20:30:27 -07:00
Roland Scheidegger	2d3b8aefda	tgsi: (trivial) only verify target for is_tex instructions d3d10 state tracker does not encode (valid) target (only offsets are really used from the texture bits), since that information always comes from the sview dcl, and not the instruction (note the meaning of target is actually slightly different between gl and d3d10 in any case, because d3d10 target does never include shadow bit). Also move the msaa sampler identification as well - would need to set that on the sview not sampler, so while this does not fix it make it at least obvious it won't work with sample instructions.	2016-03-30 04:26:54 +02:00
Ilia Mirkin	553e37aa33	mesa: allow mutable buffer textures to back GL ES images Since there is no way to create immutable texture buffers in GL ES, mutable buffer textures are allowed to back images. See issue 7 of the GL_OES_texture_buffer specification. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-03-29 21:41:03 -04:00
Brian Paul	513384d7e8	mesa: make _mesa_prepare_mipmap_level() static No longer called from any other file. Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Ian Romanick <ian.d.romanick@intel.com>	2016-03-29 18:13:46 -06:00
Brian Paul	ed39de90f1	meta: use _mesa_prepare_mipmap_levels() The prepare_mipmap_level() wrapper for _mesa_prepare_mipmap_level() is not needed. It only served to undo the GL_TEXTURE_1D_ARRAY height/depth change was was made before the call to prepare_mipmap_level() Said another way, regardless of how the meta code manipulates the height/ depth dims for GL_TEXTURE_1D_ARRAY, the gl_texture_image dimensions are correctly set up by _mesa_prepare_mipmap_levels(). Tested by plugging _mesa_meta_GenerateMipmap() into the swrast driver and testing with piglit. v2 (idr): Early out of the mipmap generation loop with dstImage is NULL. This can occur for immutable textures that have a limited range of levels or in the presense of memory allocation failures. Fixes arb_texture_view-mipgen on Intel platforms. Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-03-29 18:13:46 -06:00
Brian Paul	bab0752a80	docs: add HTTP link for Mesa downloads Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92628 Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-03-29 18:13:46 -06:00
Brian Paul	5c85c3be26	tgsi: simplify tgsi_shader_info::is_msaa_sampler checking We assert that fullinst->Instruction.Texture != 0 above so no need to check it in the conditional. We also have the fullinst->Texture.Texture value in a local variable, so use it. Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-03-29 18:13:46 -06:00
Brian Paul	86e1768c13	tgsi: collect texture sampler target info in tgsi_scan_shader() Texture sample instructions specify a sampler unit and texture target such as "1D", "2D", "CUBE", etc. Sampler view declarations also specify the sampler unit and texture target. This patch checks that the texture instructions agree with the declarations and collects the texture target type for each sampler unit. v2: only compare instruction's texture target to the sampler view declaration target if the instruction is a TEX instruction, not a SAMPLE instruction. Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-03-29 18:13:46 -06:00
Brian Paul	6775268b61	gallium/docs: s/gven/given/	2016-03-29 18:13:46 -06:00
Brian Paul	75b713455c	xlib: add support for GLX_ARB_create_context This adds the glXCreateContextAttribsARB() function for the xlib/swrast driver. This allows more piglit tests to run with this driver. For example, without this patch we get: $ bin/fbo-generatemipmap-1d -auto piglit: error: waffle_config_choose failed due to WAFFLE_ERROR_UNSUPPORTED_ ON_PLATFORM: GLX_ARB_create_context is required in order to request an OpenGL version not equal to the default value 1.0 piglit: error: Failed to create waffle_config for OpenGL 2.0 Compatibility Context piglit: info: Failed to create any GL context PIGLIT: {"result": "skip" } Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Acked-by: Roland Scheidegger <sroland@vmware.com>	2016-03-29 18:13:45 -06:00
Brian Paul	d8d029f22b	st/mesa: simplify st_generate_mipmap() The whole st_generate_mipmap() function was overly complicated. Now we just call the new _mesa_prepare_mipmap_levels() function to prepare the texture mipmap memory, then call the generate function which fills in the texture images. This fixes a failed assertion in llvmpipe/softpipe which is hit with the new piglit generatemipmap-base-change test. Also fixes some device errors (format mismatches) with the VMware svga driver. v2: fix a comment typo, per Sinclair Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-03-29 18:13:45 -06:00
Brian Paul	105fe52784	mesa: new _mesa_prepare_mipmap_levels() function for mipmap generation Simplifies the loops in generate_mipmap_uncompressed() and generate_mipmap_compressed(). Will be used in the state tracker too. Could probably be used in the meta code. If so, some additional clean-ups can be done after that. v2: use unsigned types instead of GLuint, per Ian Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-03-29 18:13:45 -06:00
Kenneth Graunke	d4a5a61d44	i965: Don't use CUBE wrap modes for integer formats on IVB/BYT. There is no linear filtering for integer formats, so we should always be using CLAMP_TO_EDGE mode. Fixes 46 dEQP cases on Ivybridge (which were likely broken by commit `0faf26e6a0`). This workaround doesn't appear to be necessary on any other hardware; I haven't found any documentation mentioning errata in this area. v2: Only apply on Ivybridge/Baytrail to avoid regressing GLES3.1 tests. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> [v1]	2016-03-29 15:43:18 -07:00
Kenneth Graunke	f8c69fbb54	Revert "i965: Set address rounding bits for GL_NEAREST filtering as well." This reverts commit `60d6a8989a`. It's pretty sketchy, and apparently regressed a bunch of dEQP tests on Sandybridge.	2016-03-29 15:35:07 -07:00
Rovanion Luckey	7087e0ab27	gallium: Format code in pb_buffer_fenced.c according to style guide. This is a tiny housekeeping patch which does the following: * Replaced tabs with three spaces. * Formatted oneline and multiline code comments. Some doxygen comments weren't marked as such and some code comments were marked as doxygen comments. * Spaces between if- and while-statements and their parenthesis. According to the mesa coding style guidelines. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-03-29 13:44:11 -06:00
Charmaine Lee	2d8df0306b	svga: emit sampler declarations in the helper function for non vgpu10 With commit `dc9ecf58c0`, we are now getting the sampler target from the sampler view declaration. But since a sampler view declaration can be defined after a sampler declaration, we need to emit the sampler declarations in the pre-helpers function, otherwise, the sampler target might not have defined yet for the sampler declaration. Fixes viewperf maya-03 and various gl trace regressions in hwv11. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-03-29 13:35:09 -06:00
Brian Paul	96e0894106	svga: avoid freeing non-malloced memory svga_shader_expand() will fall back to using non-malloced memory for emit.buf if malloc fails. We should check if the memory is malloced before freeing it in the error path of svga_tgsi_vgpu9_translate. Original patch by Thomas Hindoe Paaboel Andersen <phomes@gmail.com>. Remove trivial svga_destroy_shader_emitter() function, by BrianP. Signed-off-by: Brian Paul <brianp@vmware.com>	2016-03-29 13:35:08 -06:00
Samuel Pitoiset	9d57c84994	nvc0/ir: move load/store lowering pass to handleLDST() Having all this code in a big switch is not really a good pratice. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-29 19:55:51 +02:00
Christian König	cc68dc2b5e	st/mesa: implement new DMA-buf based VDPAU interop v2 Avoid using internal structures from another API. v2: rebase and moved includes so they don't cause problem when VDPAU isn't installed. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1) Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-03-29 17:29:22 +02:00
Christian König	bdeb22b7b6	st/vdpau: implement the new DMA-buf based interop v2 That should allow us to get away from passing internal structures around. v2: rebased Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-03-29 17:29:18 +02:00
Christian König	0042aa508e	st/vdpau: move FormatRGBAToPipe into the interop We are going to need that in the Mesa state tracker as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-03-29 17:29:14 +02:00
Christian König	faba96bc60	st/vdpau: add new interop interface Use DMA-buf for the VDPAU interop interface instead of using internal structures. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-03-29 17:29:10 +02:00
Christian König	d180de3532	st/vdpau: use linear layout for output surfaces Works around a bug in radeonsi and tiling is actually not very beneficial in this use case. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-03-29 17:28:43 +02:00
Christian König	7eb5e5b8b4	radeonsi: ignore PIPE_BIND_LINEAR in si_is_format_supported v2 Linear layout should work for all not compressed or depth/stencil formats. v2: restrict it a bit more Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-29 17:28:35 +02:00
Ilia Mirkin	9286cbdd1e	st/mesa: enable OES_texture_buffer when all components available OES_texture_buffer combines bits from a number of desktop extensions. When they're all available, turn it on. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-29 10:15:21 -04:00
Adam Jackson	5e1aec6db0	glapi/glx: Mark the indirect swapped dispatch functions _X_COLD A modest size savings: text data bss dec hex filename 264143 15608 232 279983 445af libglx.so.before 254303 15608 232 270143 41f3f libglx.so.after Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-03-29 10:10:57 -04:00
Adam Jackson	ea0f62e45e	glapi/glx: Sync some additional error checking from xserver Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-03-29 10:10:57 -04:00
Jordan Justen	f56f538ce4	anv/gen7: Fix command parser version test with indirect dispatch Caught-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-28 22:30:33 -07:00
Alejandro Piñeiro	dcd41ca87a	glsl: raise warning when using uninitialized variables v2: * Take into account out varyings too (Timothy Arceri) * Fix style (Timothy Arceri) * Use a new ast_expression variable, instead of an ast_expression::hir new parameter (Timothy Arceri) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94129 Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-03-29 07:28:57 +02:00
Alejandro Piñeiro	8568d02498	glsl: add is_lhs bool on ast_expression Useful to know if a expression is the recipient of an assignment or not, that would be used to (for example) raise warnings of "use of uninitialized variable" without getting a false positive when assigning first a variable. By default the value is false, and it is assigned to true on the following cases: * The lhs assignments subexpression * At ast_array_index, on the array itself. * While handling the method on an array, to avoid the warning calling array.length * When computed the cached test expression at test_to_hir, to avoid a duplicate warning on the test expression of a switch. set_is_lhs setter is added, because in some cases (like ast_field_selection) the value need to be propagated on the expression tree. To avoid doing the propatagion if not needed, it skips if no primary_expression.identifier is available. v2: use a new bool on ast_expression, instead of a new parameter on ast_expression::hir (Timothy Arceri) v3: fix style and some typos on comments, initialize is_lhs default value on constructor, to avoid a c++11 feature (Ian Romanick) v4: some tweaks on comments (Timothy Arceri) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94129 Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-03-29 07:28:57 +02:00
Jason Ekstrand	35e2e96b30	nir: Add a helper for getting the current block from a cursor Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-03-28 18:32:48 -07:00
Jason Ekstrand	be98c47528	nir/lower_out_to_temp: Add an "entrypoint" parameter Previously, the pass assumed that the entrypoint would be whatever function happened to have the name "main". We really shouldn't trust in the function names. Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-03-28 18:32:48 -07:00
Jason Ekstrand	31a5bec93f	nir/lower_out_to_temp: Steal the output's constant initializer Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-03-28 18:32:48 -07:00
Jason Ekstrand	38de85f9a5	nir: Add a helper for getting the unique function in a shader Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-03-28 18:32:48 -07:00
Jason Ekstrand	49be812be6	nir/sweep: Sweep function parameters They are no longer in the list of local variables so we need to explicitly sweep them. Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-03-28 18:32:48 -07:00
Jason Ekstrand	1be4c61c95	nir/builder: Add a helper for creating undefs Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-03-28 18:32:48 -07:00
Jason Ekstrand	6a2479d618	nir/builder: Add a helper for storing to variable derefs Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-03-28 18:32:48 -07:00
Jason Ekstrand	77e2ac1da7	nir/builder: Add a helper for building fdot instructions Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-03-28 18:32:48 -07:00
Jason Ekstrand	da422663a6	nir: Add a variable_foreach_safe helper Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-03-28 18:32:48 -07:00
Jason Ekstrand	731870fbe3	nir/Makefile: Fix alphabetization Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-03-28 18:32:48 -07:00
Ilia Mirkin	b4c0c514b1	mesa: add OES_texture_buffer and EXT_texture_buffer support Allow ES 3.1 contexts to access the texture buffer functionality. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-03-28 20:29:29 -04:00
Ilia Mirkin	720670a615	glsl: add OES_texture_buffer and EXT_texture_buffer support Expose the samplerBuffer/imageBuffer types, and allow the various functions to operate on them. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-03-28 20:20:49 -04:00
Ilia Mirkin	74b76c08a3	mesa: add OES_texture_buffer and EXT_texture_buffer extension to table We need to add a new bit since the GL ES exts require functionality from a combination of texture buffer extensions as well as images (for imageBuffer) support. Additionally, not all GPUs support all the texture buffer functionality (e.g. rgb32 isn't supported by nv50). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-03-28 20:19:14 -04:00
Ilia Mirkin	659beca666	mesa: properly return GetTexLevelParameter queries for buffer textures This fixes all failures with dEQP tests in this area. While ARB_texture_buffer_object explicitly says that GetTexLevelParameter & co should not be supported, GL 3.1 reverses this decision and allows all of these queries there. Conversely, there is no text that forbids the buffer-specific queries from being used with non-buffer images. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-03-28 20:18:46 -04:00
Kenneth Graunke	4ed4a2af86	glsl: Delete initialized field from uniform storage test. Timothy deleted this field. Fixes "make check". Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-03-28 17:02:00 -07:00
Jordan Justen	8dbfa265a4	anv/gen7: DispatchIndirect requires cmd parser 5 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-28 17:01:35 -07:00
Jordan Justen	1a3adae84a	anv/gen7: Save kernel command parser version Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-28 17:01:35 -07:00
Jordan Justen	f60683b32a	anv: Invalidate state cache before L3 partitioning set-up. Port `10d84ba9f0` to anv. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-28 17:01:35 -07:00
Jordan Justen	5879cb0251	anv: Fix cache pollution race during L3 partitioning set-up. Port `0aa4f99f56` to anv. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-28 17:01:35 -07:00
Timothy Arceri	86d87d1047	mesa: remove initialized field from uniform storage The only place this was used was in a gallium debug function that had to be manually enabled. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-03-29 09:59:03 +11:00
Samuel Pitoiset	b8b3af2932	nvc0: use a different offset for buffers and surfaces To not overwrite buffers and surfaces information, we need to use a different offset in the driver constant buffer. Currently, OP_SUQ is only supported for buffers but this will be slightly updated for images support. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-29 00:47:28 +02:00
Kenneth Graunke	60d6a8989a	i965: Set address rounding bits for GL_NEAREST filtering as well. Yuanhan Liu decided these were useful for linear filtering in commit `76669381` (circa 2011). Prior to that, we never set them; it seems he tried to preserve that behavior for nearest filtering. It turns out they're useful for nearest filtering, too: setting these fixes the following dEQP-GLES3 tests: functional.fbo.blit.rect.nearest_consistency_mag functional.fbo.blit.rect.nearest_consistency_mag_reverse_src_x functional.fbo.blit.rect.nearest_consistency_mag_reverse_src_y functional.fbo.blit.rect.nearest_consistency_mag_reverse_dst_x functional.fbo.blit.rect.nearest_consistency_mag_reverse_dst_y functional.fbo.blit.rect.nearest_consistency_mag_reverse_src_dst_x functional.fbo.blit.rect.nearest_consistency_mag_reverse_src_dst_y functional.fbo.blit.rect.nearest_consistency_min functional.fbo.blit.rect.nearest_consistency_min_reverse_src_x functional.fbo.blit.rect.nearest_consistency_min_reverse_src_y functional.fbo.blit.rect.nearest_consistency_min_reverse_dst_x functional.fbo.blit.rect.nearest_consistency_min_reverse_dst_y functional.fbo.blit.rect.nearest_consistency_min_reverse_src_dst_x functional.fbo.blit.rect.nearest_consistency_min_reverse_src_dst_y functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_src_x functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_src_y functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_dst_x functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_dst_y functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_src_dst_x functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_src_dst_y functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_src_x functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_src_y functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_dst_x functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_dst_y functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_src_dst_x functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_src_dst_y Apparently, BLORP has always set these bits unconditionally. However, setting them unconditionally appears to regress tests using texture projection, 3D samplers, integer formats, and vertex shaders, all in combination, such as: functional.shaders.texture_functions.textureprojlod.isampler3d_vertex Setting them on Gen4-5 appears to regress Piglit's tests/spec/arb_sampler_objects/framebufferblit. Honestly, it looks like the real problem here is a lack of precision. I'm just hacking around problems here (as embarassing as it is). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-28 15:28:58 -07:00
Kenneth Graunke	0faf26e6a0	i965: Always use BRW_TEXCOORDMODE_CUBE when seamless filtering. When using seamless cube map mode and NEAREST filtering, we explicitly overrode the wrap modes to CLAMP_TO_EDGE. This was to implement the following spec text: "If NEAREST filtering is done within a miplevel, always apply apply wrap mode CLAMP_TO_EDGE." However, textureGather() ignores the sampler's filtering mode, and instead returns the four pixels that would be blended by LINEAR filtering. This implies that we should do proper seamless filtering, and include pixels from adjacent cube faces. It turns out that we can simply delete the NEAREST -> CLAMP_TO_EDGE overrides. Normal cube map sampling works by first selecting the face, and then nearest filtering fetches the closest texel. If the nearest texel was on a different face, then that face would have been chosen. So it should always be within the face anyway, which effectively performs CLAMP_TO_EDGE. Fixes 86 dEQP-GLES31.texture.gather.basic.cube.* tests. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Suggested-by: Ian Romanick <idr@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-28 15:25:04 -07:00
Kenneth Graunke	72473658c5	i965: Fix brw_render_cache_set_check_flush's PIPE_CONTROLs. Our driver uses the brw_render_cache mechanism to track buffers we've rendered to and are about to sample from. Previously, we did a single PIPE_CONTROL with the following bits set: - Render Target Flush - Depth Cache Flush - Texture Cache Invalidate - VF Cache Invalidate - Instruction Cache Invalidate - CS Stall This combined both "top of pipe" invalidations and "bottom of pipe" flushes, which isn't how the hardware is intended to be programmed. The "top of pipe" invalidations may happen right away, without any guarantees that rendering using those caches has completed. That rendering may continue altering the caches. The "bottom of pipe" flushes do wait for the rendering to complete. The CS stall also prevents further work from happening until data is flushed out. What we wanted to do was wait for rendering complete, flush the new data out of the render and depth caches, wait, then invalidate any stale data in read-only caches. We can accomplish this by doing the "bottom of pipe" flushes with a CS stall, then the "top of pipe" flushes as a second PIPE_CONTROL. The flushes will wait until the rendering is complete, and the CS stall will prevent the second PIPE_CONTROL with the invalidations from executing until the first is done. Fixes dEQP-GLES3.functional.texture.specification.teximage2d_pbo subtests on Braswell and Skylake. These tests hit the meta PBO texture upload path, which binds the PBO as a texture and samples from it, while rendering to the destination texture. The tests then sample from the texture. For now, we leave Gen4-5 alone. It probably needs work too, but apparently it hasn't even been setting the (G45+) TC invalidation bit at all... v2: Add Sandybridge post-sync non-zero workaround, for safety. Cc: mesa-stable@lists.freedesktop.org Suggested-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-28 15:23:56 -07:00
Kenneth Graunke	de505f7d7b	i965: Whack UAV bit when FS discards and there are no color writes. dEQP-GLES31.functional.fbo.no_attachments.* draws a quad with no framebuffer attachments, using a shader that discards based on gl_FragCoord. It uses occlusion queries to inspect whether pixels are rendered or not. Unfortunately, the hardware is not dispatching any pixel shaders, so discards never happen, and the full quad of pixels increments PS_DEPTH_COUNT, making the occlusion query results bogus. To understand why, we have to delve into the WM_INT internal signalling mechanism's formulas. The "WM_INT::Pixel Shader Kill Pixel" signal is defined as: 3DSTATE_WM::ForceKillPixel == ON \|\| (3DSTATE_WM::ForceKillPixel != Off && !WM_INT::WM_HZ_OP && 3DSTATE_WM::EDSC_Mode != PREPS && (WM_INT::Depth Write Enable \|\| WM_INT::Stencil Write Enable) && ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (3DSTATE_PS_EXTRA::PixelShaderKillsPixels \|\| 3DSTATE_PS_EXTRA:: oMask Present to RenderTarget \|\| 3DSTATE_PS_BLEND::AlphaToCoverageEnable \|\| 3DSTATE_PS_BLEND::AlphaTestEnable \|\| 3DSTATE_WM_CHROMAKEY::ChromaKeyKillEnable)) Because there is no depth or stencil buffer, writes to those buffers are disabled. So the highlighted condition is false, making the whole "Kill Pixel" condition false. This then feeds into the following "WM_INT::ThreadDispatchEnable" condition: 3DSTATE_WM::ForceThreadDispatch != OFF && !WM_INT::WM_HZ_OP && 3DSTATE_PS_EXTRA::PixelShaderValid && (3DSTATE_PS_EXTRA::PixelShaderHasUAV \|\| WM_INT::Pixel Shader Kill Pixel \|\| WM_INT::RTIndependentRasterizationEnable \|\| (!3DSTATE_PS_EXTRA::PixelShaderDoesNotWriteRT && 3DSTATE_PS_BLEND::HasWriteableRT) \|\| (WM_INT::Pixel Shader Computed Depth Mode != PSCDEPTH_OFF && (WM_INT::Depth Test Enable \|\| WM_INT::Depth Write Enable)) \|\| (3DSTATE_PS_EXTRA::Computed Stencil && WM_INT::Stencil Test Enable) \|\| (3DSTATE_WM::EDSC_Mode == 1 && (WM_INT::Depth Test Enable \|\| WM_INT::Depth Write Enable \|\| WM_INT::Stencil Test Enable))) Given that there's no depth/stencil testing, no writeable render target, and the hardware thinks kill pixel doesn't happen, all of these conditions are false. We have to whack some bit to make PS invocations happen. There are many options. Curro suggested using the UAV bit. There's some precedence in doing that - we set it for fragment shaders that do SSBO/image/atomic writes when no color buffer writes are enabled. We can simply include discard here too. Fixes 64 dEQP-GLES31.functional.fbo.no_attachments.* tests. v2: Add a comment suggested and written by Jason Ekstrand. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-28 14:36:47 -07:00
Jason Ekstrand	433cf90650	nir/spirv: Remove the NoContraction hack NIR now just handles this for us by not fusing if the multiply is marked as exact.	2016-03-28 13:07:39 -07:00
Jason Ekstrand	5d9afb65a6	i965/peephole_ffma: Only match a mul+add if none of the ops are exact	2016-03-28 13:07:39 -07:00
Jason Ekstrand	035f66025b	nir/search: Don't match inexact expressions with exact subexpressions In the first pass of implementing exact handling, I made a mistake with search-and-replace. In particular, we only reallly handled exact/inexact on the root of the tree. Instead, we need to check every node in the tree for an exact/inexact match. As an example of this, consider the following GLSL code precise float a = b + c; if (a < 0) { do_stuff(); } In that case, only the add will be declared "exact" and an expression that looks for "b + c < 0" will still match and replace it with "b < -c" which may yield different results. The solution is to simply bail if any of the values are exact when matching an inexact expression.	2016-03-28 13:07:39 -07:00
Rhys Kidd	668b6ddfc5	vc4: Remove unused include from vc4_nir_lower_txf_ms.c Found with grep and inspection. Test compiled on RPi hw. Assists any future effort to remove TGSI as an intermediate stage. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Signed-off-by: Eric Anholt <eric@anholt.net>	2016-03-28 11:51:11 -07:00
Adam Jackson	2b8492d63e	glapi/glx: Treat xserver generated targets as .PHONY Meaning, always rebuild them when asked instead of bothering to look at timestamps (and then wondering why nothing happened when you said make). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-03-28 14:37:12 -04:00
Adam Jackson	c2f0bc2537	glapi/glx: Thunk non-ABI calls through GetProcAddress Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-03-28 14:37:12 -04:00
Adam Jackson	ce3f0b23d1	glapi/glx: Emit direct GL calls instead of dispatch lookup Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-03-28 14:28:51 -04:00
Adam Jackson	c0a9cbea4d	glx: Unbreak generating some of the xorg glx headers Broken by: commit `9ace0b5422` Author: Dylan Baker <baker.dylan.c@gmail.com> Date: Wed May 20 15:49:11 2015 -0700 glapi: glX_proto_size.py: use argparse instead of getopt Which changed most, but not all, callers to use --header-tag instead of -h. Reviewed-by: Dylan Baker <baker.dylan.c@gmail.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-03-28 14:28:36 -04:00
Bas Nieuwenhuizen	dd5f0950e4	mesa/st: Fix NULL access if no fragment shader is bound Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-03-28 18:02:07 +02:00
Rob Clark	b4c72b792c	freedreno/ir3: fix for load_front_face intrinsic Seems like trying to widen in the same instruction as the add.s does a non-sign-extending widen. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-03-28 10:19:53 -04:00
Rob Clark	3ca034cada	freedreno/ir3: fix compiler warn Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-03-28 10:19:09 -04:00
Ilia Mirkin	b9f1affb2e	nvc0: make sure to disable fetches from previously-set VBOs when blitting We disable the vertex attributes, but also disable the VBO fetch details as well, just in case. Not known to fix anything. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-28 08:36:34 -04:00
Ilia Mirkin	41100b6b44	nvc0: disable primitive restart and index bias during blits Back in the dawn of time, we used to do immediate uploads for the vertex data, and all was well. However Maxwell dropped support for immediate vertex data, so we started feeding in a VBO (in all cases). But we forgot to disable some things that apply in such cases, specifically primitive restart and index bias. The latter was causing WoW and other Blizzard games trouble as they use a pattern where they draw with a base vertex (aka index bias), followed by texture uploads (aka blits, internally). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91526 Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Karol Herbst <nouveau@karolherbst.de>	2016-03-28 08:35:38 -04:00
Ilia Mirkin	f667d15561	nvc0/ir: fix picking of coordinates from tex instruction for textureGrad On Fermi, there's an argument in front of the coords that combines array and indirect handle, while on Kepler the array and the indirect handle are separate (and in front of the coords). We were previously only accounting for the array bit of it, if there were an indirect access it wouldn't be counted in the formula. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-03-28 08:35:38 -04:00
Ilia Mirkin	6711f159d9	nv50/ir: saturate depth writes Apparently there's no post-FS clamping logic, so we have to do this by hand. The depth will never be outside of the 0..1 range, even on floating point zeta buffers, so this should be safe. Fixes dEQP-GLES3.functional.fbo.depth.clamp. which tests writing invalid values on various zeta buffer formats. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-28 08:35:38 -04:00
Marek Olšák	6262d6125a	gallium/util: fix up inaccurate behavior of util_framebuffer_state_equal (v2) v2: move the nr_cbufs check above the loop Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (v1)	2016-03-28 00:46:23 +02:00
Marek Olšák	21c479256a	st/mesa: only minify height if target != 1D array in st_finalize_texture The st_texture_object documentation says: "the number of 1D array layers will be in height0" We can't minify that. Spotted by luck. No app is known to hit this issue. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-28 00:44:45 +02:00
Miklós Máté	50d653c2bb	mesa: optimize out the realloc from glCopyTexImagexD() v2: comment about the purpose of the code v3: also compare texFormat, add a perf debug message, formatting fixes Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Miklós Máté <mtmkls@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-03-27 19:58:33 +02:00
Miklós Máté	baab345b19	st/mesa: fix handling the fallback texture This fixes crash when post-processing is enabled in SW:KotOR. v2: fix const-ness v3: move assignment into the if() block Signed-off-by: Miklós Máté <mtmkls@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-03-27 19:58:33 +02:00
Miklós Máté	920fbecf57	st/mesa: enable GL_ATI_fragment_shader Signed-off-by: Miklós Máté <mtmkls@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-03-27 19:58:33 +02:00
Miklós Máté	dee274477f	st/mesa: implement GL_ATI_fragment_shader v2: fix arithmetic for special opcodes, fix fog state, cleanup v3: simplify handling of special opcodes, fix rebinding with different textargets or fog equation, lots of formatting fixes v4: adapt to the compile early, fix later architecture, formatting fixes Signed-off-by: Miklós Máté <mtmkls@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-03-27 19:58:33 +02:00
Miklós Máté	d71c1e9e54	program: add ATI_fragment_shader to shader stages list Signed-off-by: Miklós Máté <mtmkls@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-03-27 19:58:33 +02:00
Miklós Máté	e2d5a6fac5	mesa: optionally associate a gl_program to ATI_fragment_shader the state tracker will use it Acked-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Miklós Máté <mtmkls@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-03-27 19:58:33 +02:00
Edward O'Callaghan	11bd53933e	gallium/p_context.h: Make comment more readable Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-03-27 18:03:04 +02:00
Edward O'Callaghan	2df141087a	mesa/st: Remove GLSLVersion clamping While here, remove itermediate glsl_feature_level variable. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-03-27 18:00:36 +02:00
Edward O'Callaghan	ca22d2f1fd	radeon/r600: Fix return type in failure branch Commit `d4e847ea` introduced a warning about making an integer from a pointer without a cast, fix it here. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-03-27 18:00:35 +02:00
Edward O'Callaghan	1fb05a9a0c	radeon/r600_query.c: Minor style fix Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-03-27 18:00:35 +02:00
Dave Airlie	fc3b000fef	virgl: drop next shader property for now. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-03-26 17:50:32 +10:00
Jason Ekstrand	6d658e9bd5	i965: Allow mul+add fusing again	2016-03-25 21:35:41 -07:00
Jason Ekstrand	fbb9e1f008	spirv/alu: Add support for the NoContraction decoration	2016-03-25 21:35:41 -07:00
Jason Ekstrand	00fa795cd3	spirv/glsl: Add a helper for converting glsl opcodes into nir opcodes This is similar to the way that regular ALU operations are handled.	2016-03-25 21:35:41 -07:00
Jason Ekstrand	98522c1853	nir/spirv: Get rid of the spirv2nir helper binary This was useful once upon a time but now that we have a real Vulkan driver to run our SPIR-V binaries through, there's really no point.	2016-03-25 21:35:41 -07:00
Nanley Chery	0e82896a11	anv/blit2d: Add a function to create an ImageView This function differs from the open-coded implementation in that the ImageView's width is determined by the caller and is not unconditionally set to match the number of texels within the surface's pitch. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-25 17:33:50 -07:00
Nanley Chery	4eab37d6cd	anv/image: Enable specifying a surface's minimum pitch This is required to create multiple, horizontally adjacent, max-width images from one blit2d surface. This is also required for more accurate width specification of surfaces within a larger surface (which is seen as the smaller surface's enclosing region). Note that anv_image_create_info::stride has been unused since commit, `b369389640` . Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-25 17:33:40 -07:00
Timothy Arceri	8683d54d2b	glsl: reduce buffer block duplication This reduces some of the craziness required for handling buffer blocks. The problem is each shader stage holds its own information about a block in memory, we were copying that information to a program wide list but the per stage information remained meaning when a binding was updated we needed to update all versions of it. This changes the per stage blocks to instead point to a single version of the block information in the program list. Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2016-03-26 09:26:30 +11:00
Jason Ekstrand	38250a9ca3	i965/vec4: Get rid of a stray predicate inverse in opquantizef16 This fixes 30 opquantize CTS tests on HSW	2016-03-25 14:37:37 -07:00
Jason Ekstrand	13bad493b4	nir/algebraic: Get rid of a redundant copy of fdiv lowering	2016-03-25 14:04:05 -07:00
Jason Ekstrand	08fe89864b	nir/algebraic: Add better lowering of ldexp	2016-03-25 14:04:05 -07:00
Jason Ekstrand	b75d770963	nir/builder: Simplify nir_ssa_undef a bit	2016-03-25 14:04:05 -07:00
Jason Ekstrand	ab31951bef	nir/spirv: Use the nir_ssa_undef helper from nir_builder	2016-03-25 14:04:05 -07:00
Jason Ekstrand	d2eee52a65	nir/builder: Add a bit size field to nir_ssa_undef	2016-03-25 14:04:05 -07:00
Jason Ekstrand	b50f7f0011	nir: Add a better comment for INTRINSIC_RANGE	2016-03-25 14:04:05 -07:00
Jason Ekstrand	add8c837b5	nir/glsl: Stop carying a pointer to the nir_shader in the visitor	2016-03-25 14:04:05 -07:00
Brian Paul	a8e5edaadf	st/xa: emit sampler view declarations in shaders Fixes recent regressions with the VMware gallium driver. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Tested-by: Charmaine Lee <charmainel@vmware.com>	2016-03-25 14:53:59 -06:00
Tim Rowley	74a04840e5	swr: [rasterizer jitter] Fix MASKLOADD AVX prototype (float -> i32)	2016-03-25 14:45:40 -05:00
Tim Rowley	93c1a2dedf	swr: [rasterizer core] NUMA optimizations... - Affinitize hot-tile memory to specific NUMA nodes. - Only do BE work for macrotiles assoicated with the numa node	2016-03-25 14:45:40 -05:00
Tim Rowley	090be2e434	swr: [rasterizer jitter] Fix logic bug for alpha-to-coverage.	2016-03-25 14:45:40 -05:00
Tim Rowley	0767e820fd	swr: [rasterizer core] Fix Compute workitem retirement	2016-03-25 14:45:40 -05:00
Tim Rowley	813e89c0cc	swr: [rasterizer core] Cleanup state ring arena after last draw that references it completes Rather than waiting for the API thread to re-use it.	2016-03-25 14:45:40 -05:00
Tim Rowley	83822d7ed5	swr: [rasterizer jitter] add missing include for llvm jitevents	2016-03-25 14:45:40 -05:00
Tim Rowley	51549912d1	swr: [rasterizer core] Reduce Arena blocksize to 128KB (from 1MB). With global allocator this doesn't seem to affect performance at all. Overall memory consumption drops by up to 85%.	2016-03-25 14:45:40 -05:00
Tim Rowley	ed5b953919	swr: [rasterizer core] One last pass at Arena optimizations	2016-03-25 14:45:40 -05:00
Tim Rowley	ee6be9e92d	swr: [rasterizer core] CachedArena optimizations Reduce list traversal during Alloc and Free. Add ability to have multiple lists based on alloc size (not used for now)	2016-03-25 14:45:39 -05:00
Tim Rowley	68314b6769	swr: [rasterizer jitter] support llvm-svn	2016-03-25 14:45:39 -05:00
Tim Rowley	ec9d4c4b37	swr: [rasterizer core] Globally cache allocated arena blocks for fast re-allocation.	2016-03-25 14:45:39 -05:00
Tim Rowley	12ce9d9aa1	swr: [rasterizer] more arena work	2016-03-25 14:45:39 -05:00
Tim Rowley	4893224e28	swr: [rasterizer core] Add clipping against user clip distances in the NullPS backend.	2016-03-25 14:45:39 -05:00
Tim Rowley	700a5b06e0	swr: [rasterizer core] Arena optimizations - preparing for global allocator.	2016-03-25 14:45:39 -05:00
Tim Rowley	5899076b6b	swr: [rasterizer core] Reset DrawContext arena at end of draw rather than upon reclaim of DC Keeps overall memory consumption lower. Also, remove unused knobs.	2016-03-25 14:45:39 -05:00
Tim Rowley	7390418441	swr: [rasterizer core] Add clipping of user clip planes in clipper.	2016-03-25 14:45:39 -05:00
Tim Rowley	4b4547a721	swr: [rasterizer] Reduce max in-flight draws to 96 (by default)	2016-03-25 14:45:39 -05:00
Tim Rowley	9111d63228	swr: [rasterizer] Fix run-time check asserts One innocuous (uninitialized variable), and one not so innocuous (stack corruption).	2016-03-25 14:45:39 -05:00
Tim Rowley	257db3610a	swr: [rasterizer jitter] signed immediate builder	2016-03-25 14:45:39 -05:00
Tim Rowley	b958aea78a	swr: [rasterizer common] changes for cygwin	2016-03-25 14:45:39 -05:00
Tim Rowley	e1222ade00	swr: [rasterizer] code styling and update copyrights	2016-03-25 14:45:14 -05:00
Tim Rowley	c75314ec67	swr: [rasterizer core] Guard against enquing work to invalid hot tiles	2016-03-25 14:43:15 -05:00
Tim Rowley	fee56fda6f	swr: [rasterizer] Stop setting viewport size to larger than hottile array Guard against enquing work to invalid tiles	2016-03-25 14:43:14 -05:00
Tim Rowley	e374d2d24b	swr: [rasterizer] Discard work + misc fixes	2016-03-25 14:43:14 -05:00
Tim Rowley	542d7dec7b	swr: [rasterizer] remove use of BYTE type	2016-03-25 14:43:14 -05:00
Tim Rowley	be4c558d01	swr: [rasterizer core] Fix crash that can occur when switching contexts	2016-03-25 14:43:14 -05:00
Tim Rowley	51a11658d9	swr: [rasterizer] remove unused knob	2016-03-25 14:43:14 -05:00
Tim Rowley	61beaa2279	swr: [rasterizer core] subcontext rework	2016-03-25 14:43:14 -05:00
Tim Rowley	0c18900cfb	swr: [rasterizer common] add _simd_s[rl]lv_epi32	2016-03-25 14:43:14 -05:00
Tim Rowley	bef222db22	swr: [rasterizer core] Alleviate potential stack overflow for 32bit builds Move large stack allocations in the GS and clipper into thread local storage.	2016-03-25 14:43:14 -05:00
Tim Rowley	3132f731f8	swr: [rasterizer] remove use of UCHAR and UINT64 types	2016-03-25 14:43:14 -05:00
Tim Rowley	643857f596	swr: [rasterizer] remove use of FLOAT type	2016-03-25 14:43:14 -05:00
Tim Rowley	3252fe3705	swr: [rasterizer] Fix Coverity issues reported by Mesa developers.	2016-03-25 14:43:14 -05:00
Tim Rowley	45d52673c2	swr: [rasterizer] add debug/perf category to knobs	2016-03-25 14:43:13 -05:00
Tim Rowley	1da9c8a970	swr: [rasterizer core] don't assume linux is 64-bit	2016-03-25 14:43:13 -05:00
Tim Rowley	49678803f7	swr: [rasterizer common] remove old unused win32 types	2016-03-25 14:43:13 -05:00
Tim Rowley	aca5513184	swr: [rasterizer jitter] vpermps support	2016-03-25 14:43:13 -05:00
Tim Rowley	bfb954189e	swr: [rasterizer] Add rdtsc buckets support for shaders Pass pointer to core buckets mgr back to sim layer. Add support for RDTSC_START/RDTSC_STOP macros in the builder. Each unique shader now has a unique bucket associated with it, enabling more detailed reporting at the shader level. Currently due to some llvm issue with thread local storage, 64bit runs require single threaded mode.	2016-03-25 14:43:13 -05:00
Tim Rowley	abd4aa68cc	swr: [rasterizer core] backend reorganization	2016-03-25 14:43:13 -05:00
Tim Rowley	13303f3320	swr: [rasterizer core] store blend output in temporary instead of PS output. Fixes additive blend problem with MSAA	2016-03-25 14:26:17 -05:00
Tim Rowley	3f4fba3772	swr: [rasterizer core] Move InitializeHotTiles and corresponding clear code out of threads.cpp.	2016-03-25 14:26:17 -05:00
Tim Rowley	bdd690dc36	swr: [rasterizer jitter] Cleanup use of types inside of Builder. Also, cached the simd width since we don't have to keep querying the JitManager for it.	2016-03-25 14:26:17 -05:00
Tim Rowley	7ead4959a5	swr: [rasterizer jitter] Fix type mismatch on select args for SCATTERPS	2016-03-25 14:26:17 -05:00
Tim Rowley	136988b42b	swr: [rasterizer core] fix rasterizing multisampling with scissor enabled We were not evaluating the scissor edge equations at sample positions.	2016-03-25 14:26:17 -05:00
Tim Rowley	45f0ce168c	swr: [rasterizer core] RingBuffer class for DC/DS Use head/tail ring buffer indices for thread synchronization. 1. SwrWaitForIdle loops until ring is empty. (head == tail) 2. GetDrawContext waits until ring is not full. (head - tail) == Ring Size 3. Draw enqueues by incrementing head. 4. Last worker thread to move past a DC dequeues by incrementing tail. Todo: To reduce contention we can cache the tail in the API thread. For example, if you know you have 64 free entries in the ring then you don't need to keep checking the tail until you used those 64 entries.	2016-03-25 14:26:17 -05:00
Tim Rowley	dd0f9eed8c	swr: [rasterizer] switch assert uses to SWR_ASSERT	2016-03-25 14:26:16 -05:00
Tim Rowley	45a4afa634	swr: [rasterizer core] Split all RECT_LIST draws into 1 RECT per draw Needed until proper RECT_LIST PrimAssembly code is written.	2016-03-25 14:26:16 -05:00
Tim Rowley	3a25185990	swr: [rasterizer] Add string knob type	2016-03-25 14:26:16 -05:00
Jordan Justen	8f3c236674	anv: Use genxml register support for L3 Cache config The programming of the L3 Cache registers should match the previous manually packed LRI values. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-25 00:19:18 -07:00
Jordan Justen	7a03fb9ccb	genxml: Add L3 Cache Control register definitions Based on intel_reg.h (`5912da45a6`) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 23:49:53 -07:00
Jordan Justen	d353ba8f5f	anv: Add genxml register support Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 23:49:53 -07:00
Jordan Justen	b332013a56	genxml: Add register support Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 23:46:59 -07:00
Sonny Jiang	f00c840578	radeonsi: add Polaris PCI IDs Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (Polaris10) Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> (Polaris11)	2016-03-24 23:08:12 -04:00
Sonny Jiang	f87ed903fb	radeon/vce: disable two pipe mode for Polaris11 Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2016-03-24 23:08:04 -04:00
Sonny Jiang	0c5477465f	radeon/vce: add Polaris11 VCE firmware support Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>	2016-03-24 23:07:53 -04:00
Sonny Jiang	42e442d888	radeonsi: add support for Polaris (v2) v2: Polaris chips should be defined after Stoney Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> (v1) Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> (v1) Signed-off-by: Leo Liu <leo.liu@amd.com> (v2 diff) Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v2 diff)	2016-03-24 23:07:32 -04:00
Sonny Jiang	f5e24b19e8	winsys/amdgpu: addrlib - add Polaris support (v2) v2: fix indentation as noted by Michel Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-03-24 23:06:39 -04:00
Jason Ekstrand	2c3f95d6aa	Merge remote-tracking branch 'public/master' into vulkan	2016-03-24 17:30:14 -07:00
Kenneth Graunke	511ce2925b	mesa: Check glReadBuffer enums against the ES3 table. From the ES 3.2 spec, section 16.1.1 (Selecting Buffers for Reading): "An INVALID_ENUM error is generated if src is not BACK or one of the values from table 15.5." Table 15.5 contains NONE and COLOR_ATTACHMENTi. Mesa properly returned INVALID_ENUM for unknown enums, but it decided what was known by using read_buffer_enum_to_index, which handles all enums in every API. So enums that were valid in GL were making it past the "valid enum" check. Such targets would then be classified as unsupported, and we'd raise INVALID_OPERATION, but that's technically the wrong error code. Fixes dEQP-GLES31's functional.debug.negative_coverage.get_error.buffer.read_buffer v2: Only call read_buffer_enuM_to_index when required (Eduardo). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-03-24 16:52:08 -07:00
Nanley Chery	a5dc3c0f02	anv: Sanitize Image extents and offsets Prepare Image extents and offsets for internal consumption by assigning the default values implicitly defned by the spec. Fixes textures on several Vulkan demos in which the VkImageCopy depth is set to zero when copying a 2D image. v2 (Jason Ekstrand): Replace "prep" with "sanitize" Make function static inline Pass structs instead of pointers Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2016-03-24 16:15:00 -07:00
Jason Ekstrand	22b343a8ec	nir: Add a pass to inline functions This commit adds a new NIR pass that lowers all function calls away by inlining the functions. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	debf23ec68	nir/builder: Add helpers for easily inserting copy_var intrinsics Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	79dec93ead	nir: Add return lowering pass This commit adds a NIR pass for lowering away returns in functions. If the return is in a loop, it is lowered to a break. If it is not in a loop, it's lowered away by moving/deleting code as needed. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	8d61d72524	nir: Add a cursor helper for getting a cursor after any phi nodes Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	18b0166749	nir/builder: Add a helper for inserting jump instructions Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	97b663481c	nir/cf: Make extracting or re-inserting nothing a no-op Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	7022a673cd	nir: Add a function for comparing cursors Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	124f229ece	nir/cf: Handle relinking top-level blocks This can happen if a function ends in a return instruction and you remove the return. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	364212f1ed	nir: Add a pass to repair SSA form Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	ea98d415e4	nir/vars_to_ssa: Use the new nir_phi_builder helper The efficiency should be approximately the same. We do a little more work per phi node because we have to sort the predecessors. However, we no longer have to walk the blocks a second time to pop things off the stack. The bigger advantage, however, is that we can now re-use the phi placement and per-block SSA value tracking in other passes. As a side-benifit, the phi builder actually handles unreachable blocks correctly. The original vars_to_ssa code, because of the way it iterated the blocks and added phi sources, didn't add sources corresponding to predecessors of unreachable blocks. The new strategy employed by the phi builder creates a phi source for each predecessor and should correctly handle unreachable blocks by setting those sources to SSA undefs. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	42ddfc611f	nir/dominance: Handle unreachable blocks Previously, nir_dominance.c didn't properly handle unreachable blocks. This can happen if, for instance, you have something like this: loop { if (...) { break; } else { break; } } In this case, the block right after the if statement will be unreachable. This commit makes two changes to handle this. First, it removes an assert and allows block->imm_dom to be null if the block is unreachable. Second, it properly skips unreachable blocks in calc_dom_frontier_cb. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	e4dc82cfcf	nir: Add a phi node placement helper Right now, we have phi placement code in two places and there are other places where it would be nice to be able to do this analysis. Instead of repeating it all over the place, this commit adds a helper for placing all of the needed phi nodes for a value. v2: Add better documentation Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Jason Ekstrand	9a41d94731	util/bitset: Allow iterating over const bitsets Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-24 15:20:44 -07:00
Rob Clark	61c7d20e4f	ttn: remove stray global from header Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-03-24 16:04:54 -04:00
Samuel Pitoiset	b9c70fcdad	nv50/ir: silence unhandled TGSI_PROPERTY_NEXT_SHADER info radeonsi uses this property to make the best decision about which shader to compile, but this is not currently used by our codegen. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-24 18:53:24 +01:00
Kenneth Graunke	d1bb1df87e	mesa: Handle negative length in glPushDebugGroup(). The KHR_debug spec doesn't actually say we should handle this, but that is most likely an oversight - it says to check against strlen and generate errors if length is negative. It appears they just forgot to explicitly spell out that we should then proceed to actually handle it. Fixes crashes from uncaught std::string exceptions in many dEQP-GLES31.functional.debug.error_filters.* tests. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-03-24 10:47:50 -07:00
Kenneth Graunke	028459a00d	mesa: Make glDebugMessageInsert deal with negative length for all types. From the KHR_debug spec, section 5.5.5 (Externally Generated Messages): "If <length> is negative, it is implied that <buf> contains a null terminated string. The error INVALID_VALUE will be generated if the number of characters in <buf>, excluding the null terminator when <length> is negative, is not less than the value of MAX_DEBUG_MESSAGE_LENGTH." This indicates that length should be set to strlen for all types, not just GL_DEBUG_TYPE_MARKER. We want it to be after validate_length() so we still generate appropriate errors. Fixes crashes from uncaught std::string exceptions in many dEQP-GLES31.functional.debug.error_filters.* tests. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-03-24 10:47:45 -07:00
Kenneth Graunke	412e686da9	mesa: Include null terminator in GL_DEBUG_NEXT_LOGGED_MESSAGE_LENGTH. From the KHR_debug spec: "Applications can query the number of messages currently in the log by obtaining the value of DEBUG_LOGGED_MESSAGES, and the string length (including its null terminator) of the oldest message in the log through the value of DEBUG_NEXT_LOGGED_MESSAGE_LENGTH." Because we weren't including the null terminator, many dEQP tests called glGetDebugMessageLog with a bufSize parameter that was 1 too small, and unable to contain the message, so we skipped returning it, failing many cases. Fixes 298 dEQP-GLES31.functional.debug.* tests. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Stephane Marchesin <stephane.marchesin@gmail.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-03-24 10:47:29 -07:00
Nicolai Hähnle	6b763c026d	st/mesa: use RGBA instead of BGRA for SRGB_ALPHA This fixes a regression introduced by commit `a8eea696` "st/mesa: honour sized internal formats in st_choose_format (v2)". Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94657 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94671 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-24 12:23:31 -05:00
Nicolai Hähnle	7880b81d39	radeonsi: silence a coverity warning The following Coverity warning 5378 tmpl.fetch_args = atomic_fetch_args; 5379 tmpl.emit = atomic_emit; >>> CID 1357115: Uninitialized variables (UNINIT) >>> Using uninitialized value "tmpl". Field "tmpl.intr_name" is uninitialized. 5380 bld_base->op_actions[TGSI_OPCODE_ATOMUADD] = tmpl; 5381 bld_base->op_actions[TGSI_OPCODE_ATOMUADD].intr_name = "add"; ... is a false positive, but what the hell. This change should "fix" it. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-24 12:23:14 -05:00
Bas Nieuwenhuizen	f96309753b	mesa: replace gl_context->Multisample._Enabled with _mesa_is_multisample_enabled. This removes any dependency on driver validation of the number of framebuffer samples. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Brian Paul <brianp@vmware.com>	2016-03-24 08:36:43 -06:00
Rob Clark	0bea0e7141	nir: fix dangling ssadef->name ptrs In many places, the convention is to pass an existing ssadef name ptr when construction/initializing a new nir_ssa_def. But that goes badly (as noticed by garbage in nir_print output) when the original string gets freed. Just use ralloc_strdup() instead, and add ralloc_free() in the two places that would care (not that the strings wouldn't eventually get freed anyways). Also fixup the nir_search code which was directly setting ssadef->name to use the parent instruction as memctx. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-24 08:30:04 -04:00
Jason Ekstrand	4e060d80ff	glsl: Add propagate_invariance to the other makefile This fixes the scons build	2016-03-23 21:12:44 -07:00
Jason Ekstrand	a984e44abd	nir/glsl: Propagate invariant into NIR alu ops Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:28:07 -07:00
Jason Ekstrand	028d6ecfe0	glsl/rebalance_tree: Don't handle invariant or precise trees Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:28:07 -07:00
Jason Ekstrand	b2209b2333	glsl/opt_algebraic: Don't handle invariant or precise trees Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:28:07 -07:00
Jason Ekstrand	89b604922d	glsl: Add a pass to propagate the "invariant" and "precise" qualifiers Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:28:06 -07:00
Jason Ekstrand	91d6272c2b	nir/alu_to_scalar: Propagate the "exact" bit Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:28:06 -07:00
Jason Ekstrand	865e83b9ec	i965/peephole_ffma: Don't fuse exact adds Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:28:06 -07:00
Jason Ekstrand	5f39e3e165	nir/cse: Properly handle nir_ssa_def.exact Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:28:06 -07:00
Jason Ekstrand	0dbda153aa	nir/algebraic: Flag inexact optimizations Many of our optimizations, while great for cutting shaders down to size, aren't really precision-safe. This commit tries to flag all of the inexact floating-point optimizations so they don't get run on values that are flagged "exact". It's a bit conservative and maybe flags some safe optimizations as unsafe but that's better than missing one. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:28:02 -07:00
Jason Ekstrand	ed3a029e80	nir/algebraic: Fix fmin detection to match the spec The previous transformation got the arguments to fmin backwards. When NaNs are involved, the GLSL min/max aren't commutative so it matters. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:28:00 -07:00
Jason Ekstrand	89545b1314	nir/algebraic: Get rid of an invlid fxor optimization The fxor opcode is required to return 1.0f or 0.0f but the input variable may not be 1.0f or 0.0f. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:27:58 -07:00
Jason Ekstrand	3a7cb6534c	nir/algebraic: Allow for flagging operations as being inexact Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:27:55 -07:00
Jason Ekstrand	a6f25fa7d7	nir/search: Propagate exactness into newly created expressions Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:27:52 -07:00
Jason Ekstrand	ded3133d47	nir/builder: Add a flag for setting exact Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:26:34 -07:00
Jason Ekstrand	4ff89377d9	nir: Add an "exact" bit to nir_alu_instr Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-23 16:26:34 -07:00
Jason Ekstrand	f849f53990	nir/clone: Export nir_variable_clone Reviewed-by: Rob Clark <robclark@gmail.com>	2016-03-23 15:26:11 -07:00
Jason Ekstrand	5fe8959912	nir/clone: Expose nir_constant_clone Reviewed-by: Rob Clark <robclark@gmail.com>	2016-03-23 15:26:08 -07:00
Jason Ekstrand	c4c373f156	nir: Fix whitespace Reviewed-by: Rob Clark <robclark@gmail.com>	2016-03-23 15:25:53 -07:00
Brian Paul	9a6da49371	docs: use latest libDRM version Signed-off-by: Brian Paul <brianp@vmware.com>	2016-03-23 12:56:32 -06:00
Lars Hamre	43c6f3f82f	compiler/glsl: allow sequence op as a const expr in gles 1.0 Allow the sequence operator to be a constant expression in GLSL ES versions prior to GLSL ES 3.0 Fixes the following piglit test: /all/spec/glsl-es-1.0/compiler/array-sized-by-sequence-in-parenthesis.vert This is similar to the logic from process_initializer() which performs the same check for constant variable initialization with sequence operators. v2: Fixed regression pointed out by Eduardo Lima Mitev Signed-off-by: Lars Hamre <chemecse@gmail.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-03-23 18:13:26 +01:00
Nicolai Hähnle	c4931ae174	radeonsi: fix out-of-bounds indexing of shader images Results are undefined but may not crash. Without this change, out-of-bounds indexing can lead to VM faults and GPU hangs. Constant buffers, samplers, and possibly others will eventually need similar treatment to support GL_ARB_robust_buffer_access_behavior. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2016-03-23 11:49:53 -05:00
Nicolai Hähnle	a8f5d11426	radeonsi: cache flush/invalidation for missing PIPE_BARRIER_*_BUFFER bits (v2) This fixes arb_shader_image_load_store-host-mem-barrier. v2: flush TC L2 for index buffers on <= CIK (Marek) Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-23 11:48:19 -05:00
Nicolai Hähnle	fc94bc2986	st/mesa: add missing MemoryBarrier bits and some explanations Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-23 11:48:15 -05:00
Nicolai Hähnle	b15b1faefd	gallium: add PIPE_BARRIER_STREAMOUT_BUFFER Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-23 11:48:02 -05:00
Marek Olšák	b8ec205515	radeonsi: fix 2D array MSAA failures since image support landed Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2016-03-23 12:14:15 +01:00
Jason Ekstrand	9881eab197	i965/fs: Don't constant-fold RCP No shader-db changes on Broadwell Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-22 16:46:15 -07:00
Jason Ekstrand	01425c45b3	i965: Remove the RCP+RSQ algebraic optimizations NIR already has this optimization and it can do much better than the little peephole in the backend. No shader-db change on Haswell or Broadwell. Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-22 16:46:15 -07:00
Jason Ekstrand	20417b2cb0	anv/device: Advertise version 1.0.5 Nothing substantial has changed since 1.0.2	2016-03-22 16:21:23 -07:00
Jason Ekstrand	204d937ac2	anv/device: Ignore the patch portion of the requested API version Fixes dEQP-VK.api.device_init.create_instance_name_version Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94661	2016-03-22 16:20:45 -07:00
Jason Ekstrand	4844723405	anv: Don't assert-fail if someone asks for a non-existent entrypoint	2016-03-22 16:11:53 -07:00
Jason Ekstrand	8dd86e8aa7	Update to the latest Vulkan header from Khronos	2016-03-22 16:06:53 -07:00
Ian Romanick	d7a25a9def	nir: Don't abs slt and friends No shader-db changes, but this is symmetric with the previous commit. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-22 14:48:02 -07:00
Ian Romanick	2bb006af68	nir: Don't abs the result of b2f or b2i In the results below, 2 SIMD16 shaders in Trine are lost. G4X total instructions in shared programs: 4012279 -> 4011108 (-0.03%) instructions in affected programs: 116776 -> 115605 (-1.00%) helped: 339 HURT: 0 total cycles in shared programs: 84315862 -> 84313584 (-0.00%) cycles in affected programs: 1767232 -> 1764954 (-0.13%) helped: 274 HURT: 81 Ironlake total instructions in shared programs: 6399073 -> 6396998 (-0.03%) instructions in affected programs: 218050 -> 215975 (-0.95%) helped: 600 HURT: 0 total cycles in shared programs: 128892088 -> 128888810 (-0.00%) cycles in affected programs: 2867452 -> 2864174 (-0.11%) helped: 422 HURT: 137 Sandy Bridge total instructions in shared programs: 8462174 -> 8460759 (-0.02%) instructions in affected programs: 178529 -> 177114 (-0.79%) helped: 596 HURT: 0 total cycles in shared programs: 117542276 -> 117534098 (-0.01%) cycles in affected programs: 1239166 -> 1230988 (-0.66%) helped: 369 HURT: 150 Ivy Bridge total instructions in shared programs: 7775131 -> 7773410 (-0.02%) instructions in affected programs: 162903 -> 161182 (-1.06%) helped: 590 HURT: 0 total cycles in shared programs: 65759882 -> 65747268 (-0.02%) cycles in affected programs: 1004354 -> 991740 (-1.26%) helped: 467 HURT: 141 Haswell total instructions in shared programs: 7107786 -> 7106327 (-0.02%) instructions in affected programs: 140954 -> 139495 (-1.04%) helped: 590 HURT: 0 total cycles in shared programs: 64668028 -> 64655322 (-0.02%) cycles in affected programs: 967080 -> 954374 (-1.31%) helped: 452 HURT: 149 LOST: 2 GAINED: 0 Broadwell total instructions in shared programs: 8980029 -> 8978287 (-0.02%) instructions in affected programs: 197232 -> 195490 (-0.88%) helped: 715 HURT: 0 total cycles in shared programs: 70070448 -> 70055970 (-0.02%) cycles in affected programs: 975724 -> 961246 (-1.48%) helped: 471 HURT: 111 LOST: 2 GAINED: 0 Skylake total instructions in shared programs: 9115178 -> 9113436 (-0.02%) instructions in affected programs: 203012 -> 201270 (-0.86%) helped: 715 HURT: 0 total cycles in shared programs: 68848660 -> 68834004 (-0.02%) cycles in affected programs: 993888 -> 979232 (-1.47%) helped: 473 HURT: 116 LOST: 2 GAINED: 0 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-22 14:48:02 -07:00
Ian Romanick	348e5a71d8	nir: Simplify 0 < fabs(a) Sandy Bridge / Ivy Bridge / Haswell total instructions in shared programs: 8462180 -> 8462174 (-0.00%) instructions in affected programs: 564 -> 558 (-1.06%) helped: 6 HURT: 0 total cycles in shared programs: 117542462 -> 117542276 (-0.00%) cycles in affected programs: 9768 -> 9582 (-1.90%) helped: 12 HURT: 0 Broadwell / Skylake total instructions in shared programs: 8980833 -> 8980826 (-0.00%) instructions in affected programs: 626 -> 619 (-1.12%) helped: 7 HURT: 0 total cycles in shared programs: 70077900 -> 70077714 (-0.00%) cycles in affected programs: 9378 -> 9192 (-1.98%) helped: 12 HURT: 0 G45 and Ironlake showed no change. v2: Modify the comments to look more like a proof. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-22 14:47:56 -07:00
Ian Romanick	564a8b8a26	nir: Simplify 0 >= b2f(a) This also prevented some regressions with other patches in my local tree. Broadwell / Skylake total instructions in shared programs: 8980835 -> 8980833 (-0.00%) instructions in affected programs: 45 -> 43 (-4.44%) helped: 1 HURT: 0 total cycles in shared programs: 70077904 -> 70077900 (-0.00%) cycles in affected programs: 122 -> 118 (-3.28%) helped: 1 HURT: 0 No changes on earlier platforms. v2: Modify the comments to look more like a proof. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-22 14:44:57 -07:00
Ian Romanick	bf0d60aa11	nir: Simplify i2b with negated or abs operand This enables removing ssa_201 and ssa_202 in sequences like: vec1 ssa_200 = flt ssa_199, ssa_194 vec1 ssa_201 = b2i ssa_200 vec1 ssa_202 = i2b -ssa_201 shader-db results: Sandy Bridge total instructions in shared programs: 8462257 -> 8462180 (-0.00%) instructions in affected programs: 3846 -> 3769 (-2.00%) helped: 35 HURT: 0 total cycles in shared programs: 117542934 -> 117542462 (-0.00%) cycles in affected programs: 20072 -> 19600 (-2.35%) helped: 20 HURT: 1 Ivy Bridge total instructions in shared programs: 7775252 -> 7775137 (-0.00%) instructions in affected programs: 3645 -> 3530 (-3.16%) helped: 35 HURT: 0 total cycles in shared programs: 65760522 -> 65760068 (-0.00%) cycles in affected programs: 21082 -> 20628 (-2.15%) helped: 25 HURT: 2 Haswell total instructions in shared programs: 7108666 -> 7108589 (-0.00%) instructions in affected programs: 3253 -> 3176 (-2.37%) helped: 35 HURT: 0 total cycles in shared programs: 64675726 -> 64675272 (-0.00%) cycles in affected programs: 21034 -> 20580 (-2.16%) helped: 26 HURT: 1 Broadwell / Skylake total instructions in shared programs: 8980912 -> 8980835 (-0.00%) instructions in affected programs: 3223 -> 3146 (-2.39%) helped: 35 HURT: 0 total cycles in shared programs: 70077926 -> 70077904 (-0.00%) cycles in affected programs: 21886 -> 21864 (-0.10%) helped: 21 HURT: 6 G45 and Ironlake showed no change. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-22 14:43:28 -07:00
Ian Romanick	a4079f1cb2	nir: Lower flrp with Boolean interpolator to bcsel On Intel platforms that don't set lower_flrp, using bcsel instead of flrp seems to be a small amount worse. On those platforms, the use of flrp, bcsel, and multiply of b2f is still an active area of research. In review, Matt suggested this is because bcsel turns into CMP+SEL, and because of the flag register we can't schedule instructions well. shader-db results: G4X / Ironlake total instructions in shared programs: 4016538 -> 4012279 (-0.11%) instructions in affected programs: 161556 -> 157297 (-2.64%) helped: 1077 HURT: 1 total cycles in shared programs: 84328296 -> 84315862 (-0.01%) cycles in affected programs: 4174570 -> 4162136 (-0.30%) helped: 926 HURT: 53 Unsurprisingly, no changes on later platforms. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-22 14:42:42 -07:00
Ian Romanick	9442db4f89	i965: Have NIR lower flrp on pre-GEN6 vec4 backend Previously we were doing the lowering by hand in vec4_visitor::emit_lrp. By doing it in NIR, we have the opportunity for NIR to do additional optimization of the expanded code. This also enables optimizations added by the next commit. shader-db results: G4X / Ironlake total instructions in shared programs: 4024401 -> 4016538 (-0.20%) instructions in affected programs: 447686 -> 439823 (-1.76%) helped: 2623 HURT: 0 total cycles in shared programs: 84375846 -> 84328296 (-0.06%) cycles in affected programs: 16964960 -> 16917410 (-0.28%) helped: 2556 HURT: 41 Unsurprisingly, no changes on later platforms. v2: Formatting and comment changes suggested by Matt. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-22 14:42:42 -07:00
Brian Paul	18c5fa1122	swrast: fix discarded const warning in s_texture.c Signed-off-by: Brian Paul <brianp@vmware.com>	2016-03-22 08:35:27 -06:00
Marc-André Lureau	530593da65	i965: fix invalid memory write I noticed some heap corruption running virgl tests, and valgrind helped me to track it down to the following error: ==29272== Invalid write of size 4 ==29272== at 0x90283D4: push_loop_stack (brw_eu_emit.c:1307) ==29272== by 0x9029A7D: brw_DO (brw_eu_emit.c:1750) ==29272== by 0x90554B0: fs_generator::generate_code(cfg_t const, int) (brw_fs_generator.cpp:1999) ==29272== by 0x904491F: brw_compile_fs (brw_fs.cpp:5685) ==29272== by 0x8FC5DC5: brw_codegen_wm_prog (brw_wm.c:137) ==29272== by 0x8FC7663: brw_fs_precompile (brw_wm.c:638) ==29272== by 0x8FA4040: brw_shader_precompile(gl_context, gl_shader_program) (brw_link.cpp:51) ==29272== by 0x8FA4A9A: brw_link_shader (brw_link.cpp:260) ==29272== by 0x8DEF751: _mesa_glsl_link_shader (ir_to_mesa.cpp:3006) ==29272== by 0x8C84325: _mesa_link_program (shaderapi.c:1042) ==29272== by 0x8C851D7: _mesa_LinkProgram (shaderapi.c:1515) ==29272== by 0x4E4B8E8: add_shader_program (vrend_renderer.c:880) ==29272== Address 0xf2f3cb0 is 0 bytes after a block of size 112 alloc'd ==29272== at 0x4C2AA98: calloc (vg_replace_malloc.c:711) ==29272== by 0x8ED11F7: ralloc_size (ralloc.c:113) ==29272== by 0x8ED1282: rzalloc_size (ralloc.c:134) ==29272== by 0x8ED14C0: rzalloc_array_size (ralloc.c:196) ==29272== by 0x9019C7B: brw_init_codegen (brw_eu.c:291) ==29272== by 0x904F565: fs_generator::fs_generator(brw_compiler const, void, void, void const, brw_stage_prog_data, unsigned int, bool, gl_shader_stage) (brw_fs_generator.cpp:124) ==29272== by 0x9044883: brw_compile_fs (brw_fs.cpp:5675) ==29272== by 0x8FC5DC5: brw_codegen_wm_prog (brw_wm.c:137) ==29272== by 0x8FC7663: brw_fs_precompile (brw_wm.c:638) ==29272== by 0x8FA4040: brw_shader_precompile(gl_context, gl_shader_program) (brw_link.cpp:51) ==29272== by 0x8FA4A9A: brw_link_shader (brw_link.cpp:260) ==29272== by 0x8DEF751: _mesa_glsl_link_shader (ir_to_mesa.cpp:3006) if_depth_in_loop is an array of size p->loop_stack_array_size, and push_loop_stack() will access if_depth_in_loop[p->loop_stack_depth+1], thus the condition to grow the array should be p->loop_stack_array_size <= (p->loop_stack_depth + 1) (it's currently off by 2...) This can be reproduced by running the following test with virgl test server: LIBGL_ALWAYS_SOFTWARE=y GALLIUM_DRIVER=virpipe bin/shader_runner ./tests/shaders/glsl-fs-unroll-explosion.shader_test -auto Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-03-21 20:50:07 -07:00
Dave Airlie	53afbc980a	tgsi: drop unused set_exec/kill_mask interfaces. These don't get used and haven't been in git history from what I can see, so drop them. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-03-22 13:07:05 +10:00
Dave Airlie	1e8435ce0c	docs/relnotes: update ARB_internalformat_query2 status. Signed-off-by: Dave Airlie <Airlied@redhat.com>	2016-03-22 09:54:08 +10:00
Dave Airlie	ee7c8b9804	st/mesa: add support for internalformat query2. Add code to handle GL_INTERNALFORMAT_PREFERRED. Add code to deal with GL_RENDERBUFFER being passes into ChooseTextureFormat. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-03-22 09:49:08 +10:00
Jason Ekstrand	869e393eb3	anv/batch_chain: Fall back to growing batches when chaining isn't available	2016-03-21 15:29:30 -07:00
Anuj Phogat	4ba47f7b2a	i965: Fix assert conditions for src/dst x/y offsets Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-03-21 14:55:18 -07:00
Anuj Phogat	65cd2f8443	swrast: Move assert for 'slice' in to check_map_teximage Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-03-21 14:55:18 -07:00
xavier	fce0b55ccb	r600/sb: Do not distribute neg in expr_handler::fold_assoc() when folding multiplications. Previously it was doing this transformation for a Trine 3 shader: MUL R6.x.12, R13.x.23, 0.5\|3f000000 - MULADD R4.x.12, -R6.x.12, 2\|40000000, 1\|3f800000 + MULADD R4.x.12, -R13.x.23, -1\|bf800000, 1\|3f800000 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94412 Signed-off-by: Xavier Bouchoux <xavierb@gmail.com> Cc: "11.0 11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-03-22 07:43:13 +10:00
Samuel Pitoiset	9efd8b590f	nvc0: make sure to delete samplers used by compute shaders Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-03-21 22:04:18 +01:00
Kenneth Graunke	4b0a5b21ae	i965/blorp: Make BlitFramebuffer() do sRGB encoding in ES 3.x. According to the ES 3.0 and GL 4.4 specifications, glBlitFramebuffer is supposed to perform sRGB decoding and encoding whenever sRGB formats are in use. The ES 3.0 specification is completely clear, and has always stated this. However, the GL specification has changed behavior in 4.1, 4.2, and 4.4. The original behavior stated that no sRGB encoding should occur. The 4.4 behavior matches ES 3.0's wording. However, implementing the new behavior appears to break applications such as Left 4 Dead 2. This patch changes Meta to apply the ES 3.x rules in ES 3.x, but leaves OpenGL alone for now, to avoid breaking applications. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-21 13:55:32 -07:00
Kenneth Graunke	8679bb7c9e	i965/blorp: Refactor sRGB encoding/decoding. Because the rules for sRGB are so insane, we change brw_blorp_miptrees to take decode_srgb and encode_srgb flags, which control linearization of the source and destination separately. This should make it easy to implement whatever crazy combination of rules people throw at us. For now, it should be equivalent. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-21 13:54:29 -07:00
Kenneth Graunke	eee8a53906	meta: Make BlitFramebuffer() do sRGB encoding in ES 3.x. According to the ES 3.0 and GL 4.4 specifications, glBlitFramebuffer is supposed to perform sRGB decoding and encoding whenever sRGB formats are in use. The ES 3.0 specification is completely clear, and has always stated this. However, the GL specification has changed behavior in 4.1, 4.2, and 4.4. The original behavior stated that no sRGB encoding should occur. The 4.4 behavior matches ES 3.0's wording. However, implementing the new behavior appears to break applications such as Left 4 Dead 2. This patch changes Meta to apply the ES 3.x rules in ES 3.x, but leaves OpenGL alone for now, to avoid breaking applications. Meta implements several other functions in terms of BlitFramebuffer, and many of those explicitly do not perform sRGB encoding. So, this patch explicitly disables sRGB encoding in those other functions, preserving the existing (correct) behavior. If you're from the future and are reading this, hi! Welcome to the "fun" of debugging sRGB problems! Best of luck! Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-21 13:53:44 -07:00
Nicolai Hähnle	b74784638d	docs: mark GL_ARB_shader_image_load_store/_size as done for radeonsi Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:26 -05:00
Edward O'Callaghan	5219eb15e1	radeonsi: Set PIPE_SHADER_CAP_MAX_SHADER_IMAGES This enables ARB_shader_image_load_store and ARB_shader_image_size. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> [allow the same number of images for all shader stages and require LLVM 3.9] Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:26 -05:00
Nicolai Hähnle	6f942ac5ee	radeonsi: disable early Z if the fragment shader writes to memory Empirically, both the EXEC_ON_* flags and LATE_Z are necessary. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:25 -05:00
Nicolai Hähnle	79762e877c	tgsi/scan: add writes_memory to flag presence of stores or atomics Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:25 -05:00
Nicolai Hähnle	e9d935ed0e	radeonsi: force the DCC enable bit off in image descriptors for writing (v2) This avoids a lockup at least on Tonga. v2: only force DCC off on VI+ (Marek) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:25 -05:00
Nicolai Hähnle	43f5ce1d20	radeonsi: implement MemoryBarrier (v2) v2: invalidate both constant and VMEM/TC L1 for constant buffers (Marek) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:25 -05:00
Nicolai Hähnle	97352aa50a	radeonsi: implement volatile memory access Prevent loads from being re-ordered or coalesced. Atomics don't need special handling by definition, and stores don't need special handling because LLVM is unable to detect dead image or buffer stores. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:25 -05:00
Nicolai Hähnle	5a61b428f4	radeonsi: implement coherent memory access (v2) v2: set glc=1 for volatile also on buffers Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:25 -05:00
Nicolai Hähnle	d6fa650454	radeonsi: Lower TGSI_OPCODE_MEMBAR down to LLVM op Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:25 -05:00
Nicolai Hähnle	f7a85a8a0a	radeonsi: Lower TGSI_OPCODE_ATOM* down to LLVM op Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:24 -05:00
Nicolai Hähnle	bfcefcb3c7	radeonsi: Lower TGSI_OPCODE_STORE down to LLVM op Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:24 -05:00
Nicolai Hähnle	1e82dedeca	radeonsi: Lower TGSI_OPCODE_LOAD down to LLVM op (v3) v2: new signature style for buffer intrinsics (offsets) v3: new signature style for llvm.amdgcn.buffer.load.format (overloaded return) Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)	2016-03-21 15:34:24 -05:00
Nicolai Hähnle	136686a51d	radeonsi: extract the LLVM type name construction into its own function Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:23 -05:00
Nicolai Hähnle	02bd0cd7b1	radeonsi: Lower TGSI_OPCODE_RESQ down to LLVM op Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:23 -05:00
Nicolai Hähnle	75539197c7	radeonsi: extract TXQ buffer size computation into its own function This will allow it to be reused for RESQ. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:23 -05:00
Nicolai Hähnle	515fb2c09c	radeonsi: decompress shader images Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:23 -05:00
Nicolai Hähnle	f61566b77a	radeonsi: update shader image descriptor for invalidated buffer Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:23 -05:00
Nicolai Hähnle	e85cf35a65	radeonsi: implement set_shader_images (v2) Whether DCC is disabled depends on the access flags with which the image is bound: image_load supports DCC, but store and atomic don't. v2: remove an unnecessary masking of images->desc.enabled_mask Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:23 -05:00
Nicolai Hähnle	b1b7268f01	gallium/radeon: make r600_texture_disable_dcc externally accessible We will need it in radeonsi for shader images. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:22 -05:00
Nicolai Hähnle	457f9c6b25	tgsi/scan: track which shader images are really buffers Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:22 -05:00
Nicolai Hähnle	fa096a14af	tgsi/scan: add images_writemask Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:22 -05:00
Nicolai Hähnle	1379544081	st/mesa: translate additional flags in MemoryBarrier Re-order flags in the order in which they appear in the OpenGL spec in the description of MemoryBarrier(). Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:22 -05:00
Nicolai Hähnle	96cd908fd3	gallium: add additional PIPE_BARRIER_* bits Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 15:34:22 -05:00
Brian Paul	86caa67aef	svga: add svga_winsys_context::pipe_debug_callback pointer The svga winsys modules can use this to send debug messages to the state tracker and Mesa. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-03-21 13:37:40 -06:00
Charmaine Lee	f8aaf0094d	svga: Fix the index buffer rebind regression The index buffer handle saved in the hw_state structure could be invalid after the index buffer is destroyed. Instead of rebinding the index buffer using the saved index buffer handle, we will reset the index buffer handle in the hw_state structure to force resending of the index buffer. Fixes bug 1593320 Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-03-21 13:37:40 -06:00
Charmaine Lee	47856e5945	svga: rebind stream output targets To ensure stream output target surfaces are available for the draw commands, we need to rebind the current stream output targets at the first draw in the command buffer. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-03-21 13:37:40 -06:00
Charmaine Lee	47cfc83440	svga: rebind index buffer Similar to other resources, current index buffer needs to be rebound at the first draw of the current command buffer to make sure the buffer is available for the draw command. Fixes bug 1587263. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-03-21 13:37:40 -06:00
Brian Paul	299f8ca0a7	svga: minor formatting fix, comment addition To sync with our internal tree. Signed-off-by: Brian Paul <brianp@vmware.com>	2016-03-21 13:37:25 -06:00
Charmaine Lee	b45b47c5c9	svga: optimize constant buffer uploads When a constant buffer slot is allocated in the upload buffer, the allocated slot size is always in multiple of 256. But the actual buffer size might not be in multiple of 256. This causes a gap between the ending offset of a slot and the starting offset of the next slot. The gap will prevent the two slots to be updated in a single update command. In order to maximize the chance of merging the contiguous dirty ranges, when a slot is to be allocated in the constant upload buffer, specify a buffer size in multiple of 256. There is about 10% performance improvement with Lightsmark2008 and 30% with Cinebench R11. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>	2016-03-21 12:58:25 -06:00
Charmaine Lee	0a1d91ef97	svga: add a few more resource updates HUD query This patch adds the following HUD queries: .num-resource-updates -- number of resource update. Commands include UPDATE_SUBRESOURCE, UPDATE_GB_IMAGE. .num-buffer-uploads -- number of buffer uploads. .num-const-buf-updates -- number of set constant buffer. .num-const-updates -- number of set shader constant. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>	2016-03-21 12:58:25 -06:00
Charmaine Lee	79e343b36a	svga: add new num-readbacks HUD query To find out how many image readback command is issued. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>	2016-03-21 12:58:25 -06:00
Brian Paul	dc9ecf58c0	svga: use shader sampler view declarations Previously, we looked at the bound textures (via the pipe_sampler_views) to determine texture dimensions (1D/2D/3D/etc) and datatype (float vs. int). But this could fail in out of memory conditions. If we failed to allocate a texture and didn't create a pipe_sampler_view, we'd default to using 0 (PIPE_BUFFER) as the texture type. This led to device errors because of inconsistent shader code. This change relies on all TGSI shaders having an SVIEW declaration for each SAMP declaration. The previous patch series does that. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-21 11:59:25 -06:00
Brian Paul	b56b853ab3	gallium/tests: declare sampler views in shaders Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-21 11:59:25 -06:00
Brian Paul	38e831ca3d	gallium/util: declare sampler view in util_make_fs_blit_msaa_depthstencil() Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-21 11:59:25 -06:00
Brian Paul	e7b5a844e3	postprocess: declare sampler views in shaders Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-21 11:59:25 -06:00
Brian Paul	5a9f2a2d89	hud: add sampler view declaration in text fragment shader Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-21 11:59:25 -06:00
Brian Paul	b3daaefadb	st/mesa: emit sampler view decls in drawpixels code v2: support both TGSI_TEXTURE_2D and _RECT Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-21 11:59:25 -06:00
Brian Paul	0f0a23d4d8	st/mesa: emit sampler view declaration in bitmap shader In June 2015, Rob Clark started updating the tgsi utility code to emit SVIEW declarations in various shaders (for polygon stipple, blitting, etc). These patches do the same for the Mesa state tracker. The VMware driver will use this. v2: support both TGSI_TEXTURE_2D and _RECT Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-21 11:59:25 -06:00
Brian Paul	72eb5a3cfe	st/mesa: emit sampler view declarations for ARB vert/frag programs Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-21 11:59:25 -06:00
Brian Paul	eda81fa357	st/mesa: use correct TGSI texture target in drawpix fragment shader Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-21 11:59:25 -06:00
Brian Paul	83b5b3d66e	st/mesa: use correct TGSI texture target in bitmap fragment shader Depending on the driver's support for NPOT textures, we might use a RECT texture instead of 2D texture. We should propogate that info to the fragment shader's TEX instruction. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-21 11:59:25 -06:00
Brian Paul	63e020d734	gallium/tgsi: pass TGSI tex target to tgsi_transform_tex_inst() Instead of hard-coded 2D tex target in tgsi_transform_tex_2d_inst() Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-21 11:59:25 -06:00
Nicolai Hähnle	a8b315b827	st/mesa: use the texture view's format for render-to-texture Aside from the bug below, it fixes a simplistic test I've written locally, and I see no regression in Piglit for radeonsi. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94595 Cc: "11.0 11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-21 11:28:38 -05:00
Hans de Goede	dcf8a4d281	gallium: Remove unused TGSI_RESOURCE_ defines These magic file-index defines where only ever used in the nouveau code and that no longer uses them. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v2) Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)	2016-03-21 12:20:58 +01:00
Hans de Goede	9b4c8f6629	nouveau: codegen: Do not silently fail in handeLOAD / handleSTORE / handleATOM handeLOAD / handleSTORE / handleATOM can only handle TGSI_FILE_BUFFER and TGSI_FILE_MEMORY. Make things fail explictly when another register-file is used in these functions. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v2)	2016-03-21 12:20:48 +01:00
Hans de Goede	86e4440361	nouveau: codegen: Disable more old resource handling code Commit `c3083c7082` ("nv50/ir: add support for BUFFER accesses") disabled / commented out some of the old resource handling code, but not all of it. Effectively all of it is dead already, if we ever enter the old code paths in handeLOAD / handleSTORE / handleATOM we will get an exception due to trying to access the now always zero-sized resources vector. Disable all the dead code. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v2)	2016-03-21 12:20:40 +01:00
Hans de Goede	71e315475c	nouveau: codegen: gk110: Make emitSTORE offset handling identical to emitLOAD Make the store offset handling in CodeEmitterGK110::emitSTORE identical to the one in CodeEmitterGK110::emitLOAD handling. This is just a cleanup, it does not cause any functional changes. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-03-21 12:20:38 +01:00
Hans de Goede	c783ad0e24	nouveau: codegen: Slightly refactor Source::scanInstruction() dst handling Use the dst temp variable which was used in the TGSI_FILE_OUTPUT case everywhere. This makes the code somewhat easier to reads and helps avoiding going over 80 chars with upcoming changes. This also brings the dst handling more in line with the src handling. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-03-21 12:20:32 +01:00
Hans de Goede	54cdde5eff	nouveau: codegen: Add support for clover / OpenCL kernel input parameters Add support for clover / OpenCL kernel input parameters. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (v1) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v2)	2016-03-21 12:20:28 +01:00
Hans de Goede	3788e1bf74	tgsi: Add support for global / private / input MEMORY Extend the MEMORY file support to differentiate between global, private and shared memory, as well as "input" memory. "MEMORY[x], INPUT" is intended to access OpenCL kernel parameters, a special memory type is added for this, since the actual storage of these (e.g. UBO-s) may differ per implementation. The uploading of kernel parameters is handled by launch_grid, "MEMORY[x], INPUT" allows drivers to use an access mechanism for parameter reads which matches with the upload method. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (v1) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v2)	2016-03-21 12:20:24 +01:00
Hans de Goede	43ddec2f43	tgsi: Fix decl.Atomic and .Shared not propagating when parsing tgsi text When support for decl.Atomic and .Shared was added, tgsi_build_declaration was not updated to propagate these properly. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (v1) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v2)	2016-03-21 12:20:19 +01:00
Iago Toral Quiroga	8f45691cda	doc: document spilling options accepted by INTEL_DEBUG Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-21 08:16:49 +01:00
Hans de Goede	b72156c8e0	tgsi: Fix return of uninitialized memory in tgsi_*_instruction_memory tgsi_default_instruction_memory / tgsi_build_instruction_memory were returning uninitialized memory for tgsi_instruction_memory.Texture and tgsi_instruction_memory.Format. Note 0 means not set, and thus is a correct default initializer for these. Fixes: `3243b6fc97` ("tgsi: add Texture and Format to tgsi_instruction_memory") Cc: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-20 18:01:53 -04:00
Ilia Mirkin	bbbdcdcf75	st/mesa: report correct precision information for low/medium/high ints When we have native integers, these have full precision. Whether they're low/medium/high isn't piped through the TGSI yet, but eventually those might have differing precisions. For now they're just 32-bit ints. Fixes the following dEQP tests: dEQP-GLES3.functional.state_query.shader.precision_vertex_highp_int dEQP-GLES3.functional.state_query.shader.precision_fragment_highp_int which expected highp ints to have full 32-bit precision, not the default 23-bit float precision. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-20 17:51:08 -04:00
Nishanth Peethambaran	eeb117a09d	st/omx/dec: Correct the timestamping Attach the timestamp to the dpb buffer and use that timestamp while pushing buffer from dpb list to the omx client. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Nishanth Peethambaran <nishanth.peethambaran@amd.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-03-20 15:01:28 -04:00
Nishanth Peethambaran	46de6bbb77	st/omx: Remove trailing spaces Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Nishanth Peethambaran <nishanth.peethambaran@amd.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-03-20 15:01:28 -04:00
Ilia Mirkin	7d98bfedd7	nv50/ir: fix indirect texturing for non-array textures on nvc0 If a layer parameter is provided, we want to flip it to position 0 (and combine it with any indirect params). However if the target is not an array, there is no layer, so we have to shift all of the arguments down by one to make room for it. This fixes situations where there were non-coordinate parameters, such as bias, lod, depth compare, explicit derivatives. Instead of adding a new parameter at the front for the indirect reference, we would swap one of those in its place. Fixes dEQP-GLES31.functional.shaders.opaque_type_indexing.sampler.uniform.compute.*shadow Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reported-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-03-20 14:14:32 -04:00
Ilia Mirkin	adb40a7399	st/mesa: only minify depth for 3d targets We make sure that that image depth matches the level's depth before copying it into place. However we should only be minifying the first level's depth for 3d textures - array textures have the same depth for all levels. This fixes tests such as dEQP-GLES3.functional.texture.specification.texsubimage3d_depth.* and I suspect account for a number of other odd situations I've run into where level > 0 of array textures was messed up. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-03-20 14:14:32 -04:00
Ilia Mirkin	6eeb284e4f	nv50/ir: normalize cube coordinates after derivatives have been computed In "manual" derivative mode (always used on nv50 and sometimes on nvc0 but always for cube), the idea is that using the quadop instruction, we set up the "other" quads to have values such that the derivatives work out, and then run the texture instruction as if nothing were strange. It pulls values from the other lanes, and does its magic. However cube coordinates have to be normalized - one of the 3 coords has to be 1, to determine which is the major axis, to say which face is being sampled. We were normalizing the coordinates first, and then adding the derivatives. This is wrong for two reasons: - the coordinates got normalized by a scaling factor but the derivatives didn't - the result of the addition didn't end up normalized To resolve this, we flip the logic around to normalize after the per-lane coordinates are set up. This fixes a bunch of textureGrad cube dEQP tests. NOTE: nv50 cube arrays with explicit derivatives are still broken, to be resolved at a later date. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-03-20 14:14:32 -04:00
Marek Olšák	ea2bff1d11	gallium/radeon: remove remnants of R600 TGSI->LLVM Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-20 00:57:05 +01:00
Marek Olšák	4e5dc69af1	r600g: flatten if (1) statement after removal of TGSI->LLVM Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-20 00:57:05 +01:00
Marek Olšák	20a09897a6	r600g: remove TGSI->LLVM translation It was useful for testing and as a prototype for radeonsi bringup, but it's not used anymore and doesn't support OpenGL 3.3 even. v2: try to fix OpenCL build Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2016-03-20 00:57:02 +01:00
Marek Olšák	8140154ae9	gallium/radeon: remove old CS tracing Cons: - it was only integrated in r600g - it doesn't work with GPUVM - it records buffer contents at the end of IBs instead of at the beginning, so the replay isn't exact - it lacks an IB parser and user-friendliness A better solution is apitrace in combination with gallium/ddebug, which has a complete IB parser and can pinpoint hanging CP packets. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-20 00:56:35 +01:00
Marek Olšák	a73a657def	radeonsi: process TGSI property NEXT_SHADER This allows compiling the main shader part as ES or LS. If we get the correct hint, non-separable GLSL shaders no longer have to be compiled as VS first, followed by LS or ES compiled on demand. The result is that fewer shaders are compiled by piglit, but it doesn't improve piglit running time. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-19 23:20:01 +01:00
Marek Olšák	2bdd7a46a9	st/mesa: set TGSI property NEXT_SHADER Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-19 23:20:01 +01:00
Marek Olšák	fbe6e92899	gallium: add TGSI property NEXT_SHADER Radeonsi needs to know which shader stage will execute after a shader in order to make the best decision about which shader variant to compile first. This is only set for VS and TES, because we don't need it elsewhere. VS has 3 variants: - next shader is FS - next shader is GS - next shader is TCS TES has 2 variants: - next shader is FS - next shader is GS Currently, radeonsi always assumes the next shader is FS, which is suboptimal, since st/mesa always knows which shader is next if the GLSL program is not a "separate shader". By default, ureg always sets "next shader is FS". Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-19 23:20:01 +01:00
Pierre Moreau	9184d9a0bb	nvc0/ir: Use double constant in handleSQRT Fixes: `a100d89d09` (nv50,nvc0: Fix invalid constant.) Signed-off-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-19 15:59:52 -04:00
Kenneth Graunke	789e096594	mesa: Disallow GL_FRAMEBUFFER_ATTACHMENT_OBJECT_NAME on winsys FBO. Fixes: dEQP-GLES3.functional.negative_api.state.get_framebuffer_attachment_parameteriv Apparently, GL_FRAMEBUFFER_ATTACHMENT_OBJECT_NAME is not allowed when GL_FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE is GL_FRAMEBUFFER_DEFAULT, and is expected to result in a GL_INVALID_ENUM error. No GL specification actually defines what GL_FRAMEBUFFER_DEFAULT means. It probably means the window system FBO. It also doesn't mention the behavior of any queries for that type. Various ARB folks seem fairly confused about it too. For now, just do something vaguely like what dEQP expects. I think we probably need to check the visual bits against 0 for the attachment, but we haven't been doing that thusfar, and given how confusingly this is specified, I can't imagine anyone relying on it. v2: Improve comments, move error condition above the _mesa_get_fb0_attachment call, add forgotten "return" (all suggested/caught by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-19 12:58:15 -07:00
Ilia Mirkin	d2445b0083	nv50/ir: force-enable derivatives on TXD ops This matters especially in vertex shaders, where derivatives are disabled by default. This fixes textureGrad in vertex shaders on nv50. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-03-19 13:09:49 -04:00
Ilia Mirkin	d1b85dbffa	nv50: reset TFB bufctx when we no longer hold a reference to the buffers This fix is analogous to commit `ff085d014`. This fixes some use-after-free situations in dEQP when an xfb state is removed, and then a clear is triggered, which only does a partial validation. It would attempt to read the no-longer-valid buffers, resulting in crashes. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-03-19 13:09:49 -04:00
Samuel Pitoiset	902bbda81b	nvc0: avoid using magic numbers for the uniform_bo offsets Instead make use of constants to improve readability. The first 32 bytes of the driver constant buffer are unknown... This doesn't seem to be used in the codegen part, but if the texBindBase offset is shifted from 0x20 to 0x00, this breaks the universe for really weird reasons. This sounds like to be related to textures. Anyway, name this NVC0_CB_AUX_UNK_INFO and add a todo should be enough for now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-19 18:01:08 +01:00
Samuel Pitoiset	26cc411db8	nv50/ir: make use of auxCBSlot instead of magic numbers This avoids using magic numbers for the driver constbuf slot which is always 15 except for compute shaders on gk104+ where the slot 0 is used. For gk104+, some special compute-related values like the thread index are uploaded to screen->parm which is currently bound on c0. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Pierre Moreau <pierre.morrow@free.fr>	2016-03-19 18:01:04 +01:00
Samuel Pitoiset	d86933e6f4	nv50,nvc0: replace resInfoCBSlot by auxCBSlot Having two different variables for the driver constant buffer slot is confusing and really useless. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Pierre Moreau <pierre.morrow@free.fr>	2016-03-19 18:00:59 +01:00
Samuel Pitoiset	e05492fd7f	nv50/ir: fix compilation warning in handleSharedATOM() In release build mode only, op may be used uninitialized because the assertion has been removed. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-19 17:01:17 +01:00
Vinson Lee	a100d89d09	nv50,nvc0: Fix invalid constant. Fix clang build error. CXX codegen/nv50_ir_lowering_nvc0.lo codegen/nv50_ir_lowering_nvc0.cpp:1783:42: error: invalid suffix 'd' on floating constant Value *zero = bld.loadImm(NULL, 0.0d); ^ Fixes: `c1e4a6bfbf` ("nv50,nvc0: handle SQRT lowering inside the driver") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-18 20:38:41 -07:00
Kenneth Graunke	46610238e0	mesa: Do proper format error checks for GenerateMipmap in ES 3.x. According to the OpenGL ES 3.2 spec's description of GenerateMipmap: "An INVALID_OPERATION error is generated if the levelbase array was not specified with an unsized internal format from table 8.3 or a sized internal format that is both color-renderable and texture-filterable according to table 8.10." Similar text exists in the ES 3.0 specification as well. Our existing rules are pretty close, but miss a few things. The OpenGL specification actually doesn't have any text about internal format checking - our existing code comes from a Khronos bug report. The ES 3.x spec provides a clearer description. Fixes dEQP-GLES3.functional.negative_api.texture.generatemipmap and dEQP-GLES2.functional.negative_api.texture.generatemipmap_zero_level _array_compressed. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-18 18:43:47 -07:00
Kenneth Graunke	f1b0573510	mesa: Add color renderable/texture filterable format info for ES 3.x. OpenGL ES 3.x contains a table of sized internal formats and their required properties. In particular, each format is marked as "Color Renderable" or "Texture Filterable". This patch introduces two functions that can be used to query the information from that table. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-18 18:43:23 -07:00
Kenneth Graunke	88d28aa4d9	i965: Stop XY clipping point and line primitives. Wide points and lines are not supposed to be clipped by the viewport. Rather, they should be rendered, and any fragments outside of the viewport should be discarded. The traditional use case for this behavior is rendering moving wide point particles. When the center of the point approaches the viewport edge, clipping would make it pop out of view early. Fixes: - dEQP-GLES2.functional.clipping.point.wide_point_clip - dEQP-GLES3.functional.clipping.point.wide_point_clip - dEQP-GLES3.functional.clipping.point.wide_point_clip_viewport_center - dEQP-GLES3.functional.clipping.point.wide_point_clip_viewport_corner - dEQP-GLES3.functional.clipping.line.wide_line_clip_viewport_center - dEQP-GLES3.functional.clipping.line.wide_line_clip_viewport_corner Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94453 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94454 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-18 18:42:51 -07:00
Kenneth Graunke	0de64ab788	i965: Scissor to the viewport when rendering points/lines. We're about to start allowing wide points/lines whose vertices are outside the viewport past the clipper. This scissoring hack ensures that any fragments generated are still restricted to the viewport. It is not necessary on Gen8+ as those platforms already discard fragments which are outside the viewport. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94453 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94454 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-18 18:42:30 -07:00
Kenneth Graunke	d000a4989f	i965: Include the viewport in the scissor rectangle. We'll need to use scissoring to restrict fragments to the viewport soon. It seems harmless to include it generally, so let's do that. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94453 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94454 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-18 18:42:15 -07:00
Kenneth Graunke	47be5a64c7	i965: Introduce an is_drawing_lines() helper. Similar to is_drawing_points(). v2: Account for isoline tessellation output topology. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-18 18:41:59 -07:00
Kenneth Graunke	757674e8d0	i965: Move is_drawing_points to brw_state.h. I need to use this in multiple source files. v2: Rebase on TES output domain fix. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-18 18:41:25 -07:00
Jason Ekstrand	ecfb074276	anv/allocator: Make the bo_pool dynamically sized	2016-03-18 17:25:58 -07:00
Kenneth Graunke	5b2d8c2273	i965: Fix gl_TessLevelOuter[] for isolines. Thanks to James Legg for finding this! From the ARB_tessellation_shader spec: "The number of isolines generated is derived from the first outer tessellation level; the number of segments in each isoline is derived from the second outer tessellation level." According to the PRM, "TF.LineDensity determines # lines" while "TF.LineDetail determines # segments". Line Density is stored at DWord 6, while Line Detail is at DWord 7. So, they're not reversed like they are for triangles and quads. Fixes Piglit's spec/arb_tessellation_shader/execution/isoline, and about 24 dEQP isoline tests (with GL_EXT_tessellation_shader hacked on - it's not normally enabled). Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94524 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-18 16:45:23 -07:00
Kenneth Graunke	24298b7e2f	i965: Decode non-normalized coordinates bit in SAMPLER_STATE. We weren't printing this for some reason. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-03-18 16:44:51 -07:00
Kenneth Graunke	8679d40dc7	i965: Account for TES in is_drawing_points(). Now that we implement tessellation shaders, the TES might be the last stage enabled. If it's outputting points, then the primitive type reaching the SF is points. We need to account for this. Caught by Ilia Mirkin. v2: Update dirty bit comment above caller (caught by Iago) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-18 16:44:15 -07:00
Pierre Moreau	1282146d4e	nv50: Mark compute states as dirty on context switch Signed-off-by: Pierre Moreau <pierre.morrow@free.fr> [ Samuel Pitoiset: Trivial rebase conflict ] Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-03-19 00:18:00 +01:00
Samuel Pitoiset	a734c0f8ba	nv50/ir: print SUBFM subops Only 3d subop is currently emitted. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-19 00:09:18 +01:00
Samuel Pitoiset	af0c97fb90	nv50: add a new validation path for compute This makes use of the new state validation interface to be consistent with 3d. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> Tested-by: Pierre Moreau <pierre.morrow@free.fr>	2016-03-19 00:09:14 +01:00
Samuel Pitoiset	5ed387675d	nv50: rework nv50_compute_validate_program() Reduce the amount of duplicated code by re-using nv50_program_validate(). While we are at it, change the prototype to return void. We don't check anymore if the translation fails but improving the state validation is a long process. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> Tested-by: Pierre Moreau <pierre.morrow@free.fr>	2016-03-19 00:09:09 +01:00
Samuel Pitoiset	a07ebc1993	nv50: rework the validation path for 3D This exposes an interface for state validation that will be also used to rework the compute validation path. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> Tested-by: Pierre Moreau <pierre.morrow@free.fr>	2016-03-19 00:09:05 +01:00
Samuel Pitoiset	517d2c97e1	nv50: rename 3d binding points to NV50_BIND_3D_XXX Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> Tested-by: Pierre Moreau <pierre.morrow@free.fr>	2016-03-19 00:09:02 +01:00
Samuel Pitoiset	9374fc1e67	nv50: rename 3d dirty flags to NV50_NEW_3D_XXX Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> Tested-by: Pierre Moreau <pierre.morrow@free.fr>	2016-03-19 00:08:56 +01:00
Samuel Pitoiset	e844aac40b	nv50: rename NV50_COMPUTE to NV50_CP Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> Tested-by: Pierre Moreau <pierre.morrow@free.fr>	2016-03-19 00:08:52 +01:00
Samuel Pitoiset	dedb46f582	nv50: rename nv50_context::dirty to nv50_context::dirty_3d Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> Tested-by: Pierre Moreau <pierre.morrow@free.fr>	2016-03-19 00:08:28 +01:00
Jason Ekstrand	b1c5d45872	anv/allocator: Add a size field to bo_pool_alloc	2016-03-18 11:50:53 -07:00
Brian Paul	9211b68ad3	st/mesa: clean up st_translate_texture_target() Reformat code. Improve assertion. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-18 12:06:31 -06:00
Brian Paul	0f73c3ab25	st/mesa: simplify drawpixels shader code with tgsi transform helper functions Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-18 12:06:30 -06:00
Brian Paul	373910f4e7	st/mesa: simplify bitmap shader code with tgsi transform helper functions Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-18 12:06:30 -06:00
Brian Paul	e9d5e68d1b	tgsi: add tgsi_transform_op3_inst() function Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-18 12:06:30 -06:00
Juan A. Suarez Romero	7a712e64d6	doc: add 'vec4' option in INTEL_DEBUG Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-18 17:30:56 +01:00
Daniel Czarnowski	d4714512e4	egl: support EGL_LARGEST_PBUFFER in eglCreatePbufferSurface(...) Patch provides a default for a set pbuffer surface size when EGL_LARGEST_PBUFFER is used by the client. MIN2 macro is moved to egldefines so that it can be shared. Fixes following Piglit test: egl-create-largest-pbuffer-surface From EGL 1.5 spec: "Use EGL_LARGEST_PBUFFER to get the largest available pbuffer when the allocation of the pbuffer would otherwise fail." Currently there exists no API to query largest available pixmap size using xlib or xcb so right now this seems most straightforward way to ensure that we fulfill above API and also we don't attempt to allocate 'too big' pixmap which might succeed on server side but not work in practice when driver starts to use it as a texture. v2: add more explanation about the change (Emil) Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-03-18 07:35:32 +02:00
George Kyriazis	dd63fa28f1	gallium/swr: Cleaned up some context-resource management Removed bound_to_context. We now pick up the context from the screen instead of the resource itself. The resource could be out-of-date and point to a pipe that is already freed. Fixes manywin mesa xdemo. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-03-17 20:57:52 -05:00
Timothy Arceri	952c166170	mesa: remove remaining tabs in prog_parameter.c Acked-by: Matt Turner <mattst88@gmail.com>	2016-03-18 12:42:53 +11:00
Timothy Arceri	ce9c042ab3	mesa: inline _mesa_add_unnamed_constant() Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-18 12:42:43 +11:00
Timothy Arceri	fa9bd6b663	mesa: simplify and inline _mesa_lookup_parameter_index() The function has only one user and strings are always null terminated. Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-18 12:42:39 +11:00
Timothy Arceri	350b1ef027	mesa: make _mesa_lookup_parameter_constant static This is not used outside of prog_parameter.c Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-18 12:42:34 +11:00
Timothy Arceri	7794b22a84	mesa: remove unused function Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-18 12:42:30 +11:00
Nicolai Hähnle	a8eea696b8	st/mesa: honour sized internal formats in st_choose_format (v2) The bitcasting which is possible with shader images (and texture views?) requires that when the user specifies a sized internal format for a texture, we really allocate that format. To this end: (1) find_exact_format should ignore sized internal formats and (2) some of the entries in the mapping table corresponding to sized internal formats are reordered to use an RGBA format instead of a BGRA one. This fixes arb_shader_image_load_store-bitcast in the (work in progress) ARB_shader_image_load_store implementation for radeonsi. v2: don't change the mapping of GL_RGB10: the change caused a regression because it preferred a format with an alpha channel, and GL_RGB10 is not among the supported formats for shader images Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-17 19:26:40 -05:00
Dongwon Kim	49eb5e75bd	configure.ac: enable_asm=yes when x-compiling across same X86 arch Currently, configure script is forcing 'enable_asm' to be 'no' whenever cross-compilation is performed on X86 host. This is based on an assumption that target architecture is different from host's (i.e. ARM). But there's always a case that we do cross-compilation for target that is also X86 based just like host in which same ASM codes will be supported. 'enable_asm' should not be forced to be "no" anymore in this case. v2: corrected commit message Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>	2016-03-17 16:53:23 -07:00
Timothy Arceri	d6b9202873	glsl: disable varying packing when its not safe In GL 4.4+ there is no guarantee that interpolation qualifiers will match between stages so we cannot safely pack varyings using the current packing pass in Mesa. We also disable packing on outerward facing interfaces for SSO because in ES we need to retain the unpacked varying information for draw time validation. For desktop GL we could allow packing for SSO in versions < 4.4 but its just safer not to do so. We do however enable packing on individual arrays, structs, and matrices as these are required by the transform feedback code and it is still safe to do so. Finally we also enable packing when a varying is only used for transform feedback and its not a SSO. This fixes all remaining rendering issues with the dEQP SSO tests, the only issues remaining with thoses tests are to do with validation. Note: There is still one remaining SSO bug that this patch doesn't fix. Their is a chance that VS -> TCS will have mismatching interfaces because we pack VS output in case its used by transform feedback but don't pack TCS input for performance reasons. This patch will make the situation better but doesn't fix it. V4: fix out of order function params after rebase, make sure packing still disabled in tess stages. Update comments as to why we disable packing on SSO. V3: ES 3.1 does require interpolation to match so don't disable packing there. Rebased on master rather than on enhanced layouts component packing series. V2: Make is_varying_packing_safe() a function in the varying_matches class, fix spelling (Matt) and make sure to remove the outer array when dealing with Geom and Tess shaders where appropriate. Lastly fix piglit regression in new piglit test and document the undefined behaviour it depends on: arb_separate_shader_objects/execution/vs-gs-linking.shader_test Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-03-18 10:26:34 +11:00
Timothy Arceri	c0ae6eeb3b	glsl: pass disable_varying_packing bool to the lowering pass This will allow us to choose to ignore the disable which will be useful for more fine grained control over when to enable or disable packing. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-03-18 10:26:30 +11:00
Marek Olšák	4ab2ac3349	radeonsi: fix Hyper-Z hangs on P2 configs Cc: 11.1 11.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-17 18:30:45 +01:00
Romain Failliot	151724159d	docs: Renormalize older extensions. For older extensions, there is an explanation first and the extension name in brackets, like that: Clamping controls (GL_ARB_color_buffer_float) I inverted that so we have the extension first and then the explanation in brackets, like that: GL_ARB_color_buffer_float (Clamping controls) It will help me later to parse the few extensions that use this syntax: all drivers that support <GL_extension> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-17 11:35:20 -05:00
Romain Failliot	f5d47dd428	docs: Renormalize some extensions. This fixes some exceptions I have to deal with in mesamatrix.net. The extensions GL_ARB_texture_buffer_object had a comment between "DONE" and the brackets. And the extension GL_KHR_robustness (in GL 4.5 and GLES 3.1) was using "90% done" instead of "in progress". The "90% done" is still here though, but as an extension comment. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-17 11:35:12 -05:00
Romain Failliot	3671bb3eaf	docs: Realign the "Status" column. The "Status" column was misaligned in some GL sections. This is a lot of diffs, but it's only spaces in the end. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-17 11:35:09 -05:00
Romain Failliot	e571f11de8	docs: howto to read and edit GL3.txt Added a small guide on how to read and edit GL3.txt. I think this would help as much the devs as the users reading this file. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-17 11:34:50 -05:00
Brian Paul	84b961dd53	r300g: add missing layer argument to rws->buffer_get_handle() call Fixes compilation error since `5aea0d691`. Reviewed-by: Christian König <christian.koenig@amd.com>	2016-03-17 09:52:21 -06:00
Christian König	5aea0d6919	radeon/winsys: add layer support for BO export Add layer support to export individual array layers. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-17 14:17:06 +01:00
Christian König	04bc082f6a	radeon/winsys: add offset support for BO import/export Add offset support to handle NV12 offsets as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-17 14:17:03 +01:00
Christian König	f1e78a48f2	gallium/winsys/drm: add layer to struct winsys_handle For exporting a specific layer of an array texture. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-17 14:16:59 +01:00
Christian König	29d26f1522	gallium/winsys/drm: add offset to struct winsys_handle We are going to need this for EGL_EXT_image_dma_buf_import. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-17 14:16:03 +01:00
Connor Abbott	58fe7837b8	nir: propagate bitsize information in nir_search When we replace an expresion we have to compute bitsize information for the replacement. We do this in two passes to validate that bitsize information is consistent and correct: first we propagate bitsize from child nodes to parent, then we do it the other way around, starting from the original's instruction destination bitsize. v2 (Iago): - Always use nir_type_bool32 instead of nir_type_bool when generating algebraic optimizations. Before we used nir_type_bool32 with constants and nir_type_bool with variables. - Fix bool comparisons in nir_search.c to account for bitsized types. v3 (Sam): - Unpack the double constant value as unsigned long long (8 bytes) in nir_algrebraic.py. v4 (Sam): - Use helpers to get type size and base type from nir_alu_type. Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-17 11:54:45 +01:00
Connor Abbott	3124ce699b	nir: add a bit_size parameter to nir_ssa_dest_init v2: Squash multiple commits addressing the new parameter in different files so we don't break the build (Iago) v3: Fix tgsi (Samuel) v4: Fix nir_clone.c (Samuel) v5: Fix vc4 and freedreno (Iago) v6 (Sam) - Fix build errors in nir_lower_indirect_derefs - Use helper to get type size from nir_alu_type. Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Tested-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-17 11:54:45 +01:00
Iago Toral Quiroga	084b24f558	nir: rename nir_const_value fields to include bitsize information Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-03-17 11:16:33 +01:00
Connor Abbott	9076c4e289	nir: update opcode definitions for different bit sizes Some opcodes need explicit bitsizes, and sometimes we need to use the double version when constant folding. v2: fix output type for u2f (Iago) v3: do not change vecN opcodes to be float. The next commit will add infrastructure to enable 64-bit integer constant folding so this is isn't really necessary. Also, that created problems with source modifiers in some cases (Iago) v4 (Jason): - do not change bcsel to work in terms of floats - leave ldexp generic Squashed changes to handle different bit sizes when constant folding since otherwise we would break the build. v2: - Use the bit-size information from the opcode information if defined (Iago) - Use helpers to get type size and base type of nir_alu_type enum (Sam) - Do not fallback to sized types to guess bit-size information. (Jason) Squashed changes in i965 and gallium/nir drivers to support sized types. These functions should only see sized types, but we can't make that change until we make sure that nir uses the sized versions in all the relevant places. A later commit will address this. Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-17 11:16:33 +01:00
Connor Abbott	6700d7e423	nir: add nir_{src,dest}_bit_size() helpers v2: use a ternary (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-17 11:16:33 +01:00
Jason Ekstrand	e172dbe5d2	nir: Add a bit_size to nir_register and nir_ssa_def This really hacky commit adds a bit size to registers and SSA values. It also adds rules in the validator to validate that they do the right things. It's still an open question as to whether or not we want a bit_size in nir_alu_instr or if we just want to let it inherit from the destination. I'm inclined to just let it inherit from the destination. A similar question needs to be asked about intrinsics. v2 (Connor): - Relax validation: comparisons have explicit destination sizes and implicit source sizes. v3 (Sam): - Use helpers to get size and base types of nir_alu_type enum. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-17 11:16:33 +01:00
Connor Abbott	3d37de930d	nir/types: add a function to get the bitsize of a base type v2: fix it for GLSL_TYPE_SUBROUTINE (Iago) Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-17 11:16:33 +01:00
Samuel Iglesias Gonsálvez	c38a25af2f	i965/nir: fix check to resolve booleans to work with sized nir_alu_type As nir_alu_type has now embedded the data size, the check for the instruction's output type (to see if a boolean resolve is required) should ignore the data size part. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-17 11:16:33 +01:00
Jason Ekstrand	78f1919429	nir: Add explicitly sized types v2: Fix size/type mask to properly handle 8-bit types. v3: Add helpers to get the bitsize and base type of a nir_alu_type enum. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-17 11:16:33 +01:00
Jordan Justen	3fd308a357	Merge remote-tracking branch 'origin/master' into vulkan	2016-03-17 01:44:07 -07:00
Jordan Justen	7d021cb15e	i965/nir: Lower nir compute shader shared variables Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-17 01:23:40 -07:00
Jordan Justen	b1e7cdfdcf	nir: Lower shared var atomics during nir_lower_io Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-17 01:23:40 -07:00
Jordan Justen	e3cbb9d37c	nir: Add support for lowering load/stores of shared variables Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-17 01:23:40 -07:00
Jordan Justen	683c359c54	nir: Add atomic operations on variables This allows us to first generate atomic operations for shared variables using these opcodes, and then later we can lower those to the shared atomics intrinsics with nir_lower_io. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-17 01:23:40 -07:00
Jordan Justen	3c807607df	nir: Add compute shader shared variable storage class Previously we were receiving shared variable accesses via a lowered intrinsic function from glsl. This change allows us to send in variables instead. For example, when converting from SPIR-V. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-17 01:23:40 -07:00
Jordan Justen	26f8262698	nir/print: Add space after shader_storage var mode Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-17 01:23:40 -07:00
Iago Toral Quiroga	5be11d2236	i965: Skip execution size adjustment for instructions of width 4 This code in brw_set_dest adjusts the execution size of any instruction with a dst.width < 8. However, we don't want to do this with instructions operating on doubles, since these will have a width of 4, but still need an execution size of 8 (for SIMD8). Unfortunately, we can't just check the size of the operands involved to detect if we are doing an operation on doubles, because we can have instructions that do operations on double operands interpreted as UD, operating on any of its 2 32-bit components. Previous commits have made it so we never emit instructions with a horizontal width of 4 that don't have the correct execution size set for gen6+, so we can skip it in this case, avoiding the conflicts with fp64 requirements. Expanding the same fix to other hardware generations requires many more changes but since we are not targetting fp64 support on them wer don't really care for now. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-17 08:23:25 +01:00
Samuel Iglesias Gonsalvez	22a10dd030	i965/vec4/gen6: fix exec_size for MOV with a width of 4 in generate_gs_ff_sync() Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-17 08:23:25 +01:00
Samuel Iglesias Gonsalvez	b91b9e4b00	i965/vec4/gen6: fix exec_size for instructions with destination width of 4 Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-17 08:23:25 +01:00
Samuel Iglesias Gonsalvez	30fc3fa24d	i965/vec4/gen6: fix exec_size for instructions with width of 4 in generate_gs_svb_write() Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-17 08:23:25 +01:00
Samuel Iglesias Gonsalvez	2fafc6b98c	i965/gs/gen6: fix execsize for instructions with width of 4 in gen6_sol_program() v2: - Add assert (Topi). Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-17 08:23:25 +01:00
Iago Toral Quiroga	f6342b5645	i965: set correct execsize for MOVS with a width of 4 in brw_find_live_channel Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-17 08:23:25 +01:00
Iago Toral Quiroga	31a8604252	i965/eu: set execution size for SEND message in brw_send_indirect_message Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-17 08:23:25 +01:00
Iago Toral Quiroga	2d6af62a0f	i965/fs: Set exec size for gen7 pull const loads v2 (Topi): - No need to set the execsize for the indirect send message, the next patch will handle that. - Set the execution size explicitly instead of taking it from the width of the dst that we set before. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-17 08:23:24 +01:00
Iago Toral Quiroga	ea45b6e96d	i965/eu: set correct execution size in brw_NOP v2: NOP should have an execsize of 1 (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-17 08:23:24 +01:00
Kenneth Graunke	9c1e01c4a8	meta: Don't use integer handles for shaders or programs. Previously, we gave our internal clear/blit shaders actual GL handles and stored them in the shader/program hash table. We used ordinary GL API entrypoints to work with them. We thought this shouldn't be a problem because GL doesn't allow applications to invent their own names for shaders or programs. GL allocates all names via glCreateShader and glCreateProgram. However, having them in the hash table is a bit risky: if a broken application guesses the name of our shaders or programs, it could alter them, potentially screwing up future meta operations. Also, test cases can observe the programs in the hash table. Running a single dEQP process that executes the following test list: dEQP-GLES3.functional.negative_api.buffer.clear dEQP-GLES3.functional.negative_api.shader.compile_shader dEQP-GLES3.functional.negative_api.shader.delete_shader would result in the last two tests breaking. The compile_shader test calls glCompileShader(9) straight away, and since it hasn't even created any shaders or programs, it expects to get a GL_INVALID_VALUE error because there's no such name. However, because the clear test ran first, it created Meta programs, so an object named "9" did exist. This patch reworks Meta to work with gl_shader and gl_shader_program pointers directly. These internal programs have bogus names, and are never stored in the hash tables, so they're invisible to applications. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94485 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-16 23:57:11 -07:00
Kenneth Graunke	0fe254168b	mesa: Expose compile_shader() and link_program() beyond the file. This will allow me to use them directly from Meta, bypassing the versions that work with GL integer handles. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-03-16 23:57:11 -07:00
Kenneth Graunke	7753657cf2	mesa: Make link_program() take a gl_shader_program, not a GLuint. In half the callers, we already have a pointer, and don't need to look it up again. This will also help with upcoming meta work. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-03-16 23:57:11 -07:00
Kenneth Graunke	a461e0003f	mesa: Make compile_shader() take a gl_shader, not a GLuint. In half the callers, we already have a pointer, and don't need to look it up again. This will also help with upcoming meta work. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-03-16 23:57:11 -07:00
Kenneth Graunke	a7e9b31d5b	meta: Use the _mesa_meta_compile_and_link_program helper more places. Less boilerplate. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-16 23:57:11 -07:00
Eric Anholt	2b9f0dffe0	vc4: Move discard handling to the condition flag. Now that the field exists in the instruction, we can make discards less special. As a bonus, that means that we should be able to merge some more .sf instructions together when we get around to that. This causes some scheduling changes, as it allows tlb_color_reads to be delayed past the discard condition setup. Since the tlb_color_read ends up later, this may mean performance improvements, but I haven't tested. total instructions in shared programs: 78114 -> 78035 (-0.10%) instructions in affected programs: 1922 -> 1843 (-4.11%) total estimated cycles in shared programs: 234318 -> 234329 (0.00%) estimated cycles in affected programs: 8200 -> 8211 (0.13%)	2016-03-16 11:28:47 -07:00
Eric Anholt	7c9fc43915	vc4: Don't make a temporary for setting flags. The register allocator doesn't really do anything about the temp, so it doesn't seem like it should matter. However, the scheduler would think that a new def is being created. This doesn't change anything yet, but it avoids a bunch of regressions in the next commit.	2016-03-16 11:28:34 -07:00
Eric Anholt	b4f45f319c	vc4: Add a safety check for setting flags. If a pack was on the src reg, should it be a float, int, or mul unpack? Just complain, instead.	2016-03-16 11:28:34 -07:00
Eric Anholt	a298fb15af	vc4: Reuse list_for_each_entry_safe_rev(). This didn't exist when I wrote the code.	2016-03-16 11:28:34 -07:00
Nanley Chery	5464f0c046	anv/blit: Reduce number of VUE headers being read Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-16 10:57:23 -07:00
Nanley Chery	f33866ae0a	anv/blit: Remove completed finishme for VkFilter This task was finished as of: `d9079648d0`. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-16 10:57:19 -07:00
Nanley Chery	5647de8ba5	anv/blit2d: Only use one extent in meta_emit_blit2d Since scaling isn't involved, we don't need multiple extents. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-16 10:57:14 -07:00
Nanley Chery	92fb65f117	anv/blit2d: Remove sampler from pipeline Since we're using texelFetch with a sampled image, a sampler is no longer needed. This agrees with the Vulkan Spec section 13.2.4 Descriptor Set Updates: sampler is a sampler handle, and is used in descriptor updates for types VK_DESCRIPTOR_TYPE_SAMPLER and VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER if the binding being updated does not use immutable samplers. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-16 10:57:00 -07:00
Nanley Chery	f8f9886915	anv/blit2d: Use texel fetch in frag shader The texelFetch operation requires that the sampled texture coordinates be unnormalized integers. This will simplify the copy shader for w-tiled images (stencil buffers). v2 (Jason): Use f2i for texel coords Fix num_components indirectly Use float inputs for interpolation Nest tex_pos functions Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-16 10:56:51 -07:00
Nanley Chery	b487acc489	Revert "anv/meta: Make meta_emit_blit() public" This reverts commit `f391683922`. Some conflicts had to be resolved in order for this revert to be successful. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-16 10:56:46 -07:00
Nanley Chery	1a0c63b880	Revert "anv/meta: Prefix anv_ to meta_emit_blit()" This reverts commit `514c055717`. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-16 10:56:41 -07:00
Nanley Chery	997a873f0c	anv/blit2d: Customize meta blit structs and functions for blit2d API * Add fields in meta struct * Add support in meta init/teardown * Switch to custom meta_emit_blit2d() Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-16 10:56:22 -07:00
Nanley Chery	2d8c632117	anv/blit2d: Copy anv_meta_blit.c functions These will be customized for blit2d operations. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-16 10:56:10 -07:00
Kenneth Graunke	b566317e7e	meta: Use ARB_explicit_attrib_location in the rest of the meta shaders. This is cleaner than using glBindAttribLocation(). Not all drivers support the extension, but I don't think those drivers use GLSL in the first place. Apparently some Meta shaders already use GL_ARB_explicit_attrib_location, so I think it should be okay. Honestly, I'm not sure how the old code worked anyway - we bound the attribute location for "texcoords", while all the shaders capitalized or spelled it differently. v2: Convert another instance in brw_meta_fast_clear.c. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-16 00:09:56 -07:00
Plamena Manolova	9d9965c06f	mesa: Ignore glPointSize when GL_POINT_SIZE_ARRAY_OES is enabled When a user defines a point size array and enables it, the point size value set via glPointSize should be ignored. To achieve this, we can simply toggle ctx->VertexProgram.PointSizeEnabled. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=42187 Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-03-15 15:49:48 -07:00
Jason Ekstrand	abaa3bed22	anv/device: Flush the fence batch rather than the start of the BO	2016-03-15 15:24:24 -07:00
Jason Ekstrand	7f6a0cb29c	Merge remote-tracking branch 'public/master' into vulkan	2016-03-15 14:09:50 -07:00
Varad Gautam	e103b52aec	vc4: Coalesce instructions using VPM reads into the VPM read. This is done instead of copy propagating the VPM reads into the instructions using them, because VPM reads have to stay in order. shader-db results: total instructions in shared programs: 78509 -> 78114 (-0.50%) instructions in affected programs: 5203 -> 4808 (-7.59%) total estimated cycles in shared programs: 234670 -> 234318 (-0.15%) estimated cycles in affected programs: 5345 -> 4993 (-6.59%) Signed-off-by: Varad Gautam <varadgautam@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Rhys Kidd <rhyskidd@gmail.com>	2016-03-15 13:09:24 -07:00
Varad Gautam	00bdbb22a9	vc4: rename file to group vpm optimizations together This file will contain optimization passes for both vpm reads and writes. Signed-off-by: Varad Gautam <varadgautam@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-03-15 12:49:37 -07:00
Eric Anholt	1c4b077409	vc4: Fix failures with nir_extract_* since the addition of the opcodes.	2016-03-15 12:49:37 -07:00
Roland Scheidegger	bb2c5e657b	llvmpipe: fix lp_rast_plane alignment on 32bit Some rasterization code relies (for sse) on the first and third planes (but not the second for now) being 128bit aligned, and we didn't get that on 32bit - I mistakenly thought the 64bit number in the struct would get the thing aligned to 64bit even on 32bit archs. Stephane Marchesin really figured this out. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> CC: <mesa-stable@lists.freedesktop.org>	2016-03-15 19:42:15 +01:00
Roland Scheidegger	12a4f0bed6	draw: fix line stippling The logic was comparing actual ints, not true/false values. This meant that it was emitting always multiple line segments instead of just one even if the stipple test had the same result, which looks inefficient, and the segments also overlapped thus breaking line aa as well. (In practice, with the no-op default line stipple pattern, for a 10-pixel long line from 0-9 it was emitting 10 segments, with the individual segments ranging from 0-1, 0-2, 0-3 and so on.) This fixes https://bugs.freedesktop.org/show_bug.cgi?id=94193 Reviewed-by: Jose Fonseca <jfonseca@vmware.com> CC: <mesa-stable@lists.freedesktop.org>	2016-03-15 19:41:34 +01:00
Roland Scheidegger	4b249ed4cd	softpipe: fix misleading TGSI_QUAD_SIZE usage All these img filter loops iterate through NUM_CHANNELS, not QUAD_SIZE. In practice both are of course the same unchangeable value (4), but it makes the code look a bit confusing. Moreover, some of the functions were actually given an array of 4 values according to the declaration, yet the code was addressing values 0/4/8/12 out of it, so fix this by just saying it's a pointer to floats like the other functions. While here, also add comment about not quite correct filtering. There's no actual code difference. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-03-15 19:37:59 +01:00
Roland Scheidegger	9e9d69979c	softpipe: fix anisotropic filtering crash The filt_args->offset wasn't assigned but was always used later leading to a crash (as far as I can tell, texel offsets don't actually make much sense with anisotropic filtering, but because there's no explicit setting if offsets are enabled there the array is always accessed). This fixes https://bugs.freedesktop.org/show_bug.cgi?id=94481 Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> CC: <mesa-stable@lists.freedesktop.org>	2016-03-15 16:40:05 +01:00
Nicolai Hähnle	4de25fa7b0	radeonsi: set DEPTH_BEFORE_SHADER based on FS_EARLY_DEPTH_STENCIL Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-14 17:24:59 -05:00
Nicolai Hähnle	0ffcc318e6	tgsi: add tgsi_full_src_register_from_dst helper function Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-14 17:24:49 -05:00
Nicolai Hähnle	c02d73af0b	gallium/u_inlines: add util_copy_image_view Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-14 17:24:46 -05:00
Nicolai Hähnle	f6dc4f5558	st/mesa: set image access flags in st_bind_images Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-14 17:24:43 -05:00
Nicolai Hähnle	71a1b54b33	gallium: add access field to pipe_image_view This allows drivers to make smarter decisions e.g. about whether the image has to be decompressed. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-14 17:24:40 -05:00
Nicolai Hähnle	8c497b8fb5	st/glsl_to_tgsi: set FS_EARLY_DEPTH_STENCIL when required Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-14 17:24:37 -05:00
Nicolai Hähnle	e526f930aa	tgsi: add TGSI_PROPERTY_FS_EARLY_DEPTH_STENCIL Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-14 17:24:33 -05:00
Nicolai Hähnle	1c0cee8764	st/glsl_to_tgsi: set memory access type on image intrinsics This is required to preserve the image variable's coherent/restrict/volatile qualifiers in TGSI. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-14 17:24:30 -05:00
Nicolai Hähnle	dfcf420412	st/glsl_to_tgsi: provide Texture and Format information for image ops Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-14 17:24:26 -05:00
Nicolai Hähnle	3243b6fc97	tgsi: add Texture and Format to tgsi_instruction_memory Frontends should have this information readily available, and it simplifies image LOAD/STORE/ATOM* handling especially with indirect image access. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-14 17:24:02 -05:00
Nicolai Hähnle	9b68bdf6f8	get: reconcile aliasing enums for MaxCombinedShaderOutputResources The enums MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS and MAX_COMBINED_SHADER_OUTPUT_RESOURCES are equal and should therefore only appear once. Noticed while implementing ARB_shader_image_load_store without previously implementing SSBO. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-03-14 17:19:14 -05:00
Francisco Jerez	b054605722	i965/fs: Restrict inequality that can only hold equal in saturate propagation. Should have no functional change. The IP value of an instruction that reads src_var cannot possibly be after the end of the live interval of the variable it's reading from, by the definition of live interval. Might save future readers a momentary WTF while trying to understand this code. Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-14 14:58:19 -07:00
Francisco Jerez	7d7990cf65	i965/vec4: Consider removal of no-op MOVs as progress during register coalesce. Bug found by the liveness analysis validation pass that will be introduced in a later commit. The no-op MOV check in opt_register_coalesce() was removing instructions which makes the cached liveness analysis calculation inconsistent with the shader IR. We were failing to set progress to true in that case though, which means that invalidate_live_intervals() wouldn't necessarily be called at the end of the function. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-14 14:58:11 -07:00
Francisco Jerez	93be4158ae	i965/fs: Add missing analysis invalidation in fixup_3src_null_dest(). Bug found by the liveness analysis validation pass that will be introduced in a later commit. fixup_3src_null_dest() was allocating registers which makes the cached liveness analysis calculation incomplete, so it must be invalidated. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-14 14:57:58 -07:00
Francisco Jerez	6691c03fd3	i965/fs: Add missing analysis invalidation in opt_sampler_eot(). Bug found by the liveness analysis validation pass that will be introduced in a later commit. opt_sampler_eot() was allocating registers and inserting and removing instructions, which makes the cached liveness analysis calculation inconsistent with the shader IR, so it must be invalidated. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-14 14:56:02 -07:00
Hans de Goede	4d02e91e49	clover: Fix pipe_grid_info.indirect not being initialized. After pipe_grid_info.indirect was introduced, clover was not modified to set it causing it to pass uninitialized memory for it to launch_grid. This commit fixes this by zero-ing the entire pipe_grid_info struct when declaring it, to avoid similar problems popping-up in the future. Cc: "11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> [ Francisco Jerez: Trivial codestyle fix. ] Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-14 14:12:42 -07:00
Sarah Sharp	af06190760	mesa: docs: Intel i965 hardware limits. This should help the next person working on hardware enabling figure out where in the Intel PRMs to find the magic platform hardware values. Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>	2016-03-14 14:00:29 -07:00
Sarah Sharp	0f5bfc7f01	mesa: docs: i965: Use correct doxygen groupings syntax When reading the source code, it's useful to indicate that a group of fields in a struct are related in someway. There were several places where people tried to group related structure members with the {@ syntax, without realizing they also needed to add the \name syntax in order to generate correct doxygen html. There are several files with groupings that look like this: struct foo { /** * Related fields description * @{ / int bar; char baz; /* @} / long qux; } However, the doxygen syntax for grouping is: struct foo { /* * \name Related fields description * @{ / int bar; char baz; /* @} */ long qux; } https://www.stack.nl/~dimitri/doxygen/manual/grouping.html Without the group name definition, the fields don't get properly grouped. Instead, the group description is applied to the first field. Fix the Intel hardware information structure, brw_device_info to properly group the GPU hardware limitations and hardware quirks fields. Once you've run `cd doxygen; make clean; make all`, updated documentation can be found at mesa/doxygen/i965/structbrw__device__info.html Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>	2016-03-14 14:00:29 -07:00
Bruce Cherniak	e9d68cc3da	gallium/swr: Resource management Better tracking of resource state and synchronization. A follow on commit will clean up resource functions into a new swr_resource.cpp file. Reviewed-By: George Kyriazis <george.kyriazis@intel.com>	2016-03-14 14:07:48 -05:00
Marek Olšák	7a2333e4ef	configure.ac: require libdrm 2.4.66 for drmGetDevice since `737b6ed13e` src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c no longer compiles: error: unknown type name ‘drmDevicePtr’	2016-03-14 16:42:41 +01:00
Francisco Jerez	63250d8178	i965: Remove useless IR self-destruct backend_shader method. From the point it's constructed the CFG contains the only existing copy of the program IR, and it never becomes invalid. Calling backend_shader::invalidate_cfg would have destroyed the program structure irrecoverably -- We weren't calling it at all for a good reason. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-13 18:07:53 -07:00
Pierre Moreau	8c7acd87af	nv50,nvc0: Set only NEW_CP_GLOBALS upon binding Signed-off-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-03-13 22:34:50 +01:00
Rob Clark	e73ac84b93	freedreno/ir3: lower extract_byte/word The following commits broke things by starting to feed us unhandled extract_u16/extract_u8 opcodes: commit `905ff86198` Author: Matt Turner <mattst88@gmail.com> AuthorDate: Wed Feb 3 14:28:31 2016 -0800 Commit: Matt Turner <mattst88@gmail.com> CommitDate: Fri Mar 4 11:52:34 2016 -0800 nir: Recognize open-coded extract_u16. commit `76289fbfa8` Author: Matt Turner <mattst88@gmail.com> AuthorDate: Thu Jan 21 09:09:48 2016 -0800 Commit: Matt Turner <mattst88@gmail.com> CommitDate: Fri Mar 4 11:52:34 2016 -0800 nir: Recognize open-coded extract_u8. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-03-13 14:10:57 -04:00
Ilia Mirkin	c1e4a6bfbf	nv50,nvc0: handle SQRT lowering inside the driver First off, st/mesa lowers DSQRT incorrectly (it uses CMP to attempt to find out whether the input is less than 0). Secondly the current approach (x * rsq(x)) behaves poorly for x = inf - a NaN is produced instead of inf. Instead we switch to the less accurate rcp(rsq(x)) method - this behaves nicely for all valid inputs. We still don't do this for DSQRT since the RSQ/RCP ops are really inaccurate, and don't even have Newton-Raphson steps right now. Eventually we should have a separate library function for DSQRT that does it more precisely (and perhaps move this lowering to the post-opt phase). This fixes a number of dEQP precision tests that were expecting better behavior for infinite inputs. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-03-13 13:17:24 -04:00
Ilia Mirkin	b3e7fb5234	nv50/ir: avoid folding mul + add if the mul has a dnz Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-03-13 13:17:24 -04:00
Ilia Mirkin	a651bc027d	nvc0: fix blit triangle size to fully cover FB's > 8192x8192 The idea is that a single triangle will cover the whole area being drawn, allowing the blit shader to do its work. However the max fb size is 16384x16384, which means that the triangle we draw needs to be twice that in order to cover the whole area fully. Increase the size of the triangle to 32768x32768. This fixes a number of dEQP tests that were failing because a blit was involved which would miss some of the resulting texture. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-03-13 13:17:24 -04:00
Rob Clark	01b071d530	freedreno: OUT_RELOC vs OUT_RELOCW fixes Make sure we use OUT_RELOCW() in cases where the buffer is written to. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-03-13 12:23:41 -04:00
Rob Clark	f68c6951b8	freedreno/a4xx: hw binning Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-03-13 12:23:41 -04:00
Rob Clark	b3fe196e21	freedreno/a4xx: use generated headers for draw initiator No need to open-code this. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-03-13 12:23:41 -04:00
Rob Clark	2224ba5976	freedreno/a4xx: remove RB_RENDER_CONTROL patching Bitfields where shuffled around for the better on a4xx, so we don't need any patching on this one. It appears to be something we set entirely in the gmem code so no conflict between tiling and render state like we had in a3xx. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-03-13 12:23:41 -04:00
Rob Clark	8824a765a2	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-03-13 12:23:41 -04:00
Rob Clark	476551a21f	freedreno/a3xx: move where we deal w/ binning FS Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-03-13 12:23:41 -04:00
Rob Clark	dd9135c452	freedreno/a4xx: move where we deal w/ binning FS Move where we pick dummy FS for binning pass, so the whole driver sees the same dummy/no-op FS stage. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-03-13 12:23:41 -04:00
Rob Clark	09b3447344	freedreno/a3xx: constify the shader variants Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-03-13 12:23:41 -04:00
Rob Clark	5b955f09f7	freedreno/a4xx: constify the shader variants Most of the driver just needs read-only access, so constify.. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-03-13 12:23:40 -04:00
Rob Clark	d9395e4ed8	freedreno/a3xx: remove duplicate mark of end of binning cmds Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-03-13 12:23:40 -04:00
Nicolai Hähnle	28d2a7e67c	radeonsi: avoid crash when a sampler state is bound for a buffer texture Sampler states don't really make sense with buffer textures, but they can be set anyway, so we need to be defensive here. This bug was lurking for a while and was finally noticed due to PBO uploads setting sampler states. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94284 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Laurent Carlier <lordheavym@gmail.com> Tested-by: Shawn Starr <shawn.starr@rogers.com>	2016-03-13 09:37:23 -05:00
Matt Turner	61b10b4eb7	i965: Use foreach_in_list_reverse_safe() macro. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-03-12 19:23:50 -08:00
Jason Ekstrand	98d58e7320	nir/clone: Add support for cloning a single function_impl Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 15:48:36 -08:00
Jason Ekstrand	036b209484	nir/validate: Better function validation Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 15:48:36 -08:00
Jason Ekstrand	f86f3c90aa	nir/print: Better function argument printing Since we aren't going to put the function parameters or the return variable in the list of locals, it won't get a proper declaration. This changes nir_print to print the type along with each parameter or return variable. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 15:48:36 -08:00
Jason Ekstrand	13969565f9	nir/print: Factor variable name lookup into a helper Otherwise, we have a problem when we go to print functions with arguments because their names get added to the hash table during declaration which happens after we print the prototype. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 15:48:36 -08:00
Jason Ekstrand	e4bebe8a02	nir: Create function parameters in function_impl_create Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 15:48:36 -08:00
Jason Ekstrand	066d3c115e	nir: Add a helper for creating a "bare" nir_function_impl Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 15:48:36 -08:00
Jason Ekstrand	2ef4754a20	nir: Add a new "param" variable mode for parameters and return variables Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 15:48:36 -08:00
Jason Ekstrand	41ae553fda	nir/glsl: Remove dead function parameter handling code NIR has never been used on IR where we haven't already done function inlining so this code has been dead from the beginning. Let's just get rid of it for now. We can always put it back in if we decide to use NIR for function inlining at some point in the future. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 15:48:36 -08:00
Jordan Justen	b83785d86d	anv/gen7: Add stall and flushes before switching pipelines This is a port of `18c76551ee` from OpenGL to Vulkan. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 13:13:37 -08:00
Jordan Justen	c8ec65a1f5	anv: Add flush_pipeline_before_pipeline_select flush_pipeline_before_pipeline_select adds workarounds required before switching the pipeline. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 13:13:37 -08:00
Jordan Justen	1b126305de	anv/genX: Add flush_pipeline_select_gpgpu Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-12 12:43:46 -08:00
Jason Ekstrand	41af9b2e51	HACK: Don't re-configure L3$ in render stages pre-BDW This fixes a "regression" on Haswell and prior caused by merging the gen7 and gen8 flush_state functions. Haswell should still work just fine if you're on a 4.4 kernel, but we really should make it detect the command parser version and do something intelligent.	2016-03-12 08:57:16 -08:00
Boyuan Zhang	6cf120ec77	st/va: add HEVC main 10 profile Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-03-11 22:33:56 -05:00
Boyuan Zhang	06c862d67d	radeon/video: enable HEVC main 10 decode Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-03-11 22:33:56 -05:00
Boyuan Zhang	8be9efcce7	radeon/uvd: handle HEVC main 10 decode Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-03-11 22:33:56 -05:00
Jason Ekstrand	753ebe4457	anv/x11: Reset the SHM fence before presenting the pixmap This seems to fix the flicker issue that I was seeing with dota2	2016-03-11 17:22:46 -08:00
Kristian Høgsberg Kristensen	9bff5266be	anv/x11: Add present support The old DRI3 implementation just used CopyArea instead of present. We still don't support all the MST fancyness, but it should at least avoid some copies and allow for. v2 (Jason Ekstrand): - Better object cleanup and destruction - Handle the CONFIGURE_NOTIFY event and return OUT_OF_DATE when needed - Track dirtyness via IDLE_NOTIFY rather than interating through the images sequentially	2016-03-11 16:54:17 -08:00
Jason Ekstrand	e920b184e9	anv/x11: Split image creation into a helper function This lets us clean up error handling and make it correct.	2016-03-11 12:28:34 -08:00
Jason Ekstrand	41a147904a	anv/wsi: Throttle rendering to no more than 2 frames ahead Right now, Vulkan apps can pretty easily DOS the GPU by simply submitting a lot of batches. This commit makes us wait until the rendering for earlier frames is comlete before continuing. By waiting 2 frames out, we can still keep the pipe reasonably full but without taking the entire system down. This is similar to what the GL driver does today.	2016-03-11 11:31:13 -08:00
Jason Ekstrand	132f079a8c	anv/gem: Use C99-style struct initializers for DRM structs This is more consistent with the way the rest of the driver works and ensures that all structs we pass into the kernel are zero'd out except for the fields we actually want to fill. We were previously doing then when building with valgrind to keep valgrind from complaining. However, we need to start doing this unconditionally as recent kernels have been getting touchier about this. In particular, as of kernel commit b31e51360e88 from Chris Wilson, context creation and destroy fail if the padding bits are not set to 0.	2016-03-11 11:31:03 -08:00
Ben Widawsky	d1ab544bb8	i965/chv: Display proper branding "Braswell" is a Cherryview based thing. It unfortunately requires extra information to determine its marketing name. Unlike all previous products, and hopefully all future ones, there is no unique 1:1 mapping of PCI device ID to brand string. I put up a fight about adding any complexity to our GL renderer string code for a very long time. However, a wise man made a comment to me that I couldn't argue with: if a user installs Windows on their hardware, the brand string should be the same as what we display in Linux. The Windows driver apparently does this check, so we should too. Note that I did manage to find a good use for this info anyway in the compute shader thread counts. v2: memcpy instead of strncpy, and some minor changes (Matt) Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com	2016-03-11 11:17:28 -08:00
Ben Widawsky	5e6a43a001	i965/chv: Update lower min for CS threads We have better information now, and 28 was not a valid thing to support. 6 EUs per sublice with 7 threads per EU is the minimum supported config. Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com	2016-03-11 11:17:28 -08:00
Ben Widawsky	3dc3dbc8d8	i965/chv: Check that compute threads are above threshold The way we are organizing this code, the statically configured max_cs_threads should always be the minimum value we actually support (ie. are aware of). As a result, we can fall back to that if we get invalid numbers from the kernel (ie. when the query succeeds, but the result is lower than expected). I was originally planning to use an assert, but there is no reason to be so mean. Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com	2016-03-11 11:17:28 -08:00
Ben Widawsky	9dd20b715a	i965/chv: Use kernel provided info for max_cs_threads With the previous patches, the code can find out the actual number of available compute threads. It is enabled only for Cherryview since that is the only platform I know for a fact has shipped devices which can benefit from this. It seems like other platforms /might/ benefit from this because of fused configurations which /might/ have shipped. Fallback code is still there. v2: Some minor adjustments from Matt Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com	2016-03-11 11:17:28 -08:00
Ben Widawsky	38eb606884	i965: Query and store GPU properties from kernel Certain products are not uniquely identifiable based on device id alone. The kernel exports an interface to help deal with this. This patch merely introduces the consumer of the interface and makes sure nothing breaks. It is also possible to use these values for programming GPGPU mode, and I plan to do that as well. The interface was introduced in libdrm 2.4.60, which is already required, so it should all be fine. v2: Some minor changes recommended by Matt Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-11 11:17:28 -08:00
Nicolai Hähnle	9908b13af6	st/mesa: check that the image unit is valid in st_bind_images Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-11 11:53:40 -05:00
Bas Nieuwenhuizen	417b6721a0	radeonsi: Lazily re-set sampler views after disabling DCC Clear DCC flags if necessary when binding a new sampler view. v2: Do not reset DCC flags of bound sampler views. v3: Check that we have a real texture (Nicolai) Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-11 11:51:15 -05:00
Marek Olšák	af3454cad5	st/mesa: remove ST_NEW_MESA flag (v2) Only used indirectly when checking dirty.st != 0 v2: also update st_cb_compute.c Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-11 16:07:18 +01:00
Nicolai Hähnle	e502801d98	r600g: clear compressed_depthtex/colortex_mask when binding buffer texture Found by inspection of the source based on a bisected bug report. This bug has been in the code for a long time, but the more recent PBO upload feature exposed it because it leads to more uses of buffer textures. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94388 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.0 11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-03-11 08:00:15 -05:00
Ilia Mirkin	f8ea98e4ec	st/mesa: add GL_ARB_shader_atomic_counter_ops support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-10 22:36:17 -05:00
Ilia Mirkin	075a5742bf	mesa: add GL_ARB_shader_atomic_counter_ops support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-10 22:34:46 -05:00
Ilia Mirkin	a8819fb1ff	nvc0: add support for TGSI FMA ops This will allow the nouveau backend to not try and split up ops that are fused in GLSL. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-03-10 22:34:28 -05:00
Nicolai Hähnle	59c5508b9a	radeonsi: update compressed_colortex_masks when a cmask is created or disabled Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-10 18:22:52 -05:00
Nicolai Hähnle	da68a9b215	radeonsi: move si_decompress_textures to si_blit.c Since it is all about calling into blitter functions, it makes more sense here. This change also reduces the size of the interfaces between .c files. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-10 18:22:49 -05:00
Nicolai Hähnle	f03c9e5692	r600g: update compressed_colortex_masks when a cmask is created or disabled Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-10 18:22:46 -05:00
Nicolai Hähnle	784269aa40	gallium/radeon: notify all contexts when cmasks are enabled/disabled There is an annoying corner case that I stumbled across while looking into piglit's arb_shader_image_load_store/execution/load-from-cleared-image.shader_test (which can be easily adapted to demonstrate the bug without the ARB_shader_image_load_store extension) When we bind a texture and then clear it using glClear (by attaching it to the current framebuffer) for the first time, we allocate a separate cmask for the texture to do fast clear, but the corresponding bit in compressed_colortex_mask is not set. Subsequent rendering will use incorrect data. Conversely, when a currently bound texture with an existing cmask is exported leading to that cmask being disabled, the compressed_colortex_mask bit will remain set, leading to an assertion later on in debug builds. Since iterating through all contexts and/or remembering where every texture is bound would be costly, and cmask enable/disable should be rare, we will maintain a global counter to signal contexts that they must update their compressed_colortex_masks. This patch introduces the global counter, and subsequent patches will do the mask update. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-10 18:22:00 -05:00
Kenneth Graunke	9ea00c6f6b	i965: Set a proper _BaseFormat for window system renderbuffers in ES. intel_alloc_private_renderbuffer_storage did: rb->_BaseFormat = _mesa_base_fbo_format(ctx, internalFormat); Unfortunately, internalFormat was usually an unsized format (such as GL_DEPTH_COMPONENT). In OpenGL ES, _mesa_base_fbo_format() refuses to accept unsized formats, and returns 0 rather than a real base format. This meant that we ended up with a completely bogus rb->_BaseFormat for window system buffers on OpenGL ES. All other renderbuffer allocation functions in intel_fbo.c instead use the mesa_format, and do: rb->_BaseFormat = _mesa_get_format_base_format(...); We can do likewise, using rb->Format. This appears to work just fine. dEQP-GLES3.functional.state_query.fbo.framebuffer_attachment_x_size_initial failed, as it tried to perform a GL_FRAMEBUFFER_ATTACHMENT_DEPTH_SIZE query on the window system depth buffer. That query relies on a proper rb->_BaseFormat being set, so it broke because rb->_BaseFormat was 0 due to the above bug. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94458 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-10 11:23:52 -08:00
Kenneth Graunke	e032e4ad5a	glcpp: Fix locations when encounting "#<NEWLINE>". We were failing to reset our location tracking when encountering a NEWLINE in the <HASH> state. Rip the code from the <*>{NEWLINE} rule, which handles this properly. Also, update 146-version-first-hash.c to have proper expectations. When I introduced the test, I didn't verify that the line/column numbers were correct, and it turns out they varied based on the type of newline ending. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94447 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-10 11:23:26 -08:00
Jason Ekstrand	1f3d582cba	isl/surface_state: Set the clear color	2016-03-10 10:41:52 -08:00
Jason Ekstrand	8c819b8c2b	genxml/gen75: Add the clear color bits to RENDER_SURFACE_STATE	2016-03-10 10:41:52 -08:00
Jason Ekstrand	6f47ed28b4	isl: Add more helpers for determining if a format is an integer format	2016-03-10 10:41:52 -08:00
Jason Ekstrand	b0e423cc4f	isl: Remove redundant check The green channel was checked twice.	2016-03-10 10:41:52 -08:00
Tim Rowley	84f857bef7	gallium/swr: remove use of BYTE from swr driver Remove use of a win32-style type leaked from the swr rasterizer. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-03-10 11:20:58 -06:00
Samuel Pitoiset	dad3e5f4ef	nvc0: expose SM35 perf counters to AMD_performance_monitor Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-10 18:20:40 +01:00
Samuel Pitoiset	0e511400de	nvc0: add driver metrics for SM35 (GK110) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-10 18:20:38 +01:00
Samuel Pitoiset	bf840aa523	nvc0: add MP performance counters for SM35 (GK110) Because compute support is not enabled by default for these chipsets, NVF0_COMPUTE=1 needs to be used, along with GALLIUM_HUD to enable performance counters. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-10 18:20:35 +01:00
Samuel Pitoiset	f289e99dee	nvc0: explode config of Kepler hardware SM events This is really verbose but most of the configuration will be reused for SM35 (GK110). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-10 18:20:32 +01:00
Samuel Pitoiset	a0ce8536b3	nvc0: rework the driver metrics infrastructure This follows the same design as MP perf counters. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-10 18:20:29 +01:00
Samuel Pitoiset	41fb87249a	nvc0: rework the MP counters infrastructure This mainly improves how we define the different list of queries. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-10 18:20:26 +01:00
Marek Olšák	7b29188a3f	egl: clean up typedef madness in the backend API let's use the dd.h format Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-03-10 18:03:14 +01:00
Iago Toral Quiroga	3e3de9ec0a	glsl: report correct number of allowed vertex inputs and fragment outputs Before we would always report 16 for both and we would only fail if either one exceeded 16. Now we fail if the maximum for each is exceeded, even if it is smaller than 16 and we report the correct maximum. Also, expand the size of to_assign[] to 32. There is code at the top of the function handling max_index up to 32, so this just makes the code more consistent. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-03-10 08:48:53 +01:00
Vinson Lee	d46feee697	nouveau: Fix clang reserved-user-defined-literal error. CXX codegen/nv50_ir.lo In file included from codegen/nv50_ir.cpp:28: ./nouveau_debug.h:19:30: error: invalid suffix on literal; C++11 requires a space between literal and identifier [-Wreserved-user-defined-literal] fprintf(stderr, "%s:%d - "fmt, __FUNCTION__, __LINE__, ##args) ^ Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-03-09 23:00:45 -08:00
Kenneth Graunke	3823b53ff8	mesa: Make glGetInteger64v convert float/doubles to 32-bit integers. According to the GL 4.4 core specification, section 2.2.2 ("Data Conversions For State Query Commands"): "If a command returning integer data is called, such as GetIntegerv or GetInteger64v, a boolean value of TRUE or FALSE is interpreted as one or zero, respectively. A floating-point value is rounded to the nearest integer, unless the value is an RGBA color component, a DepthRange value, or a depth buffer clear value. In these cases, the query command converts the floating-point value to an integer according to the INT entry of table 18.2; a value not in [−1, 1] converts to an undefined value." The INT entry of table 18.2 shows that b = 32, meaning the expectation is to convert it to a 32-bit integer value. Fixes: dEQP-GLES3.functional.state_query.floats.blend_color_getinteger64 dEQP-GLES3.functional.state_query.floats.color_clear_value_getinteger64 dEQP-GLES3.functional.state_query.floats.depth_clear_value_getinteger64 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94456 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-09 19:44:18 -08:00
Nanley Chery	7fbbad0170	anv/blit2d: Use the tiling enum for simplicity Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-03-09 10:57:47 -08:00
Nanley Chery	514c055717	anv/meta: Prefix anv_ to meta_emit_blit() Follow the convention for non-static functions. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-03-09 10:57:47 -08:00
Nanley Chery	627728cce5	anv/meta: Split anv_meta_blit.c into three files The new organization is as follows: * anv_meta_blit.c: Blit and state setup/teardown commands * anv_meta_copy.c: Copy and update commands * anv_meta_blit2d.c: 2D Blitter API commands Also, change the formatting to contain most lines within 80 columns. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-03-09 10:57:47 -08:00
Nanley Chery	f391683922	anv/meta: Make meta_emit_blit() public This can be reverted if the only other consumer, anv_meta_blit2d(), uses a different method. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-03-09 10:57:47 -08:00
Nanley Chery	ddbc645846	anv/meta: Store src and dst usage flags in a variable Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-03-09 10:57:47 -08:00
Nanley Chery	7ebbc3946a	anv/meta: Minimize height of images used for copies In addition to demystifying the value being added to the height, this future-proofs the code for new tiling modes and keeps the image height as small as possible. v2: Actually use the smallest height possible. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-03-09 10:57:47 -08:00
Emil Velikov	3dc2630e45	gallium/radeon: use explicit drm_major, drm_minor check Just like everywhere else in the radeon codebase. v2: Don't forget about drm_major == 3 (Alex) Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-09 17:25:22 +00:00
Emil Velikov	b9c5c4af6d	egl/x11: check the return value of xcb_dri2_get_buffers_reply() ... before using it. The function can return NULL, which we should check prior to refererencing it in the next function(s). Cc: Fabian Vogt <fvogt@suse.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93667 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-03-09 17:25:22 +00:00
Emil Velikov	373f118c6c	gallium: do not wrap header inclusion in Add one missing extern C guard within include/pipe/p_video_enums.h, and remove the wrapping throughout gallium. On Haiku one could even use the gallium debug_printf() although that's another topic. v2: Leave dbghelp.h as is (Jose) Cc: Jose Fonseca <jfonseca@vmware.com> Cc: Brian Paul <brianp@vmware.com> Cc: Alexander von Gluck IV <kallisti5@unixzen.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-03-09 17:21:39 +00:00
Dieter Nützel	69d389c52f	opencl: fix .gitignore for .install-gallium-links Fixes: `0b6157e971` "install-gallium-links: port changes from install-lib-links" v2: move this to the top level .gitignore and added Fixes: like Emil Velikov <emil.l.velikov@gmail.com> suggested Signed-off-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-03-09 17:16:52 +00:00
Emil Velikov	f3e23ead53	egl: remove remnants of MESA_drm_display Last set in st/egl, unused in mesa-demos and superseded by EGL_KHR_platform_gbm. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-03-09 17:16:51 +00:00
Emil Velikov	2295a4b1e1	egl: remove final pieces of KHR_vg_parent_image Similar to previous commit - unused/unset for a long time. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-03-09 17:16:51 +00:00
Emil Velikov	c85544a10c	glapi: remove the final function offset tags A commit earlier this year reworked out python scripts to use a separate file for these. Followed by removing support from the parser, and removing all of the offset tags. Seems like we either missed a few, or people added them by mistake. Either way let's nuke the ones that are still around. Cc: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-03-09 17:16:51 +00:00
Emil Velikov	3ffab9a89c	winsys/amdgpu/addrlib: do not wrap header inclusion in extern "C" Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-09 17:16:51 +00:00
Emil Velikov	a07192bd63	mesa/main: do not wrap header inclusion in extern "C" Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-03-09 17:16:51 +00:00
Emil Velikov	5351dc1522	i915: limit extern "C" hack only for libdrm headers Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-03-09 17:16:51 +00:00
Emil Velikov	cf215d92f6	xmesa: do not wrap header inclusion in extern "C" Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-03-09 17:16:50 +00:00
Emil Velikov	2af3a0ca6f	util/sha: do not wrap header inclusion in extern "C" Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-03-09 17:16:50 +00:00
Emil Velikov	d426c17550	egl/wayland: do not wrap header inclusion in extern "C" Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-03-09 17:16:50 +00:00
Emil Velikov	750da80b34	gbm: do not wrap header inclusion in extern "C" Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-03-09 17:16:50 +00:00
Nicolai Hähnle	9f06e7f5c1	st/mesa: shader image atoms must be before framebuffer update The reason is that the shader image atoms call st_finalize_texture, which may set ST_NEW_FRAMEBUFFER. This fixes an assertion triggered by a subtest of piglit's arb_shader_image_load_store-invalid. v2: add comment explaining order constraints (suggested by Ilia) Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-09 11:40:06 -05:00
Nicolai Hähnle	4eb416bd9d	gallivm: special case TGSI_OPCODE_STORE This instruction has the resource (buffer or image) as a destination to represent the writemask for SSBO writes. However, this is obviously not a "real" destination for the purpose of emitting LLVM IR. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-09 11:39:55 -05:00
Nicolai Hähnle	10b2b584ee	tgsi: set correct output mode for RESQ Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-09 11:39:43 -05:00
Marek Olšák	dcb2b77823	gallium: add CAPs returning PCI device location Reviewed-by: Brian Paul <brianp@vmware.com>	2016-03-09 15:02:28 +01:00
Marek Olšák	737b6ed13e	winsys/amdgpu: get PCI info This will be queried by the OpenCL stack using an interop call. I have tested that the values match lspci. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:28 +01:00
Marek Olšák	ec74deeb24	radeonsi: set amdgpu metadata before exporting a texture Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:28 +01:00
Nicolai Hähnle	ff7e9412be	radeonsi: extract the texture descriptor computation into its own function This will allow this code to be re-used for shader images. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-03-09 15:02:27 +01:00
Nicolai Hähnle	1197c69bdd	radeonsi: extract the buffer descriptor computation into its own function This will allow it to be re-used for shader image descriptors. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-03-09 15:02:27 +01:00
Nicolai Hähnle	2bf8ee34b8	radeonsi: remove resource field from si_sampler_view view->resource is redundant with view->base.texture, so get rid of it. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-03-09 15:02:27 +01:00
Marek Olšák	2dec5e09e1	radeonsi: accept pipe_resource in si_sampler_view_add_buffer and rename .._buffers -> .._buffer Based loosely on Nicolai's patch. This will make it easier to cherry-pick Nicolai's patches from his image support branch. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:27 +01:00
Marek Olšák	f18fc70d6f	radeonsi: disable DCC on handle export if expecting write access This should be okay except that sampler views and images are not re-set. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:27 +01:00
Bas Nieuwenhuizen	1e48ec7571	radeonsi: add DCC decompression (v2) This is currently not needed but will be necessary when we have features that do not work with DCC enabled, such as image stores and sharing non-scanout surfaces. v2: Marek: rebase, remove decompression from si_flush_resource (not needed) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:27 +01:00
Marek Olšák	b744ac9f44	radeonsi: allocate DCC in the same backing buffer as the texture To allow sharing textures with DCC enabled. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:27 +01:00
Marek Olšák	60c08aa90b	gallium/radeon: disable CMASK on handle export if sharing doesn't allow it (v2) v2: remove the list of all contexts Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:27 +01:00
Marek Olšák	970b979da1	gallium/radeon: eliminate fast color clear before sharing Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:27 +01:00
Marek Olšák	abac6bf67a	gallium/radeon: don't use fast color clear if sharing doesn't allow it Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:27 +01:00
Marek Olšák	d4e847ea33	gallium/radeon: disallow handle export for MSAA & depth textures Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:27 +01:00
Marek Olšák	d95f593758	gallium/radeon: remember that texture_from_handle was called and its flags Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:26 +01:00
Marek Olšák	c034d3dde0	gallium/radeon: check that handle usage doesn't change for a resource Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:26 +01:00
Marek Olšák	6b187bbd9f	gallium/radeon: disallow reallocation of shared buffers Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:26 +01:00
Marek Olšák	ecbd3aba17	gallium/radeon: if we can't discard a whole resource, discard the range instead Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:26 +01:00
Marek Olšák	afdaffcbdb	gallium/radeon: buffer valid range tracking only works with unshared buffers Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:26 +01:00
Marek Olšák	be73d35829	gallium/radeon: don't set texture metadata for buffers Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:26 +01:00
Marek Olšák	f914779c75	gallium/radeon: set texture metadata only once Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:26 +01:00
Marek Olšák	69d8b75114	gallium/radeon: clean up r600_texture_get_handle Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:26 +01:00
Marek Olšák	e3cee38e13	gallium/radeon: move code initializing texture metadata to its own function Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:26 +01:00
Marek Olšák	f4aa3256ef	winsys/amdgpu: allow drivers to set/get opaque metadata Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:26 +01:00
Marek Olšák	bd1feb2827	gallium/radeon: rename winsys buffer_get/set_tiling to buffer_get/set_metadata Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:26 +01:00
Marek Olšák	6011d7cf25	gallium/radeon: remove rcs parameter from radeon_winsys::buffer_set_tiling This was needed for DRM < 2.12.0 where the kernel was rewriting tiling flags in IBs. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:25 +01:00
Marek Olšák	260ef9c9be	gallium/radeon: use a structure for passing tiling flags from/to winsys and call it radeon_bo_metadata Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-03-09 15:02:25 +01:00
Marek Olšák	82db518f15	gallium: add external usage flags to resource_from(get)_handle (v2) This will allow drivers to make better decisions about texture sharing for DRI2, DRI3, Wayland, and OpenCL. v2: add read/write flags, take advantage of __DRI_IMAGE_USE_BACKBUFFER Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-03-09 15:02:25 +01:00
Axel Davy	d943ac432d	dri: add backbuffer use flag This will be used by the next commit. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-03-09 15:02:25 +01:00
Timothy Arceri	2188c77a0e	glsl: dont allow undefined array sizes in ES This applies the rule to empty declarations. Fixes: dEQP-GLES3.functional.shaders.arrays.invalid.empty_declaration_without_var_name_vertex dEQP-GLES3.functional.shaders.arrays.invalid.empty_declaration_without_var_name_fragment Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-03-09 20:30:42 +11:00
Jason Ekstrand	248ab61740	anv/cmd_buffer: Pull the core of flush_state into genX_cmd_buffer	2016-03-08 17:10:05 -08:00
Jason Ekstrand	28cbc45b3c	anv/cmd_buffer: Split flush_state into two functions	2016-03-08 16:54:07 -08:00
Jason Ekstrand	42b4c0fa6e	anv: Pull all of the genX_foo functions into anv_genX.h This way we only have to declare them each once and we get it for all gens at a single go.	2016-03-08 16:49:08 -08:00
Tamil velan	353a4f844f	radeon/uvd: increase max height to 4096 for VI and newer With this issue 'mpv --hwdec=vdpau --vo=vdpau <stream>' fails for vdpau decode if the stream height is 4096. Vdpau decode of height upto 4096 is necessary usecase on amdgpu driver for VI and newer platforms. The fix is in driver specific implementation of "Decoder Query Capabilities" API to return 4096 for VI and newer platforms. With this fix vdpauinfo reports height support as 4096 and mpv for vdpau decode works fine for 4096 height streams. Signed-off-by: Tamil velan <Tamil-Velan.Jayakumar@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-03-08 19:01:19 -05:00
Bas Nieuwenhuizen	6373845d98	winsys/amdgpu: enlarge buffer_indices_hashlist Enlarge the buffer hashlist to prevent large numbers of misses due to adding more buffers than can be cached in the hashlist. The game I tested had CS's with up to 1500 buffers and the overhead of amdgpu_lookup_buffer for various sizes was: 4096 1.97% (new value) 2048 4.37% 1024 6.92% 512 9.47% (old value) (percentage of CPU usage in render thread as determined by perf) The time spent in amdgpu_add_buffer self is ~4.2% in all cases and for 4096 the time needed to clear the hashlist is still < 0.10%, so I am not expecting significant regressions. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-03-09 00:52:07 +01:00
Jason Ekstrand	bbbdd32c19	anv/meta_clear: Use repclear again	2016-03-08 15:40:11 -08:00
Jason Ekstrand	dc504a51fb	anv/pipeline: Unconditionally emit PS_BLEND on gen8+ Special-casing the PS_BLEND packet wasn't really gaining us anything. It's defined to be more-or-less the contents of blend state entry 0 only without the indirection. We can just copy-and-paste the contents. If there are no valid color targets, then blend state 0 will be 0-initialized anyway so it's basically the same as the special case we had before.	2016-03-08 15:40:11 -08:00
Jason Ekstrand	cce65471b8	anv: Compact render targets Previously, we would always emit all of the render targets in the subpass. This commit changes it so that we compact render targets just like we do with other resources. Render targets are represented in the surface map by using a descriptor set index of UINT16_MAX.	2016-03-08 15:40:11 -08:00
Samuel Pitoiset	32e848b016	nvc0: add a new validation path for compute This makes use of the new state validation interface to be consistent with 3d. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-09 00:19:21 +01:00
Samuel Pitoiset	db9b41d302	nvc0: rework the validation path for 3D This exposes an interface for state validation that will be also used to rework the compute validation path. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-09 00:19:16 +01:00
Jordan Justen	a100a57e30	i965/hsw: Initialize SLM index in state register For Haswell, we need to initialize the SLM index in the state register. This can be copied out of the CS header dword 0. v2: * Use UW move to avoid changing upper 16-bits of sr0.1 (mattst88) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94081 Fixes: piglit arb_compute_shader/execution/shared-atomics.shader_test Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Cc: "11.2" <mesa-stable@lists.freedesktop.org> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu> (v1) Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-08 14:27:18 -08:00
Jordan Justen	d8347f12ea	i965/compute: Skip SIMD8 generation if it can't be used If the local workgroup size is sufficiently large, then the SIMD8 program can't be used. In this case we can skip generating the SIMD8 program. For complex programs this can save a significant amount of time. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-08 14:27:18 -08:00
Jordan Justen	e1d54b1ba5	i965/fs: Allow spilling for SIMD16 compute shaders For fragment shaders, we can always use a SIMD8 program. Therefore, if we detect spilling with a SIMD16 program, then it is better to skip generating a SIMD16 program to only rely on a SIMD8 program. Unfortunately, this doesn't work for compute shaders. For a compute shader, we may be required to use SIMD16 if the local workgroup size is bigger than a certain size. For example, on gen7, if the local workgroup size is larger than 512, then a SIMD16 program is required. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93840 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Cc: "11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-08 14:27:18 -08:00
Timothy Arceri	91630d7453	glsl: don't always reject shaders with mismatching ifc blocks Since we store some member qualifiers in the interface type we need to be more careful about rejecting shaders just because the pointer doesn't match. Its perfectly valid for some qualifiers such as precision to not match across shader interfaces. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-03-09 09:21:42 +11:00
Timothy Arceri	3026b3565a	glsl: make interstage_match() static Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-03-09 09:21:36 +11:00
Timothy Arceri	ebc419fcbd	glsl: don't validate ifc blocks using validation meant for variables Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-03-09 09:21:31 +11:00
Kenneth Graunke	19f13b2096	mesa: Fix error code for GetFramebufferAttachmentParameter in ES 3.0+. The ES 3.0+ specifications contain the exact same text as the OpenGL specification, which says that we should return GL_INVALID_OPERATION. ES 2.0 contains different text saying we should return GL_INVALID_ENUM. Previously, Mesa chose the error code based on API (GL vs. ES). This patch makes ES 3.0+ follow the GL behavior. ES 2 remains as is. Fixes dEQP-GLES3.functional.fbo.api.attachment_query_empty_fbo. However, breaks the dEQP-GLES2 variant of the same test for drivers which silently promote to ES 3.0. This can be worked around by exporting MESA_GLES_VERSION_OVERRIDE=2.0, but is a bug in dEQP-GLES2. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-03-08 12:46:28 -08:00
Kenneth Graunke	8b3496f378	mesa: Add GL_RED and GL_RG to ES3 effective internal format mapping. The dEQP-GLES3.functional.fbo.completeness.renderable.texture. {color0,depth,stencil}.{red,rg}_unsigned_byte tests appear to expect GL_RED/GL_RG and GL_UNSIGNED_BYTE to map to GL_R8/GL_RG8, rather than returning an INVALID_OPERATION error. This makes perfect sense. However, RED and RG are strangely missing from the ES 3.0/3.1/3.2 spec's "Effective internal format corresponding to external format and type" tables. It may be worth filing a spec bug. Fixes the 6 dEQP tests mentioned above. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-03-08 12:46:28 -08:00
Samuel Pitoiset	752769e053	nv50,nvc0: make sure to destroy the mutex used for blits This mutex is initialized when the blitter is created, but it is never destroyed. This doesn't hurt anything but it makes sense to destroy it at blitter deletion. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-08 21:24:46 +01:00
Marek Olšák	3146014d5f	gallium/radeon: don't use temporary buffers for persistent mappings Cc: 11.1 11.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-03-08 20:08:52 +01:00
Jason Ekstrand	14b18aba89	nir: Add a pass for lower indirect variable dereferences This new pass lowers load/store_var intrinsics that act on indirect derefs to if-ladder of direct load/store_var intrinsics. The if-ladders perform a simple binary search on the indirect. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-03-08 10:41:54 -08:00
Alejandro Piñeiro	ef76ea4ba9	i965/fs/nir: "surface_access::" prefix not needed "using namespace brw::surface_access" is already present at the top of the source file. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-08 17:55:28 +01:00
Brian Paul	6857420e79	mesa: fix malformed assertion in _image_format_class_to_glenum() Reviewed-by: Vinson Lee <vlee@freedesktop.org>	2016-03-08 08:42:56 -07:00
Brian Paul	3ed8729f7b	program: minor whitespace clean-ups in program_parse_extra.c	2016-03-08 08:42:56 -07:00
Christian König	37402aa4c6	st/mesa: conditionally enable GL_NV_vdpau_interop Only enable it when we compile the state tracker as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-08 13:00:04 +01:00
Christian König	e148a3b6e9	radeon/uvd: disable MPEG1 The hardware simply doesn't support that correctly. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-03-08 12:57:08 +01:00
Alejandro Piñeiro	0548844e86	i965/vec4/nir: no need to use surface_access:: to call emit_untyped_atomic Now that brw_vec4_visitor::emit_untyped_atomic was removed, there is no need to explicitly set it. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-08 08:22:26 +01:00
Alejandro Piñeiro	d3a89a7c49	i965/vec4/nir: remove emit_untyped_surface_read and emit_untyped_atomic at brw_vec4_visitor surface_access emit_untyped_read and emit_untyped_atomic provides the same functionality. v2: surface parameter of emit_untyped_atomic is a const, no need to specify default predicate on emit_untyped_atomic, use retype (Francisco Jerez). Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-08 08:22:26 +01:00
Alejandro Piñeiro	0c5c2e2c93	i965/vec4: pass the correct src_sz to emit_send at emit_untyped_atomic If the src is invalid, so src size is zero, the src_sz passed to emit send should be zero too, instead of a default 1 if we are in a simd4x2 case. This can happens if using emit_untyped_atomic for an atomic dec/inc. v2: use the proper src_sz when calling emit_send, instead of just avoid loading src at emit_send if BAD_FILE (Francisco Jerez) Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-08 08:22:26 +01:00
Kenneth Graunke	ea9fa5ff05	glcpp: Remove empty mid-rule action which changes test behavior. Apparently this causes a slight difference in the parser's token expectations, leading to a different error message. It seems harmless, but I wanted to be cautious and separate it out. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-03-07 23:02:05 -08:00
Kenneth Graunke	e816c8b54a	glcpp: Clean up most empty mid-rule actions left by previous commit. I didn't want to pollute the previous patch with all the $4 -> $3 changes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-03-07 23:02:03 -08:00
Kenneth Graunke	639bbe3cb4	glcpp: Delete unnecessary implicit version resolves. We now have a bigger hammer. The HASH_TOKEN NEWLINE rule still needs to exist to ensure the 146-version-hash-first.c test still passes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-03-07 23:02:01 -08:00
Kenneth Graunke	07ec67d85c	glcpp: Implicitly resolve version after the first non-space/hash token. We resolved the implicit version directive when processing control lines, such as #ifdef, to ensure any built-in macros exist. However, we failed to resolve it when handling ordinary text. For example, int x = __VERSION__; should resolve __VERSION__ to 110, but since we never resolved the implicit version, none of the built-in macros exist, so it was left as is. This also meant we allowed the following shader to slop through: 123 #version 120 Nothing would cause the implicit version to take effect, so when we saw the #version directive, we thought everything was peachy. This patch makes the lexer's per-token action resolve the implicit version on the first non-space/newline/hash token that isn't part of a #version directive, fulfilling the GLSL language spec: "The #version directive must occur in a shader before anything else, except for comments and white space." Because we emit #version as HASH_TOKEN then VERSION_TOKEN, we have to allow HASH_TOKEN to slop through as well, so we don't resolve the implicit version as soon as we see the # character. However, this is fine, because the parser's HASH_TOKEN NEWLINE rule does resolve the version, disallowing cases like: # #version 120 This patch also adds the above shaders as new glcpp tests. Fixes dEQP-GLES2.functional.shaders.preprocessor.predefined_macros. {gl_es_1_vertex,gl_es_1_fragment}. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-03-07 23:01:43 -08:00
Jason Ekstrand	75af420cb1	anv/pipeline: Move binding table setup to its own helper	2016-03-07 22:24:31 -08:00
Jason Ekstrand	2308891ede	anv: Store CPU-side fence information in the BO This reduces the number of allocations a bit and cuts back on memory usage. Kind-of a micro-optimization but it also makes the error handling a bit simpler so it seems like a win.	2016-03-07 22:23:44 -08:00
Jason Ekstrand	f61d40adc2	anv/allocator: Better casting in PFL macros We cast he constant 0xfff values to a uintptr_t before applying a bitwise negate to ensure that they are actually 64-bit when needed. Also, the count variable doesn't need to be explicitly cast, it will get upcast as needed by the "\|" operation.	2016-03-07 22:23:44 -08:00
Jason Ekstrand	3d4f2b0927	anv/allocator: Move the alignment assert for the pointer free list Previously we asserted every time you tried to pack a pointer and a counter together. However, this wasn't really correct. In the case where you try to grab the last element of the list, the "next elemnet" value you get may be bogus if someonoe else got there first. This was leading to assertion failures even though the allocator would safely fall through to the failure case below.	2016-03-07 22:23:44 -08:00
Jason Ekstrand	8c2b9d1529	anv/bo_pool: Allow freeing BOs where the anv_bo is in the BO itself	2016-03-07 22:23:44 -08:00
Tim Rowley	90f9df3210	gallium/swr: fix issues preventing a 32-bit build Not a currently tested configuration, but these couple of small changes allow a 32-bit build. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94383 Acked-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Brian Paul <brianp@vmware.com>	2016-03-07 17:22:24 -06:00
Nanley Chery	181b142fbd	anv/device: Up device limits for 3D and array texture dimensions The limit for these textures is 2048 not 1024. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-03-07 15:21:50 -08:00
Tim Rowley	035d39b539	gallium/swr: remove use of UINT64 from swr_fence Remove use of a win32-style type leaked from the swr rasterizer. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-03-07 16:58:48 -06:00
Jason Ekstrand	428ffc9c13	anv/device: Actually free the CPU-side fence struct again In `23de78768`, when we switched from allocating individual BOs to using the pool for fences, we accidentally deleted the free.	2016-03-07 14:50:52 -08:00
Kenneth Graunke	af41c0b7e0	glsl: Add function parameters to the parser symbol table. In a shader such as: struct S { float f; } float identity(float S) { return S; } we would think that "S" in "return S" referred to a structure, even though it's shadowed by the "float S" parameter in the inner struct. This led to the parser's grammar seeing TYPE_IDENTIFIER and getting confused. Fixes dEQP-GLES2.functional.shaders.scoping.valid. function_parameter_hides_struct_type_{vertex,fragment}. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-03-07 14:09:55 -08:00
Kenneth Graunke	c4960068d5	glsl: Add single declaration variables to the symbol table too. The lexer/parser use a symbol table to classify identifiers as variables, functions, or structure types. For some reason, we neglected to add variables in simple declarations such as int x = 5; but did add subsequent variables in multi-declarations: int x = 5, y = 6; // y gets added, but not x, for some reason Fixes four dEQP-GLES2.functional.shaders.scoping.valid subcases: - local_int_variable_hides_struct_type_vertex - local_int_variable_hides_struct_type_fragment - local_struct_variable_hides_struct_type_vertex - local_struct_variable_hides_struct_type_fragment Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-03-07 14:09:31 -08:00
Kenneth Graunke	1107e48b9a	mesa: Change GLboolean to bool in GenerateMipmap target checker. This is not API facing, so just use bool. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-07 14:01:34 -08:00
Kenneth Graunke	2f8a43586e	mesa: Make GenerateMipmap check the target before finding an object. If glGenerateMipmap was called with a bogus target, then it would pass that to _mesa_get_current_tex_object(), which would raise a _mesa_problem() telling people to file bugs. We'd then do the proper error checking, raise an error, and bail. Doing the check first avoids the _mesa_problem(). The DSA variant doesn't take a target parameter, so we leave the target validation exactly as it was in that case. Fixes one dEQP GLES2 test: dEQP-GLES2.functional.negative_api.texture.generatemipmap.invalid_target. v2: Rebase on Antia's recent patch to this area. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com> [v1] Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-07 14:01:22 -08:00
Samuel Pitoiset	8f99c1bbce	gm107/ir: add emission for ATOMS This allows to perform atomic operations on shared memory. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-07 22:13:14 +01:00
Samuel Pitoiset	7f8565f0b2	tgsi: fix parsing of shared memory declarations The SHARED TGSI keyword is only allowed with TGSI_FILE_MEMORY and not with TGSI_FILE_BUFFER. I have found this by using the nouveau_compiler from command line. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.2" <mesa-stable@lists.freedesktop.org>	2016-03-07 22:13:08 +01:00
Samuel Pitoiset	c82086f7e9	gm107/ir: add emission for BAR Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-07 18:39:50 +01:00
Samuel Pitoiset	8a109c0375	gk110/ir: add missing src predicate emission for BAR.RED Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-07 18:39:48 +01:00
Samuel Pitoiset	f4d2d49152	gk110/ir: allow to emit immediates for BAR Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-07 18:39:46 +01:00
Samuel Pitoiset	cba89fdaa1	gk110/ir: fix wrong emission of BAR.SYNC Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-07 18:39:43 +01:00
Samuel Pitoiset	5777e87bed	nvc0/ir: make sure that thread count immediate for BAR fit The limit of the thread count immediate value is 12 bits. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-07 18:39:41 +01:00
Brian Paul	3af78b426e	svga: add new surface-write-flushes HUD query To know when we're flushing the command buffer because we need to write to surface in the command buffer. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-07 09:33:15 -07:00
Brian Paul	7e8cf34546	svga: add new flush-time HUD query To measure the time spent flushing the command buffer. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-07 09:33:15 -07:00
Brian Paul	903afc370f	svga: also dump SVGA3D_BUFFER surfaces in svga_screen_cache_dump() Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-07 09:33:15 -07:00
Kristian Høgsberg Kristensen	32aa01663f	anv: Quiet pTessellationState warning Some application pass a dummy for pTessellationState which results in a lot of noise. Only warn if we're actually given tessellation shadear stages.	2016-03-06 22:06:24 -08:00
Ilia Mirkin	0941ef3dd5	mesa: flip current tf object back to default if current is being deleted In the rather unusual case of Bind + Delete, we need to make sure that we unbind the current tf object. Fixes dEQP-GLES3.functional.lifetime.delete_bound.transform_feedback Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-07 00:36:08 -05:00
Ilia Mirkin	f6827e20d1	glsl: avoid stack smashing when there are too many attributes This fixes a crash in dEQP-GLES3.functional.transform_feedback.array_element.separate.points.lowp_mat3x2 and likely others. The vertex shader has > 16 input variables (without explicit locations), which causes us to index outside of the to_assign array. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-03-07 00:36:08 -05:00
Jason Ekstrand	23de78768b	anv: Create fences from the batch BO pool Applications may create a lot of fences, perhaps as much as one per vkQueueSubmit. Really, they're supposed to use ResetFence, but it's easy enough for us to make them crazy-cheap so we might as well.	2016-03-06 14:26:52 -08:00
Francisco Jerez	3dd0441f6c	i965/vec4: Propagate swizzles correctly during copy propagation. This simplifies the code that iterates over the per-component values found in the matching copy_entry struct and checks whether the register regions that were copied to each component are similar enough to be treated as a single (reswizzled) value which can be propagated into the current instruction. Aside from being scattered between opt_copy_propagation(), try_copy_propagate(), and try_constant_propagate(), what I found terribly confusing about the preexisting logic was that opt_copy_propagation() tried to reorder the array of values according to the swizzle of the instruction source, which meant one would have had to invert the reordering applied at the top level in order to find out which component to take from each value (we were just taking the i-th component from the i-th value, which is not correct in general). The saturate mask was also being swizzled incorrectly. This consolidates the logic for matching multiple components of a copy_entry into a single function which returns the result as a regular src_reg on success, as if the copy had been performed with a single MOV instruction copying all components of the src_reg into the destination. Fixes several ARB_vertex_program MOV test-cases from: https://cgit.freedesktop.org/~kwg/piglit/log/?h=arb_program Acked-by: Matt Turner <mattst88@gmail.com>	2016-03-06 12:22:40 -08:00
Francisco Jerez	c70b7c80e3	i965: Don't try copy propagation if constant propagation succeeded. It cannot get any better. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-06 12:22:40 -08:00
Francisco Jerez	dcf5e19e65	i965/vec4: Use swizzle() to swizzle immediates during constant propagation. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-06 12:22:40 -08:00
Francisco Jerez	ff7a2b489e	i965: Add support for swizzling arbitrary immediates to (brw_)swizzle(). Scalar immediates used to be handled correctly by swizzle() (as the identity) but since commit `58fa9d47b5` it will corrupt the contents of the immediate. Vector immediates were never handled correctly, but we had ad-hoc code to swizzle VF immediates in the vec4 copy propagation pass. This takes care of swizzling V and UV in addition. v2: Don't implement swizzling of V/UV immediates (Matt). If you need to swizzle an integer vector immediate in the future apply the following diff to go back to v1: --- a/src/mesa/drivers/dri/i965/brw_eu.c +++ b/src/mesa/drivers/dri/i965/brw_eu.c @@ -119,11 +119,10 @@ brw_swap_cmod(uint32_t cmod) static unsigned imm_shift(enum brw_reg_type type, unsigned i) { - assert(type != BRW_REGISTER_TYPE_UV && type != BRW_REGISTER_TYPE_V && - "Not implemented."); - if (type == BRW_REGISTER_TYPE_VF) return 8 * (i & 3); + else if (type == BRW_REGISTER_TYPE_UV \|\| type == BRW_REGISTER_TYPE_V) + return 4 * (i & 7); else return 0; } Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-06 12:22:40 -08:00
Francisco Jerez	537d3df974	i965: Pass symbolic swizzle to brw_swizzle() as a single argument. And replace brw_swizzle1() with brw_swizzle(). Seems slightly cleaner and will allow reusing brw_swizzle() in the vec4 back-end more easily. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-03-06 12:22:39 -08:00
Ilia Mirkin	ff085d014e	nvc0: reset TFB bufctx when we no longer hold a reference to the buffers This fixes some use-after-free situations in dEQP when an xfb state is removed, and then a clear is triggered, which only does a partial validation. It would attempt to read the no-longer-valid buffers, resulting in crashes. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-03-06 10:14:52 -05:00
Jason Ekstrand	21ee5fd326	anv: Emit null render targets v2 (Francisco Jerez): Add the state_offset to the surface state offset	2016-03-05 20:47:10 -08:00
Ilia Mirkin	fa43c4bd99	nv50/ir: using sampleid/pos shouldn't force per-sample interpolation See https://www.khronos.org/bugzilla/show_bug.cgi?id=1462 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-05 23:26:03 -05:00
Ilia Mirkin	313205cb8f	st/mesa: don't force per-sample interp if only sampleid/pos are used The OES extensions clarify this behaviour to differentiate between per-sample invocation and per-sample interpolation. Using sampleid/pos will force per-sample invocation but not per-sample interpolation. See https://www.khronos.org/bugzilla/show_bug.cgi?id=1462 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-05 23:26:03 -05:00
Ilia Mirkin	dcbf8377be	swrast: fix GL_ANY_SAMPLES_PASSED values in Result Since commit `922be4eab`, the expectation is that the query result contains the correct value. Unfortunately swrast does not distinguish between GL_SAMPLES_PASSED and GL_ANY_SAMPLES_PASSED. As a result, we must fix up the query result in a post-draw fixup. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94274 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "11.2" <mesa-stable@lists.freedesktop.org>	2016-03-05 23:25:52 -05:00
Jason Ekstrand	8502794c12	anv/pipeline: Handle null wm_prog_data in 3DSTATE_CLIP	2016-03-05 14:42:16 -08:00
Kristian Høgsberg Kristensen	7b348ab8a0	anv: Fix rebase error	2016-03-05 14:33:50 -08:00
Kristian Høgsberg Kristensen	34326f46df	anv: Turn pipeline cache on by default Move the environment variable check to cache creation time so we block both lookups and uploads if it's turned off.	2016-03-05 13:54:24 -08:00
Kristian Høgsberg Kristensen	f2b37132cb	anv: Check if shader if present before uploading to cache Between the initial check the returns NO_KERNEL and compiling the shader, other threads may have added the shader to the cache. Before uploading the kernel, check again (under the mutex) that the compiled shader still isn't present.	2016-03-05 13:54:24 -08:00
Kristian Høgsberg Kristensen	30bbe28b7e	anv: Always use point size from the shader There is no API for setting the point size and the shader is always required to set it. Section 24.4: "If the value written to PointSize is less than or equal to zero, or if no value was written to PointSize, results are undefined." As such, we can just always program PointWidthSource to Vertex. This simplifies anv_pipeline a bit and avoids trouble when we enable the pipeline cache and don't have writes_point_size in the prog_data.	2016-03-05 13:54:24 -08:00
Kristian Høgsberg Kristensen	6139fe9a77	anv: Also cache the struct anv_pipeline_binding maps This is state the we generate when compiling the shaders and we need it for mapping resources from descriptor sets to binding table indices.	2016-03-05 13:50:07 -08:00
Kristian Høgsberg Kristensen	584f39c65e	anv: Don't re-upload shaders when merging Using anv_pipeline_cache_upload_kernel() will re-upload the kernel and prog_data when we merge caches. Since the kernel and prog_data is already in the program_stream, use anv_pipeline_cache_add_entry() instead to only add the entry to the hash table.	2016-03-05 13:50:07 -08:00
Kristian Høgsberg Kristensen	626559ed37	anv: Add anv_pipeline_cache_add_entry() This function will grow the cache to make room and then add the entry.	2016-03-05 13:50:07 -08:00
Kristian Høgsberg Kristensen	07441c344c	anv: Rename anv_pipeline_cache_add_entry() to 'set' This function is a helper that unconditionally sets a hash table entry and expects the cache to have enough room. Calling it 'add_entry' suggests it will grow the cache as needed.	2016-03-05 13:50:07 -08:00
Kristian Høgsberg Kristensen	87967a2c85	anv: Simplify pipeline cache control flow a bit No functional change, but the control flow around searching the cache and falling back to compiling is a bit simpler.	2016-03-05 13:50:07 -08:00
Kristian Høgsberg Kristensen	2b29342fae	anv: Store prog data in pipeline cache stream We have to keep it there for the cache to work, so let's not have an extra copy in struct anv_pipeline too.	2016-03-05 13:50:07 -08:00
Kristian Høgsberg Kristensen	37c5e70253	anv: Rename 'table' to 'hash_table' in anv_pipeline_cache A little less ambiguous.	2016-03-05 13:50:07 -08:00
Kristian Høgsberg Kristensen	c028ffea70	anv: Serialize as much pipeline cache as we can We can serialize as much as the application asks for and just stop once we run out of memory. This lets applications use a fixed amount of space for caching and still get some benefit.	2016-03-05 13:50:07 -08:00
Kristian Høgsberg Kristensen	cd812f086e	anv: Use 1.0 pipeline cache header The final version of the pipeline cache header adds a few more fields.	2016-03-05 13:50:07 -08:00
Kristian Høgsberg Kristensen	26ed943eb9	anv: Fix shader key hashing This was copied from inline code to a helper and wasn't updated to hash a pointer instead.	2016-03-05 13:50:07 -08:00
Kristian Høgsberg Kristensen	3baf8af947	anv: Remove excess whitespace	2016-03-05 13:50:07 -08:00
Kristian Høgsberg Kristensen	ab36eae5e7	anv: Remove left-over bits of sparse-descriptor code	2016-03-05 13:50:07 -08:00
Jason Ekstrand	1afdfc3e6e	anv/pipeline: Implement the depth compare EQUAL workaround on gen8+	2016-03-05 09:59:28 -08:00
Jason Ekstrand	7c1660aa14	anv: Don't allow D16_UNORM to be combined with stencil Among other things, this can cause the depth or stencil test to spurriously fail when the fragment shader uses discard.	2016-03-05 09:59:28 -08:00
Jason Ekstrand	9a90176d48	anv/pipeline: Calculate the correct max_source_attr for 3DSTATE_SBE	2016-03-05 09:59:28 -08:00
Brian Paul	a4678311be	st/mesa: 78-column wrapping in st_extensions.c Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-03-05 09:21:05 -07:00
Brian Paul	9e6a6bd575	gallium/util: add new comments, assertions in u_debug_refcnt.c Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-03-05 09:20:34 -07:00
Brian Paul	b6a607b221	gallium/util: update comments and URL in u_debug_refcnt.c Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-03-05 09:20:28 -07:00
Brian Paul	cbca6964e2	gallium/util: make stream variable static in u_debug_refcnt.c Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-03-05 09:20:23 -07:00
Brian Paul	fb0abedce7	gallium/util: re-indent u_debug_refcnt.[ch] Wrap comments to 78 columns, etc. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-03-05 09:20:14 -07:00
Brian Paul	a7ba29f6d8	gallium/tests: silence warning in compute.c compute.c: In function ‘launch_grid’: compute.c:435:20: warning: assignment discards ‘const’ qualifier from pointer target type [enabled by default] info.input = input; ^ Maybe the pipe_grid_info::input field should be const void *? Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-03-05 09:15:44 -07:00
Timothy Arceri	31943e6ba5	glsl: replace remaining tabs in link_varyings.cpp Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2016-03-05 20:50:10 +11:00
Timothy Arceri	e2415e8467	glsl: replace remaining tabs in link_uniforms.cpp Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2016-03-05 20:50:05 +11:00
Jordan Justen	81f30e2f50	anv/hsw: Move query code to genX file for Haswell This fixes many CTS cases, but will require an update to the kernel command parser register whitelist. (The CS GPRs and TIMESTAMP registers need to be whitelisted.) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-05 01:08:07 -08:00
Timothy Arceri	3322cb7b8d	docs: mark align layout qualifier as DONE Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-03-05 19:39:13 +11:00
Timothy Arceri	037f68d81e	glsl: apply align layout qualifier rules to block offsets From Section 4.4.5 (Uniform and Shader Storage Block Layout Qualifiers) of the OpenGL 4.50 spec: "The align qualifier makes the start of each block member have a minimum byte alignment. It does not affect the internal layout within each member, which will still follow the std140 or std430 rules. The specified alignment must be a power of 2, or a compile-time error results. The actual alignment of a member will be the greater of the specified align alignment and the standard (e.g., std140) base alignment for the member's type. The actual offset of a member is computed as follows: If offset was declared, start with that offset, otherwise start with the next available offset. If the resulting offset is not a multiple of the actual alignment, increase it to the first offset that is a multiple of the actual alignment. This results in the actual offset the member will have. When align is applied to an array, it affects only the start of the array, not the array's internal stride. Both an offset and an align qualifier can be specified on a declaration. The align qualifier, when used on a block, has the same effect as qualifying each member with the same align value as declared on the block, and gets the same compile-time results and errors as if this had been done. As described in general earlier, an individual member can specify its own align, which overrides the block-level align, but just for that member. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-03-05 19:39:07 +11:00
Timothy Arceri	5a27fefffe	glsl: parse align layout qualifier Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-03-05 19:39:01 +11:00
Timothy Arceri	22b0082b9d	docs: mark explicit byte offsets as DONE Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-03-05 19:38:55 +11:00
Timothy Arceri	802262c0af	glsl: use explicit offset when lowering buffer access Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-03-05 19:38:49 +11:00
Timothy Arceri	96527c3cf2	glsl: copy explicit offset to uniform storage Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-03-05 19:38:44 +11:00
Timothy Arceri	e12a49ac12	glsl: update comment on offset field The old comment was for the location not the offset, we now use the field for block members so mention that also. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-03-05 19:38:39 +11:00
Timothy Arceri	9f24f42c49	glsl: add offset to glsl interface type In this patch we also copy the offset value from the ast and implement offset linking rules by adding it to the record_compare() function. From Section 4.4.5 (Uniform and Shader Storage Block Layout Qualifiers) of the GLSL 4.50 spec: "Two blocks linked together in the same program with the same block name must have the exact same set of members qualified with offset and their integral-constant-expression values must be the same, or a link-time error results." Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-03-05 19:38:34 +11:00
Timothy Arceri	8abed7f185	glsl: apply compile-time rules for the offset layout qualifier This implements the rules for the offset qualifier on block members. From Section 4.4.5 (Uniform and Shader Storage Block Layout Qualifiers) of the GLSL 4.50 spec: "The offset qualifier can only be used on block members of blocks declared with std140 or std430 layouts." ... "It is a compile-time error to specify an offset that is smaller than the offset of the previous member in the block or that lies within the previous member of the block." ... "The specified offset must be a multiple of the base alignment of the type of the block member it qualifies, or a compile-time error results." Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-03-05 19:38:30 +11:00
Timothy Arceri	6f45484ac7	glsl: enable offset layout qualifier for ARB_enhanced_layouts Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-03-05 19:38:26 +11:00
Timothy Arceri	1824ff1c2a	glsl: reject invalid input layout qualifiers Global in validation is already handled, this will do the validation for variables, blocks and block members. This fixes some CTS tests for the new enhanced layouts transform feedback qualifiers. V2: add some more valid input flags Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-03-05 19:07:09 +11:00
Timothy Arceri	bd53cc7b45	glsl: only apply default stream to output blocks This is needed to allow invalid qualifier checks on inputs. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-03-05 19:07:04 +11:00
Timothy Arceri	78d3098c05	glsl: rework parsing of blocks Previously interface blocks were giving the global default flags of uniform blocks. This meant we could not check for invalid qualifiers on interface blocks because they always contained invalid flags. This changes parsing so that interface blocks now get an empty set of layouts. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-03-05 19:07:00 +11:00
Timothy Arceri	d244986bf2	glsl: don't apply uniform/buffer layouts to interface blocks If the following patch we will stop setting these layouts by default on interface blocks, so we need to do this to avoid hitting the assert. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-03-05 19:06:56 +11:00
Nanley Chery	4e75f9b219	anv: Implement VK_REMAINING_{MIP_LEVELS,ARRAY_LAYERS} v2: Subtract the baseMipLevel and baseArrayLayer (Jason) Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-04 21:25:23 -08:00
Kenneth Graunke	4ba7ad6cc1	i965: Only magnify depth for 3D textures, not array textures. When BaseLevel > 0, we magnify the dimensions to fill out the size of miplevels [0..BaseLevel). In particular, this was magnifying depth, thinking that the depth doubles at each level. This is perfectly reasonable for 3D textures, but dead wrong for array textures. Changing the depth != 1 condition to a target == GL_TEXTURE_3D check should make this only happen in the appropriate cases. Fixes about 32 dEQP tests: - dEQP-GLES31.functional.texture.gather.*.level_{1,2} Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2016-03-04 21:25:08 -08:00
Jason Ekstrand	c1436e80ef	anv/meta_clear: Set the right number of dynamic states	2016-03-04 19:18:20 -08:00
Juan A. Suarez Romero	2f76a9924e	i965/vec4: add opportunistic behaviour to opt_vector_float() opt_vector_float() transforms several scalar MOV operations to a single vectorial MOV. This is done when those MOV covers all the components of the destination register. So something like: mov vgrf3.0.xy:D, 0D mov vgrf3.0.w:D, 1065353216D mov vgrf3.0.z:D, 0D is transformed in: mov vgrf3.0:F, [0F, 0F, 0F, 1F] But there are cases where not all the components are written. For example, in: mov vgrf2.0.x:D, 1073741824D mov vgrf3.0.xy:D, 0D mov vgrf3.0.w:D, 1065353216D mov vgrf4.0.xy:D, 1065353216D mov vgrf4.0.w:D, 0D mov vgrf6.0:UD, u4.xyzw:UD Nor vgrf3 nor vgrf4 .z components are written, so the optimization is not applied. But it could be applied anyway with the components covered, using a writemask to select the ones written. So we could transform it in: mov vgrf2.0.x:D, 1073741824D mov vgrf3.0.xyw:F, [0F, 0F, 0F, 1F] mov vgrf4.0.xyw:F, [1F, 1F, 0F, 0F] mov vgrf6.0:UD, u4.xyzw:UD This commit does precisely that: opportunistically apply opt_vector_float() when possible. total instructions in shared programs: 7124660 -> 7114784 (-0.14%) instructions in affected programs: 443078 -> 433202 (-2.23%) helped: 4998 HURT: 0 total cycles in shared programs: 64757760 -> 64728016 (-0.05%) cycles in affected programs: 1401686 -> 1371942 (-2.12%) helped: 3243 HURT: 38 v2: change vectorize_mov() signature (Matt). v3: take in account predicates (Juan). v4 [mattst88]: Update shader-db numbers. Fix some whitespace issues. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2016-03-04 19:16:52 -08:00
Jason Ekstrand	cc57efc67a	anv/pipeline: Fix depthBiasEnable on gen7 The first time I tried to fix this, I set the wrong fields.	2016-03-04 17:56:12 -08:00
Jason Ekstrand	653261285e	anv/cmd_buffer: Reset the state streams when resetting the command buffer	2016-03-04 17:54:29 -08:00
Jason Ekstrand	f700d16a89	anv/cmd_buffer: Include Haswell in set_subpass	2016-03-04 17:54:29 -08:00
George Kyriazis	feb71117ae	st/xlib: Don't destroy screen on XCloseDisplay() screen may still be used by other resources that are not yet freed. To correctly fix this there will be a need to account for resources differently, but this quick fix is not any worse than the original code that leaked screens anyway. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-03-04 18:14:46 -07:00
Nanley Chery	a6fb62a864	isl: Fix RenderTargetViewExtent for mipmapped 3D surfaces Match the comment stated above the assignment. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-04 13:20:44 -08:00
Nanley Chery	b80c8ebc45	isl: Get rid of isl_surf_fill_state_info::level0_extent_px This field is no longer needed. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-04 13:20:03 -08:00
Jason Ekstrand	d154a5ebd6	anv/cmd_buffer: Let the pipeline set StencilBufferWriteEnable on gen9	2016-03-04 12:23:01 -08:00
Jason Ekstrand	f374765ce6	anv/cmd_buffer: Mask stencil reference values	2016-03-04 12:22:32 -08:00
Jason Ekstrand	d61dcec64d	anv/clear: Pull the stencil write mask from the pipeline The stencil write mask wasn't getting set at all so we were using whatever write mask happend to be left over by the application.	2016-03-04 12:03:00 -08:00
Jason Ekstrand	ec18fef88d	anv/pipeline: Set StencilBufferWriteEnable from the pipeline The hardware docs say that StencilBufferWriteEnable should only be set if StencilTestEnable is set. It seems reasonable to set them together.	2016-03-04 12:03:00 -08:00
Jason Ekstrand	fcd8e57185	anv/pipeline: More competent gen8 clipping	2016-03-04 12:03:00 -08:00
Jason Ekstrand	a8afd29653	anv/pipeline: Use the right provoking vertex for triangle fans	2016-03-04 12:03:00 -08:00
Jason Ekstrand	fa8539dd6b	anv/pipeline: Respect pRasterizationState->depthBiasEnable	2016-03-04 12:03:00 -08:00
Matt Turner	1f862e923c	i965/fs: Optimize float conversions of byte/word extract. instructions in affected programs: 31535 -> 29966 (-4.98%) helped: 23 cycles in affected programs: 272648 -> 266022 (-2.43%) helped: 14 HURT: 1 The patch decreases the number of instructions in the two Unigine programs by: #1721: 4374 -> 4155 instructions (-5.01%) #1706: 3582 -> 3363 instructions (-6.11%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-04 11:52:34 -08:00
Matt Turner	905ff86198	nir: Recognize open-coded extract_u16. No shader-db changes, but does recognize some extract_u16 which enables the next patch to optimize some code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-04 11:52:34 -08:00
Matt Turner	76289fbfa8	nir: Recognize open-coded extract_u8. Two shaders that appear in Unigine benchmarks (Heaven and Valley) unpack three bytes from an integer and convert each into a float: float((val >> 16u) & 0xffu) float((val >> 8u) & 0xffu) float((val >> 0u) & 0xffu) Instead of shifting, masking, and type converting like this: shr(8) g15<1>UD g25<8,8,1>UD 0x00000010UD and(8) g16<1>UD g15<8,8,1>UD 0x000000ffUD mov(8) g17<1>F g16<8,8,1>UD shr(8) g18<1>UD g25<8,8,1>UD 0x00000008UD and(8) g19<1>UD g18<8,8,1>UD 0x000000ffUD mov(8) g20<1>F g19<8,8,1>UD and(8) g21<1>UD g25<8,8,1>UD 0x000000ffUD mov(8) g22<1>F g21<8,8,1>UD i965 can simply extract a byte and convert to float in a single instruction: mov(8) g17<1>F g25.2<32,8,4>UB mov(8) g20<1>F g25.1<32,8,4>UB mov(8) g22<1>F g25.0<32,8,4>UB This patch implements the first step: recognizing byte extraction. A later patch will optimize out the conversion to float. instructions in affected programs: 28568 -> 27450 (-3.91%) helped: 7 cycles in affected programs: 210076 -> 203144 (-3.30%) helped: 7 This patch decreases the number of instructions in the two Unigine programs by: #1721: 4520 -> 4374 instructions (-3.23%) #1706: 3752 -> 3582 instructions (-4.53%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-04 11:52:34 -08:00
Kenneth Graunke	9d7faadd8a	anv: Fix backwards shadow comparisons sample_c is backwards from what GL and Vulkan expect. See intel_state.c in i965. v2: Drop unused vk_to_gen_compare_op. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-04 11:35:46 -08:00
George Kyriazis	01e92e7010	st/xlib: Hang off screen destructor off main XCloseDisplay() callback. This resolves some order dependencies between the already existing callback the newly created one. Tested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-03-04 10:57:24 -07:00
George Kyriazis	51e562c3ea	st/xlib: Support unlimited number of display connections There is a limit of 10 display connections, which was a problem for apps/tests that were continuously opening/closing display connections. This fix uses XAddExtension() and XESetCloseDisplay() to keep track of the status of the display connections from the X server, freeing mesa-related data as X displays get destroyed by the X server. Poster child is the VTK "TimingTests" Tested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-03-04 10:57:09 -07:00
Brian Paul	192ee9adb1	svga: add new command-buffer-size HUD query To plot a graph of the command buffer size. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-04 07:57:41 -07:00
Brian Paul	1258f907f4	svga: add new svga_winsys_context::get_command_buffer_size() To ask how large the current command buffer is. Will be used for a new GALLIUM_HUD graph. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-04 07:57:41 -07:00
Brian Paul	6fc8d90fa9	svga: reorder SVGA_QUERY_ switch cases to match declaration order Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-04 07:57:41 -07:00
Sinclair Yeh	f1410c5b91	svga: Force an RGBA view creation for an RGBA resource glXCreatePixmap() may specify a GLX_TEXTURE_FORMAT_RGB_EXT format for an RGBA resource, causing us to create an RGBX view for an RGBA resource, a combination vgpu10 does not support. When this is detected, change the request to create an RGBA view instead. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-03-04 07:57:41 -07:00
Charmaine Lee	8366701f4c	svga: fix an error in svga_texture_generate_mipmap With this patch, make sure the shader resource view is properly created before referencing it in the generate mipmap command. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-03-04 07:57:41 -07:00
Thomas Hellstrom	395c7b8fa1	winsys/svga: Increase the fence timeout If running with a software renderer backend, the timeout may be insufficient, and we don't want to release busy buffers too early. In practice, SVGA gpu lockups are extremely rare. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2016-03-04 13:55:23 +01:00
Thomas Hellstrom	24ad7e16cd	winsys/svga: Fix an uninitialized return value Reported-by: Brian Paul <brianp@vmware.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviwed-by: Brian Paul <brianp@vmware.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2016-03-04 13:54:38 +01:00
Kenneth Graunke	9ec246796f	i965: Set MaxFramebufferWidth/Height to 16384, not viewport. dEQP-GLES31.functional.fbo.no_attachments.maximums.{all,height,size,width} started hitting assertion failures when emitting SURFACE_STATE, after commit `e8fd60e789` where Samuel increased the maximum viewport size to 32768, from 16384. MaxFramebufferWidth/Height were being set to the maximum viewport size, but are actually limited by the SURFACE_STATE width/height field range, which is 16384 on Gen7+ (where ARB_framebuffer_no_attachments is exposed). So, reduce these to 16384 explicitly. Fixes assert fails in the above mentioned dEQP tests. (Those tests still fail, however.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-03-03 21:31:22 -08:00
Francisco Jerez	a6046d217d	glsl: Improve the accuracy of the acos() approximation. The adjusted polynomial coefficients come from the numerical minimization of the L2 norm of the relative error. The old coefficients would give a maximum relative error of about 15000 ULP in the neighborhood around acos(x) = 0, the new ones give a relative error bounded by less than 2000 ULP in the same neighborhood. Fixes four dEQP subtests: dEQP-GLES31.functional.shaders.builtin_functions.precision.acos. highp_compute.{scalar,vec2,vec3,vec4} Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-03 21:31:22 -08:00
Kenneth Graunke	2795fbcae3	glsl: Parameterize asin_expr() on the fit coefficients. This will allow us to share the implementation while using different polynomials for asin() and acos(). Francisco Jerez did this in the SPIR-V front-end; I'm merely porting his idea to the GLSL world. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-03 21:31:22 -08:00
Kenneth Graunke	aa37cbdff7	mesa: Allow Get() of several forgotten IsEnabled() pnames. From section 6.2 ("State Tables") of the GL 2.1 specification (the text also appears in the GL 3.0 and ES 3.1 specifications): "However, state variables for which IsEnabled is listed as the query command can also be obtained using GetBooleanv, GetIntegerv, GetFloatv, and GetDoublev." GL_DEBUG_OUTPUT, GL_DEBUG_OUTPUT_SYNCHRONOUS, and GL_FRAGMENT_SHADER_ATI were missing from the glGet() functions. All other IsEnabled() pnames look to be present, as far as I can tell. Fixes 8 dEQP-GLES31.functional.debug.state_query subtests: debug_output[_synchronous]_get{boolean,float,integer,integer64}. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-03-03 21:31:22 -08:00
Kenneth Graunke	b4b50b074b	mesa: Make glGet queries initialize ctx->Debug when necessary. dEQP-GLES31.functional.debug.state_query.debug_group_stack_depth_* tries to call glGet on GL_DEBUG_GROUP_STACK_DEPTH right away, before doing any other debug setup. This should return 1. However, because ctx->Debug wasn't allocated, we bailed and returned 0. This patch removes the open-coded locking and switches the two glGet functions to use _mesa_lock_debug_state(), which takes care of allocating and initializing that state on the first time. It also conveniently takes care of unlocking on failure for us, so we don't need to handle that in every caller. Fixes dEQP-GLES31.functional.debug.state_query.debug_group_stack_depth_ {getboolean,getfloat,getinteger,getinteger64}. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-03-03 21:31:22 -08:00
Kenneth Graunke	3ed260f54c	hack to make dota 2 menus work	2016-03-03 16:21:09 -08:00
Jason Ekstrand	56ba13c994	isl/surface_state: Set L2 bypass disable for certain BC* formats	2016-03-03 16:16:57 -08:00
Eduardo Lima Mitev	47392011c0	Update docs to advertise new support for ARB_internalformat_query2 Support in Mesa main and i965 has just been added. v2: Include note in 'New Features' of docs/relnotes/11.3.0.html. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-03 22:19:35 +01:00
Kenneth Graunke	623ce595a9	anv: Compile shader stages in pipeline order. Instead of the arbitrary order modules might be specified in. Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-03 11:36:19 -08:00
Nanley Chery	8dddc3fb1e	anv/meta: Delete unused functions Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-03 11:26:44 -08:00
Nanley Chery	d20f6abc85	anv/meta: Use blitter API for state-handling in Buffer Update/Copy Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-03 11:26:42 -08:00
Nanley Chery	318b67d157	anv/meta: Use blitter API in do_buffer_copy() v2: Keep pitch in units of bytes (Jason) Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-03 11:26:36 -08:00
Nanley Chery	96ff4d0679	anv/meta: Use blitter API in anv_CmdCopyImage() Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-03 11:26:35 -08:00
Nanley Chery	9b6c95d46e	anv/meta: Use blitter API for copies between Images and Buffers Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-03 11:25:20 -08:00
Nanley Chery	91640c34c6	anv/meta: Add function which copies between Buffers and Images v2: Keep pitch in units of bytes (Jason) Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-03 11:25:15 -08:00
Nanley Chery	61ad78d0d1	anv/meta: Add function to create anv_meta_blit2d_surf from anv_image v2: Keep pitch in units of bytes (Jason) Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-03 11:25:10 -08:00
Nanley Chery	2e9b08b9b8	anv/meta: Implement the blitter API functions Most of the code in anv_meta_blit2d() is borrowed from do_buffer_copy(). Create an image and image view for each rectangle. Note: For tiled RGB images, ISL will align the image's row_pitch up to the nearest tile width. v2 (Jason): Keep pitch in units of bytes Make src_format and dst_format variables s/dest/dst/ in every usage v3: Fix dst_image width Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-03 11:25:04 -08:00
Nanley Chery	032bf172b4	anv/meta: Modify blitter API fields Some fields are unnecessary. The variables "pitch" and "bs" are used for consistency with ISL. v2: Keep pitch in units of bytes (Jason) Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-03 11:24:53 -08:00
Jason Ekstrand	654f79a045	anv/meta: Add the beginnings of a blitter API This API is designed to be an abstraction that sits between the VkCmdCopy commands and the hardware. The idea is that it is simple enough that it should be implementable using the blitter but with enough extra data that we can implement it with the 3-D pipeline efficiently. One design objective is to allow the user to supply enough information that we can handle most blit operations with a single draw call even if they require copying multiple rectangles.	2016-03-03 11:24:45 -08:00
Nanley Chery	d1e48b9945	anv/meta: Remove redundancies in do_buffer_copy() Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-03 11:24:42 -08:00
Nanley Chery	cfe7036750	anv/meta: Replace copy_format w/ block size in do_buffer_copy() This is a preparatory commit that will simplify the future usage of this function. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-03 11:24:38 -08:00
Nanley Chery	d50ff250ec	anv/meta: Add missing command to exit meta in anv_CmdUpdateBuffer() Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-03 11:24:21 -08:00
Nanley Chery	1d9d90d9a6	anv/image: Create a linear image when requested If a linear image is requested, the only possible result should be a linearly-tiled surface. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-03 11:24:17 -08:00
Nanley Chery	091f1da902	isl: Don't filter tiling flags if a specific tiling bit is set If a specific bit is set, the intention to create a surface with a specific tiling format should be respected. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-03 11:23:40 -08:00
Nanley Chery	456f5b0314	isl: Add function to get intratile offsets from x/y offsets Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-03 10:56:15 -08:00
Jason Ekstrand	206414f92e	anv/util: Fix vector resizing It wasn't properly handling the fact that wrap-around in the source may not translate to wrap-around in the destination. This really needs unit tests.	2016-03-03 08:17:36 -08:00
Antia Puentes	4f028bfcc0	i965: Enable the ARB_internalformat_query2 extension Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:08 +01:00
Eduardo Lima Mitev	cbbdf8612d	i965/formatquery: Add support for INTERNALFORMAT_PREFERRED query This pname is tricky. The spec states that an internal format should be returned, that is compatible with the passed internal format, and has at least the same precision. There is no clear API to resolve this. The closest we have (and what other drivers (i.e, NVidia proprietary) do, is to return the same internal format given as parameter. But we validate first that the passed internal format is supported by i965. To check for support, we have the TextureFormatSupported map'. But this map expects a 'mesa_format', which takes a format+typen. So, we must first "come up" with a generic type that is suited for this internal format, then get a mesa_format, and then do the validation. The cleanest solution here is to add a method that does exactly what the spec wants: a driver's preferred internal format from a given internal format. But at this point we lack a clear view of what defines this preference, and also there seems to be no API for it. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:08 +01:00
Eduardo Lima Mitev	e064f43485	mesa/glformats: Consider DEPTH/STENCIL when resolving a mesa_format _mesa_format_from_format_and_type() is currently not considering DEPTH and STENCIL formats, which are not array formats and are not handled anywhere. This patch adds cases for common combinations of DEPTH/STENCIL format and types. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Eduardo Lima Mitev	ec299602a6	mesa/formatquery: Add (GET_)TEXTURE_IMAGE_TYPE pnames These basically reuse the default implementation of GL_READ_PIXELS_TYPE. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Eduardo Lima Mitev	23f94146c9	mesa/formatquery: Add (GET_)TEXTURE_IMAGE_FORMAT pnames Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Eduardo Lima Mitev	020671f2a3	mesa/formatquery: Add READ_PIXELS_TYPE pname We call the driver to provide its preferred type, but also provide a default implementation that selects a generic type based on the passed internal format. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Eduardo Lima Mitev	bec286f724	mesa/formatquery: Add READ_PIXELS_FORMAT pname Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Eduardo Lima Mitev	09550c16a5	mesa/formatquery: Add support for READ_PIXELS query This is supported since very early version of OpenGL, but we still call the driver to give it the opportunity to report caveat or no support. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Alejandro Piñeiro	8d7696f638	mesa/formatquery: added FILTER pname support It discards out the targets and internalformats that explicitly mention (per-spec) that doesn't support filter types other than NEAREST or NEAREST_MIPMAP_NEAREST. Those are: * Texture buffers target * Multisample targets * Any integer internalformat For the case of multisample targets, it was used the existing method _mesa_target_allows_setting_sampler_parameter. This would scalate better in the future if new targets appear that doesn't allow to set sampler parameters. We consider RENDERBUFFER to support LINEAR filters, because although it doesn't support this filter for sampling, you can set LINEAR on a blit operation using glBlitFramebuffer. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Alejandro Piñeiro	a8736a2567	mesa/texparam: make public target_allows_setting_sampler_parameters In order to allow to be used on ARB_internalformat_query2 implementation. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Antia Puentes	e8ab7727e1	mesa/formatquery: Added framebuffer renderability related queries From the ARB_internalformat_query2 specification: "- FRAMEBUFFER_RENDERABLE: The support for rendering to the resource via framebuffer attachment is returned in <params>. - FRAMEBUFFER_RENDERABLE_LAYERED: The support for layered rendering to the resource via framebuffer attachment is returned in <params>. - FRAMEBUFFER_BLEND: The support for rendering to the resource via framebuffer attachment when blending is enabled is returned in <params>." For all of them, "Possible values returned are FULL_SUPPORT, CAVEAT_SUPPORT, or NONE. If the resource is unsupported, NONE is returned." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Antia Puentes	b4ee9f56fd	mesa/formatquery: Added texture gather/shadow related queries From the ARB_internalformat_query2 specification: "- TEXTURE_SHADOW: The support for using the resource with shadow samplers is written to <params>. - TEXTURE_GATHER: The support for using the resource with texture gather operations is written to <params>. - TEXTURE_GATHER_SHADOW: The support for using resource with texture gather operations with shadow samplers is written to <params>." For all of them, "Possible values returned are FULL_SUPPORT, CAVEAT_SUPPORT, or NONE. If the resource or operation is not supported, NONE is returned." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Antia Puentes	557939c08f	mesa/formatquery: Added texture view related queries From the ARB_internalformat_query2 specification: "- TEXTURE_VIEW: The support for using the resource with the TextureView command is returned in <params>. Possible values returned are FULL_SUPPORT, CAVEAT_SUPPORT, or NONE. If the resource or operation is not supported, NONE is returned. - VIEW_COMPATIBILITY_CLASS: The compatibility class of the resource when used as a texture view is returned in <params>. The compatibility class is one of the values from the /Class/ column of Table 3.X.2. If the resource has no other formats that are compatible, the resource does not support views, or if texture views are not supported, NONE is returned." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Antia Puentes	04e2e0b24a	mesa/textureview: Make _lookup_view_class public It will be used by the ARB_internalformat_query2 implementation to implement the VIEW_COMPATIBILITY_CLASS <pname> query. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Antia Puentes	2066c7be61	mesa/formatquery: Added CLEAR_BUFFER <pname> query From the ARB_internalformat_query2 specification: "- CLEAR_BUFFER: The support for using the resource with ClearBuffer*Data commands is returned in <params>. Possible values returned are FULL_SUPPORT, CAVEAT_SUPPORT, or NONE. If the resource or operation is not supported, NONE is returned." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Antia Puentes	aed633bb97	mesa/formatquery: Added compressed texture related queries From the ARB_internalformat_query2 specification: "- TEXTURE_COMPRESSED: If <internalformat> is a compressed format that is supported for this type of resource, TRUE is returned in <params>. If the internal format is not compressed, or the type of resource is not supported, FALSE is returned. - TEXTURE_COMPRESSED_BLOCK_WIDTH: If the resource contains a compressed format, the width of a compressed block (in bytes) is returned in <params>. If the internal format is not compressed, or the resource is not supported, 0 is returned. - TEXTURE_COMPRESSED_BLOCK_HEIGHT: If the resource contains a compressed format, the height of a compressed block (in bytes) is returned in <params>. If the internal format is not compressed, or the resource is not supported, 0 is returned. - TEXTURE_COMPRESSED_BLOCK_SIZE: If the resource contains a compressed format the number of bytes per block is returned in <params>. If the internal format is not compressed, or the resource is not supported, 0 is returned. (combined with the above, allows the bitrate to be computed, and may be useful in conjunction with ARB_compressed_texture_pixel_storage)." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Antia Puentes	467f462c75	mesa/formatquery: Added simultaneous texture and depth/stencil queries From the ARB_internalformat_query2 specification: "- SIMULTANEOUS_TEXTURE_AND_DEPTH_TEST: The support for using the resource both as a source for texture sampling while it is bound as a buffer for depth test is written to <params>. For example, a depth (or stencil) texture could be bound simultaneously for texturing while it is bound as a depth (and/or stencil) buffer without causing a feedback loop, provided that depth writes are disabled. - SIMULTANEOUS_TEXTURE_AND_STENCIL_TEST: The support for using the resource both as a source for texture sampling while it is bound as a buffer for stencil test is written to <params>. For example, a depth (or stencil) texture could be bound simultaneously for texturing while it is bound as a depth (and/or stencil) buffer without causing a feedback loop, provided that stencil writes are disabled. - SIMULTANEOUS_TEXTURE_AND_DEPTH_WRITE: The support for using the resource both as a source for texture sampling while performing depth writes to the resources is written to <params>. For example, a depth-stencil texture could be bound simultaneously for stencil texturing while it is bound as a depth buffer. Feedback loops cannot occur because sampling a stencil texture only returns the stencil portion, and thus writes to the depth buffer do not modify the stencil portions. - SIMULTANEOUS_TEXTURE_AND_STENCIL_WRITE: The support for using the resource both as a source for texture sampling while performing stencil writes to the resources is written to <params>. For example, a depth-stencil texture could be bound simultaneously for depth-texturing while it is bound as a stencil buffer. Feedback loops cannot occur because sampling a depth texture only returns the depth portion, and thus writes to the stencil buffer could not modify the depth portions. For all of them, "Possible values returned are FULL_SUPPORT, CAVEAT_SUPPORT, or NONE. If the resource or operation is not supported, NONE is returned." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Antia Puentes	bd45fb3de4	mesa/formatquery: Added queries related to image textures From the ARB_internalformat_query2 specification: "- IMAGE_TEXEL_SIZE: The size of a texel when the resource when used as an image texture is returned in <params>. This is the value from the /Size/ column in Table 3.22. If the resource is not supported for image textures, or if image textures are not supported, zero is returned. - IMAGE_COMPATIBILITY_CLASS: The compatibility class of the resource when used as an image texture is returned in <params>. This corresponds to the value from the /Class/ column in Table 3.22. The possible values returned are IMAGE_CLASS_4_X_32, IMAGE_CLASS_2_X_32, IMAGE_CLASS_1_X_32, IMAGE_CLASS_4_X_16, IMAGE_CLASS_2_X_16, IMAGE_CLASS_1_X_16, IMAGE_CLASS_4_X_8, IMAGE_CLASS_2_X_8, IMAGE_CLASS_1_X_8, IMAGE_CLASS_11_11_10, and IMAGE_CLASS_10_10_10_2, which correspond to the 4x32, 2x32, 1x32, 4x16, 2x16, 1x16, 4x8, 2x8, 1x8, the class (a) 11/11/10 packed floating-point format, and the class (b) 10/10/10/2 packed formats, respectively. If the resource is not supported for image textures, or if image textures are not supported, NONE is returned. - IMAGE_PIXEL_FORMAT: The pixel format of the resource when used as an image texture is returned in <params>. This is the value from the /Pixel format/ column in Table 3.22. If the resource is not supported for image textures, or if image textures are not supported, NONE is returned. - IMAGE_PIXEL_TYPE: The pixel type of the resource when used as an image texture is returned in <params>. This is the value from the /Pixel type/ column in Table 3.22. If the resource is not supported for image textures, or if image textures are not supported, NONE is returned." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Antia Puentes	990a7200e0	mesa/shaderimage: Added func to get the GL_IMAGE_CLASS from the format It will be used by the ARB_internalformat_query2 implementation to implement the IMAGE_COMPATIBILITY_CLASS <pname> query. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Antia Puentes	52c3692324	mesa/formatquery: Added SHADER_IMAGE_{LOAD,STORE,ATOMIC} <pname> queries From the ARB_internalformat_query2 specification: "- SHADER_IMAGE_LOAD: The support for using the resource with image load operations in shaders is written to <params>. In this case the <internalformat> is the value of the <format> parameter that would be passed to BindImageTexture. - SHADER_IMAGE_STORE: The support for using the resource with image store operations in shaders is written to <params>. In this case the <internalformat> is the value of the <format> parameter that is passed to BindImageTexture. - SHADER_IMAGE_ATOMIC: The support for using the resource with atomic memory operations from shaders is written to <params>." For all of them: "Possible values returned are FULL_SUPPORT, CAVEAT_SUPPORT, or NONE. If the resource or operation is not supported, NONE is returned." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Antia Puentes	876f7a7c08	mesa/shaderimage: Make is_image_format_supported public It will be used by the ARB_internalformat_query2 implementation to implement queries related to the ARB_shader_image_load_store extension. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Antia Puentes	fae2b10ff9	mesa/formatquery: Added queries related to texture sampling in shaders From the ARB_internalformat_query2 specification: "- VERTEX_TEXTURE: The support for using the resource as a source for texture sampling in a vertex shader is written to <params>. - TESS_CONTROL_TEXTURE: The support for using the resource as a source for texture sampling in a tessellation control shader is written to <params>. - TESS_EVALUATION_TEXTURE: The support for using the resource as a source for texture sampling in a tessellation evaluation shader is written to <params>. - GEOMETRY_TEXTURE: The support for using the resource as a source for texture sampling in a geometry shader is written to <params>. - FRAGMENT_TEXTURE: The support for using the resource as a source for texture sampling in a fragment shader is written to <params>. - COMPUTE_TEXTURE: The support for using the resource as a source for texture sampling in a compute shader is written to <params>." For all of them, "Possible values returned are FULL_SUPPORT, CAVEAT_SUPPORT, or NONE. If the resource or operation is not supported, NONE is returned." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Antia Puentes	aeb759c7d6	mesa/formatquery: Added SRGB_DECODE_ARB <pname> query From the ARB_internalformat_query2 specification: "- SRGB_DECODE_ARB: The support for toggling whether sRGB decode happens at sampling time (see EXT/ARB_texture_sRGB_decode) for the resource is returned in <params>. Possible values returned are FULL_SUPPORT, CAVEAT_SUPPORT, or NONE. If the resource or operation is not supported, NONE is returned." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Antia Puentes	bcb2f9cdb9	mesa/formatquery: Added SRGB_{READ,WRITE} <pname> queries From the ARB_internalformat_query2 specification: "- SRGB_READ: The support for converting from sRGB colorspace on read operations (see section 3.9.18) from the resource is returned in <params>. Possible values returned are FULL_SUPPORT, CAVEAT_SUPPORT, or NONE. If the resource or operation is not supported, NONE is returned. - SRGB_WRITE: The support for converting to sRGB colorspace on write operations to the resource is returned in <params>. This indicates that writing to framebuffers with this internalformat will encode to sRGB color spaces when FRAMEBUFFER_SRGB is enabled (see section 4.1.8). Possible values returned are FULL_SUPPORT, CAVEAT_SUPPORT, or NONE. If the resource or operation is not supported, NONE is returned." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Antia Puentes	e88cbb7a51	mesa/formatquery: Added COLOR_ENCODING <pname> query. From the ARB_internalformat_query2 specification: "- COLOR_ENCODING: The color encoding for the resource is returned in <params>. Possible values for color buffers are LINEAR or SRGB, for linear or sRGB-encoded color components, respectively. For non-color formats (such as depth or stencil), or for unsupported resources, the value NONE is returned." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Eduardo Lima Mitev	b1755535ec	mesa/glformats: Add a helper function _mesa_is_srgb_format() Returns true if the passed format is an sRGB format, false otherwise. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Antia Puentes	87b2de3998	mesa/formatquery: Added mipmap related <pname> queries Specifically MIPMAP, MANUAL_GENERATE_MIPMAP and AUTO_GENERATE_MIPMAP <pname> queries. From the ARB_internalformat_query2 specification: "- MIPMAP: If the resource supports mipmaps, TRUE is returned in <params>. If the resource is not supported, or if mipmaps are not supported for this type of resource, FALSE is returned. - MANUAL_GENERATE_MIPMAP: The support for manually generating mipmaps for the resource is returned in <params>. Possible values returned are FULL_SUPPORT, CAVEAT_SUPPORT, or NONE. If the resource is not supported, or if the operation is not supported, NONE is returned. - AUTO_GENERATE_MIPMAP: The support for automatic generation of mipmaps for the resource is returned in <params>. Possible values returned are FULL_SUPPORT, CAVEAT_SUPPORT, or NONE. If the resource is not supported, or if the operation is not supported, NONE is returned." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:07 +01:00
Antia Puentes	079d99b830	mesa/genmipmap: Added a function to validate the internalformat It will be used by the ARB_internalformat_query2 implementation to implement mipmap related queries. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Antia Puentes	06852f4b7a	mesa/genmipmap: Added a function to check if the target is valid It will be used by the ARB_internalformat_query2 implementation to implement mipmap related queries. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Antia Puentes	df3a37311d	mesa/formatquery: Added {COLOR,DEPTH,STENCIL}_RENDERABLE <pname> queries Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Antia Puentes	c22ceb08bb	mesa/formatquery: Added {COLOR,DEPTH,STENCIL}_COMPONENTS <pname> queries Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Alejandro Piñeiro	e976a30db8	mesa/formatquery: support for MAX_COMBINED_DIMENSIONS It is implemented combining the values returned by calls to the 32-bit query _mesa_GetInternalformati32v. The main reason is simplicity. The other option would be C&P how we implemented the support of GL_MAX_{WIDTH/HEIGHT/DEPTH} and GL_SAMPLES. Additionally, doing this way, we avoid adding checks on the code, as are done by the call to the query itself. MAX_COMBINED_DIMENSIONS is the only pname pointed on the spec of needing a 64-bit query. We handle that possibility by packing the returning value on the two first 32-bit integers of params. This would work on the 32-bit query as far as the value is not greater that INT_MAX. On the 64-bit query wrapper we unpack those values in order to get the final value. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Eduardo Lima Mitev	c5cf16a4fc	mesa/teximage: add _mesa_is_cube_map_texture utility method Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Alejandro Piñeiro	4e33278b39	main/formatquery: support for MAX_{WIDTH/HEIGHT/DEPTH/LAYERS} Implemented by calling GetIntegerv with the equivalent pname and handling individually the exceptions related to dimensions. All those pnames are used to get the maximum value for each dimension of the given target. The only difference between this calls and calling GetInteger with pnames like GL_MAX_TEXTURE_SIZE, GL_MAX_3D_TEXTURE_SIZE, etc is that GetInternalformat allows to specify a internalformat. But at this moment, there is no reason to think that the values would be different based on the internalformat. The spec already take that into account, using these specific pnames as example on Issue 7 of arb_internalformat_query2 spec. So this seems like a hook to allow to return different values based on the internalformat in the future. It is worth to note that the piglit test associated to those pnames are checking the returned values of GetInternalformat against the values returned by GetInteger, and the test is passing with NVIDIA proprietary drivers. main/formatquery: support for MAX_{WIDTH/HEIGHT/DEPTH/LAYERS} Implemented by calling GetIntegerv with the equivalent pname and handling individually the exceptions related to dimensions. All those pnames are used to get the maximum value for each dimension of the given target. The only difference between this calls and calling GetInteger with pnames like GL_MAX_TEXTURE_SIZE, GL_MAX_3D_TEXTURE_SIZE, etc is that GetInternalformat allows to specify a internalformat. But at this moment, there is no reason to think that the values would be different based on the internalformat. The spec already take that into account, using these specific pnames as example on Issue 7 of arb_internalformat_query2 spec. So this seems like a hook to allow to return different values based on the internalformat in the future. It is worth to note that the piglit test associated to those pnames are checking the returned values of GetInternalformat against the values returned by GetInteger, and the test is passing with NVIDIA proprietary drivers. v2: use _mesa_has## instead of direct ctx->Extensions access (Nanley Chery) Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Alejandro Piñeiro	b750144b0a	mesa/formatquery: support for IMAGE_FORMAT_COMPATIBILITY_TYPE From arb_internalformat_query2 spec: "IMAGE_FORMAT_COMPATIBILITY_TYPE: The matching criteria use for the resource when used as an image textures is returned in <params>. This is equivalent to calling GetTexParameter with <value> set to IMAGE_FORMAT_COMPATIBILITY_TYPE." Current implementation of GetTexParameter for this case returns a field of a texture object, so the support of this pname was implemented creating a temporal texture object and returning that value. It is worth to mention that right now that field is not reassigned after initialization. So it is somehow hardcoded. An alternative option would be return that value. That doesn't seems really scalable though. v2: use _mesa_has## instead of direct ctx->Extensions access (Nanley Chery) Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Alejandro Piñeiro	e98a3c799f	mesa/formatquery: handle unmodified buffer for SAMPLES on the 64-bit query From arb_internalformat_query2 spec: " If <internalformat> is not color-renderable, depth-renderable, or stencil-renderable (as defined in section 4.4.4), or if <target> does not support multiple samples (ie other than TEXTURE_2D_MULTISAMPLE, TEXTURE_2D_MULTISAMPLE_ARRAY, or RENDERBUFFER), <params> is not modified." So there are cases where the buffer should not be modified. As the 64-bit query is a wrapper over the 32-bit query, we can't just copy the values to the equivalent 32-bit buffer, as that would fail if the original params contained values greater that INT_MAX. So we need to copy-back only the values that got modified by the 32-bit query. We do that by filling the temporal buffer by negatives, as the 32-bit query should not return negative values ever. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Alejandro Piñeiro	580816b747	mesa/formatquery: initial implementation for GetInternalformati64v It just does a wrapping on the existing 32-bit GetInternalformativ. We will maintain the 32-bit query as default as it is likely that it would be the one most used. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Antia Puentes	7241e1b5f4	mesa/formatquery: Added INTERNALFORMAT_{X}_{SIZE,TYPE} <pname> queries From the ARB_internalformat_query2 spec: "- INTERNALFORMAT_RED_SIZE - INTERNALFORMAT_GREEN_SIZE - INTERNALFORMAT_BLUE_SIZE - INTERNALFORMAT_ALPHA_SIZE - INTERNALFORMAT_DEPTH_SIZE - INTERNALFORMAT_STENCIL_SIZE - INTERNALFORMAT_SHARED_SIZE For uncompressed internal formats, queries of these values return the actual resolutions that would be used for storing image array components for the resource. For compressed internal formats, the resolutions returned specify the component resolution of an uncompressed internal format that produces an image of roughly the same quality as the compressed algorithm. For textures this query will return the same information as querying GetTexLevelParameter{if}v for TEXTURE__SIZE would return. If the internal format is unsupported, or if a particular component is not present in the format, 0 is written to <params>. - INTERNALFORMAT_RED_TYPE - INTERNALFORMAT_GREEN_TYPE - INTERNALFORMAT_BLUE_TYPE - INTERNALFORMAT_ALPHA_TYPE - INTERNALFORMAT_DEPTH_TYPE - INTERNALFORMAT_STENCIL_TYPE For uncompressed internal formats, queries for these values return the data type used to store the component. For compressed internal formats the types returned specify how components are interpreted after decompression. For textures this query returns the same information as querying GetTexLevelParameter{if}v for TEXTURE_TYPE would return. Possible values return include, NONE, SIGNED_NORMALIZED, UNSIGNED_NORMALIZED, FLOAT, INT, UNSIGNED_INT, representing missing, signed normalized fixed point, unsigned normalized fixed point, floating-point, signed unnormalized integer and unsigned unnormalized integer components. NONE is returned for all component types if the format is unsupported." Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Antia Puentes	675418182b	mesa/main: Extend _mesa_get_format_bits to accept new pnames The new pnames accepted by the function are: - INTERNALFORMAT_RED_SIZE - INTERNALFORMAT_GREEN_SIZE - INTERNALFORMAT_BLUE_SIZE - INTERNALFORMAT_ALPHA_SIZE - INTERNALFORMAT_DEPTH_SIZE - INTERNALFORMAT_STENCIL_SIZE It will be used by the ARB_internalformat_query2 implementation to implement those pnames. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Antia Puentes	4a8dae6247	mesa/main: Extend _mesa_base_format_has_channel to accept new pnames The new pnames accepted by the function are: - INTERNALFORMAT_RED_SIZE - INTERNALFORMAT_GREEN_SIZE - INTERNALFORMAT_BLUE_SIZE - INTERNALFORMAT_ALPHA_SIZE - INTERNALFORMAT_DEPTH_SIZE - INTERNALFORMAT_STENCIL_SIZE - INTERNALFORMAT_RED_TYPE - INTERNALFORMAT_GREEN_TYPE - INTERNALFORMAT_BLUE_TYPE - INTERNALFORMAT_ALPHA_TYPE - INTERNALFORMAT_DEPTH_TYPE - INTERNALFORMAT_STENCIL_TYPE It will be used by the ARB_internalformat_query2 implementation to implement those pnames. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Antia Puentes	f1c789fa00	mesa/main: Make legal_get_tex_level_parameter_target public It will be used by the ARB_internalformat_query2 implementation to check if the target is valid for those <pnames> that are said in the spec that should return the same values than the 'glGetTexLevelParameter{if}v' function: - INTERNALFORMAT_RED_SIZE - INTERNALFORMAT_GREEN_SIZE - INTERNALFORMAT_BLUE_SIZE - INTERNALFORMAT_ALPHA_SIZE - INTERNALFORMAT_DEPTH_SIZE - INTERNALFORMAT_STENCIL_SIZE - INTERNALFORMAT_SHARED_SIZE - INTERNALFORMAT_RED_TYPE - INTERNALFORMAT_GREEN_TYPE - INTERNALFORMAT_BLUE_TYPE - INTERNALFORMAT_ALPHA_TYPE - INTERNALFORMAT_DEPTH_TYPE - INTERNALFORMAT_STENCIL_TYPE - IMAGE_FORMAT_COMPATIBILITY_TYPE Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Eduardo Lima Mitev	eacb2c971e	mesa/formatquery: Added INTERNALFORMAT_PREFERRED pname Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Antia Puentes	56ec2dfcb1	mesa/formatquery: Added the INTERNALFORMAT_SUPPORTED <pname> query Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Antia Puentes	4722abc630	mesa/formatquery: Added a func to check <internalformat> supported From the ARB_internalformat_query2 specification: "The INTERNALFORMAT_SUPPORTED <pname> can be used to determine if the internal format is supported, and the other <pnames> are defined in terms of whether or not the format is supported." v2: Consider also FBO base formats when checking if the internalformat is supported. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Antia Puentes	5f6e3a0370	mesa/formatquery: Added func to check if the 'resource' is supported Checks that the 'resource', as defined by the ARB_internalformat_query2 specification, is supported by the implementation for those 'pnames' that require this check. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Alejandro Piñeiro	95392cfa9d	mesa/main: not fill mesa_error on _mesa_legal_texture_base_format_for_target This would allow to use this method if you are just querying if it is allowed, like for arb_internalformat_query2. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Antia Puentes	aaf5ad513b	mesa/teximage: Make _mesa_format_no_online_compression public It will be used by the ARB_internalformat_query2 implementation to check if a certain compressed 'internalformat' is supported by texture 'targets'. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Antia Puentes	5eef355823	mesa/teximage: make public is_renderable_texture_format It will be used by the ARB_internalformat_query2 implementation to check if the 'internalformat' passed is supported by texture MULTISAMPLE 'targets'. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Antia Puentes	b5d27bc5dd	mesa/main: Added empty skeleton of glGetInternalformati64v Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Alejandro Piñeiro	2453bba504	mesa: Add dispatch and extension XML for GL_ARB_internalformat_query2 Equivalent to commit bda540 (that added GL_ARB_internalformat_query) v2: include the new xml to to API_XML list at Makefile.am (Emil Velikov) Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:06 +01:00
Antia Puentes	d432337e2d	mesa/formatquery: Added boilerplate code to extend GetInternalformativ The goal is to extend the GetInternalformativ query to implement the ARB_internalformat_query2 specification, keeping the behaviour defined by the ARB_internalformat_query if ARB_internalformat_query2 is not supported. v2: Don't require ARB_internalformat_query when profile is GLES3. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:05 +01:00
Antia Puentes	806bc2bf22	mesa/formatquery: Added a func to check if the <target> is supported From the ARB_internalformat_query2 spec: "If the particular <target> and <internalformat> combination do not make sense, or if a particular type of <target> is not supported by the implementation the "unsupported" answer should be given. This is not an error." This function checks if the <target> is supported by the implementation. v2: Allow RENDERBUFFER targets also on GLES 3 profiles. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:05 +01:00
Antia Puentes	4af3e5e9f1	mesa/formatquery: Added function to set 'unsupported' responses The ARB_internalformat_query2 specification defines which is the reponse best representing "not supported" or "not applicable" for each <pname>. Queries for unsupported features, targets, internalformats, combinations of: target and internalformat, target and pname, pname and internalformat, do not return an error but the corresponding 'unsupported' response. We will use that response as the default answer. For SAMPLES the 'unsupported' response is to not modify the 'params' buffer. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:05 +01:00
Antia Puentes	a6434f41cc	mesa/formatquery: Added function to validate parameters Handles the cases where an error should be returned according to the ARB_internalformat_query and ARB_internalformat_query2 specifications. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:05 +01:00
Antia Puentes	b89463cdfd	mesa/main: Add extension tracking bit for ARB_internalformat_query2 Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:05 +01:00
Eduardo Lima Mitev	a347a0f53f	mesa: Completely remove QuerySamplesForFormat from driver func table At this point, all uses have been replaced by the more general hook QueryInternalFormat, introduced by ARB_internalformat_query2. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:05 +01:00
Eduardo Lima Mitev	993d7345b7	mesa/formatquery: Use new driver hook QueryInternalFormat Implements SAMPLES and NUM_SAMPLE_COUNTS queries using the new generic driver call QueryInternalFormat, which is being introduced as replacement of QuerySamplesForFormat to support ARB_internalformat_query2. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:05 +01:00
Eduardo Lima Mitev	25ee5c60dc	mesa/formatquery: Remove tracking of number of elements in the response Currently, the number of integers returned in the response to GetInternalFormativ is being tracked by a 'count' variable. This is so only the modified elements from the temporary buffer are copied into the original user buffer. However, with the introduction of ARB_internalformat_query2, keeping track of 'count' would complicate the code a lot, considering the high number of queries. So, we propose to forget about tracking count, and move all the 16 elements in the temporary buffer, back to the user buffer (clamped to user buffer size of course). This is basically a trade-off between performance and code clarity. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:05 +01:00
Eduardo Lima Mitev	1f0b2ce8ec	mesa/multisample: Check sample count using the new driver hook Use QueryInternalFormat instead of QuerySamplesForFormat to obtain the highest supported sample. QuerySamplesForFormat is to be removed. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:05 +01:00
Eduardo Lima Mitev	ee31b0b1d0	st/format: Replace QuerySamplesForFormat by new QueryInternalFormat hook The previous code for SAMPLES and NUM_SAMPLE_COUNTS is reused as a private function. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:05 +01:00
Eduardo Lima Mitev	82be7735f3	i965/formatquery: Respond queries SAMPLES and NUM_SAMPLE_COUNTS This effectively disables old QuerySamplesForFormat driver hook, since it is never called by Mesa anymore. v2: Call brw_query_samples_for_format() with a dummy buffer to calculate num samples, to avoid modifying the original buffer. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:05 +01:00
Eduardo Lima Mitev	2dabff9068	i965: Move brw_query_samples_for_format() to brw_queryformat.c Now that there is a dedicated source file for internal format queries, this function belongs there. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:05 +01:00
Eduardo Lima Mitev	28144c4476	i965: Add boilerplate function for QueryInternalFormat driver hook By default, we call back the driver's hook fallback function that has generic implementations for the all the queries. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:05 +01:00
Eduardo Lima Mitev	45054f9702	mesa: Add a default QueryInternalFormat() function for drivers This is a fallback function for drivers not implementing ARB_internalformat_query2. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:05 +01:00
Eduardo Lima Mitev	93d30c3de9	mesa: Add QueryInternalFormat to device driver virtual table This new function queries different driver parameters for a particular target and texture format. It is basically a driver hook to support ARB_internalformat_query2. Since ARB_internalformat_query2 introduced several new query parameters over ARB_internalformat_query, having one driver hook for each parameter is no longer feasible. So this is the generic entry-point for calls to glGetInternalFormativ and glGetInternalFormati64v. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-03 15:14:05 +01:00
Iago Toral Quiroga	283c8372cb	glsl/opt_array_splitting: Fix indentation Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-03-03 09:12:41 +01:00
Iago Toral Quiroga	4a60002424	glsl/opt_array_splitting: Fix crash when doing array indexing into other arrays When we find indirect indexing into an array, the current implementation of the array spliiting optimization pass does not look further into the expression tree. However, if the variable expression involves variable indexing into other arrays, we can miss that these other arrays also have variable indexing. If that happens, the pass will crash later on after hitting an assertion put there to ensure that split arrays are in fact always indexed via constants: shader_runner: opt_array_splitting.cpp:296: void ir_array_splitting_visitor::split_deref(ir_dereference**): Assertion `constant' failed. This patch fixes the problem by letting the pass step into the variable index expression to identify these cases properly. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89607 Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-03-03 09:02:30 +01:00
Oded Gabbay	914d4967d7	radeonsi: Do colorformat endian swap for PIPE_USAGE_STAGING There is an old if statement (dated to 2011) that prevented doing endian swap for colorformat, in case the buffer is marked as PIPE_USAGE_STAGING. This is now wrong because st_ReadPixels() reads into a destination texture that is marked with PIPE_USAGE_STAGING. Therefore, even if the texture is rendered correctly to the monitor, when reading it back we get unswapped/wrong values. This patch makes the check_rgba() function in gl-1.0-readpixsanity piglit test pass in big-endian. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-03 09:20:08 +02:00
Oded Gabbay	ef5183faea	r600g: Do colorformat endian swap for PIPE_USAGE_STAGING There is an old if statement (dated to 2011) that prevented doing endian swap for colorformat, in case the buffer is marked as PIPE_USAGE_STAGING. This is now wrong because st_ReadPixels() reads into a destination texture that is marked with PIPE_USAGE_STAGING. Therefore, even if the texture is rendered correctly to the monitor, when reading it back we get unswapped/wrong values. This patch makes the check_rgba() function in gl-1.0-readpixsanity piglit test pass in big-endian. v2: removed duplicate call to r600_colorformat_endian_swap() inside evergreen_init_color_surface_rat() Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-03-03 09:20:08 +02:00
Tim Rowley	7bb193d28c	mesa/build: add OpenSWR to build Tested on Linux (centos, ubuntu, and suse variants) Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-03-02 18:38:42 -06:00
Tim Rowley	d003be2a30	gallium/docs - add OpenSWR documentation Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-03-02 18:38:41 -06:00
Tim Rowley	da4f95d168	gallium/target-helpers: add OpenSWR driver Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-03-02 18:38:41 -06:00
Tim Rowley	ea37602273	gallium/auxilary: more __cplusplus exports swr driver which is written in C++ needs access to some more gallium utility functions than are currently exposed. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-03-02 18:38:41 -06:00
Tim Rowley	c6e67f5a93	gallium/swr: add OpenSWR rasterizer Acked-by: Roland Scheidegger <sroland@vmware.com> Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-03-02 18:38:41 -06:00
Tim Rowley	2b2d3680bf	gallium/swr: add OpenSWR driver OpenSWR is a new software rasterizer for x86 processors designed for high performance and high scalablility on visualization workloads. Acked-by: Roland Scheidegger <sroland@vmware.com> Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-03-02 18:38:41 -06:00
Timothy Arceri	2eec41f6f1	glsl: replace remaining tabs in ir_builder.cpp Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2016-03-03 11:25:57 +11:00
Anuj Phogat	7026f27e33	mesa: Update comment Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-03-02 15:06:46 -08:00
Anuj Phogat	6ccead5b48	mesa: Fix function description Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-03-02 15:06:46 -08:00
Anuj Phogat	de61849994	meta: Remove the 'allocate_storage' parameter in _mesa_meta_pbo_GetTexSubImage() Texture is already allocated before calling this meta function. So, the value of 'allocate_storage' passed to the function is always false. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-02 15:06:46 -08:00
Anuj Phogat	6d4ebbe9e5	meta: Fix the pbo usage in meta for GLES{1,2} contexts OpenGL ES 1.0 doesn't support using GL_STREAM_DRAW and both ES 1.0 and 2.0 don't support GL_STREAM_READ in glBufferData(). So, handle it correctly by calling the _mesa_meta_begin() before create_texture_for_pbo(). V2: Remove the changes related to allocate_storage. (Ian) Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-03-02 15:06:45 -08:00
Matt Turner	0d047d10f1	program: Clean up after condition code removal.	2016-03-02 12:15:58 -08:00
Matt Turner	961ead6746	program: Remove variable used only in assert().	2016-03-02 12:15:58 -08:00
Matt Turner	de2ef0401b	program: Drop GL_FRAGMENT_PROGRAM_NV from switch statement.	2016-03-02 12:15:58 -08:00
Jordan Justen	98cdce1ce4	anv/gen7: Use predicated rendering for indirect compute For OpenGL, see commit `9a939ebb47`. Fixes: * dEQP-VK.compute.indirect_dispatch.upload_buffer.empty_command * dEQP-VK.compute.indirect_dispatch.gen_in_compute.empty_command Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-02 12:03:05 -08:00
Jordan Justen	da4745104c	anv: Save batch to local variable for indirect compute Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-03-02 12:03:05 -08:00
Jason Ekstrand	b0867ca4b2	anv: Fix make check	2016-03-02 11:45:29 -08:00
Samuel Pitoiset	b94a46aa8e	gk110/ir: fix wrong emission of NOT modifier for VOTE Spotted by Coverity. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reported-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-02 20:36:18 +01:00
Jason Ekstrand	2168082a48	isl: Fix make check	2016-03-02 11:31:22 -08:00
Jason Ekstrand	8f5a64e44f	gen8/cmd_buffer: Properly return flushed push constant stages This is required on SKL so that we can properly re-emit binding table pointers commands.	2016-03-02 10:48:40 -08:00
Thomas Hindoe Paaboel Andersen	535002f4da	gallium/cso: fix indentation Only one of these were recently introduced. However, since we keep copy/pasting the same wrong indentation we should probably just fix it. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-03-02 08:55:20 -07:00
Thomas Hindoe Paaboel Andersen	37cfc51b13	st/mesa: move dereference after null check We should not dereference shader before we have done the null check. Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-03-02 08:55:20 -07:00
Matt Turner	ad17511302	i965/gen6/gs: Replace V-immediate with VF-immediate. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-02 07:28:52 -08:00
Marek Olšák	43f74ac67c	gallium: fix PIPE_BIND_QUERY_BUFFER - PIPE_BIND_SCANOUT overlap Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-03-02 15:32:52 +01:00
Samuel Iglesias Gonsálvez	e8fd60e789	i965: set ctx->Const.MaxViewport{Width,Height} to 32k From ARB_viewport_array spec: " * On GL3-capable hardware the VIEWPORT_BOUNDS_RANGE should be at least [-16384, 16383]. * On GL4-capable hardware the VIEWPORT_BOUNDS_RANGE should be at least [-32768, 32767]." This range is set using ctx->Const.MaxViewportWidth value, so just bump those constants to 32k for gen7+ which can support OpenGL 4.0. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-03-02 07:19:01 +01:00
Samuel Iglesias Gonsálvez	add57b3fa8	main: remove MAX_VIEWPORT_WIDTH and MAX_VIEWPORT_HEIGHT constants Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-03-02 07:19:01 +01:00
Samuel Iglesias Gonsálvez	aa849d97a0	main: call invalidate_framebuffer_storage() with driver's viewport limits Don't use hardcoded ones because the driver can set different ones. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-03-02 07:19:01 +01:00
Jason Ekstrand	5b70aa11ee	anv/meta_blit: Use unorm formats for 8 and 16-bit RGB and RGBA values While Broadwell is very good about UINT formats, HSW is more restrictive. Neither R8G8B8_UINT nor R16G16B16_UINT really exist on HSW. It should be safe to just use the unorm formats.	2016-03-01 21:45:20 -08:00
Kenneth Graunke	89e421369c	Merge remote-tracking branch 'origin/master' into vulkan	2016-03-01 17:11:29 -08:00
Rob Clark	c4ae047cab	freedreno/ir3: enable shareable shaders Now that we are no longer using the pctx reference in the shader, drop it and turn on shareable shaders. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-03-01 19:21:45 -05:00
Rob Clark	c3f2f8cbe4	freedreno/ir3: pass ctx to constant-emit code Rather than fishing it out of the shader. This removes the other big user of shader->pctx. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-03-01 19:20:44 -05:00
Rob Clark	5fd152bae8	freedreno/ir3: add dev ptr to ir3_compiler And use this for allocating bo's to hold the shader binary, rather than accessing the dev via ctx ptr. One step towards making shaders sharable across contexts. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-03-01 19:20:33 -05:00
Jason Ekstrand	e941fd8470	genxml: Make the border color pointer consistent across gens	2016-03-01 14:43:05 -08:00
Jason Ekstrand	eecd1f8001	gen7/pipeline: Add competent blending This is mostly a copy-and-paste from gen8. Blending still isn't 100% but it fixes about 1100 CTS blend tests on HSW.	2016-03-01 13:51:58 -08:00
Jason Ekstrand	8b091deb5e	anv: Unify gen7 and gen8 state Now that we've pulled surface state setup into ISL, there's not much to do here.	2016-03-01 12:17:23 -08:00
Matt Turner	1be953797e	mesa: Remove NV_fragment_program remnants from dlist.c. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com>	2016-03-01 11:41:30 -08:00
Matt Turner	89abb22a85	mesa: Remove NV_fragment_program_option enable bit. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com>	2016-03-01 11:41:30 -08:00
Matt Turner	ed72a1c118	program: Remove NV_fragment_program opcode parsing. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com>	2016-03-01 11:41:29 -08:00
Matt Turner	5429554f09	program: Remove NV_fragment_program scalar suffix parsing. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com>	2016-03-01 11:41:29 -08:00
Matt Turner	409c24f9cc	program: Remove NV_fragment_program_option parsing support. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com>	2016-03-01 11:41:29 -08:00
Matt Turner	fe2d2c7ad8	program: Remove NV_fragment_program Abs support. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com>	2016-03-01 11:41:29 -08:00
Matt Turner	0d1f6c752f	program: Remove incorrect comment about OPCODE_TXD. The table in prog_instruction.h is correct. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com>	2016-03-01 11:41:29 -08:00
Matt Turner	624d06708d	program: Remove OPCODE_TXP_NV. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com>	2016-03-01 11:41:29 -08:00
Matt Turner	aaef6cf4e3	program: Clean up after previous commit. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com>	2016-03-01 11:41:29 -08:00
Matt Turner	7b50b0457d	program: Remove condition-code and precision support. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com>	2016-03-01 11:41:29 -08:00
Matt Turner	9e11ff7e11	program: Remove OPCODE_KIL_NV. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com>	2016-03-01 11:41:29 -08:00
Matt Turner	a0c3650ad3	program: Remove RelAddr2 support. Looks like more never-used crap from the first geometry shader attempt. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com>	2016-03-01 11:41:29 -08:00
Matt Turner	6b1fb4862e	program: Mark table const. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com>	2016-03-01 11:41:29 -08:00
Matt Turner	fc61b41a95	mesa: Remove EmitCondCodes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com>	2016-03-01 11:41:29 -08:00
Matt Turner	7fe206da28	docs: Remove descriptions of long dead Emit* fields. Dead since commit `d8a366200` in 2010. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com>	2016-03-01 11:41:29 -08:00
Matt Turner	f3b68fc5fc	glsl: Initialize gl_shader_program::EmptyUniformLocations. Commit `65dfb30` added exec_list EmptyUniformLocations, but only initialized the list if ARB_explicit_uniform_location was enabled, leading to crashes if the extension was not available. Cc: "11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-03-01 11:41:29 -08:00
Ian Romanick	1a80ca22fe	i965/meta: Don't pollute the framebuffer namespace tl;dr: For many types of GL object, we can NEVER use the Gen function. In OpenGL ES (all versions!) and OpenGL compatibility profile, applications don't have to call Gen functions. The GL spec is very clear about how you can mix-and-match generated names and non-generated names: you can use any name you want for a particular object type until you call the Gen function for that object type. Here's the problem scenario: - Application calls a meta function that generates a name. The first Gen will probably return 1. - Application decides to use the same name for an object of the same type without calling Gen. Many demo programs use names 1, 2, 3, etc. without calling Gen. - Application calls the meta function again, and the meta function replaces the data. The application's data is lost, and the app fails. Have fun debugging that. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363 Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-01 11:07:20 -08:00
Ian Romanick	8f1b1878a0	i965/meta: Use _mesa_bind_framebuffers instead of _mesa_BindFramebuffer Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-01 11:07:20 -08:00
Ian Romanick	3071da3032	meta: Don't pollute the framebuffer namespace tl;dr: For many types of GL object, we can NEVER use the Gen function. In OpenGL ES (all versions!) and OpenGL compatibility profile, applications don't have to call Gen functions. The GL spec is very clear about how you can mix-and-match generated names and non-generated names: you can use any name you want for a particular object type until you call the Gen function for that object type. Here's the problem scenario: - Application calls a meta function that generates a name. The first Gen will probably return 1. - Application decides to use the same name for an object of the same type without calling Gen. Many demo programs use names 1, 2, 3, etc. without calling Gen. - Application calls the meta function again, and the meta function replaces the data. The application's data is lost, and the app fails. Have fun debugging that. Fixes piglit tests: - object-namespace-pollution glGetTexImage-compressed framebuffer - object-namespace-pollution glGenerateMipmap framebuffer Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363 Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-01 11:07:20 -08:00
Ian Romanick	91e5825b8a	meta/decompress: Track framebuffer using gl_framebuffer instead of GL API object handle Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-01 11:07:20 -08:00
Ian Romanick	3ed44fab18	meta/generate_mipmap: Track framebuffer using gl_framebuffer instead of GL API object handle Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-01 11:07:20 -08:00
Ian Romanick	ec5757f9c9	meta: Use _mesa_bind_framebuffers instead of _mesa_BindFramebuffer Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-01 11:07:20 -08:00
Ian Romanick	7c254f0200	meta: Use _mesa_CreateFramebuffers instead of _mesa_GenFramebuffers This enables later patches that will stop calling _mesa_GenFramebuffers or _mesa_CreateFramebuffers which pollute the framebuffer namespace. For framebuffers, the Bind call is still necessary. sed -i -e 's/_mesa_GenFramebuffers/_mesa_CreateFramebuffers/' \ src/mesa/drivers/common/*.c Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-01 11:07:20 -08:00
Ian Romanick	6b70c9ea98	i965/meta: Use _mesa_CreateFramebuffers instead of _mesa_GenFramebuffers This enables later patches that will stop calling _mesa_GenFramebuffers or _mesa_CreateFramebuffers which pollute the framebuffer namespace. For framebuffers, the Bind call is still necessary. sed -i -e 's/_mesa_GenFramebuffers/_mesa_CreateFramebuffers/' \ src/mesa/drivers/dri/i965/*.c Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-01 11:07:19 -08:00
Ian Romanick	f76462cb6f	meta: Save and restore the framebuffer using gl_framebuffer instead of GL API object handle Some meta operations can be called recursively. Future changes (the "Don't pollute the ... namespace" changes) will cause objects with invalid names to be used. If a nested meta operation tries to restore an object named 0xDEADBEEF, it will fail. This also fixes another latent bug in meta. In a multithreaded, multicontext application, one thread can delete an object that is bound in another thread. That object continues to exist until it is unbound (i.e., its refcount drops to zero). Meta unbinds objects all over the place. As a result, the rebind in _mesa_meta_end could fail because the object vanished! See https://bugs.freedesktop.org/show_bug.cgi?id=92363#c8. Using _mesa_reference_<object type> to save and restore the objects prevents the refcount from going to zero. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363 Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-01 11:07:19 -08:00
Ian Romanick	fed9b0ed5a	mesa: Refactor bind_framebuffer to make _mesa_bind_framebuffers Fixing dd_function_table::BindFramebuffer will come later because that change is probably not suitable for stable. v2: Fix whitespace issue noticed by Topi. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-01 11:07:19 -08:00
Ian Romanick	64aff35f84	meta: Use _mesa_check_framebuffer_status instead of _mesa_CheckFramebufferStatus sed -i -e 's/_mesa_CheckFramebufferStatus(GL_DRAW_FRAMEBUFFER/_mesa_check_framebuffer_status(ctx, ctx->DrawBuffer/' \ -e 's/_mesa_CheckFramebufferStatus(GL_FRAMEBUFFER[^)]*/_mesa_check_framebuffer_status(ctx, ctx->DrawBuffer/' \ -e 's/_mesa_CheckFramebufferStatus(GL_READ_FRAMEBUFFER/_mesa_check_framebuffer_status(ctx, ctx->ReadBuffer/' \ $(grep -rl _mesa_CheckFramebufferStatus src/mesa/drivers) The second expression catches both GL_FRAMEBUFFER and GL_FRAMEBUFFER_EXT. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-01 11:07:19 -08:00
Ian Romanick	92266ff7a3	meta: Obvious refactor of _mesa_meta_framebuffer_texture_image Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-01 11:07:19 -08:00
Ian Romanick	f69c743069	meta: Convert _mesa_meta_bind_fbo_image to take a gl_framebuffer instead of a GL API handle Also change the name of the function to _mesa_meta_framebuffer_texture_image. The function is basically a wrapper around _mesa_framebuffer_texture (which is used to implement glFramebufferTexture1D and friends), so it makes sense for it's name to be similar to that. The next patch will clean _mesa_meta_framebuffer_texture_image up considerably. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-03-01 11:07:19 -08:00
Jason Ekstrand	6e20c1e058	anv/cmd_buffer: Look at both sides for stencil enable Now it's all consistent with gen9	2016-03-01 11:03:29 -08:00
Jason Ekstrand	4cfdd16500	anv/cmd_buffer: Clean up stencil state setup on gen7	2016-03-01 11:02:21 -08:00
Jason Ekstrand	bb08d86efe	anv/cmd_buffer: Clean up stencil state setup on gen8	2016-03-01 10:58:43 -08:00
Kristian Høgsberg Kristensen	22d8666d74	anv: Add in image->offset when setting up depth buffer Fix from Neil Roberts. https://bugs.freedesktop.org/show_bug.cgi?id=94348	2016-03-01 09:19:39 -08:00
Jason Ekstrand	38f4c11c2f	anv/pipeline: Pull 3DSTATE_SBE into a shared helper	2016-03-01 08:46:32 -08:00
Dave Airlie	ac222626ad	virgl: add support for passing render condition flags to host. This just passes the extra blit info to fix the render condition tests. Cc: "11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-03-01 15:50:00 +10:00
Jason Ekstrand	3f8df795c1	genxml: Break output detail of 3DSTATE_SBE on gen7 into a struct This makes it work like 3DSTATE_SBE_SWIZ on gen8+ which is much more convenient.	2016-02-29 16:47:42 -08:00
Kenneth Graunke	24994ae926	i965: Push most TES inputs in vec4 mode. (This is commit `4a1c8a3037` for vec4 mode.) Using the push model for inputs is much more efficient than pulling inputs - the hardware can simply copy a large chunk into URB registers at thread creation time, rather than having the thread send messages to request data from the L3 cache. Unfortunately, it's possible to have more TES inputs than fit in registers, so we have to fall back to the pull model in some cases. However, it turns out that most tessellation evaluation shaders are fairly simple, and don't use many inputs. An arbitrary cut-off of 24 vec4 slots (12 registers) should suffice. (I chose this instead of the 32 vec4 slots used in the scalar backend to avoid regressing a few Piglit tests due to the vec4 register allocator being too stupid to figure out what to do. We probably ought to fix that, but it's a separate issue.) Improves performance in GPUTest's tessmark_x64 microbenchmark by 41.5394% +/- 0.288519% (n = 115) at 1024x768 on my Clevo W740SU (with Iris Pro 5200). Improves performance in Synmark's Gl40TerrainFlyTess microbenchmark by 38.3576% +/- 0.759748% (n = 42). v2: Simplify abs/negate handling, as requested by Matt. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-02-29 16:12:50 -08:00
Marek Olšák	c54f38494c	r600g: remove support for DRM < 2.12.0	2016-03-01 00:18:54 +01:00
Marek Olšák	b7da8fa11d	r300g: remove support for DRM < 2.12.0	2016-03-01 00:18:54 +01:00
Marek Olšák	a5e2a173dd	winsys/radeon: drop support for DRM 2.12.0 (kernel < 3.2) in order to make some winsys interface changes easier This distros should use new DRM if they want to use new Mesa: Distro kernel mesa eol SLES 10 2.6.16 6.4.2 2016-07 SLED 11 3.0 9.0.3 2022-03 RHEL 5 2.6.18 6.5.1 2017-03 RHEL 6 2.6.32 10.4.3 2020-11 Debian 6 2.6.32 7.7.1 2016-02 Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-03-01 00:18:54 +01:00
Marek Olšák	69a8e435ce	radeonsi: also dump shaders on a VM fault Reviewed-by: Christian König <christian.koenig@amd.com>	2016-03-01 00:18:54 +01:00
Marek Olšák	18df72b50b	radeonsi: dump full shader disassemblies into ddebug logs including prolog and epilog disassemblies Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-03-01 00:18:54 +01:00
Marek Olšák	74b4ce81fb	radeonsi: allow dumping shader disassemblies to a file Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-03-01 00:18:54 +01:00
Marek Olšák	d0f3b524cd	radeonsi: use re-Z This can increase perf for shaders that kill pixels (kill, alpha-test, alpha-to-coverage). v2: add comments Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-03-01 00:18:19 +01:00
Marek Olšák	09bfbd43a0	tgsi/scan: count memory instructions for radeonsi Reviewed-by: Brian Paul <brianp@vmware.com>	2016-03-01 00:11:32 +01:00
Jason Ekstrand	097564bb8e	anv/cmd_buffer: Dirty push constants when changing pipelines.	2016-02-29 14:36:24 -08:00
Jason Ekstrand	d29fd1c7cb	anv/cmd_buffer: Re-emit push constants packets for all stages	2016-02-29 14:36:24 -08:00
Jason Ekstrand	9715724015	anv/pipeline: Follow push constant alignment restrictions on BDW+ and HSW gt3	2016-02-29 14:36:24 -08:00
Jason Ekstrand	6986ae35ad	anv/pipeline: Avoid a division by zero	2016-02-29 14:36:24 -08:00
Jason Ekstrand	51b618285d	anv/pipeline: Use dynamic checks for max push constants The GEN_GEN macros aren't available in anv_pipeline since it only gets compiled once for the whold driver.	2016-02-29 14:36:24 -08:00
Dave Airlie	35859d5bbb	mesa/fbobject: propogate Layered when reusing attachments. When reusing a depth attachment as a stencil, we need to propogate the layered bit, otherwise we fail to complete the framebuffer. discovered running ./bin/fbo-depth-array depth-layered-clear on virgl on haswell. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-03-01 07:34:37 +10:00
Nanley Chery	74b7b59db5	isl/surface_state: Fix array spacing on Gen7 v2: Don't cast the enum to a boolean (Jason) Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-02-29 11:43:33 -08:00
Kristian Høgsberg Kristensen	9d8bae6137	anv: Don't advertise pipelineStatisticsQuery We don't support that just yet. Reported-by: Jacek Konieczny <jajcus@jajcus.net>	2016-02-29 10:55:39 -08:00
Axel Davy	83bc2acfe9	st/nine: Fix second Multithreading issue with MANAGED buffers Here is another threading issue with MANAGED buffers: Thread 1: buffer creation Thread 1: buffer lock Thread 2: Draw call Thread 1: writes data Thread 1: Unlock Without this patch, the buffer is initially dirty and in the list of things to upload after its creation. The draw call will then upload the data and unset the dirty flag, and the Unlock won't trigger a second upload. Fixes regression introduced by `cc0114f30b`: "st/nine: Implement Managed vertex/index buffers" Cc: "11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-02-29 18:55:58 +01:00
Axel Davy	44246fe99d	st/nine: Fix Multithreading issue with MANAGED buffers d3d calls are protected by mutexes, however if app is doing in two threads: Thread 1: buffer Lock Thread 2: Draw call Thread 1: writes data Thread 1: Unlock Then before this patch, the Draw call would begin to upload the buffer. Solves this by moving the moment we add the buffer to the queue of things to upload (We move it from Lock time to Unlock time). Cc: "11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-02-29 18:55:58 +01:00
Axel Davy	35c858c42c	st/nine: Handle READONLY for buffer MANAGED pool READONLY won't trigger an upload. Cc: "11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-02-29 18:55:58 +01:00
Axel Davy	8a8affdfda	st/nine: Use Position input helper for ps3 declared inputs When the semantic is Position (which can happen with index 0 only), use the helper to get Position input. Cc: "11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-02-29 18:55:58 +01:00
Axel Davy	f08c990af5	st/nine: Introduce helper for Position shader input Cc: "11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-02-29 18:55:58 +01:00
Marc-André Lureau	f1d12e7392	virtio_gpu: Add virtio 1.0 PCI ID to driver map Add the virtio-gpu PCI ID for virtio 1.0 (according to the specification, "the PCI Device ID is calculated by adding 0x1040 to the Virtio Device ID") Support for virtio 1.0 was added in qemu 2.4 (same time virtio-gpu landed). Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-29 11:31:36 +00:00
Koop Mast	04bc09fdf9	st/clover: Add libelf cflags to the build Otherwise the build will fail, when the library is in a non default location. v2 [Emil Velikov] - drop the unneeded cflags from targets/opencl. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Fixes: `7f585a6a98` "configure.ac: use pkg-config for libelf" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93524 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-29 11:30:15 +00:00
Emil Velikov	c212a70cd9	mesa; add get-extra-pick-list.sh script into bin/ This is a very rudimentary script that checks if any of the applied cherry-picks have been referenced (fixed?) by another patch. With the latter either missing the stable tag or hasn't yet been picked. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-02-29 11:25:35 +00:00
Emil Velikov	64500f21f3	automake: explicitly set distcheck configure flags Pretty much all of these are enabled by default. Considering the recent updates (see previous commits) one might as well list most/all of these here. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-29 10:51:45 +00:00
Emil Velikov	325bc6fb4a	automake: add more missing options for make distcheck Namely - opencl, osmesa (only the gallium flavour as it conflicts with the classic one), surfaceless egl platform and a couple gallium drivers (virgl and vc4). Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-29 10:51:45 +00:00
Emil Velikov	0b6157e971	install-gallium-links: port changes from install-lib-links Namely: `b662d5282f` mesa: Add clean-local rule to remove .lib links. `5c1aac17ad` install-lib-links: don't depend on .libs directory `fece147be5` install-lib-links: remove the .install-lib-links file With these in place, make distcheck now passes and a race condition has been avoided. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-29 10:51:45 +00:00
Rob Herring	51b22bd468	r600: Make enum alu_op_flags unsigned In builds with clang, there are several errors related to the enum alu_op_flags like this: src/gallium/drivers/r600/sb/sb_expr.cpp:887:8: error: case value evaluates to -1610612736, which cannot be narrowed to type 'unsigned int' [-Wc++11-narrowing] These are due to the MSB being set in the enum. Fix these errors by making the enum values unsigned as needed. The flags field that stores this enum also needs to be unsigned. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Cc: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-02-29 10:51:45 +00:00
Rob Herring	92dd38df5a	gallium/radeon: Add space between string literal and identifier Fix compiles with clang that have this C++11 error: src/gallium/drivers/radeon/r600_pipe_common.h:662:34: error: invalid suffix on literal; C++11 requires a space between literal and identifier [-Wreserved-user-defined-literal] Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Cc: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-02-29 10:51:45 +00:00
Rob Herring	0156a33aa3	freedreno: drop unnecessary -Wno-packed-bitfield-compat Enabling this warning doesn't generate any warnings with gcc, but is an unknown option for clang, so drop it. Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: Rob Clark <robdclark@gmail.com> (v1) Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> v2: keep the warning around, commented out Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-29 10:51:45 +00:00
Rob Herring	8949edf018	Android: clean-up and fix DRI module path handling MESA_DRI_MODULE_PATH is only getting set for classic DRI drivers and may or may not be set correctly for gallium_dri.so depending on the makefile include ordering. For Android 6 and earlier it is fine, but with build system changes in AOSP master, it is not. Move the path variables to a single place at the top level and introduce MESA_DRI_MODULE_REL_PATH for Android 5 and later which require relative paths. With this, there is a single variable to change. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-29 10:51:44 +00:00
Rob Herring	0663edf85b	Android: remove headers from LOCAL_SRC_FILES The Android build system now spits out warnings for header files listed in LOCAL_SRC_FILES, so strip them out. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-29 10:51:44 +00:00
Rob Herring	6dae9176d6	Android: add -Wno-date-time flag for clang clang complains about date/time macros: src/mesa/main/context.c:403:25: error: expansion of date or time macro is not reproducible [-Werror,-Wdate-time] Disable this warning. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-29 10:51:44 +00:00
Rob Herring	a2f16db19b	Android: glsl: fix dependence on YACC_HEADER_SUFFIX from build system The makefile was implicitly picking up YACC_HEADER_SUFFIX from the Android build system, but this variable is now gone. Add it locally to fix the build with AOSP master. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-29 10:51:44 +00:00
Rob Herring	794221fbb7	Android: remove dependence on .SECONDEXPANSION With the Android build system changes to ninja/kati, the use of .SECONDEXPANSION is no longer supported. Fix this by avoiding rule specific variables and using $(transform-generated-source). Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-29 10:51:44 +00:00
Rob Herring	574a92b048	Android: fix build break from nir/glsl move to compiler/ Commits `a39a8fbbaa` ("nir: move to compiler/") and `eb63640c1d` ("glsl: move to compiler/") broke Android builds. Fix them. There is also a missing dependency between generated NIR headers and several libraries. This isn't a new issue, but seems to have been exposed by the NIR move. Built with i915, i965, freedreno, r300g, r600g, vc4, and virgl enabled. Cc: "11.2" <mesa-stable@lists.freedesktop.org> Cc: Mauro Rossi <issor.oruam@gmail.com> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-29 10:51:44 +00:00
Oded Gabbay	a640ad15e1	gallium/radeon: disable evergreen_do_fast_color_clear for BE This function is currently broken for BE. I assume it's because of util_pack_color(). Until I fix this path, I prefer to disable it so users would be able to see correct colors on their desktop and applications. Together with the two following patches: - gallium/r600: Don't let h/w do endian swap for colorformat - gallium/radeon: remove separate BE path in r600_translate_colorswap it fixes BZ#72877 and BZ#92039 Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-29 12:26:27 +02:00
Oded Gabbay	e3dfc0e095	gallium/r600: Don't let h/w do endian swap for colorformat Since the rework on gallium pipe formats, there is no more need to do endian swap of the colorformat in the h/w, because the conversion between mesa format and gallium (pipe) format takes endianess into account (see the big #if in p_format.h). v2: return ENDIAN_NONE only for four 8-bits components (V_0280A0_COLOR_8_8_8_8) Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-29 12:26:27 +02:00
Oded Gabbay	9559071ed6	gallium/radeon: remove separate BE path in r600_translate_colorswap After further testing, it appears there is no need for separate BE path in r600_translate_colorswap() The only fix remaining is the change of the last if statement, in the 4 channels case. Originally, it contained an invalid swizzle configuration that never got hit, in LE or BE. So the fix is relevant for both systems. This patch adds an additional 120 available visuals for LE and BE, as seen in glxinfo v2: Tested for regressions by running piglit gpu.py with CAICOS (r600g) on x86-64 machine. No regressions found. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-29 12:26:27 +02:00
Samuel Pitoiset	07ed003faf	nv50/ir: emit VOTE instruction Changes from v2: - add missing NOT modifier for GK110/GM107 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-28 23:58:11 +01:00
Jordan Justen	635c0e92b7	anv: Set CURBEAllocationSize in MEDIA_VFE_STATE Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-02-28 11:54:49 -08:00
Jordan Justen	1af5dacd76	anv/gen7: Enable SLM in L3 cache control register Port `1983003` to gen7. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-02-28 11:54:49 -08:00
Kristian Høgsberg Kristensen	b00b42d99b	nir/spirv: Use the new bare sampler type	2016-02-28 11:24:05 -08:00
Jordan Justen	72efb68d48	anv/pipeline: Set URB offset to zero if size is zero After `3ecd357d81`, it may be possible for the VS to get assigned all of the URB space. On Ivy Bridge, this will cause the offset for the other stages to be 16, which cannot be packed into the ConstantBufferOffset field of 3DSTATE_PUSH_CONSTANT_ALLOC_*. Instead we can set the offset to zero if the stage size is zero. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-02-28 10:51:38 -08:00
Jordan Justen	ef06ddb08a	anv/pipeline: Set FS URB space to zero if the FS is unused Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-02-28 10:51:38 -08:00
Jordan Justen	45d8ce07a5	anv/pipeline: Set stage URB size to zero if it is unused Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-02-28 10:49:39 -08:00
Samuel Pitoiset	b3efa0a59e	gk110/ir: add ld lock/st unlock emission Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-28 19:20:20 +01:00
Ilia Mirkin	aa3b85fd18	nv50,nvc0: bump minimum texture buffer offset alignment It appears that it actually needs to be aligned to the datum size, so it was 1 when testing with R8, but it can be as high as 16 with RGBA32. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-02-27 16:26:34 -05:00
Jason Ekstrand	46b7c242da	anv/gen7: Clean up the dummy PS case Fix whitespace and remove dead comments	2016-02-27 11:24:09 -08:00
Jason Ekstrand	e18a2f037a	anv/gen7: Set MaximumNumberofThreads in the dummy PS packet	2016-02-27 11:23:56 -08:00
Jason Ekstrand	ad50896c87	anv/gen7: Only try to get the depth format the surface has depth	2016-02-27 11:23:18 -08:00
Jason Ekstrand	4b34f2ccb8	anv/image: Use isl for filling brw_image_param	2016-02-27 10:26:14 -08:00
Jason Ekstrand	bd6470fa6c	isl: Add helpers for filling out brw_image_param	2016-02-27 10:26:14 -08:00
Jason Ekstrand	7363024cbd	anv: Fill out image_param structs at view creation time	2016-02-27 10:26:14 -08:00
Jason Ekstrand	e9d126f23b	anv/image: Add a ussage_mask field to image_view_init This allows us to avoid doing some unneeded work on the meta paths where we know that the image view will be used for exactly one thing. The meta paths also sometimes do things that aren't quite valid like setting the array slice on a 3-D texture and we want to limit the number of paths that need to be able to sensibly handle the lies.	2016-02-27 10:26:14 -08:00
Jason Ekstrand	b4c16fd01a	isl: Move isl_image.c to isl_storage_image.c	2016-02-27 10:26:14 -08:00
Jason Ekstrand	eb19d640eb	anv: Use isl to fill buffer surface states	2016-02-27 10:26:14 -08:00
Jason Ekstrand	a0cd20eb7f	isl: Add a helper for filling a buffer surface state	2016-02-27 10:26:14 -08:00
Jason Ekstrand	9d5b8f7709	anv: Remove unneeded fiels from anv_image_view	2016-02-27 10:26:14 -08:00
Jason Ekstrand	b70a8d40fa	anv/state: Remove unused fill_surface_state functions	2016-02-27 10:26:14 -08:00
Jason Ekstrand	ded57c3cca	anv: Use ISL to fill out surface states	2016-02-27 10:26:14 -08:00
Jason Ekstrand	4a9b805ce5	anv/device: Store the default MOCS in the device	2016-02-27 10:26:13 -08:00
Jason Ekstrand	d798762cdb	isl: Add a function for filling out a surface state	2016-02-27 10:26:13 -08:00
Jason Ekstrand	6b06072ba8	isl: Create per-gen helper libraries for gens 7, 8, and 9	2016-02-27 10:26:13 -08:00
Jason Ekstrand	82d2db80bb	genxml: Add MOCS fields to RENDER_SURFACE_STATE This allows us to set MOCS as a single uint32_t on all platforms.	2016-02-27 10:26:13 -08:00
Jason Ekstrand	452782f68b	gen/genX_pack: Add genxml to the pack header path If you have an out-of-tree build, gen8_pack.h and friends will not be in the same folder as genX_pack.h so this will be a problem. We fixed out-of-tree earlier by adding the genxml folder to the includes for the vulkan driver. However, this is not a good long-term solution because we want to use it in ISL as well.	2016-02-27 10:26:13 -08:00
Ilia Mirkin	e2dce1a340	mesa: add GL_OES_gpu_shader5 and GL_EXT_gpu_shader5 support The two extensions are identical, and are largely taking bits of already existing desktop functionality. We continue to do a poor job of supporting the 'precise' keyword, just like we do on desktop. This passes the relevant dEQP tests that I could find. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-02-27 00:08:28 -05:00
Ilia Mirkin	2875183463	mesa: expose GL_EXT_texture_sRGB_decode on GLES 3.0+ Could be exposed on earlier GLES versions if we supported EXT_sRGB, but we don't, for now. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-02-26 23:55:45 -05:00
Nanley Chery	265d4c415c	isl: Fix isl_surf_get_image_intratile_offset_el() Consecutive tiles are separated by the size of the tile, not by the logical tile width. v2: Remove extra subtraction (Ville) Add parenthesis (Jason) v3: Update the unit tests for the function Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-02-26 16:59:36 -08:00
Ian Romanick	585b18f305	i965/cfg: Fix comment list punctuation Trivial Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2016-02-26 16:51:27 -08:00
Ian Romanick	5bfb302783	i965/cfg: Split out dead control flow paths to simplify both paths v2: Fix some bad indentation. Suggested by Curro. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-02-26 16:51:27 -08:00
Ian Romanick	2513a20240	i965/cfg: Don't handle fully empty if/else/endif This will now never occur. The empty if-else part would have already been removed leaving an empty if-endif part. No shader-db changes. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-02-26 16:51:27 -08:00
Ian Romanick	69bb063ec2	i965/cfg: Eliminate an empty then-branch of an if/else/endif On BDW, total instructions in shared programs: 8448571 -> 8448367 (-0.00%) instructions in affected programs: 21000 -> 20796 (-0.97%) helped: 116 HURT: 0 v2: Remove spurious attempt to combine the if_block with the (removed!) else_block. Suggested by Matt and Curro. Correct the comment describing what the new pass does. Suggested by Matt. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-02-26 16:51:27 -08:00
Ian Romanick	c7deee69ea	i965/cfg: Track prev_block and prev_inst explicitly in the whole function This provides a trivial simplification now, and it makes some future changes more straight forward. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-02-26 16:51:27 -08:00
Ian Romanick	70cf0eb5c7	i965/cfg: Slightly rearrange dead_control_flow_eliminate 'git diff -w' is a bit more illustrative. A couple declarations were moved, the continue was removed, and the code was reindented. This will simplify future changes. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-02-26 16:51:27 -08:00
Thomas Hindoe Paaboel Andersen	6bb6b5c341	anv: remove stray ; after if Both logic and indentation suggests that the ; were not intended here. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-02-26 16:05:28 -08:00
Jason Ekstrand	b7bc52b5b1	anv/gen8: Emit the 3DSTATE_PS_BLEND packet	2016-02-26 16:04:48 -08:00
Kenneth Graunke	a0294c2cf3	i965: Simplify brw_nir_lower_vue_inputs() slightly. The same code appeared in both branches; pull it above the if statement. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-26 15:55:59 -08:00
Kenneth Graunke	8151003ade	i965: Avoid recalculating the normal VUE map for IO lowering. The caller already computes it. Now that we have stage specific functions, it's really easy to pass this in. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-26 15:55:59 -08:00
Kenneth Graunke	15b3639bf1	i965: Avoid recalculating the tessellation VUE map for IO lowering. The caller already computes it. Now that we have stage specific functions, it's really easy to pass this in. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-26 15:55:59 -08:00
Kenneth Graunke	cfbd9831f8	i965: Eliminate brw_nir_lower_{inputs,outputs,io} functions. Now that each stage is directly calling brw_nir_lower_io(), and we have per-stage helper functions, it makes sense to just call the relevant one directly, rather than going through multiple switch statements. This also eliminates stupid function parameters, such as the two that only apply to vertex attributes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-26 15:55:59 -08:00
Kenneth Graunke	b96ddd2e52	i965: Split brw_nir_lower_inputs/outputs into per-stage functions. These functions are both giant switch statements where most cases don't overlap at all. Let's put the bulk of the work in per-stage helpers. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-26 15:55:59 -08:00
Kenneth Graunke	d33c478bed	i965: Remove catch-all nir_lower_io call with specific cases. Most cases already call nir_lower_io explicitly for input and output lowering. This catch all isn't very useful anymore - we can just add it to the remaining cases. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-26 15:55:59 -08:00
Kenneth Graunke	51f8797993	i965: Move optimizations from brw_nir_lower_io to brw_postprocess_nir. This simplifies things. Every caller of brw_nir_lower_io() immediately calls brw_postprocess_nir(). The only real change this will have is that we get an extra brw_nir_optimize() call when compiling compute shaders, but that seems fine. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-26 15:55:59 -08:00
Kenneth Graunke	dcd4a841e9	i965: Always do NIR IO lowering at specialization time. We've now hit literally every case other than geometry shaders (and compute shaders, but those are a no-op). So, let's just move geometry shaders over too and be done with it. The only advantage to doing this at link time was to save the expense of running the pass on recompiles. But we're already running a lot of passes, and the extra code complexity isn't worth it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-26 15:55:59 -08:00
Kenneth Graunke	fa7135107f	i965: Make an is_scalar boolean in brw_compile_gs(). Shorter than compiler->scalar_stage[MESA_SHADER_GEOMETRY], which can help with line-wrapping. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-26 15:55:59 -08:00
Jason Ekstrand	b3cb6e78aa	i965/nir: Do lower_io late for fragment shaders The Vulkan driver wants to be able to delete fragment outputs that are beyond key.nr_color_regions; this is a lot easier if we lower outputs at specialization time rather than link time. (Rationale added to commit message by Ken) Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-26 15:55:59 -08:00
Jordan Justen	7428e6f86a	i965: Set dest type to UW for several send messages Without this, on SIMD 16 the send instruction destination will appear to write more than one destination register, causing the simulator to report an error. Of course, the send instruction can actually write more than one destination register regardless of the type set for the destination, so this is a bit strange. Suggested-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-02-26 12:03:56 -08:00
Samuel Pitoiset	aad48f8691	nvc0: rework nvc0_compute_validate_program() Reduce the amount of duplicated code by re-using nvc0_program_validate(). While we are at it, change the prototype to return void and remove nvc0_compute.h which is now useless. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Pierre Moreau <pierre.morrow@free.fr> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-26 14:00:27 +01:00
Samuel Pitoiset	e1f5c76047	nvc0: make sure to validate compute global buffers on Fermi No reason to not validate those global buffers and this might avoid fails if someone try to use the global memory from compute programs. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Pierre Moreau <pierre.morrow@free.fr> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-26 14:00:23 +01:00
Samuel Pitoiset	dcf7938833	nvc0: move nvc0_validate_global_residents() to nvc0_compute.c While we are at it, rename it to nvc0_compute_validate_globals() and update its prototype. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Pierre Moreau <pierre.morrow@free.fr> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-26 14:00:18 +01:00
Derek Foreman	d085a5dff5	egl/wayland: Try to use wl_surface.damage_buffer for SwapBuffersWithDamage Since commit `d1314de293` we ignore damage passed to SwapBuffersWithDamage. Wayland 1.10 now has functionality that allows us to properly process those damage rectangles, and a way to query if it's available. Now we can use wl_surface.damage_buffer and interpret the incoming damage as being in buffer co-ordinates. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk> Signed-off-by: Derek Foreman <derekf@osg.samsung.com>	2016-02-26 11:49:09 +00:00
Dave Airlie	840aa52f50	virgl: add missing CAP turned off.	2016-02-26 04:03:09 +00:00
Miklós Máté	847f1cc698	program: Remove extra reference_program() It was already done in get_mesa_program() Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-02-25 22:02:50 +01:00
Emil Velikov	51c65a4c48	automake: add nine to make distcheck Will allow us to catch/prevent issues, like the one in mesa 11.2.0-rc1. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-25 19:56:07 +00:00
Emil Velikov	b08dbc84fe	st/nine: don't forget to bundle the nine_limits.h file Without this mesa 11.2.0-rc1 ended up busted :-( Cc: "11.2" <mesa-stable@lists.freedesktop.org> Repored-by: Ondřej Súkup <mimi.vx@gmail.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-25 19:56:07 +00:00
Matt Turner	4009a9ead4	i965/fs: Allow saturate propagation to propagate negations into MADs. Allows us to transform mad res src0 src1 src2 mov.sat dst -res into mad.sat dst -src0 -src1 src2 instructions in affected programs: 3712 -> 3688 (-0.65%) helped: 24 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-02-25 10:51:15 -08:00
Matt Turner	65d3217cb0	i965/fs: Allow saturate propagation to propagate negations into ADDs. Allows us to transform add res src0 src1 mov.sat dst -res into add.sat dst -src0 -src1 No shader-db changes. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-02-25 10:51:13 -08:00
Matt Turner	7b6113bc2d	i965/fs: Allow saturate propagation to propagate negations into MULs. Allows us to transform mul res src0 src1 mov.sat dst -res into mul.sat dst src0 -src1 instructions in affected programs: 45246 -> 45054 (-0.42%) helped: 162 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-02-25 10:51:10 -08:00
Matt Turner	1567da1e28	i965/fs: Don't CSE negated multiplies with saturation. It's not correct to CSE these multiplies mul.sat dst1, -a, b mul.sat dst2, a, b by emitting a negated MOV from dst1 to dst2: mul.sat dst1, -a, b mov dst2, -dst1 Take 2.0*2.0 for example. The first multiply would produce 0.0 and the second would produce 1.0. Fixes bad generated code in 18 to 22 shaders: instructions in affected programs: 432 -> 464 (7.41%) helped: 4 HURT: 18 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-02-25 10:51:04 -08:00
Matt Turner	3da789f1e9	glsl: Consider ubo_load to be a horizontal operation. Unclear to me whether it actually is a horizontal operation that cannot be vectorized, but the fact that i965 generates the same code in either case makes me less interested in finding out. Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94199 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-25 10:50:34 -08:00
Jason Ekstrand	c32273d246	anv/device: Properly handle apiVersion == 0 From the Vulkan 1.0 spec section 3.2: "If apiVersion is 0 the implementation must ignore it"	2016-02-25 08:52:37 -08:00
Andres Gomez	d1509a5848	glsl/ast: Implicit conversion from double to float is not allowed Also, renamed get_conversion_operation to avoid future misunderstandings. Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-25 13:10:50 +01:00
Oded Gabbay	439b5b008f	gallium/radeon: return correct values for BE in r600_translate_colorswap Because I changed the swizzle check, I also need to adapt the return values for each check. It's basically almost the same as before, we just cross between STD and STD_REV, and cross between ALT and ALT_REV This fixes the rgba test in gl-1.0-readpixsanity (piglit) and also fixes tri-flat (mesa demos). Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-25 09:21:08 +02:00
Oded Gabbay	ff8b41b702	gallium: remove duplicate define from enum pipe_format Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-25 09:21:08 +02:00
Ian Romanick	9d9aeb91b1	glsl: Detect do-while-false loops and unroll them Previously loops like do { // ... } while (false); that did not have any other loop-branch instructions would not be unrolled. This is commonly used to wrap multiline preprocessor macros. This produces IR like (loop ( ... break )) Since limiting_terminator was NULL, the loop unroller would throw up its hands and say, "I don't know how many iterations. How can I unroll this?" We can detect this another way. If there is no limiting_terminator and the only loop-branch is a break as the last IR, there's only one iteration. On my very old checkout of shader-db, this removes a loop from Orbital Explorer, but it does not otherwise affect the shader. The loop removed is the one the compiler inserts surrounding the switch statement. This change does prevent some seriously bad code generation in some patches to meta shaders that I recently sent out for review. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-02-24 18:43:40 -08:00
Nanley Chery	3eb476fa14	i965: Enable tiled mem_copy with sRGB-formatted resources RGBA8 and BGRA8 unorm formats are compatible with the various mem_copy functions. Their sRGB counterparts are also compatible because they're also color-renderable (of importance when the specified resource is a readbuffer) and they share the same physical layout. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-02-24 14:40:34 -08:00
Kristian Høgsberg Kristensen	59f5728995	Merge remote-tracking branch 'origin/master' into vulkan	2016-02-24 13:04:54 -08:00
Kristian Høgsberg Kristensen	25c2470b24	anv: Set max_hs_threads/max_ds_threads	2016-02-24 12:21:26 -08:00
Kenneth Graunke	3ecd357d81	anv: Allocate more push constant space. Previously we allocated 4kB of push constant space for VS, GS, and PS (for a total of 12kB) no matter what. This works, but doesn't fully utilize the space - we have 16kB or 32kB of space. This makes anv use the same method as brw - divide up the space evenly among all active shader stages. This means HS and DS would get space, if those shader stages existed. In the future, we can probably do better by inspecting how many push constants each shader stage uses, and weight things accordingly. But this is strictly better than the old code, and ideally we'd justify a fancier solution with actual performance data.	2016-02-24 11:22:05 -08:00
Kenneth Graunke	3f11517730	anv: Properly size the push constant L3 area. We were assuming it was 32kB everywhere, reducing the available URB space. It's actually 16kB on Ivybridge, Baytrail, and Haswell GT1-2.	2016-02-24 11:13:08 -08:00
Kenneth Graunke	7f9b03cc8b	anv: Emit 3DSTATE_PUSH_CONSTANT_ALLOC_* via a loop. Now we're emitting HS and DS packets as well.	2016-02-24 11:13:08 -08:00
Kenneth Graunke	1024a66fc4	anv: Emit 3DSTATE_URB_* via a loop. Rather than keeping separate {vs,hs,ds,gs}_start fields, we now store an array indexed by the shader stage (MESA_SHADER_). The 3DSTATE_URB_ commands are also sequentially numbered. This makes it easy to just emit them in a loop. This simplifies the code a little, and also will make it easier to add more credible HS and DS code later.	2016-02-24 11:13:02 -08:00
Brian Paul	c95d5c5f6f	mesa: replace for loop with bitshifting in supported_buffer_bitmask() Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-24 08:32:01 -07:00
Brian Paul	ac37d0475c	mesa: updates some comments in buffers.c Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-24 08:31:53 -07:00
Brian Paul	d8412029bb	mesa: make _mesa_draw_buffers() static Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-24 08:31:44 -07:00
Brian Paul	24d8080507	mesa: make _mesa_draw_buffer() static Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-24 08:31:41 -07:00
Brian Paul	ebfcf9de43	mesa: make _mesa_read_buffer() static Not called from any other file. Remove _mesa_ prefix and update comments. Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-24 08:31:37 -07:00
Brian Paul	1e41c2e135	mesa: move declaration of buffer var in handle_first_current() Declare the var in the scopes where it's used. Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-24 08:31:31 -07:00
Brian Paul	c8fdb42c91	mesa: use gl_buffer_index in a few places Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-24 08:31:28 -07:00
Brian Paul	363019e17a	st/mesa: remove useless break statement Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-24 08:31:23 -07:00
Brian Paul	953cb24e65	st/mesa: rename st_readpixels to st_ReadPixels To match the convention of other device driver functions. Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-24 08:31:17 -07:00
Brian Paul	83b589301f	st/mesa: fix frontbuffer glReadPixels regressions The change "mesa/readpix: Don't clip in _mesa_readpixels()" caused a few piglit regressions. The failing tests use glReadPixels to read from the front color buffer. The problem is we were trying to read from a non-existant front color buffer. The front color buffer is created on demand in st/mesa. Since the missing buffer bounds were effectively 0 x 0 the glReadPixels was totally clipped and returned early. The fix involves creating the real front color buffer when we're about to try reading from it. Tested with llvmpipe and VMware driver on Linux, Windows. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94253 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94254 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94257 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-24 08:30:07 -07:00
Jason Ekstrand	c9564fd598	nir/spirv: Allow but warn for a few capabilities Unfortunately, glslang gives us cull/clip distance and GS streams even if the shader doesn't use it whenever a shader is declared as version 450. This is a glslang bug, but we can easily enough ignore it for now.	2016-02-23 22:07:25 -08:00
Jason Ekstrand	f0f7cc22f3	anv/descriptor_set: Use the correct size for the descriptor pool The descriptor sizes array gives the total number of each type of descriptor that will ever be allocated from the pool, not the total amount that may be in any particular set. In our case, this simply means that we have to sum a bunch of things up and there we go.	2016-02-23 21:25:37 -08:00
Jason Ekstrand	040355b688	nir/spirv: Add more capabilities	2016-02-23 21:01:00 -08:00
Jason Ekstrand	bd3db3d665	anv/meta: Allocate descriptor pools on-the-fly We can't use a global descriptor pool like we were because it's not thread-safe. For now, we'll allocate them on-the-fly and that should work fine. At some point in the future, we could do something where we stack-allocate them or allocate them out of one of the state streams.	2016-02-23 17:04:19 -08:00
Oded Gabbay	4b7e219e61	gallium/radeon: Correctly translate colorswaps for big endian The current code in r600_translate_colorswap uses the swizzle information to determine which colorswap to use. This works for BE & LE when the nr_channels is <4, but when nr_channels==4 (e.g. PIPE_FORMAT_A8R8G8B8_UNORM), this method can not be used for both BE and LE, because the swizzle info is the same for both of them. As a result, r600g doesn't support 24bit color formats, only 16bit, which forces the user to choose 16bit color in X server. This patch fixes this bug by separating the checks for LE and BE and adapting the swizzle conditions in the BE part of the checks. Tested on an Evergreen GPU (Cedar GL FirePro 2270) running inside POWER7 Big-Endian Machine. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> CC: "11.2" "11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-23 20:55:40 +02:00
Thomas Hindoe Paaboel Andersen	1807806add	mesa: use sizeof on the correct type Before the luminance stride was based on the size of GL_FLOAT which is just the type constant (0x1406). Change it to use the size of GLfloat. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-23 08:55:35 -07:00
Marek Olšák	190a291b03	tgsi/scan: handle holes between VS inputs, assert-fail in other cases "st/mesa: overhaul vertex setup for clearing, glDrawPixels, glBitmap" added a vertex shader declaring IN[0] and IN[2], but not IN[1]. Drivers relying on tgsi_shader_info can't handle holes in declarations, because tgsi_shader_info doesn't track that. This is just a quick workaround meant for stable that will work for vertex shaders. This fixes radeonsi DrawPixels and CopyPixels crashes. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-23 16:42:16 +01:00
Jason Ekstrand	bfbb238dea	anv/descriptor_set: Set descriptor type for immuatable samplers	2016-02-22 21:39:14 -08:00
Jason Ekstrand	64e1c84059	intel/genxml: Update macro documentation	2016-02-22 21:20:04 -08:00
Francisco Jerez	31a0affa28	docs: Mark off GL_OES_shader_image_atomic as done. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-22 19:59:56 -08:00
Francisco Jerez	058ed980c6	i965/fs: Return result of image atomic in a register of the expected type. So the result is of float type if we're implementing the float overload of imageAtomicExchange. This is the only back-end change required to support OES_shader_image_atomic AFAICT. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-22 19:57:09 -08:00
Francisco Jerez	81c16a2dab	glsl: Implement the required built-in functions when OES_shader_image_atomic is enabled. This is basically just the same atomic functions exposed by ARB_shader_image_load_store, with one exception: "highp float imageAtomicExchange( coherent IMAGE_PARAMS, float data);" There's no float atomic exchange overload in the original ARB_shader_image_load_store or GL 4.2, so this seems like new functionality that requires specific back-end support and a separate availability condition in the built-in function generator. v2: Move image availability predicate logic into a separate static function for clarity. Had to pull out the image_function_flags enum from the builtin_builder class for that to be possible. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-22 19:56:54 -08:00
Francisco Jerez	be125af95e	glsl: Add usual extension boilerplate for OES_shader_image_atomic. v2: No need for extension enable bits (Ilia). Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-22 19:56:35 -08:00
Francisco Jerez	009bbecf6d	mesa: Add extension table entry for OES_shader_image_atomic. v2: No need for extension enable bits (Ilia). Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-22 19:55:35 -08:00
Jason Ekstrand	ae619a0355	anv/state: Replace a bunch of ANV_GEN with GEN_GEN	2016-02-22 19:19:00 -08:00
Jason Ekstrand	442dff8cf4	anv/descriptor_set: Stop marking everything as having dynamic offsets	2016-02-22 17:23:29 -08:00
Kristian Høgsberg Kristensen	2570a58bcd	anv: Implement descriptor pools Descriptor pools are an optimization that lets applications allocate descriptor sets through an externally synchronized object (that is, unlocked). In our case it's also plugging a memory leak, since we didn't track all allocated sets and failed to free them in vkResetDescriptorPool() and vkDestroyDescriptorPool().	2016-02-22 17:13:51 -08:00
Kristian Høgsberg Kristensen	353d5bf286	anv/x11: Free swapchain images and memory on destroy	2016-02-22 16:23:47 -08:00
Samuel Pitoiset	2999257e0f	nvc0: rename 3d binding points to NVC0_BIND_3D_XXX Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-22 21:28:51 +01:00
Samuel Pitoiset	9c6a7bfb40	nvc0: rename 3d dirty flags to NVC0_NEW_3D_XXX Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-22 21:28:51 +01:00
Samuel Pitoiset	2c48369f54	nvc0: prefix compute macros with _CP_ instead of _COMPUTE_ Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-22 21:28:51 +01:00
Samuel Pitoiset	bbff97ae39	nvc0: rename NVXX_COMPUTE to NVXX_CP Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-22 21:28:51 +01:00
Samuel Pitoiset	5330ed959e	nvc0: rename nvc0_context::dirty to nvc0_context::dirty_3d Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-22 21:28:51 +01:00
Samuel Pitoiset	84b9b8f0a3	nvc0/ir: add missing emission of locked load predicate Like unlocked store on shared memory, locked store can fail and the second dest which is a predicate must be emitted. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-02-22 21:28:51 +01:00
Samuel Pitoiset	9f0d059d4b	nvc0/ir: add ld lock/st unlock emission on GK104 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-22 21:28:51 +01:00
Samuel Pitoiset	6526225f88	nv50/ir: restore OP_SELP to be a regular instruction Actually OP_SELP doesn't need to be a compare instruction. Instead we just need to set the NOT modifier when building the instruction. While we are at it, fix the dst register type and use a GPR. Suggested by Ilia Mirkin. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-22 21:28:51 +01:00
Mark Janes	08b408311c	vulkan: fix out-of-tree builds	2016-02-22 11:31:15 -08:00
Brian Paul	9de3b0273d	svga: unbind index buffer when drawing non-indexed primitives Silences a warning reported by the svga3d device. v2: also null-out the index buffer pointer Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-02-22 12:14:48 -07:00
Kristian Høgsberg Kristensen	f843aabdd4	intel/genxml: Add README I've had people ask about the design of the pack functions, for example, why aren't we using bitfields. I wrote up a bit of background on why and how we ended up with the current design and we might as well keep that with the code.	2016-02-22 09:14:25 -08:00
Nanley Chery	7b2c63a53c	anv/meta_blit: Handle compressed textures in anv_CmdCopyImage As with anv_CmdCopyBufferToImage, compressed textures require special handling during copies. Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2016-02-22 09:04:28 -08:00
Ilia Mirkin	571bd9ac42	mesa: add GL_EXT_texture_border_clamp support This extension is identical to GL_OES_texture_border_clamp. But dEQP has tests that want the EXT variant. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-02-22 10:38:56 -05:00
Ilia Mirkin	b6654831c3	mesa: add GL_OES_texture_border_clamp support Only minor differences to the existing ARB_texture_border_clamp support. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-02-22 10:38:56 -05:00
Ilia Mirkin	af8ad49541	mesa: bump version 11.2 has been branched, we're on 11.3 now. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-22 10:38:37 -05:00
Emil Velikov	4cd5e5b48e	nouveau: update the Makefile.sources list Reflect the nv50->g80 change and the new gm107_texture header. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-22 11:40:29 +00:00
Jason Ekstrand	f49ba0f7d8	nir/spirv: Add support for multisampled textures	2016-02-21 22:02:38 -08:00
Marek Olšák	ff360a52e6	radeonsi: implement binary shaders & shader cache in memory (v2) v2: handle _mesa_hash_table_insert failure other cosmetic changes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:58 +01:00
Marek Olšák	1132910e50	gallium/radeon: remove unused radeon_shader_binary_free_* functions Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:58 +01:00
Marek Olšák	50ac2612d0	radeonsi: make radeon_shader_reloc name string fixed-sized This will simplify implementations of binary shaders. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:58 +01:00
Marek Olšák	1fe73d55e3	radeonsi: move some struct si_shader members to new struct si_shader_info This will be part of shader binaries. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:58 +01:00
Marek Olšák	10fa269f4f	radeonsi: use smaller types for some si_shader members in order to decrease the shader size for a shader cache. v2: add & use SI_MAX_VS_OUTPUTS Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:58 +01:00
Marek Olšák	9aaf28da62	radeonsi: enable compiling one variant per shader Shader stats from VERDE: Default scheduler: Totals: SGPRS: 491272 -> 488672 (-0.53 %) VGPRS: 289980 -> 311093 (7.28 %) Code Size: 11091656 -> 11219948 (1.16 %) bytes LDS: 97 -> 97 (0.00 %) blocks Scratch: 1732608 -> 2246656 (29.67 %) bytes per wave Max Waves: 78063 -> 77352 (-0.91 %) Wait states: 0 -> 0 (0.00 %) Looking at some of the worst regressions, I get: - The VGPR increase seems to be caused by the fact that if PS has used less than 16 VGPRs, now it will always use 16 VGPRs and sometimes even 20. However, the wave count remains at 10 if VGPRs <= 24, so no harm there. - The scratch increase seems to be caused by SGPR spilling. The unnecessary SGPR spilling has been an ongoing issue with the compiler and it's completely fixable by rematerializing s_loads or reordering instructions. SI scheduler: Totals: SGPRS: 374848 -> 374576 (-0.07 %) VGPRS: 284456 -> 307515 (8.11 %) Code Size: 11433068 -> 11535452 (0.90 %) bytes LDS: 97 -> 97 (0.00 %) blocks Scratch: 509952 -> 522240 (2.41 %) bytes per wave Max Waves: 79456 -> 78217 (-1.56 %) Wait states: 0 -> 0 (0.00 %) VGPRs - same story as before. The SI scheduler doesn't spill SGPRs so much and generally spills way less than the default scheduler. (522240 spills vs 2246656 spills) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:58 +01:00
Marek Olšák	754cf171e9	radeonsi: print full shader name before disassembly Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:58 +01:00
Marek Olšák	3c98e0b369	radeonsi: compile non-GS middle parts of shaders immediately if enabled Still disabled. Only prologs & epilogs are compiled in draw calls, but each variant of those is compiled only once per process. VS is always compiled as hw VS. TES is always compiled as hw VS. LS and ES stages are always compiled on demand. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:58 +01:00
Marek Olšák	e038f8fd49	radeonsi: rework polygon stippling for PS prolog Don't use the pstipple module. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:58 +01:00
Marek Olšák	4636d9be4a	radeonsi: add PS prolog Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:58 +01:00
Marek Olšák	e79bb746ab	radeonsi: add PS epilog Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	eb10919b83	radeonsi: add TCS epilog Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	e1b21696a3	radeonsi: add VS epilog It only exports the primitive ID. Also used by TES when it's compiled as VS. The VS input location of the primitive ID input is v2. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	70de433dea	radeonsi: add VS prolog This is disabled with use_monolithic_shaders = true. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	19a92886a8	radeonsi: first bits for non-monolithic shaders Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	0303886b10	radeonsi: add code for dumping all shader parts together (v2) v2: unify some code into si_get_shader_binary_size Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	17eb99d8b9	radeonsi: add code for combining and uploading shaders from 3 shader parts Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	9d5bf1a3ef	radeonsi: fail compilation if non-GS non-CS shaders have rodata Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	09408764c1	radeonsi: separate 2 pieces of code from create_function Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	292759220c	radeonsi: add samplemask parameter to si_export_mrt_color Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	e6aea08b86	radeonsi: add start_instance parameter to get_instance_index_for_fetch Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	dc27456194	radeonsi: separate out shader key bits for prologs & epilogs Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	d995d4830e	radeonsi: compute how many input VGPRs fragment shaders have Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	fe1b6ede01	radeonsi: compute how many input SGPRs and VGPRs shaders have Prologs (shader binaries inserted before the API shader binary) need to know this, so that they won't change the input registers unintentionally. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	36202182ac	gallium/radeon: add basic code for setting shader return values LLVMBuildInsertValue will be used on return_value. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Samuel Pitoiset	3c9ed2015c	nvc0: enable compute shaders on Fermi Kepler compute support is really different than Fermi and it's not ready yet. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-21 10:42:32 +01:00
Samuel Pitoiset	14a810e9d0	nv50/ir: add atomics support on shared memory for Fermi Changes from v3: - move the previous OP_SELP change to the previous commit Changes from v2: - make sure the op is OP_SELP when emitting the predicate and add one assert - use bld.getSSA() for mkOp2() - add cross edge between tryLockAndSetBB and joinBB Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-21 10:42:32 +01:00
Samuel Pitoiset	e0371e63df	nv50/ir: make OP_SELP a compare instruction This OP_SELP insn will be used to handle compare and swap subops. Changes from v2: - fix logic for GK110+ Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-21 10:42:29 +01:00
Samuel Pitoiset	0c930557bf	nv50/ir: add lock/unlock subops for load/store Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-21 10:42:02 +01:00
Samuel Pitoiset	45e85e16f5	nv50/ir: use s[] addr space for shared buffers Shared memory address space (FILE_MEMORY_SHARED) must be used instead of global memory when a shared memory area is declared. Changes from v2: - oops, do not remove TGSI_FILE_BUFFER in a switch in nv50_ir_from_tgsi.cpp Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-21 10:41:58 +01:00
Samuel Pitoiset	80fc67fba5	nvc0: reduce likelihood of collision for real buffers on Fermi Reduce likelihood of collision with real buffers by placing the hole at the top of the 4G area. This fixes some indirect draw+compute tests with large buffers. Suggested by Ilia Mirkin. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-21 10:41:53 +01:00
Samuel Pitoiset	807901b639	nvc0: invalidate compute state when switching pipe contexts Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-21 10:41:48 +01:00
Samuel Pitoiset	c6293877f0	nvc0: add support for indirect compute on Fermi When indirect compute is used, the size of the grid (in blocks) is stored as three integers inside a buffer. This requires a macro to set up GRIDDIM_YX and GRIDDIM_Z. Changes from v2: - do not launch the grid if the number of groups for a dimension is 0 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-21 10:41:45 +01:00
Samuel Pitoiset	fa7333a742	nvc0: bind textures/samplers for compute on Fermi Textures and samplers don't seem to be aliased between COMPUTE and 3D. Changes from v2: - refactor the code to share (almost) the same logic between 3d and compute Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-21 10:41:40 +01:00
Samuel Pitoiset	917a5ff6ea	nvc0: bind shader buffers for compute on Fermi This is loosely based on 3D. Shader buffers are bound on c15 (the driver constbuf) at offset 0x200. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-21 10:41:37 +01:00
Samuel Pitoiset	a9b70a86db	nvc0: bind driver constbuf for compute on Fermi Changes from v3: - add new validation state for COMPUTE driver constbuf Changes from v2: - always bind the driver consts even if user params come in via clover Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-21 10:41:32 +01:00
Samuel Pitoiset	527652629d	nvc0: add a new validation state for 3D driver constbuf This will be used to invalidate 3D driver constbuf when using COMPUTE and vice-versa. This is needed because this CB contains a bunch of useful information like the addrs of shader buffers. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-21 10:41:29 +01:00
Samuel Pitoiset	57d4251003	nvc0: bind constant buffers for compute on Fermi Loosely based on 3D. Changs from v3: - invalidate COMPUTE CBs after validating 3D CBs because they are aliased Changes from v2: - get rid of the 's' param to nvc0_cb_bo_push() because it doesn't matter to upload constbufs for compute using the 3d chan Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-21 10:41:25 +01:00
Samuel Pitoiset	53f92bb7f9	nvc0: allocate an area for compute user constbufs For compute shaders, we might need to upload uniforms. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-21 10:41:21 +01:00
Jason Ekstrand	f1dddeadc2	anv: Fix a typo in apply_dynamic_offsets shader->num_uniforms is in terms of bytes in i965.	2016-02-20 21:24:31 -08:00
Jason Ekstrand	b5868d2343	anv: Zero out the WSI array when initializing the instance	2016-02-20 19:30:14 -08:00
Jason Ekstrand	bc696f1db6	isl: Stop including mesa/main/imports.h It pulls in all sorts of stuff we don't want.	2016-02-20 10:35:25 -08:00
Samuel Pitoiset	89d25a82e8	nv50: do not advertise about compute shaders Compute shaders are totally unsupported. This avoids Clover to report that OpenCL is supported on Tesla because it's a lie. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-20 19:25:12 +01:00
Jason Ekstrand	853fc3e431	genxml: Add mote includes in the generated headers	2016-02-20 09:33:20 -08:00
Jason Ekstrand	1f1cf6fcb0	anv: Get rid of GENX_FUNC It was a bad idea.	2016-02-20 09:12:38 -08:00
Jason Ekstrand	371b4a5b33	anv: Switch over to the macros in genxml	2016-02-20 09:09:28 -08:00
Jason Ekstrand	0d76aa9485	intel/genxml: Add a couple of helper headers	2016-02-20 08:35:36 -08:00
Rhys Kidd	a0f55e91cc	docs: Correct typo in LLVMpipe envvar description Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-20 16:15:35 +01:00
Ilia Mirkin	0b10ec1086	st/mesa: force depth mode to GL_RED for sized depth/stencil formats See commit `9db2098d` for the i965 version of this. This fixes depth in a bunch of dEQP EXT_texture_border_clamp tests. And probably other ones as well. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2016-02-19 17:37:39 -05:00
Daniel Czarnowski	e6f1a44d14	egl_dri2: set correct error code if swapbuffers fails A return value of '-1' means that there was error during swap with a window drawable, in this case we set error as EGL_BAD_NATIVE_WINDOW. v2: coding style cleanup, better commit message Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-19 18:23:19 +00:00
Dongwon Kim	d1e1563bb6	egl: move Null check to eglGetSyncAttribKHR to prevent Segfault Null-check on "*value" is currently done in _eglGetSyncAttrib, which is after eglGetSyncAttribKHR dereferences it. Move the check a layer up (in the beginning of eglGetSyncAttribKHR) to avoid segfaults. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org Signed-off-by: Dongwon Kim <dongwon.kim@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> [Emil Velikov: tweak commit message, add stable tag] Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-19 18:23:19 +00:00
Ilia Mirkin	b697400a97	meta/copy_image: use precomputed dst_internal_format to avoid segfault If the destination is a renderbuffer, dst_tex_image will be NULL. This fixes the *to_renderbuffer dEQP copy image tests. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Cc: mesa-stable@lists.freedesktop.org	2016-02-19 13:10:28 -05:00
Ilia Mirkin	a03d6f2aa3	mesa: add GL_OES_texture_stencil8 support It's basically the same thing as GL_ARB_texture_stencil8 except that glCopyTexImage isn't supported, so add STENCIL_INDEX to the list of invalid GLES formats for glCopyTexImage. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-02-19 12:37:22 -05:00
Ilia Mirkin	2b938a390c	st/mesa: fix pbo uploads - LOD must be provided in .w for TXF (even for buffer textures) - User buffer must be valid at draw time - Must have a sampler associated with the sampler view This makes PBO uploads work again on nouveau. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-19 11:30:33 -05:00
Ilia Mirkin	68c4af1c19	mesa: check fbo completeness based on internal format, not driver format The base format is a function of the user-requested format, while the driver format is not. So we should use the base format instead. The driver format can be anything. Specifically in the stencil-only case, it might be a depth/stencil format. However we still want to refuse such an attachment when bound to GL_DEPTH. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-19 11:30:33 -05:00
Jason Ekstrand	2b85807458	genxml: Stop using unicode in the pack generator This causes python problems and problems when people don't have a locale set properly in their shell.	2016-02-19 08:05:35 -08:00
Dave Airlie	1375cb3c27	anv: fix warning about unused width variable. We don't use width outside the debug clause here.	2016-02-19 08:01:54 -08:00
Brian Paul	0eb7b5c2a3	mesa: small optimization of _mesa_expand_bitmap() Avoid a per-pixel multiply. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-19 08:51:51 -07:00
Brian Paul	8a2a1a6bd6	mesa: add special case ubyte[4] / BGRA conversion function This reduces a glTexImage(GL_RGBA, GL_UNSIGNED_BYTE) hot spot in when storing the texture as BGRA. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-19 08:51:51 -07:00
Brian Paul	44f48fead5	st/mesa: implement a simple cache for glDrawPixels Instead of discarding the texture we created, keep it around in case the next glDrawPixels draws the same image again. This is intended to help application which draw the same image several times in a row, either within a frame or subsequent frames. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-02-19 08:51:51 -07:00
Brian Paul	71dcc067a5	llvmpipe: add a few const qualifiers Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-19 08:51:51 -07:00
Brian Paul	6d551f9ea3	trace: assorted whitespace and formatting fixes Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-02-19 08:49:51 -07:00
Brian Paul	e8689d9df3	trace: remove unneeded inline qualifiers Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-02-19 08:49:41 -07:00
Iago Toral Quiroga	72794b0bd9	glsl: fix emit_inline_matrix_constructor for doubles Specifically, for the case where we initialize a dmat with a source matrix that has fewer columns/rows. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-19 14:16:05 +01:00
Iago Toral Quiroga	d1617b4088	glsl: Mark float constants as such So we don't generate double to float conversion code Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-19 14:16:05 +01:00
Iago Toral Quiroga	ad22886ef1	glsl: fix indentation in emit_inline_matrix_constructor Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-19 14:16:05 +01:00
Rob Clark	04ad05c987	glsl: fix standalone compiler Need to set some non-zero limits for MaxCombinedUniformComponents, otherwise we hit an "Too many <type> shader uniform components" error in the linker. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-02-19 08:02:02 -05:00
Nicolai Hähnle	d7c4ffd1ee	st/mesa: disable depth/stencil/alpha tests in PBO upload Noticed by Brian Paul. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-18 20:49:12 -05:00
Brian Paul	2f3d06d9f9	svga: allow non-contiguous VS input declarations This fixes a glDrawPixels regression since `b63fe0552b`. The new quad-drawing utility code uses 3 vertex attributes (xyz, rgba, st). For glDrawPixels path we don't use the rgba attribute so there's a gap in the TGSI VS input declarations (INPUT[0] = pos, INPUT[2] = texcoord). The TGSI->VGPU10 translations code did not handle this correctly. I missed this because my VM was configured for HWv11 while testing. Another way to fix this would be to change the tgsi_scan.c code so that the tgsi_shader_info::num_inputs (and num_outputs) included the unused inputs/outputs. These counts would then actually be "max input register index + 1" rather than "number of used inputs". But that change could impact all drivers so put it off for now. No regressions found with piglit or typical GL apps. v2: also update alloc_system_value_index() to use info.file_max[] Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-02-18 15:46:17 -07:00
Oded Gabbay	a3e3c3e621	gallivm: Check whether to stop disassemble only for x86 Because the if statement that checks whether we have a return statement is valid only on x86, surround it with X86 or X86-64 arch defines Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-19 00:18:11 +02:00
Oded Gabbay	b3d42934a1	gallivm: use sstream for dissasembling Currently, disassemble() directly prints to stdout. This has broke the profiling support for llvmpipe JIT code. This patch redirects the output to an sstream object, which is then either gets printed to stdout (for assembly debugging) or gets written to a file in /tmp/ (for profiling support). Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-19 00:18:11 +02:00
Rob Clark	93c62fdee9	trace: fix new gcc6 warnings src/gallium/drivers/trace/tr_context.c:1713:39: warning: ‘rbug_blocker_flags’ defined but not used [-Wunused-const-variable] static const struct debug_named_value rbug_blocker_flags[] = { ^~~~~~~~~~~~~~~~~~ Note that use of rbug_blocker_flags was removed in: commit `5494332128` Author: Jakob Bornecrantz <jakob@vmware.com> Date: Wed May 12 19:26:19 2010 +0100 trace: Remove rbug from trace Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-02-18 17:10:55 -05:00
Rob Clark	5051d85b03	gallium/auxiliary: fix new gcc6 warnings src/gallium/auxiliary/pipebuffer/pb_bufmgr_mm.c: In function ‘mm_bufmgr_create_from_buffer’: src/gallium/auxiliary/pipebuffer/pb_bufmgr_mm.c:288:4: warning: statement is indented as if it were guarded by... [-Wmisleading-indentation] if(mm->map) ^~ src/gallium/auxiliary/pipebuffer/pb_bufmgr_mm.c:286:1: note: ...this ‘if’ clause, but it is not if(mm->heap) ^~ Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-02-18 17:10:55 -05:00
Rob Clark	bba836ea6a	gallium/hud: fix new gcc6 warnings src/gallium/auxiliary/hud/font.c:234:22: warning: ‘Fixed8x13_Character_159’ defined but not used [-Wunused-const-variable] static const GLubyte Fixed8x13_Character_159[] = { 9, 0, 0, 0, 0, 0, 0,170, 0, 0, 0,130, 0, 0, 0,130, 0, 0, 0,130, 0, 0, 0,170, 0, 0, 0, 0, 0}; ^~~~~~~~~~~~~~~~~~~~~~~ .... many more.. These are simply unused, just #if 0 them out for now, in case someone wants to use them in the future. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-02-18 17:10:55 -05:00
Rob Clark	7d5372bfe8	mesa: fix new gcc6 warnings src/mesa/main/texstore.c:92:22: warning: ‘map_1032’ defined but not used [-Wunused-const-variable] static const GLubyte map_1032[6] = { 1, 0, 3, 2, ZERO, ONE }; ^~~~~~~~ src/mesa/main/texstore.c:91:22: warning: ‘map_3210’ defined but not used [-Wunused-const-variable] static const GLubyte map_3210[6] = { 3, 2, 1, 0, ZERO, ONE }; ^~~~~~~~ src/mesa/main/texstore.c:90:22: warning: ‘map_identity’ defined but not used [-Wunused-const-variable] static const GLubyte map_identity[6] = { 0, 1, 2, 3, ZERO, ONE }; ^~~~~~~~~~~~ These appear to be unused since: commit `8ec6534b26` Author: Iago Toral Quiroga <itoral@igalia.com> AuthorDate: Wed Oct 15 13:42:11 2014 +0200 mesa: Use _mesa_format_convert to implement texstore_rgba. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-02-18 17:10:55 -05:00
Rob Clark	b01575ec99	glsl: fix new gcc6 warnings src/compiler/glsl/lower_discard_flow.cpp:79:1: warning: ‘ir_visitor_status {anonymous}::lower_discard_flow_visitor::visit_enter(ir_loop_jump)’ defined but not used [-Wunused-function] lower_discard_flow_visitor::visit_enter(ir_loop_jump ir) ^~~~~~~~~~~~~~~~~~~~~~~~~~ The base class method that was intended to be overridden was 'visit(ir_loop_jump *ir)', not visit_enter(). Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-02-18 17:10:55 -05:00
Rob Clark	e93caca071	glsl: fix new gcc6 warnings src/compiler/glsl/ast_to_hir.cpp: In function ‘unsigned int ast_process_struct_or_iface_block_members(exec_list, _mesa_glsl_parse_state, exec_list, glsl_struct_field, bool, glsl_matrix_layout, bool, ir_variable_mode, ast_type_qualifier, unsigned int, unsigned int)’: src/compiler/glsl/ast_to_hir.cpp:6339:52: warning: ‘first_member_has_explicit_location’ may be used uninitialized in this function [-Wmaybe-uninitialized] if (!layout->flags.q.explicit_location && ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~ ((first_member_has_explicit_location && ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ !qual->flags.q.explicit_location) \|\| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ (!first_member_has_explicit_location && ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ qual->flags.q.explicit_location))) { ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-02-18 17:10:55 -05:00
Rob Clark	e2060aaf57	i965: fix new gcc6 warnings src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp:244:1: warning: ‘void {anonymous}::fs_copy_prop_dataflow::dump_block_data() const’ defined but not used [-Wunused-function] fs_copy_prop_dataflow::dump_block_data() const ^~~~~~~~~~~~~~~~~~~~~ From looking at git history, it looks like this is intended to be unused (ie. just for adding on-demand debug prints) Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-02-18 17:10:55 -05:00
Rob Clark	a13442ac67	util: fix new gcc6 warnings src/util/hash_table.h:111:23: warning: ‘_mesa_fnv32_1a_offset_bias’ defined but not used [-Wunused-const-variable] static const uint32_t _mesa_fnv32_1a_offset_bias = 2166136261u; ^~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-02-18 17:10:55 -05:00
Jason Ekstrand	698ea54283	anv/pipeline: Fix a typo in the pipeline layout code	2016-02-18 13:55:57 -08:00
Jason Ekstrand	d5bb23156d	anv/allocator: Set is_winsys_bo to false for block pool BOs	2016-02-18 13:55:57 -08:00
Kenneth Graunke	1c694a6c20	glcpp: Disallow "defined" as a macro name. Both GCC and Clang disallow this, and glslang has recently started disallowing it as well. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94188 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-02-18 13:38:50 -08:00
Mark Janes	1b37276467	vulkan: fix out-of-tree build We need to be able to find the generated gen*pack.h headers. Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-02-18 12:30:27 -08:00
Jason Ekstrand	e0565f40ea	anv/pipeline: Use nir's num_images for allocating image_params	2016-02-18 11:44:26 -08:00
Jason Ekstrand	79c0781f44	nir/gather_info: Count textures and images	2016-02-18 11:42:36 -08:00
Samuel Pitoiset	dfc95ad6d1	gallium/cso: only enable compute shaders when TGSI is supported Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94186 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-18 20:41:25 +01:00
Jason Ekstrand	e881c73975	anv/pipeline: Don't leak the binding map	2016-02-18 11:09:30 -08:00
Jason Ekstrand	8c23392c26	anv/formats: Don't use a compound literal to initialize a const array Doing so makes older versions of GCC rather grumpy. Newere GCC fixes this, but using a compound literal isn't really gaining us anything anyway.	2016-02-18 10:44:08 -08:00
Jason Ekstrand	9851c8285f	Move the intel vulkan driver to src/intel/vulkan	2016-02-18 10:37:59 -08:00
Jason Ekstrand	47b8b08612	Move isl to src/intel	2016-02-18 10:34:47 -08:00
Jason Ekstrand	f6d9587688	vulkan: Move XML and generator into src/intel/genxml	2016-02-18 10:30:29 -08:00
Kristian Høgsberg Kristensen	542c38df36	anv/meta: Initialize blend state for the right attachment We were always initializing only RT 0. We need to initialize the RT we're creating the clear pipeline for.	2016-02-18 10:22:50 -08:00
Kristian Høgsberg Kristensen	05f75a3026	anv/meta: Don't use the blit ds layout in resolve code	2016-02-18 10:22:50 -08:00
Rob Herring	5c7f97426d	Android: disable unused-parameter warning Android builds with -Wunused-parameter enabled which results in spewing lots of warnings. Disable it so more meaningful warnings are more visible. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-18 17:47:33 +00:00
Rob Herring	7efc273df1	Android: enable building on arm64 Use the LOCAL_CFLAGS_{32/64} instead of arch specific variants to define the DEFAULT_DRIVER_DIR. This enables building for arm64. Cc: Chih-Wei Huang <cwhuang@android-x86.org> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-18 17:47:33 +00:00
Rob Herring	1f53a57b2f	Android: Fix building secondary arch in mixed 32/64-bit builds TARGET_CC is not defined for the secondary arch on combined 32/64-bit builds. The build system uses 2ND_TARGET_CC instead and it is not meant to be used in module makefiles. LOCAL_CC was used to provide C only flags as -std=c99 is not valid for C++ files. Since Android 4.4, LOCAL_CONLYFLAGS was added to set compiler flags on C files only, so it can be used now instead of LOCAL_CC. This will break on pre-4.4 versions of Android, but it unlikely anyone is using current Mesa with such an old version of Android. Cc: Chih-Wei Huang <cwhuang@android-x86.org> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-18 17:47:33 +00:00
Rob Herring	ba06ea1a37	egl: android: clean-up config attribute setting Pass the additional config attributes to dri2_add_config to set them instead of open coding them. This is in preparation to add more attributes. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-18 17:47:33 +00:00
Varad Gautam	e35c5af337	egl: android: fix visuals declaration Signed-off-by: Varad Gautam <varadgautam@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-18 17:47:33 +00:00
Rob Herring	64d2f398f6	Android: fix build break in libmesa_program Commit `5fd848f6c9` ("program: Use _mesa_geometric_samples to calculate gl_NumSamples") broken Android builds. Add the missing include path "main" to framebuffer.h like other includes in prog_statevars.c. Cc: Neil Roberts <neil@linux.intel.com> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Neil Roberts <neil@linux.intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-18 17:47:33 +00:00
Ilia Mirkin	12e3ad2ae9	mesa: gl_NumSamples should always be at least one From ARB_sample_shading: "gl_NumSamples is the total number of samples in the framebuffer, or one if rendering to a non-multisample framebuffer" So make sure to always pass in at least 1. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Edward O`Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Neil Roberts <neil@linux.intel.com>	2016-02-18 12:35:28 -05:00
Plamena Manolova	65dfb3048e	compiler/glsl: Fix uniform location counting. This patch moves the calculation of current uniforms to link_uniforms, which makes use of UniformRemapTable which stores all the reserved uniform locations. Location assignment for implicit uniforms now tries to use any gaps left in the table after the location assignment for explicit uniforms. This gives us more space to store more uniforms. Patch is based on earlier patch with following changes/additions: 1: Move the counting of explicit locations to check_explicit_uniform_locations and then pass the number to link_assign_uniform_locations. 2: Count the number of empty slots in UniformRemapTable and store them in a list_head. 3: Try to find an empty slot for implicit locations from the list, if that fails resize UniformRemapTable. Fixes following CTS tests: ES31-CTS.explicit_uniform_location.uniform-loc-mix-with-implicit-max ES31-CTS.explicit_uniform_location.uniform-loc-mix-with-implicit-max-array Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93696	2016-02-18 11:53:35 +02:00
Jason Ekstrand	40c76d4efa	Delete nir_lower_samplers.cpp Somehow, in one of the merges with mesa master, the old file must have been kept when nir_lower_samplers.cpp was moved to nir_lower_samplers.c.	2016-02-17 20:16:11 -08:00
Roland Scheidegger	d335b6abc0	gallivm, tgsi: provide fake sample_i_ms implementations Just like the rest of the msaa "implementation" it's just fake for now... Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-18 05:00:03 +01:00
Brian Paul	06d3b0a006	st/mesa: new st_DrawAtlasBitmaps() function for drawing bitmap text This basically saves the current pipeline state, sets up state for rendering, constructs a set of textured quads, renders, then restores the previous pipeline state. It shouldn't be hard to implement a similar function for non-gallium drives. With some code refactoring, the vertex definition code could probably be shared. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-17 19:57:48 -07:00
Brian Paul	b26ddda12f	mesa: implement a display list / glBitmap texture atlas This improves the performance of applications which use glXUseXFont() or wglUseFontBitmaps() and glCallLists() to draw bitmap text. Basically, we collect all the glBitmap images from the display lists and put them into a texture atlas. To render the bitmaps for a glCallLists() command, we render a set of textured quads where each quad is textured with one bitmap image. Actually, the rendering part has to be done by the Mesa driver or Mesa/gallium state tracker. Note that GLUT demos that use glutBitmapCharacter() don't benefit from this. v2, per Nicolai Hähnle: - check the max tex rect size is at least 1024. - add comment in dd.h that texture_rectangle is required. - in _mesa_DeleteLists(), try to delete the atlas before the list(s) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-17 19:57:48 -07:00
Ilia Mirkin	6f4a725073	st/mesa: apply DepthMode swizzle to stencil texturing as well Gallium doesn't present these as GL_RED-style. A swizzle is necessary to present the proper data in the unused components. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-17 21:20:24 -05:00
Jason Ekstrand	005b9ac758	anv: Gut anv_pipeline_layout Almost none of the data in anv_pipeline_layout is used anymore thanks to doing real layout in the pipeline itself.	2016-02-17 18:04:40 -08:00
Kristian Høgsberg Kristensen	c2581a9375	anv: Build the real pipeline layout in the pipeline This gives us the chance to pack the binding table down to just what the shaders actually need. Some applications use very large descriptor sets and only ever use a handful of entries. Compacted binding tables should be much more efficient in this case. It comes at the down-side of having to re-emit binding tables every time we switch pipelines, but that's considered an acceptable cost.	2016-02-17 18:04:39 -08:00
Jason Ekstrand	581e4468f9	nir/spirv: Add some more capabilities	2016-02-17 18:04:39 -08:00
Jason Ekstrand	fed8b7f817	anv/pipeline: Delete out-of-bounds fragment shader outputs	2016-02-17 18:04:39 -08:00
Jason Ekstrand	979732fafc	nir: Add a helper for getting the one function from a shader	2016-02-17 18:04:39 -08:00
Jason Ekstrand	8c05b44bbb	nir: Add a nir_foreach_variable_safe helper	2016-02-17 18:04:39 -08:00
Jason Ekstrand	d67d84f5e5	i965/nir: Do lower_io late for fragment shaders	2016-02-17 18:04:39 -08:00
Jason Ekstrand	7c26d8d471	anv/gen7_pipeline: Set WriteDisable = true if we have no color attachments	2016-02-17 18:04:39 -08:00
Jason Ekstrand	9f9cd3de44	anv/gen8_pipeline: Default color attachments to WriteDisable = true	2016-02-17 18:04:39 -08:00
Jason Ekstrand	da9fd74d34	anv: Pull StencilBufferWriteEnable from both sides	2016-02-17 18:04:39 -08:00
Nanley Chery	9963af8bbd	anv: Ignore unused dimensions in vkCreateImage's anv_image We ignore unused dimensions in the isl surface; do the same for the resulting anv_image. Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2016-02-17 17:32:26 -08:00
Ben Widawsky	20e8ee3662	i965/skl: Update Skylake renderer strings Also adds some of the Iris/Pro parts which we previously didn't have named. v2: 0x192d is gt3, not gt4 Adding some 'e' tags for eDRAM parts Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Acked-by: Michał Winiarski <michal.winiarski@intel.com>	2016-02-17 16:50:59 -08:00
Ben Widawsky	644c8a5151	i965/skl: Add two missing device IDs The Iris part is left unbranded because we did not have these with original SKL. v2: 0x192d is gt3, not gt4 v3: Forgot to update the temporary brand string when I did v2. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Acked-by: Michał Winiarski <michal.winiarski@intel.com>	2016-02-17 16:50:59 -08:00
Ilia Mirkin	f3cd62a765	mesa: allow multisampled format info to be returned on GLES 3.1 The restriction on multisampled integer texture formats only applies to GLES 3.0, so don't apply it to GLES 3.1 contexts. This fixes a slew of dEQP-GLES31.functional.state_query.internal_format.* tests, which now all pass. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-02-17 19:30:40 -05:00
Kristian Høgsberg Kristensen	b8da261dc7	spirv: Fix SpvOpFwidth, SpvOpFwidthFine and SpvOpFwidthCoarse "Result is the same as computing the sum of the absolute values of OpDPdx and OpDPdy on P." We were doing sum of absolute values of OpDPdx of P and OpDPdx of NULL.	2016-02-17 15:28:52 -08:00
Kristian Høgsberg Kristensen	ae3e249d57	anv: Remove hacky PIPE_CONTROL in vkCmdEndRenderPass() The vkCmdPipelineBarrier() command should work as intended now and we need to pull the plug on this old hack.	2016-02-17 15:19:07 -08:00
Kristian Høgsberg Kristensen	5e92e91c61	anv: Rework vkCmdPipelineBarrier() We don't need to look at the stage flags, as we don't really support any fine-grained, stage-level synchronization. We have to do two PIPE_CONTROLs in case we're both flushing and invalidating. Additionally, if we do end up doing two PIPE_CONTROLs, the first, flusing one also has to stall and wait for the flushing to finish, so we don't re-dirty the caches with in-flight rendering after the second PIPE_CONTROL invalidates.	2016-02-17 15:18:06 -08:00
Ben Widawsky	2bf041d94f	i965: Extract push constant state to a new file Every stage has a corresponding 3DSTATE_CONSTANT_XS packet, so having the code to create and emit push constant buffers in genX_vs_state.c is a little strange. Moving it to a separate file seems more logical. v2 [Ken]: Rebase on master, explain motivation in the commit message. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-17 12:34:23 -08:00
Matt Turner	0e9dc59a58	i965: Make emit_minmax return an instruction*. And use it in brw_fs_nir.cpp.	2016-02-17 12:35:27 -08:00
Matt Turner	2f2c00c727	i965: Lower min/max after optimization on Gen4/5. Gen4/5's SEL instruction cannot use conditional modifiers, so min/max are implemented as CMP + SEL. Handling that after optimization lets us CSE more. On Ironlake: total instructions in shared programs: 6426035 -> 6422753 (-0.05%) instructions in affected programs: 326604 -> 323322 (-1.00%) helped: 1411 total cycles in shared programs: 129184700 -> 129101586 (-0.06%) cycles in affected programs: 18950290 -> 18867176 (-0.44%) helped: 2419 HURT: 328 Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-02-17 12:35:27 -08:00
Matt Turner	378d98f87e	i965/vec4: Initialize force_writemask_all in vec4_builder(). Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-02-17 12:35:27 -08:00
Kristian Høgsberg Kristensen	3b9b908054	anv: Ignore unused dimensions in vkCreateImage We would assert on unused dimensions (eg extent.depth for VK_IMAGE_TYPE_2D) not being 1, but the specification doesn't put any constraints on those. For example, for VK_IMAGE_TYPE_1D: "If imageType is VK_IMAGE_TYPE_1D, the value of extent.width must be less than or equal to the value of VkPhysicalDeviceLimits::maxImageDimension1D, or the value of VkImageFormatProperties::maxExtent.width (as returned by vkGetPhysicalDeviceImageFormatProperties with values of format, type, tiling, usage and flags equal to those in this structure) - whichever is higher" We'll fix up the arguments to isl to keep isl strict in what it expects.	2016-02-17 12:21:51 -08:00
Kristian Høgsberg Kristensen	b63e28c0e1	anv: Set correct write domain on window system BOs We need to make sure GEM understands that we're writing to the BO, in case it needs to synchronize with other rings (blitter use in display server, for example).	2016-02-17 11:19:56 -08:00
Tom Stellard	dc7cf07af3	radeon/llvm: Add TargetLibraryInfo to the pass manager This will prevent optimization passes from introducing unsupported library calls. Tested-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-17 19:06:41 +00:00
Tom Stellard	4f351a6cb1	radeon/llvm: Set the target triple on the module Tested-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-17 19:06:41 +00:00
Tom Stellard	77f4e1c7ff	gallivm: Add helpers for creating and destroying TargetLibraryInfo This functionality is not exposed via the LLVM C API. Tested-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-17 19:06:41 +00:00
Samuel Pitoiset	cfd1dd0500	nvc0: invalidate all buffers when switching pipe contexts Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-17 21:14:24 +01:00
Ilia Mirkin	49c67926c7	st/mesa: fix up result_src.type when doing i2u/u2i conversions Even though it's a no-op, it's important to keep track of the type so that we can pick the properly-signed op later on. This fixes dEQP-GLES3.functional.shaders.precision.uint.highp_div_fragment, which ended up using IDIV instead of UDIV. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2016-02-17 13:30:33 -05:00
Brian Paul	5e52df2198	st/mesa: use cso_set_viewport_dims() in try_pbo_upload_common() Note that this results in a different transformation for the viewport's Z axis (depth range), but that doesn't matter for this case. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-17 11:25:02 -07:00
Jordan Justen	9a939ebb47	i965/gen7: Use predicated rendering for indirect compute On gen7 (Ivy Bridge, Haswell), we will get a GPU hang if an indirect dispatch is used, but one of the dimensions is 0. Therefore we use predicated rendering on the GPGPU_WALKER command to handle this case. Fixes piglit test: spec/arb_compute_shader/zero-dispatch-size From the ARB_compute_shader spec, under DispatchCompute: "If the work group count in any dimension is zero, no work groups are dispatched." And then for DispatchComputeIndirect: ... "is equivalent (assuming no errors are generated) to calling DispatchCompute with <num_groups_x>, <num_groups_y> and <num_groups_z>" ... Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94100 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-17 09:25:47 -08:00
Rob Clark	37d540ba70	freedreno: expose time-elapsed query Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-02-17 10:41:55 -05:00
Rob Clark	ba194630cc	freedreno/a4xx: implement time-elapsed query Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-02-17 10:41:55 -05:00
Rob Clark	62fa868728	freedreno/a4xx: better occlusion/sample counting This seems to give more reliable results. More similar to what we do on a3xx, although I think it breaks the a3xx theory that the four sets of results map to each MRT (since we appear to still only have four sets on a4xx). The divide-by-two is a bit odd, but seems to be needed for some reason. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-02-17 10:41:55 -05:00
Rob Clark	87eb406791	freedreno/query: fix refcnt'ing issue Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-02-17 10:41:55 -05:00
Rob Clark	0e91dccf9c	freedreno/query: some queries don't have ->begin_query() Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-02-17 10:41:55 -05:00
Rob Clark	9d23d7b7cb	freedreno/query: align counter snapshot locations Some hw queries need their sample memory locations to have certain alignment. At the moment that isn't an issue, since the only hw query is occlusion, so all samples have the same size. But when others are added with different sample sizes, this starts to be a problem. All current and immediately upcoming hw queries simply need their sample address aligned to their size, so let's use that for now. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-02-17 10:41:55 -05:00
Rob Clark	8529e210ec	freedreno/query: add optional enable hook Add enable hook for hw query providers. Some will need to configure perfctr selector registers, which we want to do at the start of the submit. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-02-17 10:41:55 -05:00
Rob Clark	45ab5b1c34	freedreno: query max gpu freq This will be needed to support converting from cycle counts to time for performance related queries (initially time-elapsed, but there are some additional performance counters that could be wired up). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-02-17 10:41:55 -05:00
Rob Clark	dcb69185a0	freedreno: update generated headers Mostly to pull in perf ctrs. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-02-17 10:41:55 -05:00
Rob Clark	2a7ceb5957	freedreno/ir3: fix new gcc6 errors src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c: In function ‘emit_tex’: src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c:1368:26: warning: unused variable ‘const_off’ [-Wunused-variable] struct ir3_instruction *const_off[4]; ^~~~~~~~~ unused since: commit `8750299a42` Author: Jason Ekstrand <jason.ekstrand@intel.com> Date: Tue Feb 9 14:51:28 2016 -0800 nir: Remove the const_offset from nir_tex_instr Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-02-17 10:41:55 -05:00
Kristian Høgsberg Kristensen	5caa995c32	Revert "anv: Disable snooping for allocator pools again" This reverts commit `c136672c59`. We still have the intermittent missing flush for VkEvent in certain vulkancts cases: piglit.deqp-vk.api.command_buffers.execute_large_primary piglit.deqp-vk.api.command_buffers.submit_count_non_zero, Let's reenable the snooping until we figure out the root cause.	2016-02-16 23:23:49 -08:00
Kristian Høgsberg Kristensen	ecc67f1aac	anv: Make driver and icd file installable Change the name of the .so to libvulkan_intel.so and add an installable icd with the installed paths. Keep the icd file with build-tree paths, but rename to dev_icd.json to make it clear that it's for development purposes.	2016-02-16 23:23:17 -08:00
Kristian Høgsberg Kristensen	4a2d17f606	anv: Revise PhysicalDeviceFeatures and remove FINISHME	2016-02-16 15:43:12 -08:00
Karol Herbst	edf774bb7e	nv50/ir: we can't do the add to mad conversion when the mul saturates Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 18:20:10 -05:00
Karol Herbst	068e9848ba	nv50/ir: optimize neg(and(set, 1)) to set helps shaders in saints row IV, bioshock infinite and shadow warrior total instructions in shared programs : 1914931 -> 1903900 (-0.58%) total gprs used in shared programs : 247920 -> 247785 (-0.05%) total local used in shared programs : 5673 -> 5673 (0.00%) total bytes used in shared programs : 17558272 -> 17457320 (-0.57%) local gpr inst bytes helped 0 137 719 719 hurt 0 12 0 0 v2: remove this opt for OP_SLCT and check against float for OP_SET v3: simplified the code Signed-off-by: Karol Herbst <nouveau@karolherbst.de> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 18:20:10 -05:00
Ilia Mirkin	ca23c8081f	nv50/ir: fix quadop emission in the presence of predication When there's a predicate, it just goes onto the sources list. If the quadop only has a single regular source, we will end up thinking that the predicate is the second source. Check explicitly for the predSrc so that we don't accidentally emit the wrong thing. This fixes a bunch of dEQP-GLES3.functional.shaders.derivate.* tests. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-02-16 18:20:10 -05:00
Ilia Mirkin	1d1ddfe5f8	nv50,nvc0: enable/disable seamless cubemap texturing as requested In a situation where the seamless setting isn't available on a per-texture basis (G200+ Teslas, and all Fermis), assume that all samplers will have it identically set, and enable accordingly. This fixes arb_seamless_cubemap piglit test on Fermi and Tesla. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 18:20:10 -05:00
Philipp Zabel	ecd1d94d1c	anv: pCreateInfo->pApplicationInfo parameter to vkCreateInstance may be NULL Fix a NULL pointer dereference in anv_CreateInstance in case the pApplicationInfo field of the supplied VkInstanceCreateInfo structure is NULL [1]. [1] https://www.khronos.org/registry/vulkan/specs/1.0/apispec.html#VkInstanceCreateInfo Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>	2016-02-16 14:42:26 -08:00
Rob Clark	d49307435a	st/mesa: add missing ETC2 entries to format_map Noticed by Ilia when I was trying to figure out why some app was failing to use ETC2. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:53:43 -05:00
Samuel Pitoiset	3d5f61a262	nvc0: enable compute support on GK110:GM200 with an envvar Without this NVF0_COMPUTE environment variable, compute support is initialized by default and this is not what we want for now because it might break 3D. It will be enabled by default once we are sure it won't break anything. Please note that compute support on GM200+ is not enabled yet because it needs to be double-checked. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 21:39:00 +01:00
Samuel Pitoiset	6d74fa5756	nvc0: add compute support for GM107 Fortunately, compute support on GM107 is very close to GK110, except the GK110_COMPUTE.UNK02C4 which is invalid and should not be used. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 21:39:00 +01:00
Samuel Pitoiset	bc331dd838	nvc0: fix compute state initialization on GK110+ Because our firmware doesn't support the GK110_COMPUTE.FIRMWARE[0x6] method the GPU hangs when it is used. Removing it fix the issue and allow to launch compute shaders on GK110+. Tested on GK208 and GM107. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 21:39:00 +01:00
Timothy Arceri	a61823b584	glsl: remove duplicate interpolation_string() function We already have one in the IR code that can be used everywhere its needed in the AST code so remove the one from the AST. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-02-17 07:26:38 +11:00
Timothy Arceri	e70ece4eea	glsl: remove unused helper Seems to have become unused when i965 moved to NIR. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-17 07:25:10 +11:00
Timothy Arceri	07e6a37332	glsl: set user defined varyings to smooth by default in ES This is usually handled by the backends in order to handle the various interactions with the gl_*Color built-ins. The problem is this means linking will fail if one side on the interface adds the smooth qualifier to the varying and the other side just uses the default even though they match. This fixes various deqp tests. The spec is not clear what to for desktop GL so leave it as is for now. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92743	2016-02-17 07:23:49 +11:00
Samuel Pitoiset	f638512890	gm107/ir: add ATOM CAS emission This fixes the following dEQP test and the other compswap variants. dEQP-GLES31.functional.ssbo.atomic.compswap.highp_int Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 20:53:39 +01:00
Samuel Pitoiset	09446cf5f6	st/mesa: do not init limits when compute shaders are not supported When the number of uniform blocks is less than 12, ARB_uniform_buffer_object can't be enabled and the maximum GL version is not even 3.1... This fixes a regression introduced in `7c79c1e` (st/mesa: add compute shader state) if the maximum number of uniform blocks allowed for compute shaders is less than 12. This happens on Kepler but this might also affect other Gallium drivers. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reported-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Tested-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>	2016-02-16 20:53:35 +01:00
Jordan Justen	f28d80fabf	mesa: Don't call driver when there is no compute work The ARB_compute_shader spec says: "If the work group count in any dimension is zero, no work groups are dispatched." Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 09:25:20 -08:00
Jordan Justen	8514c75a26	i965: Set compute shader shared memory max to 64k See Ivy Bridge PRM, Volume 2, Part 2, 1.8.4 INTERFACE_DESCRIPTOR_DATA: DWORD 5, bits 20:16: "This field indicates how much shared local memory the thread group requires. The amount is specified in 4k blocks, but only powers of 2 are allowed: 0, 4k, 8k, 16k, 32k and 64k per half-slice." For Haswell, see Volume 2d, INTERFACE_DESCRIPTOR_DATA: DWORD 5, bits 20:16: With text identical to the Ivy Bridge PRM. For Broadwell, see Volume 2d, INTERFACE_DESCRIPTOR_DATA: DWORD 6, bits 20:16: With text identical to the Ivy Bridge PRM. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2016-02-16 09:25:20 -08:00
Brian Paul	f90801cd40	st/mesa: use new CSO_BITS_ALL_SHADERS Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-16 10:22:32 -07:00
Brian Paul	1bf8fa8277	cso: add CSO_BITS_ALL_SHADERS For saving/restoring all shader stages. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-16 10:22:32 -07:00
Brian Paul	a0636157c4	st/mesa: simplify st->ctx, ctx->st usage in a various places	2016-02-16 10:22:32 -07:00
Brian Paul	5239832cf1	st/mesa: use _mesa_geometric_width/height() in glDrawPixels code Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 10:22:32 -07:00
Brian Paul	b92d48fb6b	st/mesa: rename attr variable in st_DrawTex() Rename to 'tex_attr' to be a bit more clear. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-16 10:22:32 -07:00
Brian Paul	5ce1f1245d	st/mesa: use 'cso' instead of 'st->cso_context' in st_DrawTex() Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-16 10:22:32 -07:00
Brian Paul	79ffe94c8b	st/mesa: fix whitespace and add comment in st_DrawTex() Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-16 10:22:32 -07:00
Brian Paul	4277618235	st/mesa: used _mesa_num_tex_faces() in st_finalize_texture() Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-16 10:22:32 -07:00
Brian Paul	ffa1a1dd21	cso: make most of the cso_save/restore_x() functions static Users of the CSO save/restore facility all use the new cso_save/restore_state() functions instead. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-16 10:22:32 -07:00
Brian Paul	223ffd8a08	postprocess: use new cso_save/restore_state() functions Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-16 10:22:32 -07:00
Brian Paul	70e8a4f734	gallium/hud: use new cso_save/restore_state() functions Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-16 10:22:32 -07:00
Brian Paul	66889d8f84	gallium/util: use new cso_save/restore_state() functions Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-16 10:22:32 -07:00
Brian Paul	38db9a4e26	st/mesa: use cso_save/restore_state() in st_cb_texture.c This simplifies the error handling code too. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-16 10:22:32 -07:00
Brian Paul	33fc248606	st/mesa: use new cso_save/restore_state() functions Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-16 10:22:32 -07:00
Brian Paul	9403571755	cso: add new cso_save/restore_state() functions cso_save_state() takes a bitmask of state items to save. Calling cso_restore_state() restores those states. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-16 10:22:32 -07:00
Brian Paul	017a003f1c	cso: remove comment There's a similar comment just a few lines before.	2016-02-16 10:22:32 -07:00
Brian Paul	347b9418ac	st/mesa: use new cso_set_viewport_dims() helper Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-16 10:22:32 -07:00
Brian Paul	f7af12ae85	cso: add new cso_set_viewport_dims() helper To simplify some viewport setting code in the state tracker. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-16 10:22:32 -07:00
Brian Paul	f88c859cd3	st/mesa: use 'cso' local var instead of st->cso_context Just a little cleaner. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-16 10:22:32 -07:00
Brian Paul	d7d4fe90c4	st/mesa: consolidate quad drawing code The glClear, glBitmap and glDrawPixels code now use a new st_draw_quad() helper function. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-16 10:22:32 -07:00
Brian Paul	b63fe0552b	st/mesa: overhaul vertex setup for clearing, glDrawPixels, glBitmap Define a new st_util_vertex structure which is a bit smaller (9 floats versus the previous 12 floats per vertex). Clean up the glClear, glDrawPixels and glBitmap code that sets up the vertex data and does the drawing so it's all very similar. This can lead to more consolidation. v2: add assertion that vertex buffer slot == 0 to catch possible future change in cso_get_aux_vertex_buffer_slot() behavior. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-16 10:22:31 -07:00
Brian Paul	2b1535f82f	st/mesa: include u_draw.h, not u_draw_quad.h in st_draw.c Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-16 10:22:31 -07:00
Jason Ekstrand	48087cfc4e	anv/icd.json: Update the ABI version	2016-02-16 08:02:17 -08:00
Jason Ekstrand	0a3324e66c	anv: Pull Khronos stuff from the README	2016-02-16 07:43:21 -08:00
Jan Vesely	04085afcbf	configure: Bail out on llvm-config component error Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-16 10:09:33 -05:00
Matthew Dawson	0bba5ca468	Handle removal of LLVMAddTargetData in SVN revision 260919 LLVM removed LLVMAddTargetData for the 3.9 release in r260919. For the two places in mesa where this is called, only enable the lines when compiling for less then 3.9. For the radeon driver, I'm not sure how to check if any other LLVM calls need to be adjusted. I think since the target data used is extracted from the LLVMModule, it isn't necessary to pass it back to LLVM again. The code does compile, and at least for radeonsi does run OpenGL games. [ Michel Dänzer: Move #if closer to LLVMAddTargetData in lp_bld_init.c, and add HAVE_LLVM < 0x0309 guards around now unused occurrences of TD and data_layout ] Signed-off-by: Matthew Dawson <matthew@mjdsystems.ca> Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2016-02-16 16:18:35 +09:00
Topi Pohjolainen	7287cc8440	i965: Expose logic telling if non-msrt mcs is supported Alos use the opportunity to mark inputs constant. (Context has to be given as read-write to intel_miptree_supports_non_msrt_fast_clear() to support debug output). Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2016-02-16 08:52:24 +02:00
Topi Pohjolainen	dd37b6aaa9	i965/gen9: Refactor msrt mcs initialization This will be re-used to initialize auxiliary buffers in lossless compression case. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2016-02-16 08:52:24 +02:00
Topi Pohjolainen	2bd58790e2	i965: Add a few assertions on lossless compression v2 (Ben): Use combination of msaa_layout and number of samples instead of introducing explicit type for lossless compression (intel_miptree_is_lossless_compressed()). Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2016-02-16 08:52:24 +02:00
Topi Pohjolainen	56f29911ec	i965: Add a flag telling color resolve pass to ignore CCS_E v2 (Ben): Use combination of msaa_layout and number of samples instead of introducing explicit type for lossless compression (intel_miptree_is_lossless_compressed()). Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2016-02-16 08:52:24 +02:00
Topi Pohjolainen	97f4ca90b8	i965: Add resolve option for lossless compression v2 (Ben): Use combination of msaa_layout and number of samples instead of introducing explicit type for lossless compression (intel_miptree_is_lossless_compressed()). Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2016-02-16 08:52:24 +02:00
Topi Pohjolainen	0e79bff957	i965: Allow fast clear to be used with lossless compression v2 (Ben): Use combination of msaa_layout and number of samples instead of introducing explicit type for lossless compression. v3 (Ben): Squash with "i965: Resolve color buffer also in lossless compression case" and clarify simple non-compressed fast clear case. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2016-02-16 08:52:24 +02:00
Topi Pohjolainen	4b801116d3	i965: Add helper for detecting lossless compression Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2016-02-16 08:52:23 +02:00
Topi Pohjolainen	36b7c0dad9	Revert "i965: Restore vbo after color resolve during brw_try_draw_prims()" This got pushed accidentally in the first place but wasn't reverted as it didn't regress piglit but instead fixed one newly introduced test exercising a corner in case in i965 driver. However, saving and restoring vertex buffer context is complicated and requires more thought. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94150 Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Tapani Palli <tapani.palli@intel.com>	2016-02-16 08:52:14 +02:00
Ben Skeggs	33ace5544e	nvc0: initial support for GM20x GPUs Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:57:16 +10:00
Ben Skeggs	97fc3fd559	nvc0: implement support for maxwell texture headers Adds support for the new TIC layout that's present on Maxwell GPUs, heavily based on the code for the existing layout. This code is required for GM20x support. While GM10x supports the older layout still, this commit switches it to use the updated version instead. Piglit testing shows zero regressions on GM107. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:57:13 +10:00
Ben Skeggs	7333b0c20c	nvc0: import maxwell texture header definitions from rnndb Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:57:10 +10:00
Ben Skeggs	733c8f8c73	nv50-: split tic format specification We previously stored texture format information as it would appear in the TIC. We're about to support the new TIC layout that appeared with Maxwell, so it makes more sense to store the data in a split-out format. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:57:07 +10:00
Ben Skeggs	a928cbc205	nv50-: remove nv50_texture.xml.h Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:57:05 +10:00
Ben Skeggs	ff1af29dd9	nvc0: switch nvc0_tex.c to updated g80_texture.xml.h Verified (binary diff) to produce identical code. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:57:03 +10:00
Ben Skeggs	c999736c18	nvc0: switch nvc0_surface.c to updated g80_texture.xml.h Verified (binary diff) to produce identical code. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:57:02 +10:00
Ben Skeggs	63880dca12	nv50: switch nv50_tex.c to updated g80_texture.xml.h Verified (binary diff) to produce identical code. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:57:00 +10:00
Ben Skeggs	a15c08c95c	nv50: switch nv50_surface.c to updated g80_texture.xml.h Verified (binary diff) to produce identical code. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:56:58 +10:00
Ben Skeggs	59d93ad1be	nv50: switch nv50_state.c to updated g80_texture.xml.h Verified (binary diff) to produce identical code. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:56:56 +10:00
Ben Skeggs	1a45b7afb6	nv50-: switch nv50_formats.c to updated g80_texture.xml.h Verified (binary diff) to produce identical code. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:56:54 +10:00
Ben Skeggs	d5ac81295d	nv50: import updated g80_texture.xml.h from rnndb Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:56:52 +10:00
Ben Skeggs	7235b6250d	nv50-: remove nv50_defs.xml.h Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:56:50 +10:00
Ben Skeggs	b04b16754c	nv50-: switch nv50_formats.c to updated g80_defs.xml.h Verified (binary diff) to produce identical code. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:56:48 +10:00
Ben Skeggs	3444f83077	nv50-: improved macros to handle format specification Verified (binary diff) to produce identical code. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:56:45 +10:00
Ben Skeggs	346d7a24ea	nv50-: separate vertex formats from surface format descriptions We've previously had identical naming between vertex and texture formats, so it mostly made sense to define these together. However, upcoming patches are going to transition the driver over to using updated texture header definitions using NVIDIA's naming, and this will no longer be the case. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:56:42 +10:00
Ben Skeggs	3e2dd50d81	nvc0: remove unnecessary includes Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:56:40 +10:00
Ben Skeggs	e8eda47898	nvc0: switch nvc0_tex.c to updated g80_defs.xml.h Verified (binary diff) to produce identical code. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:56:38 +10:00
Ben Skeggs	546ccf3f82	nvc0: switch nvc0_surface.c to updated g80_defs.xml.h Verified (binary diff) to produce identical code. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:56:36 +10:00
Ben Skeggs	0a0d8e4497	nv50: remove unnecessary include Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:56:33 +10:00
Ben Skeggs	9c4b7748db	nv50: switch nv50_transfer.c to g80_defs.xml.h Verified (binary diff) to produce identical code. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:56:31 +10:00
Ben Skeggs	577eeb7984	nv50: switch nv50_tex.c to updated g80_defs.xml.h Verified (binary diff) to produce identical code. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:56:29 +10:00
Ben Skeggs	114d41feb2	nv50: switch nv50_surface.c to updated g80_defs.xml.h Verified (binary diff) to produce identical code. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:56:27 +10:00
Ben Skeggs	413cc25753	nv50: import updated g80_defs.xml.h from rnndb Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:56:12 +10:00
Nicolai Hähnle	2de9317d5f	st/mesa: count shader images in MaxCombinedShaderOutputResources Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-15 22:22:34 -05:00
Ilia Mirkin	1edbe0157d	st/mesa: enable GL image extensions when backend supports them This enables ARB_shader_image_load_store and ARB_shader_image_size when the backend claims support for these. It will also implicitly enable the image component of ARB_shader_texture_image_samples. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-02-15 22:22:33 -05:00
Ilia Mirkin	2e0a84208b	st/mesa: convert GLSL image intrinsics into TGSI Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-02-15 22:22:33 -05:00
Ilia Mirkin	672257dc69	st/mesa: allow st_format.h to be included from C++ files Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-02-15 22:22:33 -05:00
Nicolai Hähnle	ef27190a34	st/mesa: set pipe_image_view layers correctly for 3D textures Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-15 22:22:33 -05:00
Nicolai Hähnle	f1b0bda6bc	st/mesa: call st_finalize_texture from image atoms Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-15 22:22:33 -05:00
Ilia Mirkin	78093167b1	st/mesa: add an image atom for shader images Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-02-15 22:22:33 -05:00
Ilia Mirkin	e2a1ec5f0f	tgsi: show textual format representation Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-02-15 22:22:33 -05:00
Ilia Mirkin	9fbfa1abb2	gallium: add PIPE_SHADER_CAP_MAX_SHADER_IMAGES Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-02-15 22:22:33 -05:00
Ilia Mirkin	bceff68114	gallium: make image views non-persistent objects Make them akin to shader buffers, with no refcounting/etc. Just used to pass data about the bound image in ->set_shader_images. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-15 22:22:33 -05:00
Ilia Mirkin	cfbf25ac8f	st/mesa: empty buffer binding if the buffer's not really there This can happen with 0-sized buffers. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-15 22:22:33 -05:00
Kristian Høgsberg Kristensen	a3672a241b	anv/genxml: Include MBO bits for gen7 and gen75	2016-02-15 17:57:03 -08:00
Kristian Høgsberg Kristensen	c2b2ebf1ed	anv: Add missing gen75_cmd_buffer_set_subpass() prototype	2016-02-15 17:40:15 -08:00
Adam Jackson	80ec20351c	anv: Bump to 1.0.3 Probably this should be picked up from <vulkan.h> directly, or we should just assume that any 1.0.x is legal.	2016-02-15 17:38:26 -08:00
Kristian Høgsberg Kristensen	b53edea76c	anv/gen7: Make disabling the FS work We disable the fragment shader for depth/stencil-only pipelines. This commit makes that work for gen7.	2016-02-15 17:32:07 -08:00
Kristian Høgsberg Kristensen	85f67cf16e	anv: Deduplicate render pass code This lets us share the renderpass code and depth/stencil state code between gen 7 and gen 8.	2016-02-15 17:32:07 -08:00
Kristian Høgsberg Kristensen	ac4fd0ed21	anv/gen7: Fix pipeline selection in init_device_state() We need the 3D pipeline for the initial setup, not GPGPU.	2016-02-15 17:32:07 -08:00
Kristian Høgsberg Kristensen	ea694637ac	anv/gen7: Set 3DSTATE_SF depth buffer format correctly We need to pull this from the render pass information at state flush time.	2016-02-15 17:32:07 -08:00
Kristian Høgsberg Kristensen	18dd59538b	anv/gen7: Call flush_pipeline_select_3d() from CmdBeginRenderPass	2016-02-15 17:32:07 -08:00
Kristian Høgsberg Kristensen	832f73f512	anv: Share flush_pipeline_select_3d() between gen7 and gen8	2016-02-15 17:32:07 -08:00
Kristian Høgsberg Kristensen	53eaa0a6b8	anv: Fix warning 3DSTATE_VERTEX_ELEMENTS setup This is a little more subtle. If elem_count is 0, nothing else happens in this function, so we return early to avoid warning about uninitialized 'p'.	2016-02-15 17:32:07 -08:00
Kristian Høgsberg Kristensen	5d72d7b12d	anv: Fix misc simple warnings	2016-02-15 17:32:07 -08:00
Rhys Kidd	76e2af3dd4	docs: Document VC4_DEBUG envvar Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Signed-off-by: Eric Anholt <eric@anholt.net>	2016-02-15 17:13:52 -08:00
Rhys Kidd	aa82cc4b22	vc4: Add missing braces in initializer Silences the following GCC warning: mesa/src/gallium/drivers/vc4/vc4_qir_schedule.c: In function 'qir_schedule_instructions': mesa/src/gallium/drivers/vc4/vc4_qir_schedule.c:578:16: warning: missing braces around initializer [-Wmissing-braces] struct schedule_state state = { 0 }; ^ Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Signed-off-by: Eric Anholt <eric@anholt.net>	2016-02-15 17:13:52 -08:00
Rhys Kidd	c75ced3623	vc4: Correct typo setting 'handled_qinst_cond' Variable was previously always set to true. Accordingly, the later assert() served no active purpose. Found with GCC warning and code inspection: mesa/src/gallium/drivers/vc4/vc4_qpu_emit.c: In function'vc4_generate_code': mesa/src/gallium/drivers/vc4/vc4_qpu_emit.c:315:22: warning: variable 'handled_qinst_cond' set but not used [-Wunused-but-set-variable] bool handled_qinst_cond = true; ^ Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Signed-off-by: Eric Anholt <eric@anholt.net>	2016-02-15 17:13:52 -08:00
Eric Anholt	655fa0f465	vc4: Don't treat conditional MOVs as raw MOV. The two consumers want to know that the destination will be exactly the source, which is not true if we might not set the destination. Signed-off-by: Eric Anholt <eric@anholt.net>	2016-02-15 17:13:52 -08:00
Timothy Arceri	00a1bd13b5	glsl: warn in GL as well as ES when varying not written Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93339	2016-02-16 11:15:43 +11:00
Ilia Mirkin	6d39075c06	docs: update GLES 3.1 section for recent nvc0 additions Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-15 17:43:37 -05:00
Jason Ekstrand	08ecd8a8d1	anv/meta_resolve: Set origin_upper_left on gl_FragCoord It's required by the spec and any shaders that don't set it will be broken. I'm not really sure how multisampling was even working before...	2016-02-15 12:45:03 -08:00
Ilia Mirkin	4360ba0caf	mesa: need to check resource and set length even if bufSize is 0 This fixes a number of dEQP tests, such as: dEQP-GLES31.functional.program_interface_query.buffer_limited_query.resource_query It was expecting the length to be set even in the bufSize == 0 case. Also _mesa_get_program_resourceiv does some error checking on the resource which should probably happen even in the bufSize == 0 case as well although there's no dEQP test for that. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-02-15 12:20:25 -05:00
Ben Widawsky	66c790720b	i965/bxt: Production thread counts v2: Forgot to squash in the comment removal Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-02-15 07:48:09 -08:00
Daniel Czarnowski	5d87a7c894	egl_dri2: NULL check for xcb_dri2_get_buffers_reply() Without the check, unsuccessful xcb_dri2_get_buffers_reply(...) causes segmentation fault in dri2_get_buffers. Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org	2016-02-15 07:43:27 +02:00
Edward O'Callaghan	331f963b7e	nv50,nvc0: Remove duplicate logic from nvc0_set_framebuffer_state() We already have this logic in the gallium/util functions so lets reduce some entropy while here. V.2: Apply change to nv50 also as suggested by Samuel Pitoiset. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-02-14 23:56:54 +01:00
Samuel Pitoiset	cbf24a01dd	nv50: add missing PIPE_SHADER_CAP_SUPPORTED_IRS Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-14 22:56:02 +01:00
Kenneth Graunke	8122d21d15	i965: Fix gl_DrawID in the vec4 backend. brw_draw_upload.c uploads VertexID/InstanceID first, then DrawID. So we need to assign the attribute mapping in that order as well. Fixes the following Pigit tests with the vec4 backend: - arb_shader_draw_parameters-drawid vertexid - arb_shader_draw_parameters-drawid-indirect basevertex Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-02-14 13:24:07 -08:00
Brian Paul	816c987b67	mesa: move assertion in _mesa_cube_face_target() Fixes piglit arb_texture_view-sampling-2d-array-as-2d-layer regression. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94134 Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-14 09:16:22 -07:00
Serge Martin	a4cff1859e	clover: fix build failure since `bfd695e` Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-02-14 11:00:29 +01:00
Kenneth Graunke	565aa69970	glsl: Fix overflow of ImageAccess[] array. The ImageAccess array is statically sized to MAX_IMAGE_UNIFORMS: GLenum ImageAccess[MAX_IMAGE_UNIFORMS]; There was no bounds checking ensuring we don't overflow. Passing in a shader with too many uniforms would cause writes to extend into other fields, such as sh->NumImages. Later linker checks already handle reporting an error when there are too many images, so just avoid corrupting structures here. This rearranges the logic a bit to look more like the sampler case. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Tested-by: Jordan Justen <jordan.l.justen@intel.com>	2016-02-13 21:12:18 -08:00
Ilia Mirkin	6411444c36	mesa: default FixedSampleLocations to true when using a dummy image GL_ARB_texture_multisample and GLES 3.1 expect the initial value to be GL_TRUE. This fixes dEQP-GLES31.functional.state_query.texture_level.texture_2d_multisample_array.fixed_sample_locations_integer and a few related tests. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2016-02-13 23:41:28 -05:00
Jason Ekstrand	7410c60988	nir/types: Add more type constructor functions Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-02-13 17:22:36 -08:00
Jason Ekstrand	f05f576803	nir/types: Add a few more glsl_type_is_ functions Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-02-13 17:22:36 -08:00
Jason Ekstrand	914829f766	nir/types: Add helpers for working with sampler and image types Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-02-13 17:22:36 -08:00
Jason Ekstrand	d140b13fd5	nir/types: Add helpers for function types Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-02-13 17:22:36 -08:00
Jason Ekstrand	b9e94ad806	glsl/types: Expose glsl_struct_field and glsl_function_param to C Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-02-13 17:22:36 -08:00
Jason Ekstrand	954d46184f	glsl/types: Add a helper for getting image types Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-02-13 17:22:36 -08:00
Jason Ekstrand	95ea9f7708	glsl/types: Add support for function types SPIR-V has a concept of a function type that's used fairly heavily. We could special-case function types in SPIR-V -> NIR but it's easier if we just add support to glsl_types. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-02-13 17:22:36 -08:00
Jason Ekstrand	5ec6a65388	glsl/types: Add a bare "sampler" type This is to be used by SPIR-V for representing a sampler that isn't attached to any particular image. In SPIR-V, all of the interesting bits such as dimensionality, sampled type, etc. come from the image, the bare "sampler" type simply uses a sampled type of VOID and 0 values for the rest. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-02-13 17:22:36 -08:00
Jason Ekstrand	ac089126b9	glsl/types: Rename sampler_type to sampled_type It's a bit more descriptive since it is the base type that you get when you sample from it. Also, the next commit adds a bare "sampler" type and we need glsl_type::sampler_type available for a public static member. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-02-13 17:22:36 -08:00
Vinson Lee	4ed4c1d921	llvmpipe: Do not use barriers if not using threads. Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94088 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-13 14:42:05 -08:00
Francisco Jerez	9e30d66b7c	i965: Reupload push and pull constants when we get new shader image unit state. Fixes several of the "dEQP-GLES31.functional.image_load_storeload_storesingle_layer" dEQP tests that use image formats we implement using untyped surface messages. Cc: mesa-stable@lists.freedesktop.org Tested-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-13 14:33:32 -08:00
Samuel Pitoiset	40fcb6b9f9	i965: fix MAX_COMPUTE_SHARED_SIZE constant value MAX_COMPUTE_SHARED_SIZE should be set to 32768. This fixes a regression introduced in `be27f77` (mesa: do not use a constant for MAX_COMPUTE_SHARED_SIZE). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94139 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-13 23:13:31 +01:00
Samuel Pitoiset	7f0a19400e	nv50/ir: add missing SV_TID and SV_CTAID sysvals on GM107 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-13 22:26:38 +01:00
Samuel Pitoiset	d11266aa06	nv50/ir: add MEMBAR emission for GM107 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-13 22:06:15 +01:00
Alejandro Piñeiro	a150101125	docs: document MESA_GLES_VERSION_OVERRIDE envvar v2: Removed reference to FC not being an allowed suffix (Brian Paul) Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-13 20:21:06 +01:00
Samuel Pitoiset	b410ed9215	st/mesa: fix pipe_grid_info initializer Fixes MSVC build error which doesn't allow empty initializers. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-13 17:08:24 +01:00
Samuel Pitoiset	628b0e8571	trace: add all compute related functions Changes from v3: - dump the TGSI compute program Changes from v2: - remove use of MALLOC() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-13 16:01:02 +01:00
Samuel Pitoiset	fe0b55f39e	st/mesa: implement limits for ARB_compute_shader According to the spec, this also increases the following minimum values: - MAX_COMBINED_TEXTURE_IMAGE_UNITS 96 (616), was 80 - MAX_UNIFORM_BUFFER_BINDINGS 72 (612), was 60 ARB_compute_shader is not enabled by default because images support is still not implemented yet. If you want to use it you need to set MESA_EXTENSION_OVERRIDE=GL_ARB_compute_shader. Changes from v2: - make use of the new PIPE_CAP_SHADER_SUPPORTED_IRS cap instead of enabling the extension when PIPE_CAP_COMPUTE is enabled. - query for PIPE_CAP_COMPUTE first - s/shader_supported_irs/compute_supported_irs/ - disable ARB_compute_shader and add a comment which explains why Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-13 16:01:02 +01:00
Samuel Pitoiset	8aa666981b	st/mesa: add compute program dispatch callbacks This state tracker implements DispatchCompute() and DispatchComputeIndirect(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-13 16:01:01 +01:00
Samuel Pitoiset	805d92e540	st/mesa: add state validation for compute shaders This binds atomics, constants, samplers, ssbos, textures and ubos. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-13 16:01:01 +01:00
Samuel Pitoiset	61c87cd2c0	st/mesa: add mappings for compute shader sysvals LOCAL_INVOCATION_ID, WORK_GROUP_ID and NUM_WORK_GROUPS are respectively mapped to THREAD_ID, BLOCK_ID and GRID_SIZE. Changes from v2: - add assertions in st_translate_program() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-13 16:01:00 +01:00
Samuel Pitoiset	e8db4e4e0a	st/mesa: keep track of shared memory declarations Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-13 16:01:00 +01:00
Samuel Pitoiset	dfa58f0ff0	st/mesa: add intrinsics for shared variables This adds GLSL intrinsics for load/store and atomic operations. Changes from v2: - use PROGRAM_MEMORY instead of PROGRAM_BUFFER Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-13 16:01:00 +01:00
Samuel Pitoiset	44e04dc809	st/mesa: add conversion for compute shaders According to the spec, there are no predefined inputs nor any fixed-function outputs. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-13 16:01:00 +01:00
Samuel Pitoiset	7c79c1e3e2	st/mesa: add compute shader states Changes from v2: - use as much common code as possible (eg. st_basic_variant) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-13 16:00:54 +01:00
Samuel Pitoiset	08c46025c8	st/mesa: add a second pipeline for compute Compute needs a new and different validation path. Changes from v2: - make use of unreachable() instead of assert() when the pipeline is invalid - move the st_pipeline enumeration to st_context.h instead of st_api.h Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-13 15:51:17 +01:00
Samuel Pitoiset	a8328e3a50	tgsi/ureg: add shared variables support for compute shaders This introduces TGSI_FILE_MEMORY for shared, global and local memory. Only shared memory is currently supported. Changes from v2: - introduce TGSI_FILE_MEMORY Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-13 15:51:17 +01:00
Samuel Pitoiset	5e09ac78e5	gallium: add PIPE_SHADER_CAP_SUPPORTED_IRS This cap indicates the supported representations of programs. It should be a mask of pipe_shader_ir bits. It will allow to enable ARB_compute_shader if the underlying driver supports TGSI. Changes from v2: - improve description of PIPE_SHADER_CAP_SUPPORTED_IRS Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-13 15:51:17 +01:00
Samuel Pitoiset	43f4420fba	gallium: add indirect compute parameters to pipe_grid_info Like indirect draw, we need to store a resource and an offset that needs to be 4 byte aligned. When indirect is used, the size of the grid (in blocks) is stored with three 32-bit integers. Changes from v2: - s/most values/block sizes/ Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-13 15:51:17 +01:00
Samuel Pitoiset	bfd695e1d2	gallium: add a new interface for pipe_context::launch_grid() This introduces pipe_grid_info which contains all information to describe a launch_grid call. This will be used to implement indirect compute in the same fashion as indirect draw. Changes from v2: - correctly initialize pipe_grid_info for nv50/nvc0 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-13 15:51:17 +01:00
Samuel Pitoiset	61ed09c7ea	gallium/cso: add support for compute shaders Changes from v2: - removed cso_{save,restore}_compute_shader() functions and the compute_shader_saved variable because disabling compute shaders for meta ops is not currently needed Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-13 15:51:17 +01:00
Samuel Pitoiset	ffd9c7fd74	mesa: add PROGRAM_MEMORY This will be used for shared, global and local memory areas. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-13 15:51:17 +01:00
Samuel Pitoiset	a9eb1327be	mesa: store shared size in gl_compute_program The size of shared variables needs to be stored in gl_compute_program in order to set up pipe_compute_state::req_local_mem. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-13 15:51:17 +01:00
Samuel Pitoiset	be27f772e8	mesa: do not use a constant for MAX_COMPUTE_SHARED_SIZE This will allow to query the underlying drivers for the maximum total storage size of all variables declared as <shared> with PIPE_COMPUTE_CAP_MAX_LOCAL_SIZE. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-13 15:51:17 +01:00
Ilia Mirkin	f2547883cf	mesa: make compute maximums reflect driver-provided values Looks like the various max's were never plumbed through. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-13 15:51:17 +01:00
Topi Pohjolainen	f709a08457	i965: Add means for limiting color resolves Until now there has been only one type of color buffer that needs to resolved - namely single sampled fast clear. As even the sampler engine in GPU doesn't understand the associated meta data, the color values need to be always resolved prior to reading them. From SKL onwards there is new scheme supported called the lossless compression of single sampled color buffers. This is something that is understood by the sampling engine and therefore resolving of these types of buffers is not necessary before sampling. This patch adds means to make the distinction when considering if resolve is needed. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2016-02-13 09:50:24 +02:00
Topi Pohjolainen	7513c5c782	i965: Refactor resolving of auxiliary mode Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2016-02-13 09:30:36 +02:00
Topi Pohjolainen	9002bcdb35	i965: Don't try to create aux buffer for non-msrt aux-buffer In addition to simply calling miptree_create() the higher level call intel_miptree_create() also considers if the buffer should be associated with an auxiliary buffer based on the given format. Here we are allocating an auxiliary buffer which in turn has such format that would mislead intel_miptree_create_layout() later on to try to associate the auxiliary buffer with an auxiliary buffer. To prevent this the actual buffer creation logic was split out into its own function. Lets invoke that instead. v2 (Ben): Do not signal msaa layout with explicit argument but using layout_flags instead. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2016-02-13 09:28:41 +02:00
Jason Ekstrand	88042b9f10	nir: Get rid of the C++ NIR_SRC/DEST_INIT macros These were originally added to reduce compiler warnings but aren't really needed. Getting rid of them reduces the diff between the Vulkan branch and master, so we might as well.	2016-02-12 21:35:02 -08:00
Ben Widawsky	5743fd9571	i965: Rename optimizer debug 00 filename This allows ls, and scripts to get the file names in the correct order of optimization. Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-02-12 20:52:28 -08:00
Kenneth Graunke	c8b0020f2f	i965: Make brw_clear_cache NULL out stale program pointers. The L3 partitioning code tries to look at all programs - both render programs (VS/TCS/TES/GS/FS) and compute (CS). After calling brw_clear_cache, all prog_data pointers are invalid and point to freed data. The intention was that flagging the dirty bits for all programs would cause the next draw call to re-run the atoms for each program stage, uploading new programs and installing new, valid pointers. However, this doesn't quite work in our new multi-pipeline world. When drawing or dispatching a compute workload, we only consider the programs for the appropriate pipeline: drawing sets up VS/TCS/TES/GS/FS, but not CS, and vice versa. This leaves pointers dangling a bit longer than intended. The L3 configuration code tries to inspect the prog_data for all shader stages, so that we avoid having to reconfigure it when swapping back and forth between render and compute workloads. So we can't have dangling pointers. The fix is simple: have brw_clear_cache NULL out stale prog_data pointers, making it safe to inspect. The next L3 configuration pass will see either the render shaders or compute shader as missing for one go around, but will pick them up when both pipelines have run. In other words, we'll simply reconfigure L3 twice, which is safe, if a tiny bit wasteful - but then again, we just threw every compiled shader we had on the floor and started recompiling the from scratch, which is massively more wasteful, so it's not much of a concern. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93790 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jordan Justen <jljusten@gmail.com>	2016-02-12 20:35:34 -08:00
Ilia Mirkin	f56b5de877	mesa: avoid segfault in GetProgramPipelineInfoLog when no length If there is no pipe info log, we would unconditionally deref length, which was only optionally there. _mesa_copy_string handles the source being null, as well as the length, so may as well just always call it. Fixes a segfault in dEQP-GLES31.functional.state_query.program_pipeline.info_log Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-12 18:22:50 -05:00
Ilia Mirkin	f82ff6207c	mesa: reset offset/size to 0 when removing atomic binding Similar to commit `dd9d2963d6` (mesa: AtomicBufferBindings should be initialized to zero.), we should reset these to zero when unbinding. This fixes a number of dEQP failures due to cross-test pollution. The tests properly unbound everything, but when querying the values again, the expectation was that they would be 0. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-02-12 18:22:49 -05:00
Ilia Mirkin	b7e246d89a	mesa: recognize enums GL_COLOR_ATTACHMENT8-31 as valid Similar as for AUX1-3, these enums aren't invalid (i.e. -1) but also not supported by mesa. Returning BUFFER_COUNT causes the proper error to be returned by ReadBuffer and other functions. This resolves some failures in dEQP-GLES31.functional.debug.negative_coverage.get_error.buffer.read_buffer Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-12 18:22:49 -05:00
Ilia Mirkin	a663aa2a37	mesa/clear: update ClearBufferfv error handling for GL 4.5 spec This fixes dEQP-GLES31.functional.debug.negative_coverage.get_error.buffer.clear_bufferfv and brings the logic up to spec with GL 4.5 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-12 18:22:49 -05:00
Ilia Mirkin	3a0051bea9	mesa/clear: update ClearBufferuiv error handling for GL 4.5 spec This fixes dEQP-GLES31.functional.debug.negative_coverage.get_error.buffer.clear_bufferuiv and brings the logic up to spec with GL 4.5 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-12 18:22:49 -05:00
Ilia Mirkin	758162923b	mesa/clear: simplify ClearBufferiv error handling Might as well handle everything in the same error call. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-12 18:22:49 -05:00
Ilia Mirkin	86fd9d6b8e	mesa/clear: remove dead code handling ClearBufferiv(GL_DEPTH) There's a hunk above which sets INVALID_ENUM for GL_DEPTH unconditionally. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-12 18:22:48 -05:00
Ilia Mirkin	d33ef19479	mesa: allow DEPTH_STENCIL_TEXTURE_MODE queries in GLES 3.1 contexts This fixes dEQP-GLES31.functional.state_query.texture.texture_2d_multisample.depth_stencil_mode_integer and a few related tests. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-02-12 18:22:48 -05:00
Kenneth Graunke	2a0fc82864	i915: include teximage.h To get _mesa_num_tex_faces() prototype.	2016-02-12 15:20:29 -08:00
Kristian Høgsberg Kristensen	c136672c59	anv: Disable snooping for allocator pools again The race we were seeing on cherryview was caused by the multi-submit problem with fences. We can now turn snooping off again an rely on clflush and we intended.	2016-02-12 15:11:31 -08:00
Kristian Høgsberg Kristensen	b0c30b77d4	anv: Submit fence bo only after all command buffers We were submitting the fence bo after each command buffer in a multi command buffer submit, causing us to occasionally complete the fence too early.	2016-02-12 15:08:09 -08:00
Brian Paul	320ccf710e	i965: include teximage.h To get _mesa_num_tex_faces() prototype.	2016-02-12 15:42:54 -07:00
Axel Davy	cc0114f30b	st/nine: Implement Managed vertex/index buffers We were implementing those the same way than the default pool, which is sub-optimal. The buffer is supposed to return pointer to a ram copy when user locks, and automatically update the vram copy when needed. v2: Rename NineBuffer9_Validate to NineBuffer9_Upload Rename validate_buffers to update_managed_buffers Initialize NineBuffer9 managed fields after the resource is allocated. In case of allocation failure, when the dtor is executed, This->base.pool is then rightfully set. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-02-12 23:26:36 +01:00
Axel Davy	77d6c11f8f	st/nine: Align stack for entry points For 32 bits, incoming stack is 4-byte aligned. We need to realign the stack to 16-byte at some point, or there are issues later (crash with SSE, llvm, etc). This patch chooses to align the stack at API entry points. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-02-12 23:26:36 +01:00
Axel Davy	d7a5468da9	st/nine: Drop path for ureg_NRM and ureg_CLAMP using MIN/MAX is fine instead of CLAMP. NRM doesn't exist anymore. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-12 23:26:36 +01:00
Axel Davy	6b43f5b1d4	st/nine: Remove usage of SQRT in ff code SQRT is not supported everywhere, so replace it by RSQ + MUL and handle case <= 0. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-02-12 23:26:36 +01:00
Axel Davy	17078d92ea	st/nine: Fix stateblocks crashes with lights We had several issues of crashes with it. This should fix it. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-12 23:26:36 +01:00
Axel Davy	6cba347530	st/nine: SCRATCH does support all formats Add new argument to d3d9_to_pipe_format_checked to be able to bypass format support checks. This argument is set to TRUE when the requested Pool is SCRATCH. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-12 23:26:36 +01:00
Axel Davy	dbcb4f46ad	st/nine: Add format checks to create_zs_or_rt_surface Returns INVALIDCALL when trying to create a surface of unsupported format. In practice, apps are supposed to check for format support before trying to create a render target of that format. However some bad behaving apps could just try to create the surface and deduce if it failed that it wasn't supported. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-12 23:26:36 +01:00
Axel Davy	3a2e0c7784	st/nine: Support ATI1/ATI2 for CubeTexture Texture and CubeTexture use common code, and thus ATI1/ATI2 is already implemented for CubeTexture. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-12 23:26:36 +01:00
Axel Davy	6c4774bbe4	st/nine: Clean pSharedHandle Texture ctors checks Clarify the behaviour and clean the checks Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-12 23:26:36 +01:00
Axel Davy	bb65b189f3	st/nine: Move texture creation checks We were having checks at both CreateTexture functions and in ctors. Move all CreateTexture checks to ctors. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-12 23:26:36 +01:00
Axel Davy	d973a525d3	st/nine: Clean useless code in texture9.c This->base.base.resource is worth NULL for SYSTEMMEM textures. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-12 23:26:36 +01:00
Axel Davy	36b4bb303c	st/nine: Do not set SHARED flag for shared textures. We do not support shared textures, thus no need to set the shared flag. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-12 23:26:36 +01:00
Axel Davy	77a5871c1d	st/nine: Do not set resource usage for SYSTEMMEM We do not create a resource for SYSTEMMEM textures, thus we do not need to set resource usage. The only exception is vertexbuffer SYSTEMMEM, since we do use a pipe resource for them. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-12 23:26:36 +01:00
Brian Paul	9675fb6c68	mesa: move _mesa_num_tex_faces() to teximage.h So it's near the other cube map helper functions. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-02-12 15:11:38 -07:00
Brian Paul	6e09df24b5	mesa: simplify some code with new _mesa_cube_face_target() function Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-02-12 15:11:38 -07:00
Brian Paul	82db969ac0	mesa: add _mesa_cube_face_target() helper Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-02-12 15:11:24 -07:00
Brian Paul	d73f5a3133	mesa: make _mesa_tex_target_to_face() an inline function Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-02-12 15:10:37 -07:00
Brian Paul	6a08673c5e	mesa: remove _ARB suffix from cube map enums Just minor clean-up so we're consistent everywhere. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-02-12 15:10:15 -07:00
Brian Paul	ae70d0d68c	docs: Visual Studio 2013 or later is now required	2016-02-12 15:08:35 -07:00
Timothy Arceri	4e59362d1b	glsl: replace _strtoui64() with strtoull() for MSVC Now that MSVC 2013 is required we can remove this. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-13 08:57:01 +11:00
Kristian Høgsberg Kristensen	39a120aefe	anv: Implement VkPipelineCache We hash the input SPIR-V, specialization constants, entrypoint and the shader key using SHA1 to determine a unique identifier for the combination. A VkPipelineCache is then a hash table mapping these identifiers to the corresponding prog_data and kernel data.	2016-02-12 11:53:49 -08:00
Chad Versace	03bea8fda7	anv/meta_blit: Remove references to clearing Long ago, the blit code used to handle clearing and blitting. - Fix any comments that refer to clearing. - Rename shader var 'attr' to 'tex_pos'. The name 'attr' is an artifact of the time when the shader was used for blitting as well as clearing.	2016-02-12 11:29:29 -08:00
Chad Versace	97b5a07378	anv/meta_blit: Coalesce glsl_vec4_type vars Just a refactor. No behavior change. Several expressions have the same value: they point to glsl_vec4_type(). Coalesce them into a single variable.	2016-02-12 11:29:29 -08:00
Jason Ekstrand	699f21216f	anv/device: clflush simple batches if !LLC	2016-02-12 11:00:42 -08:00
Jason Ekstrand	42155abdd7	anv: Add a clfush_range helper function	2016-02-12 11:00:08 -08:00
Jason Ekstrand	3c8dc1afd1	nir/spirv/glsl: Clean up the row-skipping swizzle logic a bit	2016-02-12 10:40:39 -08:00
Chad Versace	37f4dfb19d	anv/meta: Move blit code to anv_meta_blit.c The clear code lived in anv_meta_clear.c. The resolve code in anv_meta_resolve.c. Only the blit code lived in anv_meta.c, alongside the shareed meta code. This is just a copy-paste patch. No change in behavior.	2016-02-12 09:56:24 -08:00
Chad Versace	cf7fd53850	anv/meta: Hardcode smooth texcoord interpolation in blit shaders Trivial cleanup. No change in behavior. Function argument 'attr_flat', in anv_meta.c:build_nir_vertex_shader(), was always false.	2016-02-12 09:15:58 -08:00
Jose Fonseca	950da38164	mesa: Use _aligned_malloc/free for MinGW too. We already use these for gallium in src/gallium/auxiliary/os/os_memory_stdc.h and it's always better to minimize divergences between MinGW and MSVC. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-12 14:51:28 +00:00
Jose Fonseca	c69ef377c8	mesa: Remove support for MSVC2008. Spotted by Emil Velikov. Trivial.	2016-02-12 10:31:15 +00:00
Jose Fonseca	5bc8d34526	util/u_atomic: Remove MSVC 2008 support. Spotted by Emil Velikov. Trivial.	2016-02-12 10:31:15 +00:00
Topi Pohjolainen	30711d984f	i965: Stop considering if msrt aux buffers need aux buffer Auxiliary buffers are always created with sample number of zero which effectively prevents intel_miptree_create_layout() from trying to associate auxiliary buffers with auxiliary buffers. Now that there is more direct path available lets start using it instead and stop even checking for such (im)possibility. v2 (Ben): Do not signal msaa layout with explicit argument but using layout_flags instead. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2016-02-12 09:17:29 +02:00
Topi Pohjolainen	422b1386d7	i965: Separate miptree creation from auxiliary buffer setup Currently the logic allocating and setting up miptrees is closely combined with decision making when to re-allocate buffers in X-tiled layout and when to associate colors with auxiliary buffers. These auxiliary buffers are in turn also represented as miptrees and are created by the same miptree creation logic calling itself recursively. This means considering in vain if the auxiliary buffers should be represented in X-tiled layout or if they should be associated with auxiliary buffers again. While this is somewhat unnecessary, this doesn't impose any problems currently. Miptrees for auxiliary buffers are created as simgle-sampled fusing the consideration for multi-sampled compression auxiliary buffers. The format in turn is such that is not applicable for single-sampled fast clears (that would require accompaning auxiliary buffer). But once the driver starts to support lossless compression of color buffers the auxiliary buffer will have a format that would itself be applicable for lossless compression. This would be rather difficult and ugly to detect in the current miptree creation logic, and therefore this patch seeks to separate the association logic from the general allocation and setup steps. v2 (Ben): - Do not reconsider for X-tiling in intel_miptree_create() as it was just forced to Y-tiling in miptree_create(). - Do not drop checks for allocation failures. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-02-12 09:13:07 +02:00
Topi Pohjolainen	d089f2d932	i965: Isolate aligned dimensions for stencil only This makes the logic a little more explicit and helps to keep subsequent patches easier to read. Suggested-by: Ben Widawsky <benjamin.widawsky@intel.com> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-02-12 09:13:07 +02:00
Topi Pohjolainen	0dcd9a09d1	i965: Restore vbo after color resolve during brw_try_draw_prims() Part of brw_try_draw_prims() is a check to validate textures (brw_validate_textures()). In case of textures that currently have only level zero but are marked for mipmap generation, i965 driver will decide to replace the underlying buffer with a larger one capable of holding also the additional levels. This results into blit from the original buffer to the newly allocated (see intel_miptree_copy_teximage()). This blit is currently handled with blitter engine and hence it won't effect the ongoing draw operation. However, this blit in turn may trigger color resolve on the source buffer. In principle, this should be possible with fast cleared buffers but I only started hitting it when I enabled lossless compression (that reguires similar resolve to fast cleared buffers). Now, the color resolve is a meta operation and uses the same drawing path we are already in middle of. After quite a bit of debugging I realized that the resolve will modify the current vbo setup but it won't restore it afterwards resulting in the original draw call using wrong vertex data. When brw_try_draw_prims() gets called, the vbo logic in the Mesa core (see vbo_draw_arrays()) has just bound the vbo (see vbo_bind_arrays() and recalculate_input_bindings()). Color resolve operation will overwrite the vbo setup by calling vbo_bind_arrays() against the resolve rectangle (see brw_draw_rectlist()). Once the color resolve is done the vbo setup is left to the resolve rectangle state and the original drawing call yields bogus results. This patch aims to restore the original state after the color resolve by calling vbo_bind_arrays() yet again after the vertex array state in the core context have been restored. Now having said all this, I'd also like to state that I'm quite uncomfortable with the nested meta operations. Ths original draw call in this case is in fact a meta operation itself. It is a blit from level zero to level one when generating the additional mipmap levels (see _mesa_meta_GenerateMipmap()). Imagine the complexity if the blit in the middle from buffer to another would go to meta path also instead of blitter. I would very tempted to try to move all the resolves to happen before a meta operation is started. Additionally I still feel that work I did earlier in the spring/ summer time moving meta operations to use direct state upload bypassing the core context would make sense. v2: Force input recalculation by setting the flag explicitly v3: Do not attempt to restore vbo for opengles1 which doesn't support vertex buffer objects. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-02-12 09:13:07 +02:00
Topi Pohjolainen	779429d063	i965: Validate textures before altering driver state Validation may kick off copies and subsequently color resolves. Color resolves (and the copies themselves if ending up in meta path) will overwrite the internal driver state but are not prepared to restore it. Instead of adding that capability the validation can be simply performed before the state is updated. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-02-12 09:13:07 +02:00
Kenneth Graunke	76f6f59c6e	i965: Make brw_clear_cache flag all the bits on both pipelines. Setting brw->ctx.NewDriverState and brw->ctx.NewGLState affects the dirty bits for the current pipeline. But, we need to flag everything dirty on both pipelines, so that when we switch back, we'll realize our programs are stale and re-upload them. To accomplish this, flag the saved state for both pipelines. Only one of them should matter, but this way we don't have to check which we need to set. It's harmless to set the other. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93790 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-11 22:53:19 -08:00
Samuel Iglesias Gonsálvez	61ceb36ead	glsl: Allow invariant qualifer in block members in desktop OpenGL. Feedback from Khronos is that 'invariant' should be allowed on block members for desktop OpenGL. Fix piglit regression added by `fe1e89a0`: invariant-qualifier-in-out-block-01.vert v2: - Allow it for in/out blocks in OpenGL ES too, so when OES_shader_io_blocks is supported we don't need to do any change (Timothy) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89330 Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-02-12 07:20:47 +01:00
Jason Ekstrand	ea93041ccc	anv/device: Use a normal BO in submit_simple_batch	2016-02-11 21:39:15 -08:00
Jason Ekstrand	3a2b23a447	anv: Add a vk_icdGetInstanceProcAddr entrypoint Aparently there are some issues in symbol resolution if an application packages its own loader and you have a system-installed one. I don't really understand the details, but it's not onorous to add.	2016-02-11 21:20:12 -08:00
Kenneth Graunke	e9644cb1f9	i965: Consider tessellation in get_pipeline_state_l3_weights. I think this was just missed; Curro and I were probably writing code simultaneously and forgot to combine them at the end. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-02-11 19:15:17 -08:00
Kenneth Graunke	f275c61c30	i965: Split brw_upload_texture_surfaces into compute/render atoms. When uploading state for the compute pipeline, we don't want to look at VS/TCS/TES/GS/FS programs, as they might be stale, and aren't relevant anyway. Likewise, the render pipeline shouldn't look at CS. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93790 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-02-11 19:15:08 -08:00
Jason Ekstrand	25b09d1b5d	anv/event: Use a 64-bit value The immediate write from PIPE_CONTROL is 64-bits at least on BDW. This used to work on 64-bit archs because the compiler would align the following anv_state struct up for us. However, in 32-bit builds, they overlap and it causes problems.	2016-02-11 19:00:56 -08:00
Marek Olšák	f3943614ff	radeonsi: fix build with LLVM 3.6 Broken by this cleanup: `3dc1cb0cc7` Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-12 00:41:36 +01:00
Jason Ekstrand	3086c5a5e1	gen8/pipeline: Properly set bits in PS_EXTRA for W, depth, and samaple mask	2016-02-11 15:22:18 -08:00
Jason Ekstrand	4016619931	nir/spirv: Allow the clip distance capability.	2016-02-11 15:14:46 -08:00
Jason Ekstrand	da4a6bbbea	gen8/pipeline: Pull gs_vertex_count from prog_data	2016-02-11 15:13:54 -08:00
Jason Ekstrand	ff8895ba56	Merge remote-tracking branch 'mesa-public/master' into vulkan	2016-02-11 15:09:30 -08:00
Jason Ekstrand	9f8c01b03c	i965/gs: Pass VerticesIn though prog_data Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-11 15:07:20 -08:00
Jason Ekstrand	56eb9c44ad	i965/fs: Pass usage of depth, W, and sample mask through prog_data We really need to stop pulling information directly out of shaders for state setup. For one thing, if we want any sort of an on-disk shader cache, having all of this metadata in one place is going to be crucial. Also, passing it all through prog_data cleans up the compiler <-> state setup API substantially. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-11 15:07:20 -08:00
Jason Ekstrand	ae3543950c	i965/fs: Refactor setup_payload_gen6 to assume FS It's extremely FS specific so the fact that we have a stage check in the middle of it is rather bogus. While were here, we rename setup_payload_gen4 and setup_payload_gen6 to make it obvious that they are both FS specific. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-11 15:07:20 -08:00
Samuel Pitoiset	d759f0ddf1	nv50,nvc0: remove unused parameter in nvXX_state_validate() This 'words' parameter is there since 2011 but it has never been used. While we are at it, get rid of the extern declaration. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-11 23:14:16 +01:00
Timothy Arceri	b600247035	glsl: don't validate interface blocks twice We already check for opaque types so don't recheck for atomics and images. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-12 09:12:23 +11:00
Timothy Arceri	98d3cc9fbc	glsl: remove duplicate embedded struct validation Commit `c98deb18d5` in 2010 disallowed embedded struct definitions in ES. Then in 2013 `d9bb8b7b56` disallowed it for everything but GLSL 1.10. Commit `c98deb18d5` seemed the cleanest way to do the check so its been extended to cover GL and the other version has been removed. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-12 09:06:49 +11:00
Jose Fonseca	0d4898ae80	include,gallium: Remove pre-MSVC 2013 compatibility. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-11 21:36:00 +00:00
Jose Fonseca	a97a955b92	scons: Eliminate MSVC2008 compatibility. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-11 21:36:00 +00:00
Jose Fonseca	1cadfe08c4	configure: Eliminate MSVC2008 compatibility. We no longer need to build any part of Mesa with Windows SDK 7.0.7600 or MSVC 2008. MSVC 2013 will be the oldest we support. In practice this means people are now free to declare variables in the middle of blocks, on the whole Mesa tree. Care should still be taken with variable length arrays and void pointer arithmetic. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Hella-acked-by: Ian Romanick <ian.d.romanick@intel.com>	2016-02-11 21:36:00 +00:00
Chris Forbes	a2c8b5ece5	i965: ir: dump floats as %-g rather than %f, so we can see denormals Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2016-02-11 12:10:29 -08:00
Jordan Justen	9f36070c2f	i965/gen7: Require kernel cmd_parser 5 for ARB_compute_shader The indirect dispatch registers were whitelisted in command parser version 5. (Version 5 is available as of Linux 4.4) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-11 10:49:13 -08:00
Marek Olšák	a8aa73f768	st/mesa: release GLSL IR in LinkShader after it's not needed Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-11 17:31:40 +01:00
Marek Olšák	906ecab450	mesa: call build_program_resource_list inside Driver.LinkShader to allow LinkShader to free the GLSL IR. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-02-11 16:56:28 +01:00
Marek Olšák	0f235c960c	st/mesa: use correct pipe functions to create tess shaders Broken by one of my cleanups. Spotted by luck. Radeonsi doesn't care, because all shader create callbacks go to the same function. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-11 16:56:28 +01:00
Marek Olšák	100796c15c	gallium/radeon: drop support for LLVM 3.5 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> v2: adjust the comment in the amdgpu winsys	2016-02-11 16:48:30 +01:00
Marek Olšák	3dc1cb0cc7	radeonsi: obtain commonly used LLVM types only once Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-11 16:48:30 +01:00
Marek Olšák	1643dca513	radeonsi: cleanup shader codegen si_shader_ctx -> ctx type * ptr -> type ptr si_shader_context shader -> si_shader_context *ctx Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-11 16:48:30 +01:00
Marek Olšák	1c8a1a8fed	radeonsi: fix a crash when binding a sampler buffer Buffers don't contain r600_texture. Broken by `7aedbbacae`: "radeonsi: put image, fmask, and sampler descriptors into one array" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94091	2016-02-11 16:48:30 +01:00
Kristian Høgsberg Kristensen	2009e304f7	anv/pack: Handle case where a struct field covers multiple dwords We also didn't add start to field.end to get the absolute field end position.	2016-02-10 22:36:38 -08:00
Emil Velikov	0f3cea95ab	docs: add news item and link release notes for 11.1.2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-11 01:47:16 +00:00
Emil Velikov	0802afd92d	docs: add sha256 checksums for 11.1.2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `e49dd21bcb`)	2016-02-11 01:45:27 +00:00
Emil Velikov	323782aa57	docs: add release notes for 11.1.2 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `7bcd827806`)	2016-02-11 01:45:25 +00:00
Jason Ekstrand	f710f3ca37	Merge remote-tracking branch 'mesa-public/master' into vulkan This also reverts commit `1d65abfa58` because now NIR handles texture offsets in a much more sane way.	2016-02-10 17:12:11 -08:00
Jason Ekstrand	7ef3e47c27	Merge commit '85f5c18fef1ff2f19d698f150e23a02acd6f59b9' into vulkan	2016-02-10 17:09:56 -08:00
Kristian Høgsberg Kristensen	d2623a3247	anv: Handle dwords that are all MBZ correctly A few packets have dwords in them that are all MBZ and we failed to write those. This change makes sure we iterate through all dwords and write them all.	2016-02-10 16:36:47 -08:00
Jason Ekstrand	8750299a42	nir: Remove the const_offset from nir_tex_instr When NIR was originally drafted, there was no easy way to determine if something was constant or not. The result was that we had lots of special-casing for constant values such as this. Now that load_const instructions are SSA-only, it's really easy to find constants and this isn't really needed anymore. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Rob Clark <robclark@gmail.com>	2016-02-10 16:33:50 -08:00
Jason Ekstrand	70dff4a55e	nir/lower_vec_to_movs: Better report channels handled by insert_mov This fixes two issues. First, we had a use-after-free in the case where the instruction got deleted and we tried to return mov->dest.write_mask. Second, in the case where we are doing a self-mov of a register, we delete those channels that are moved to themselves from the write-mask. This means that those channels aren't reported as being handled even though they are. We now stash off the write-mask before remove unneeded channels so that they still get reported as handled. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94073 Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2016-02-10 16:33:14 -08:00
Kristian Høgsberg Kristensen	09bb7ea4b7	anv: Fix out-of-tree build We need to be able to find the generated nir_opcodes.h header.	2016-02-10 15:54:28 -08:00
Kristian Høgsberg Kristensen	9cc939d82f	nir: Fix out-of-tree build for spirv2nir This needs to be able to find the generated nir_opcodes.h header.	2016-02-10 15:54:28 -08:00
Jason Ekstrand	9be5a4bc29	nir/spirv: Fix handling of OpGroupMemberDecorate We were pulling the member index from the wrong dword	2016-02-10 15:36:42 -08:00
Jason Ekstrand	ac04c6de2c	nir/spirv: Assert that struct member ids are in-bounds	2016-02-10 15:36:41 -08:00
Marek Olšák	6ee1c386fe	radeonsi: don't emit unnecessary NULL exports for unbound targets (v3) v2: remove semantic index == 0 checks add the else statement to remove shadowing of args v3: fix fbo-alphatest-nocolor regression Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v2)	2016-02-10 23:53:17 +01:00
Mark Janes	8179834030	nir/spirv: fix build_mat_subdet stack smasher The sub-determinate implementation pattern fixed by `6a7e2904e0` has a second instance in the same file. With the previous algorithm, when row and j are both 3, the index overruns the array. This only impacts the stack on 32 bit builds. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-02-10 14:43:03 -08:00
Kristian Høgsberg Kristensen	51c01e292c	anv: Generate pack headers from XML definition This huge commit switches us over to using a simple xml format (genxml) for defining our command streamer commands and a python script for generating the pack headers we use in the driver.	2016-02-10 14:31:26 -08:00
Ben Widawsky	088280e022	i965: Make sure we blit a full compressed block This fixes an assertion failure in [at least] one of the Unreal Engine Linux demo/games that uses DXT1 compression. Specifically, the "Vehicle Game". At some point, the game ends up trying to blit mip level whose size is 2x2, which is smaller than a DXT1 block. As a result, the assertion in the blit path is triggered. It should be safe to simply make sure we align the width and height, which is sadly an example of compression being less efficient. NOTE: The demo seems to work fine without the assert, and therefore release builds of mesa wouldn't stumble over this. Perhaps there is some unnoticeable corruption, but I had trouble spotting it. Thanks to Jason for looking at my backtrace and figuring out what was going on. v2: Use NPOT alignment to make sure ASTC is handled properly (Ilia) Remove comment about how this doesn't fix other bugs, because it does. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93358 Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Tested-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-02-10 14:08:46 -08:00
Marek Olšák	79d0082c64	radeon/uvd: silence a warning	2016-02-10 20:16:17 +01:00
Marek Olšák	d9c8a8fe61	r300g: silence warnings	2016-02-10 20:16:17 +01:00
Ian Romanick	0ecc9d907e	meta/decompress: Don't pollute the renderbuffer namespace tl;dr: For many types of GL object, we can NEVER use the Gen function. In OpenGL ES (all versions!) and OpenGL compatibility profile, applications don't have to call Gen functions. The GL spec is very clear about how you can mix-and-match generated names and non-generated names: you can use any name you want for a particular object type until you call the Gen function for that object type. Here's the problem scenario: - Application calls a meta function that generates a name. The first Gen will probably return 1. - Application decides to use the same name for an object of the same type without calling Gen. Many demo programs use names 1, 2, 3, etc. without calling Gen. - Application calls the meta function again, and the meta function replaces the data. The application's data is lost, and the app fails. Have fun debugging that. Fixes piglit 'object-namespace-pollution glGetTexImage-compressed renderbuffer' test. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363 Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-02-10 10:59:55 -08:00
Ian Romanick	3aeff21fbf	meta: Use internal functions for renderbuffer access Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-02-10 10:59:53 -08:00
Ian Romanick	4087c17832	meta/decompress: Track renderbuffer using gl_renderbuffer instead of GL API object handle Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-02-10 10:59:50 -08:00
Ian Romanick	47a5aa4bfa	i965/meta: Don't pollute the renderbuffer namespace tl;dr: For many types of GL object, we can NEVER use the Gen function. In OpenGL ES (all versions!) and OpenGL compatibility profile, applications don't have to call Gen functions. The GL spec is very clear about how you can mix-and-match generated names and non-generated names: you can use any name you want for a particular object type until you call the Gen function for that object type. Here's the problem scenario: - Application calls a meta function that generates a name. The first Gen will probably return 1. - Application decides to use the same name for an object of the same type without calling Gen. Many demo programs use names 1, 2, 3, etc. without calling Gen. - Application calls the meta function again, and the meta function replaces the data. The application's data is lost, and the app fails. Have fun debugging that. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363 Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-02-10 10:59:47 -08:00
Ian Romanick	03506c9ef1	i965/meta: Use internal functions for renderbuffer access Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-02-10 10:59:44 -08:00
Ian Romanick	4c6b0e017c	i965/meta: Return struct gl_renderbuffer* from brw_get_rb_for_slice instead of GL API handle Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-02-10 10:59:42 -08:00
Ian Romanick	ab2b631703	meta: Don't save or restore the renderbuffer binding Nothing left in meta does anything with the RBO binding, so we don't need to save or restore it. The FBO binding is still modified. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-02-10 10:59:40 -08:00
Ian Romanick	e273bbd60b	meta: Use _mesa_CreateRenderbuffers instead of _mesa_GenRenderbuffers and _mesa_BindRenderbuffer This has the advantage that it does not pollute the global binding state. It also enables later patches that will stop calling _mesa_GenRenderbuffers / _mesa_CreateRenderbuffers which pollute the renderbuffer namespace. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-02-10 10:59:36 -08:00
Ian Romanick	1e055e9211	i965/meta: Use _mesa_CreateRenderbuffers instead of _mesa_GenRenderbuffers and _mesa_BindRenderbuffer This has the advantage that it does not pollute the global binding state. It also enables later patches that will stop calling _mesa_GenRenderbuffers / _mesa_CreateRenderbuffers which pollute the renderbuffer namespace. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-02-10 10:59:33 -08:00
Ian Romanick	eb5bc62e97	mesa: Refactor renderbuffer_storage to make _mesa_renderbuffer_storage Pulls the parts of renderbuffer_storage that aren't just parameter validation out into a function that can be called from other parts of Mesa (e.g., meta). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-02-10 10:59:31 -08:00
Ian Romanick	9ae42ab1ec	mesa: Refactor _mesa_framebuffer_renderbuffer This function previously was only used in fbobject.c and contained a bunch of API validation. Split the function into framebuffer_renderbuffer that is static and contains the validation, and _mesa_framebuffer_renderbuffer that is suitable for calling from elsewhere in Mesa (e.g., meta). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-02-10 10:59:28 -08:00
Marek Olšák	7aedbbacae	radeonsi: put image, fmask, and sampler descriptors into one array The texture slot is expanded to 16 dwords containing 2 descriptors. Those can be: - Image and fmask, or - Image and sampler state By carefully choosing the locations, we can put all three into one slot, with the fmask and sampler state being mutually exclusive. This improves shaders in 2 ways: - 2 user SGPRs are unused, shaders can use them as temporary registers now - each pair of descriptors is always on the same cache line v2: cosmetic changes: add back v8i32, don't load a sampler state & fmask at the same time Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-10 19:41:49 +01:00
Marek Olšák	796ee76e2e	winsys/radeon: fix the num_tile_pipes comment to silence warnings	2016-02-10 19:41:49 +01:00
Alexandre Demers	111602e159	winsys/radeon: better explain the num_tile_pipes fixup for TAHITI (v2) v2: Clarify the relation between num_tiles_pipes and GB_TILE_MODE and the fix needed for Tahiti as suggested by Marek. Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-02-10 19:29:41 +01:00
Samuel Pitoiset	5e8db898fd	st/mesa: check ureg_create() retval in create_pbo_upload_vs() This avoids a possible NULL dereference because ureg_create() might return a NULL pointer. Spotted by coverity. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-10 18:26:20 +01:00
Bernhard Rosenkränzer	e86ba7844f	freedreno/ir3: Get rid of nested functions This allows building Freedreno with clang Signed-off-by: Bernhard Rosenkränzer <bero@linaro.org> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-02-10 11:26:48 -05:00
Chris Forbes	43d23e879c	i965/blorp: Fix hiz ops on MSAA surfaces Two things were broken here: - The depth/stencil surface dimensions were broken for MSAA. - Sample count was programmed incorrectly. Result was the depth resolve didn't work correctly on MSAA surfaces, and so sampling the surface later produced garbage. Fixes the new piglit test arb_texture_multisample-sample-depth, and various artifacts in 'tesseract' with msaa=4 glineardepth=0. Fixes freedesktop bug #76396. Not observed any piglit regressions on Haswell. v2: Just set brw_hiz_op_params::dst.num_samples rather than adding a helper function (Ken). Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> v3: moved the alignment needed for hiz+msaa to brw_blorp.cpp, as suggested by Chad Versace (Alejandro Piñeiro on behalf of Chris Forbes) Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com> Tested-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-02-10 09:00:05 +01:00
Topi Pohjolainen	878b2b8964	i965/gen8: Remove dead assertion The assertion is inside a condition mandating num_samples > 1 and therefore the first half of the constraint is always met. The second half in turn would only be applicable for single sampled case and moreover it is trying to falsely check against surface type instead of format. Subsequent patches will introduce proper support for the lossless compression and dropping this here makes the patches a little simpler. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2016-02-10 09:11:34 +02:00
Topi Pohjolainen	3c432d48bf	i965: Use constant pointer when checking for compression Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2016-02-10 09:10:45 +02:00
Brian Paul	85fab1f09a	mesa: fix trivial comment typo in dlist.c	2016-02-09 20:09:30 -07:00
Kenneth Graunke	85f5c18fef	i965/vec4: Drop support for ATTR as an instruction destination. This is no longer necessary...and it doesn't make much sense to have inputs as destinations. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2016-02-09 17:01:45 -08:00
Kenneth Graunke	67c5d00273	i965/vec4/gs: Stop munging the ATTR containing gl_PointSize. gl_PointSize is delivered in the .w component of the VUE header, while the language expects it to be a float (and thus in the .x component). Previously, we emitted MOVs to copy it over to the .x component. But this is silly - we can just use a .wwww swizzle and access it without copying anything or clobbering the value stored at .x (which admittedly is useless). Removes the last use of ATTR destinations. v2: Use BRW_SWIZZLE_WWWW, not SWIZZLE_WWWW (caught by GCC). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2016-02-09 17:01:45 -08:00
Kenneth Graunke	d56ae2d160	i965: Apply VS attribute workarounds in NIR. This patch re-implements the pre-Haswell VS attribute workarounds. Instead of emitting shader code in the vec4 backend, we now simply call a NIR pass to emit the necessary code. This simplifies the vec4 backend. Beyond deleting code, it removes the primary use of ATTR as a destination. It also eliminates the requirement that the vec4 VS backend express the ATTR file in terms of VERT_ATTRIB_* locations, giving us a bit more flexibility. This approach is a little different: rather than munging the attributes at the top, we emit code to fix them up when they're accessed. However, we run the optimizer afterwards, so CSE should eliminate the redundant math. It may even be able to fuse it with other calculations based on the input value. shader-db does not handle non-default NOS settings, so I have no statistics about this patch. Note that the scalar backend does not implement VS attribute workarounds, as they are unnecessary on hardware which allows SIMD8 VS. v2: Do one multiply for FIXED rescaling and select components from either the original or scaled copy, rather than multiplying each component separately (suggested by Matt Turner). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2016-02-09 17:01:45 -08:00
Jason Ekstrand	09b3e30dc6	anv: Fix up spirv for new texture/sampler split stuff	2016-02-09 16:48:36 -08:00
Brian Paul	cac54d7987	st/mesa: clarify some texture target code in st_cb_drawpix.c Use st->internal_target instead of PIPE_TEXTURE_2D when choosing the texture format. Probably no real difference, but let's be consistent. Simplify a test when determining whether we need normalized texcoords. Add a new assertion. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-09 17:48:26 -07:00
Brian Paul	5e4de781fa	st/mesa: fix bitmap texture target code and simplify tex sampler state Bitmaps may be drawn with a PIPE_TEXTURE_2D or PIPE_TEXTURE_RECT resource as determined at context creation by checking if PIPE_CAP_NPOT_TEXTURES is supported. But many places in the bitmap code were hard-coded to use PIPE_TEXTURE_2D. Use st->internal_target instead. I think an older NV chip is the only case where a gallium driver does not support NPOT textures. Bitmap drawing was probably broken for that GPU. Also, we only need one sampler state with texcoord normalization set up according to st->internal_target. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-09 17:48:25 -07:00
Brian Paul	9e2a9d5743	st/mesa: use MAX3() macro, as we do for sampler view code below Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-09 17:48:25 -07:00
Brian Paul	a5b8ede253	st/mesa: move some st_cb_drawpixels.c code, add comments	2016-02-09 17:47:42 -07:00
Jason Ekstrand	b14f4c1fd3	Merge remote-tracking branch 'mesa-public/master' into vulkan This pulls in the separate texture/sampler stuff from upstream	2016-02-09 16:47:37 -08:00
Jason Ekstrand	e01dd59b73	vtn: Use const_index helpers	2016-02-09 16:32:38 -08:00
Jason Ekstrand	e15f7551d1	anv/apply_pipeline_layout: Use the new const_index helpers	2016-02-09 16:32:38 -08:00
Jason Ekstrand	768bd7f272	Merge commit '8b0fb1c152fe191768953aa8c77b89034a377f83' into vulkan This pulls in Rob Clark's const_index changes for NIR	2016-02-09 15:30:39 -08:00
Nanley Chery	c624241ef4	mesa/readpix: Dedent former _mesa_readpixels() if block Formatting patch split out for easy reviewing. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-09 15:13:07 -08:00
Nanley Chery	b89a8a15c2	mesa/readpix: Don't clip in _mesa_readpixels() The clipping is performed higher up in the call-chain. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-09 15:13:07 -08:00
Nanley Chery	605832736a	mesa/readpix: Clip ReadPixels() area to the ReadBuffer's The fast path for Intel's ReadPixels() unintentionally omits clipping the specified area to a valid one. Rather than clip in various corner-cases, perform this operation in the API validation stage. The bug in intel_readpixels_tiled_memcpy() showed itself when the winsys ReadBuffer's height was smaller than the one specified by ReadPixels(). yoffset became negative, which was an invalid input for tiled_to_linear(). v2: Move clipping to validation stage (Jason) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92193 Reported-by: Marta Löfstedt <marta.lofstedt@intel.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-09 15:13:07 -08:00
Nanley Chery	55d56d34e0	mesa/image: Make _mesa_clip_readpixels() work with renderbuffers v2: Use gl_renderbuffer::{Width,Height} (Jason) Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-09 15:13:07 -08:00
Jason Ekstrand	d03e5d5255	i965/vec4: Plumb separate surfaces and samplers through from NIR Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 15:00:17 -08:00
Jason Ekstrand	f88027f7bd	i965/vec4: Separate the sampler from the surface in generate_tex Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 15:00:17 -08:00
Jason Ekstrand	b8ab9c8c86	i965/fs: Plumb separate surfaces and samplers through from NIR Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 15:00:17 -08:00
Jason Ekstrand	c0c14de130	i965/fs: Separate the sampler from the surface in generate_tex Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 15:00:17 -08:00
Jason Ekstrand	a37b8110c1	i965/fs: Add an enum for keeping track of texture instruciton sources These logical texture instructions can have a lot of sources. It's much safer if we have symbolic names for them. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 15:00:17 -08:00
Jason Ekstrand	5ec456375e	nir: Separate texture from sampler in nir_tex_instr This commit adds the capability to NIR to support separate textures and samplers. As it currently stands, glsl_to_nir only sets the texture deref and leaves the sampler deref alone as it did before and nir_lower_samplers assumes this. Backends can still assume that they are combined and only look at only at the texture index. Or, if they wish, they can assume that they are separate because nir_lower_samplers, tgsi_to_nir, and prog_to_nir all set both texture and sampler index whenever a sampler is required (the two indices are the same in this case). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 15:00:17 -08:00
Jason Ekstrand	ee85014b90	nir/tex_instr: Rename sampler to texture We're about to separate the two concepts. When we do, the sampler will become optional. Doing a rename first makes the separation a bit more safe because drivers that depend on GLSL or TGSI behaviour will be fine to just use the texture index all the time. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 15:00:17 -08:00
Jason Ekstrand	3f42184994	nir: Add some braces around loops and ifs	2016-02-09 15:00:17 -08:00
Kenneth Graunke	830b075e86	i965: Explicitly write the "TR DS Cache Disable" bit at TCS EOT. Bit 0 of the Patch Header is "TR DS Cache Disable". Setting that bit disables the DS Cache for tessellator-output topologies resulting in stitch-transition regions (but leaves it enabled for other cases). We probably shouldn't leave this to chance - the URB could contain garbage - which could result in the cache randomly being turned on or off. This patch makes the final EOT write 0 to the first DWord (which only contains this one bit). This ensures the cache is always on. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-02-09 14:54:26 -08:00
Rob Clark	8b0fb1c152	freedreno/ir3: use const_index helpers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-02-09 17:30:33 -05:00
Rob Clark	ced8d3e773	nir: use const_index helpers Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-02-09 17:30:33 -05:00
Rob Clark	6921762de6	ptn: use const_index helpers Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-02-09 17:30:33 -05:00
Rob Clark	ead05e8670	ttn: use const_index helpers Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-02-09 17:30:33 -05:00
Rob Clark	b1770235ed	ttn: small logic cleanup The only case where dim!=NULL is where op==load_ubo. But using op==load_ubo is less confusing. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-02-09 17:30:33 -05:00
Rob Clark	b6cf98bc82	gtn: use const_index helpers Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-02-09 17:30:33 -05:00
Rob Clark	1df3ecc1b8	nir: const_index helpers Direct access to intr->const_index[n], where different slots have different meanings, is somewhat confusing. Instead, let's put some extra info in nir_intrinsic_infos[] about which slots map to what, and add some get/set helpers. The helpers validate that the field being accessed (base/writemask/etc) is applicable for the intrinsic opc, for some extra safety. And nir_print can use this to dump out decoded const_index fields. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-02-09 17:30:33 -05:00
Chad Versace	4c5dcccfba	anv/image: Fix usage for depthstencil images The tests assertion-failed in vkCmdClearDepthStencilImage because the isl surface lacked ISL_SURF_USAGE_DEPTH_BIT. Fixes: https://gitlab.khronos.org/vulkan/mesa/issues/26 Fixes: dEQP-VK.pipeline.timestamp.transfer_tests.host_stage_with_clear_depth_stencil_image_method Fixes: dEQP-VK.pipeline.timestamp.transfer_tests.transfer_stage_with_clear_depth_stencil_image_method	2016-02-09 12:54:30 -08:00
Chad Versace	c5e521f391	anv/image: Refactor choose_isl_surf_usage() - Rename local var isl_flags -> isl_usage. - Fix comment.	2016-02-09 12:54:30 -08:00
Chad Versace	2f4bb00c2b	anv/image: Fix choose_isl_surf_usage() Don't translate VkImageCreateInfo::usage into an isl_surf_usage bitmask. Instead, translate anv_image::usage, which is a superset of VkImageCreateInfo::usage. For-Issue: https://gitlab.khronos.org/vulkan/mesa/issues/26	2016-02-09 12:54:04 -08:00
Kenneth Graunke	8b0f6de73d	glsl: Disallow transform feedback varyings with compute shaders. If the only stage is MESA_SHADER_COMPUTE, we should complain that there's nothing coming out of the geometry shader stage just as we would if the first stage were MESA_SHADER_FRAGMENT. Also, it's valid for tessellation shaders to be the stage producing transform feedback varyings, so mention those in the compiler error. Found by inspection. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-02-09 12:34:11 -08:00
Marek Olšák	329181ae33	radeonsi: enable denorms for 64-bit and 16-bit floats This fixes FP16 conversion instructions for VI, which has 16-bit floats, but not SI & CI, which can't disable denorms for those instructions. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	17fe3fa312	gallium: pass the robust buffer access context flag to drivers radeonsi will not do bounds checking for loads if this is not set. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	d611fce23d	gallium/radeon: add a function for adding llvm function attributes This will be used for setting the new InitialPSInputAddr attribute. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	de2e28366a	radeonsi: compile geometry shaders immediately they have only 1 variant Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	f7a8b6fff5	radeonsi: split out code for deleting si_shader Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	e21142087c	radeonsi: move code writing tess factors into a separate function Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	dc5fc3c2f6	radeonsi: make LLVM IR dumping less messy Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	c1041366db	radeonsi: move a few r600_can_dump_shader calls to where they're needed Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	b6d5666fbf	radeonsi: remove useless code that handles dx10_clamp_mode "enable-no-nans-fp-math" is a wrong string and there was a disagreement about fixing it. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	57271d5364	radeonsi: dump SPI_PS_INPUT values along with shader stats Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	5a53628f45	radeonsi: read SPI_PS_INPUT_ADDR from LLVM if it returns it Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	9483fcc7f2	radeonsi: don't force gl_SampleMaskIn to 1 for smoothing Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	c379c2540b	radeonsi: split PS input interpolation code into its own function This will be used by the fragment shader prolog. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	b9126dcda8	radeonsi: implement forcing per-sample_interpolation using the shader key only It was partly a state and partly emulated by shader code, but since we want to do this in a fragment shader prolog, we need to put it into the shader key, which will be used to generate the prolog. This also removes the spi_ps_input states and moves the registers to the PS state. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	4596f3c1b8	radeonsi: remove si_shader::ps_input_interpolate tgsi_shader_info has this too. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	6dda2455c8	radeonsi: move BCOLOR PS input locations after all other inputs BCOLOR inputs were immediately after COLOR inputs. Thus, all following inputs were offset by 1 if color_two_side was enabled, and not offset if it was not enabled, which is a variation that's problematic if we want to have 1 variant per shader and the variant doesn't care about color_two_side (that should be handled by other bytecode attached at the beginning). Instead, move BCOLOR inputs after all other inputs, so BCOLOR0 is at location "num_inputs" if it's present. BCOLOR1 is next. This also allows removing si_shader::nparam and si_shader::ps_input_param_offset, which are useless now. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	606e4185f3	radeonsi: move SPI_PS_INPUT_CNTL value computation to a separate function Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	90cbbe1c12	radeonsi: generate a color_two_side variant only if the shader reads colors Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	4bbbaaf191	radeonsi: move si_shader_context initialization into a separate function This will be re-used later. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	a3e9a5f9f8	st/mesa: remove st_is_program_native The default scenario sets GL_TRUE too. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	7046c588eb	st/mesa: unify destroy_program_variants cases for TCS, TES, GS Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-09 21:19:50 +01:00
Marek Olšák	75be3ee9f9	st/mesa: unify get_variant functions for TCS, TES, GS Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-09 21:19:50 +01:00
Marek Olšák	b8d31fdedf	st/mesa: unify variants and delete functions for TCS, TES, GS no difference between those Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-09 21:19:50 +01:00
Chad Versace	bdab29a312	isl: Add more assertions to isl_surf_get_depth_format() R16_UNORM and R32_FLOAT are illegal formats for interleaved depthstencil surfaces.	2016-02-09 11:40:08 -08:00
Jason Ekstrand	1d65abfa58	nir/spirv: Better handle constant offsets in texture lookups	2016-02-09 10:29:05 -08:00
Jason Ekstrand	209820739b	nir/spirv: Set the vtn_mode and interface type for sampler parameters	2016-02-09 10:29:05 -08:00
Jason Ekstrand	de6c9c5f2e	nir/inline_functions: Don't shadown variables when it isn't needed Previously, in order to get things working, we just always shadowed variables. Now, we rewrite derefs whenever it's safe to do so and only shadow if we have an in or out variable that we write or read to respectively.	2016-02-09 10:29:05 -08:00
Jason Ekstrand	b6c00bfb03	nir: Rework function parameters	2016-02-09 10:29:05 -08:00
Jason Ekstrand	a485567d3a	anv/WSI/X11: Use the right allocator for freeing swapchains	2016-02-09 10:29:05 -08:00
Brian Paul	fe14110f35	mesa: fix incorrect viewport position when GL_CLIP_ORIGIN = GL_LOWER_LEFT Ilia Mirkin found/fixed the mistake. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93813 Cc: "11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-09 11:27:48 -07:00
Brian Paul	0193e20df5	mesa: rewrite save_CallLists() code When glCallLists() is compiled into a display list, preserve the call as a single glCallLists rather than 'n' glCallList calls. This will matter for an upcoming display list optimization project. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-02-09 11:27:48 -07:00
Brian Paul	711d5347cf	mesa: add missing error check in _mesa_CallLists() Generate GL_INVALID_VALUE if n < 0. Return early if n==0 or lists==NULL. v2: fix formatting, also check for lists==NULL. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-02-09 11:27:48 -07:00
Brian Paul	b1ddc03633	mesa: whitespace clean-ups in dlist.h And remove 'extern' qualifiers.	2016-02-09 11:27:48 -07:00
Brian Paul	7d18faf8e7	st/mesa: don't allocate bitmap drawing state until needed Most apps don't use glBitmap so don't allocate the bitmap cache or gallium state objects/shaders/etc until the first call to st_Bitmap(). v2: simplify a conditional, per Gustaw Smolarczyk. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 11:27:48 -07:00
Brian Paul	a5799de3dc	st/mesa: move the setup_bitmap_vertex_data() code into draw_bitmap_quad() Now all the code to setup the vertex data and draw it is in one place. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 11:27:48 -07:00
Brian Paul	130d34ce65	st/mesa: refactor some bitmap drawing code Move setup/restoration of rendering state into helper functions. This makes the draw_bitmap_quad() function much more concise. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 11:27:47 -07:00
Chad Versace	e6d3432c81	anv: Replace anv_format::depth_format with ::has_depth isl now understands depth formats. We no longer need depth formats in the anv_format table.	2016-02-09 10:02:50 -08:00
Chad Versace	0a93067993	isl: Add func isl_surf_get_depth_format() For depth surfaces, it gets the value for 3DSTATE_DEPTH_BUFFER.SurfaceFormat.	2016-02-09 10:02:50 -08:00
Chad Versace	4d037b551e	anv: Rename anv_format::surface_format -> isl_format Because that's what it is, an isl format.	2016-02-09 10:02:50 -08:00
Ilia Mirkin	922be4eab9	mesa: remove hack to fix up GL_ANY_SAMPLES_PASSED results Both st/mesa and i965 should return a true/false result now, and the only other driver implementing queries (radeon) doesn't support ARB_occlusion_query2 which added that pname. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 11:59:35 -05:00
Ilia Mirkin	7aca4bb9b1	st/mesa: make use of the occlusion predicate query Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 11:59:35 -05:00
Ilia Mirkin	50235ab3ab	nv50: add PIPE_QUERY_OCCLUSION_PREDICATE support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-02-09 11:59:35 -05:00
Ilia Mirkin	0cb1dda36e	nv30: add PIPE_QUERY_OCCLUSION_PREDICATE support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-09 11:59:35 -05:00
Ilia Mirkin	0d04ec2fd2	ilo: add PIPE_QUERY_OCCLUSION_PREDICATE support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2016-02-09 11:59:27 -05:00
Nicolai Hähnle	c260175677	draw: use util_pstipple_* function for stipple pattern textures and samplers This reduces code duplication. Suggested-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-09 10:01:57 -05:00
Nicolai Hähnle	452e51bf1e	draw: use util_pstipple_create_fragment_shader This reduces code duplication. It also adds support for drivers where the fragment position is a system value. Suggested-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-09 10:01:32 -05:00
Marek Olšák	83b4d701c0	winsys/radeon: fix a wrong NUM_TILE_PIPES value from the kernel Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94019 Tested-by: Nick Sarnie <commendsarnex@gmail.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-02-09 15:26:40 +01:00
Timothy Arceri	1aae5e8ced	nir: remove unused nir_variable fields These are used in GLSL IR to removed unused varyings and match transform feedback variables. There is no need to use these in NIR. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 22:49:06 +11:00
Timothy Arceri	6235b69134	glsl: remove unrequired forward declaration This was added in `2548092ad8` although I don't see why as it was already in the linker.h header. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 22:48:55 +11:00
Timothy Arceri	9dd6a4ea79	glsl: clean up and fix bug in varying linking rules The existing code was very hard to follow and has been the source of at least 3 bugs in the past year. The existing code also has a bug for SSO where if we have a multi-stage SSO for example a tes -> gs program, if we try to use transform feedback with gs the existing code would look for the transform feedback varyings in the tes stage and fail as it can't find them. V2: Add more code comments, always try to remove unused inputs to the first stage. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 22:44:22 +11:00
Timothy Arceri	fd0b89ad8d	glsl: simplify ES Vertex/Fragment shader requirements We really just needed to skip the existing ES < 3.1 check if we have a compute shader, all other scenarios are already covered. * No shaders is a link error. * Geom or Tess without Vertex is a link error which means we always require a Vertex shader and hence a Fragment shader. * Finally a Compute shader linked with any other stage is a link error. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 22:44:15 +11:00
Timothy Arceri	55fa3c44bc	glsl: simplify required stages for linking rules Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 22:44:11 +11:00
Timothy Arceri	20823992b4	glsl: small tidy up now that link_shaders() exits early with 0 shaders Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 22:44:07 +11:00
Timothy Arceri	76cfb47207	glsl: don't attempt to link empty program Previously an empty program would go through the entire link_shaders() function and we would have to be careful not to cause a segfault. In core profile also now set link_status to false by generating an error, it was previously set to true. From Section 7.3 (PROGRAM OBJECTS) of the OpenGL 4.5 spec: "Linking can fail for a variety of reasons as specified in the OpenGL Shading Language Specification, as well as any of the following reasons: - No shader objects are attached to program." V2: Only generate an error in core profile and add spec quote (Ian) V3: generate error in ES too, remove previous check which was only applying the rule to GL 4.5/ES 3.1 and above. My understand is that this spec change is clarifying previously undefined behaviour and therefore should be applied retrospectively. The ES CTS tests for this are in ES 2 I suspect it was passing because it would have generated an error for not having both a vertex and fragment shader. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 22:44:02 +11:00
Matt Turner	371c4b3c48	nir: Recognize open-coded bitfield_reverse. Helps 11 shaders in UnrealEngine4 demos. I seriously hope they would have given us bitfieldReverse() if we exposed GL 4.0 (but we do expose ARB_gpu_shader5, so why not use that anyway?). instructions in affected programs: 4875 -> 4633 (-4.96%) cycles in affected programs: 270516 -> 244516 (-9.61%) I suspect there's a lot of room to improve nir_search/opt_algebraic's handling of this. We'd actually like to match, e.g., step2 by matching step1 once and then doing a pointer comparison for the second instance of step1, but unfortunately we generate an enormous tuple for instead. The .text size increases by 6.5% and the .data by 17.5%. text data bss dec hex filename 22957 45224 0 68181 10a55 nir_libnir_la-nir_opt_algebraic.o 24461 53160 0 77621 12f35 nir_libnir_la-nir_opt_algebraic.o I'd be happy to remove this if Unreal4 uses bitfieldReverse() if it is in a GL 4.0 context once we expose GL 4.0. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-02-08 21:20:58 -08:00
Matt Turner	2d0d9755da	nir: Handle large unsigned values in opt_algebraic. The next patch adds an algebraic rule that uses the constant 0xff00ff00. Without this change, the build fails with return hex(struct.unpack('I', struct.pack('i', self.value))[0]) struct.error: 'i' format requires -2147483648 <= number <= 2147483647 The hex() function handles integers of any size, and assigning a negative value to an unsigned does what we want in C. The pack/unpack is unnecessary (and as we see, buggy). Reviewed-by: Dylan Baker <baker.dylan.c@gmail.com>	2016-02-08 20:38:17 -08:00
Matt Turner	7be8d07732	nir: Do opt_algebraic in reverse order. Walking the SSA definitions in order means that we consider the smallest algebraic optimizations before larger optimizations. So if a smaller rule is part of a larger rule, the smaller one will happen first, preventing the larger one from happening. instructions in affected programs: 32721 -> 32611 (-0.34%) helped: 106 In programs whose nir_optimize loop count changes (129 of them): before: 1164 optimization loops after: 1071 optimization loops Of the 129 affected, 16 programs' optimization loop counts increased. Prevents regressions and annoyances in the next commits. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-02-08 20:38:17 -08:00
Matt Turner	a8f0960816	nir: Recognize product of open-coded pow()s. Prevents regressions in the next commit. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-02-08 20:38:17 -08:00
Matt Turner	9f02e3ab03	nir: Add opt_algebraic rules for xor with zero. instructions in affected programs: 668 -> 664 (-0.60%) helped: 4 Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-02-08 20:38:17 -08:00
Timothy Arceri	3fd4280759	glsl: validate arrays of arrays on empty type delclarations Fixes: dEQP-GLES31.functional.shaders.arrays_of_arrays.invalid.empty_declaration_without_var_name_fragment dEQP-GLES31.functional.shaders.arrays_of_arrays.invalid.empty_declaration_without_var_name_vertex Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-02-09 13:52:52 +11:00
Kenneth Graunke	74f956c416	i965: Use nir_lower_load_const_to_scalar(). I don't know why, but we never hooked up this pass Eric wrote. Otherwise, you can end up with stupid scalarized code such as: vec4 ssa_7 = load_const (0.0, 0.0, 0.0, 0.0) vec4 ssa_8 = ... vec1 ssa_9 = feq ssa_8, ssa_7 vec1 ssa_10 = feq ssa_8.y, ssa_7.y vec1 ssa_11 = feq ssa_8, ssa_7.z vec1 ssa_12 = feq ssa_8.y, ssa_7.w ssa_8.xyxy == <0, 0, 0, 0> should only take two feq instructions. shader-db on Skylake: total instructions in shared programs: 9121153 -> 9120749 (-0.00%) instructions in affected programs: 32421 -> 32017 (-1.25%) helped: 277 HURT: 69 total cycles in shared programs: 69003364 -> 69000912 (-0.00%) cycles in affected programs: 899186 -> 896734 (-0.27%) helped: 313 HURT: 403 This also prevents regressions when disabling channel expressions. v2: Don't call opt_cse afterwards (requested by Matt). It should happen in the optimization loop below anyway. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-02-08 18:10:34 -08:00
Timothy Arceri	184afd8fd9	mesa: remove now unused sampler index handing code Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-02-09 12:03:02 +11:00
Timothy Arceri	edc108765e	mesa: compute sampler index in ir_to_mesa rather than using UniformHash The aim of this is to work towards removing UniformHash from the program struct so that we don't need to hold onto it in memory and pass it around outside the linker. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-02-09 12:02:58 +11:00
Kenneth Graunke	d0e1d6b7e2	i965: Don't add barrier deps for FB write messages. There are never render target reads, so there are no scheduling hazards. Giving the extra flexibility to the scheduler makes it possible to do FB writes as soon as their sources are available, reducing register pressure. It also makes it possible to do the payload setup for more than one FB write message at a time, which could better hide latency. shader-db results on Skylake: total instructions in shared programs: 9110254 -> 9110211 (-0.00%) instructions in affected programs: 2898 -> 2855 (-1.48%) helped: 3 HURT: 0 LOST: 0 GAINED: 1 A reduction in instruction counts is surprising, but legitimate: the three shaders helped were spilling, and reducing register pressure allowed us to issue fewer spills/fills. total cycles in shared programs: 69035108 -> 68928820 (-0.15%) cycles in affected programs: 4412402 -> 4306114 (-2.41%) helped: 4457 HURT: 213 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-02-08 16:59:35 -08:00
Dave Airlie	6502b3f60e	st/mesa: enable AoA for gallium drivers reporting GLSL 1.30 Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-02-09 10:52:09 +10:00
Dave Airlie	b74e8c89a6	st/mesa: add atomic AoA support reuse the sampler deref handling code to do the same thing for atomics. Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-02-09 10:52:09 +10:00
Dave Airlie	90bbe3d781	mesa: drop unused nonconst sampler functions. Since we fixed the glsl->tgsi conversion we no longer need this function. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-02-09 10:52:08 +10:00
Dave Airlie	bb8bbe34e3	st/mesa: handle indirect samplers in arrays/structs properly (v4.1) The state tracker never handled this properly, and it finally annoyed me for the second time so I decided to fix it properly. This is inspired by the NIR sampler lowering code and I only realised NIR seems to do its deref ordering different to GLSL at the last minute, once I got that things got much easier. it fixes a bunch of tests in tests/spec/arb_gpu_shader5/execution/sampler_array_indexing/ v2: fix AoA tests when forced on. I was right I didn't need all that code, fixing the AoA code meant cleaning up a chunk of code I didn't like in the array handling. v3: start generalising the code a bit more for atomics. v3.1: use UniformRemapTable v4: handle uniforms differently using the param_index, and go back to UniformStorage fix issues identified by Timothy with deref handling. v4.1: squash const fix and move handling 1D const out of recursive function. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-02-09 10:52:08 +10:00
Dave Airlie	52801766a0	glsl/ir: add param index to variable. We have a requirement to store the index into the mesa parameterlist for uniforms. Up until now we've overwritten var->data.location with this info. However this then stops us accessing UniformStorage, which is needed to do proper dereferencing. Add a new variable to ir_variable to store this value in, and change the two uses to use it correctly. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-02-09 10:52:08 +10:00
Francisco Jerez	53739fddc6	i965: Rename define for the PIPE_CONTROL DC flush bit. Its previous name was somewhat misleading, this really behaves like a RW cache flush rather than an invalidation. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-08 15:48:00 -08:00
Francisco Jerez	10d84ba9f0	i965: Invalidate state cache before L3 partitioning set-up. The state cache is also L3-backed so it seems sensible to make sure it's clean as we do for other RO caches before repartitioning the L3. This wasn't part of my original L3 partitioning code because I was able to reproduce hangs on Gen7 hardware when the state cache invalidation happened asynchronously with previous 3D rendering, which should no longer be possible after the previous change. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-08 15:47:21 -08:00
Francisco Jerez	0aa4f99f56	i965: Fix cache pollution race during L3 partitioning set-up. We need to split the stalling flush from the RO cache invalidation into a different PIPE_CONTROL command to make sure that the top of the pipe invalidation happens after any previous rendering is complete. Otherwise it's possible for previous rendering to pollute the L3 cache in the short window of time between RO invalidation and the completion of the stalling flush. Fixes rendering artifacts on Unigine Heaven, Metro Last Light Redux and Metro 2033 Redux. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93540 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93599 Tested-by: Darius Spitznagel <d.spitznagel@goodbytez.de> Tested-by: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-08 15:45:44 -08:00
Francisco Jerez	1817e3c07a	i965/fs: Don't emit unnecessary SEL instruction from emit_image_atomic(). The SEL instruction with predication mode NONE emitted when the atomic operation doesn't need to be predicated is a no-op and might rely on undocumented hardware behaviour. Noticed by chance while looking at the assembly output. Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-02-08 15:43:05 -08:00
Matt Turner	c300559fbf	i965/vec4: Update vec4 unit tests for commit `01dacc83ff`. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94050	2016-02-08 15:32:12 -08:00
Francisco Jerez	cec6fe2ad8	vtn: Clean up acos implementation. Parameterize build_asin() on the fit coefficients so the implementation can be shared while still using different polynomials for asin and acos. Also switch back to implementing acos in terms of asin -- The improvement obtained from cancelling out the pi/2 terms was negligible compared to the approximation error.	2016-02-08 15:23:43 -08:00
Francisco Jerez	f50a651726	nir/spirv: Create integer types of correct signedness. vtn_handle_type() creates a signed type regardless of the value of the signedness flag, which usually doesn't make much of a difference except when the type is used as base sampled type of an image type, what will cause the base type of the NIR image variable to be inconsistent with its format and cause an assertion failure in the back-end (most likely only reproducible on Gen7), and may change the semantics of the image intrinsic subtly (e.g. UMIN may become IMIN).	2016-02-08 15:23:35 -08:00
Brian Paul	01dacc83ff	dri/common: include debug_output.h to silence warning	2016-02-08 10:52:02 -07:00
Brian Paul	59251610ed	tgsi: minor whitespace fixes in tgsi_scan.c Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-08 09:29:38 -07:00
Brian Paul	42246ab1f5	tgsi: s/true/TRUE/ in tgsi_scan.c Just to be consistent. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-08 09:29:38 -07:00
Brian Paul	da6e879a6c	tgsi: use switches instead of big if/else ifs Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-08 09:29:38 -07:00
Brian Paul	37eb3f0400	tgsi: break gigantic tgsi_scan_shader() function into pieces New functions for examining instructions, declarations, etc. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-08 09:29:38 -07:00
Brian Paul	3c3ef69696	st/mesa: minor formatting fixes in st_cb_bitmap.c	2016-02-08 09:29:38 -07:00
Brian Paul	5fdbfb8d6f	mesa: move GL_ARB_debug_output code into new debug_output.c file The errors.c file had grown quite large so split off this extension code into its own file. This involved making a handful of functions non-static. Acked-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-02-08 09:29:38 -07:00
Brian Paul	6691ba1fe8	gallium/util: whitespace, formatting fixes in u_debug_stack.c	2016-02-08 09:29:38 -07:00
Brian Paul	5d2539cb49	gallium/util: whitespace, formatting fixes in u_staging.[ch] files Still some nonsensical comments.	2016-02-08 09:29:38 -07:00
Brian Paul	c84a8911fc	gallium/util: switch over to new u_debug_image.[ch] code Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-08 09:29:38 -07:00
Brian Paul	3917c8f3f9	gallium/util: put image dumping functions into separate file To try to reduce the clutter in u_debug.[ch] Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-08 09:29:38 -07:00
Brian Paul	6c7d4a7173	gallium/util: whitespace, formatting fixes in u_debug.c Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-08 09:29:38 -07:00
Samuel Pitoiset	efe5829578	trace: add missing pipe_context::clear_texture() This fixes a crash with bin/arb_clear_texture-base-formats and probably some other tests which use clear_texture(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-08 00:06:32 +01:00
Samuel Pitoiset	1dacbb7b46	trace: remove useless MALLOC() in trace_context_draw_vbo() There is no need to allocate memory when unwrapping the indirect buf. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-08 00:06:22 +01:00
Vinson Lee	ccaf734275	mesa/extensions: Fix NVX_gpu_memory_info lexicographical order. Fixes MesaExtensionsTest.AlphabeticallySorted. Fixes: `1d79b99580` ("mesa: implement GL_NVX_gpu_memory_info (v2)") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94016 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-07 14:42:00 -08:00
Ilia Mirkin	88519c6087	glsl: return cloned signature, not the builtin one The builtin data can get released with a glReleaseShaderCompiler call. We're careful everywhere to clone everything that comes out of builtins except here, where we accidentally return the signature belonging to the builtin version, rather than the locally-cloned one. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Tested-by: Rob Herring <robh@kernel.org> Cc: mesa-stable@lists.freedesktop.org	2016-02-07 17:23:58 -05:00
Ilia Mirkin	ac57577e29	glsl: make sure builtins are initialized before getting the shader The builtin function shader is part of the builtin state, released when glReleaseShaderCompiler is called. We must ensure that the builtins have been (re)initialized before attempting to link with the builtin shader. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Tested-by: Rob Herring <robh@kernel.org> Cc: mesa-stable@lists.freedesktop.org	2016-02-07 17:23:57 -05:00
Samuel Pitoiset	04c2ca5038	tgsi: use TGSI_WRITEMASK_XYZW instead of hardcoding the mask Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Serge Martin <edb+mesa@sigluy.net>	2016-02-06 20:24:41 +01:00
Kristian Høgsberg Kristensen	6c4c04690f	anv: Deduplicate dispatch calls This can all be shared between gen8+ and pre-gen8.	2016-02-05 22:36:53 -08:00
Timothy Arceri	ea7f64f74d	glsl: don't generate transform feedback candidate when not required If we are not even looking for one don't bother generating a candidate list. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-06 14:34:43 +11:00
Timothy Arceri	c1bbaff1e8	glsl: replace unreachable code with an assert() All interface blocks will have been lowered by this point so just use an assert. Returning false would have caused all sorts of problems if they were not lowered yet and there is an assert to catch this later anyway. We also update the tests to reflect this change. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-06 14:34:35 +11:00
Jan Vesely	e377037bef	r600, compute: Do not overwrite pipe_resource.screen found by inspection. Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-05 21:17:15 -05:00
Kristian Høgsberg Kristensen	bdefaae2b9	anv: Deduplicate anv_CmdDraw calls These were all duplicated between gen7_cmd_buffer.c and gen8_cmd_buffer.c. This commit consolidates both copies in genX_cmd_buffer.c.	2016-02-05 16:41:56 -08:00
Kristian Høgsberg Kristensen	6cdada0360	anv: Move invariant state to small initial batch We use the simple batch helper to submit a batch at driver startup time which holds all the state that never changes. We don't have a whole lot and once we enable tesselation there'll be even less. Even so, it's a simple mechanism and reduces our steady state batch sizes a bit.	2016-02-05 16:13:53 -08:00
Kristian Høgsberg Kristensen	c9c3344c4f	anv: Split out batch submit helper from anv_DeviceWaitIdle We'll reuse this mechanism in the next commit.	2016-02-05 16:13:52 -08:00
Kristian Høgsberg Kristensen	381d85545a	anv: Share scratch_space helper between gen7 and gen8+ The gen7 pipeline has a useful helper function for this, let's use it in gen8_pipeline.c too. The gen7 function has an off-by-one bug though: we have to compute log2(size / 1024) - 1, but we divide by 2048 instead so as to avoid the case where size is less than 1024 and we'd return -1.	2016-02-05 16:13:52 -08:00
Kristian Høgsberg Kristensen	d1617dbec3	anv: Share URB setup between gen7 and gen8+	2016-02-05 16:13:52 -08:00
Jason Ekstrand	9401516113	Merge remote-tracking branch 'mesa-public/master' into vulkan	2016-02-05 15:21:11 -08:00
Jason Ekstrand	741744f691	Merge commit mesa-public/master into vulkan This pulls in the patches that move all of the compiler stuff around	2016-02-05 15:03:44 -08:00
Jason Ekstrand	9645b8eb1f	Merge branch mesa-public/master into vulkan	2016-02-05 14:21:13 -08:00
Jan Vesely	5b51b2e000	r600g: Ignore format for PIPE_BUFFER targets Fixes compute since `7dd31b81fe` gallium/radeon: support PIPE_CAP_SURFACE_REINTERPRET_BLOCKS Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-02-05 20:23:56 +01:00
Marek Olšák	d8e4908b63	mesa/get: fix a breakage after rebase trivial.	2016-02-05 19:39:13 +01:00
Matt Turner	9f2e22bf34	i965/vec4: don't copy ATTR into 3src instructions with complex swizzles The vec4 backend, at the end, does this: if (inst->is_3src()) { for (int i = 0; i < 3; i++) { if (inst->src[i].vstride == BRW_VERTICAL_STRIDE_0) assert(brw_is_single_value_swizzle(inst->src[i].swizzle)); So make sure that we use the same conditions when trying to copy-propagate. UNIFORMs will be converted to vstride 0 in convert_to_hw_regs, but so will ATTRs when interleaved (as will happen in a GS with multiple attributes). Since the vstride is not set at copy-prop time, infer it by inspecting dispatch_mode and reject ATTRs if they have non-scalar swizzles and are interleaved. Fixes assertion errors in dolphin-generated geometry shaders (or misrendering on opt builds) on Sandybridge or on IVB/HSW with INTEL_DEBUG=nodualobj. Co-authored-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93418 Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2016-02-05 09:33:19 -08:00
Marek Olšák	1106e79ed9	docs/relnotes: document memory info extensions	2016-02-05 17:47:59 +01:00
Marek Olšák	635555af6a	gallium/radeon: implement query_memory_info (v2) v2: don't use DIV_ROUND_UP (no so useful) also return eviction stats Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-02-05 17:31:58 +01:00
Marek Olšák	5f51a24a77	st/mesa: implement and enable memory info extensions (v2) v2: assert and return if query_memory_info is not set rebase Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-02-05 17:31:53 +01:00
Marek Olšák	837f74aa51	mesa: implement GL_ATI_meminfo (v2) v2: rebase Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-02-05 17:31:20 +01:00
Marek Olšák	1d79b99580	mesa: implement GL_NVX_gpu_memory_info (v2) v2: implement eviction queries properly add gl_memory_info structure Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-02-05 17:30:07 +01:00
Marek Olšák	d2e4c9e737	gallium: add interface for querying memory usage and sizes (v2) If you're worried about the duplication of some CAPs, we can remove them later. v2: add fields for memory eviction stats Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-02-05 17:29:38 +01:00
Marek Olšák	c577f2843a	gallium/radeon: remove radeon_info::r600_tiling_config Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-02-05 17:29:19 +01:00
Marek Olšák	4f96846d9d	gallium/radeon: get pipe_interleave_bytes AKA group_bytes from the winsys Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-02-05 17:28:59 +01:00
Marek Olšák	276621da45	gallium/radeon: set num_banks in the winsys amdgpu doesn't have to set this, because radeonsi gets it from tile mode arrays by default. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-02-05 17:28:40 +01:00
Marek Olšák	294ec530c9	gallium/radeon: just get num_tile_pipes from the winsys Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-02-05 17:28:24 +01:00
Marek Olšák	0f3556d308	winsys/amdgpu: add an assertion to cik_get_num_tile_pipes (v2) v2: print an error to stderr Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-02-05 17:28:18 +01:00
Marek Olšák	a2291f7b57	winsys/amdgpu: remove an r600-only setting Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-02-05 17:28:12 +01:00
Marek Olšák	1e864d7379	gallium/radeon: rename & reorder members of radeon_info Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-02-05 17:28:00 +01:00
Steinar H. Gunderson	feb53912f8	mesa: Fix locking of GLsync objects. GLsync objects had a race condition when used from multiple threads (which is the main point of the extension, really); it could be validated as a sync object at the beginning of the function, and then deleted by another thread before use, causing crashes. Fix this by changing all casts from GLsync to struct gl_sync_object to a new function _mesa_get_and_ref_sync() that validates and increases the refcount. In a similar vein, validation itself uses _mesa_set_search(), which requires synchronization -- it was called without a mutex held, causing spurious error returns and other issues. Since _mesa_get_and_ref_sync() now takes the shared context mutex, this problem is also resolved. Fixes bug #92757, found while developing Nageru, my live video mixer (due for release at FOSDEM 2016). v2: Marek: silence warnings, fix declaration after code Signed-off-by: Steinar H. Gunderson <sesse@google.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-02-05 17:18:17 +01:00
Nicolai Hähnle	156e81f305	radeonsi: add placeholder MC and SRBM performance counter groups Yet another change motivated by AMD GPUPerfStudio compatibility. These groups are not directly accessible from userspace, and AMD GPUPerfStudio does not actually query them - it just requires them to be there. Hence, adding a placeholder for now. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Acked-by: Marek Olšák <marek.olsak@amd.com>	2016-02-05 09:25:33 -05:00
Nicolai Hähnle	988f4b31f3	radeonsi: re-order the SQ_xx performance counter blocks This is yet another change motivated by appeasing AMD GPUPerfStudio's hardcoding of performance counter group numbers. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Acked-by: Marek Olšák <marek.olsak@amd.com>	2016-02-05 09:25:30 -05:00
Nicolai Hähnle	75affd73b0	radeonsi: re-order the perfcounter hardware blocks As documented in the comment, AMD GPUPerfStudio unfortunately hardcodes the order of performance counter groups. Let's do the pragmatic thing and present the same order as Catalyst/Crimson. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Acked-by: Marek Olšák <marek.olsak@amd.com>	2016-02-05 09:25:27 -05:00
Nicolai Hähnle	b0e32548c8	gallium/radeon: add GPIN driver query group This group was used by older versions of AMD GPUPerfStudio (via AMD_performance_monitor) to identify the GPU family, and GPUPerfStudio still complains when it isn't available. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Acked-by: Marek Olšák <marek.olsak@amd.com>	2016-02-05 09:24:59 -05:00
Nicolai Hähnle	4b672b8310	radeonsi: Allow dumping LLVM IR before optimization passes Set R600_DEBUG=preoptir to dump the LLVM IR before optimization passes, to allow diagnosing problems caused by optimization passes. Note that in order to compile the resulting IR with llc, you will first have to run at least the mem2reg pass, e.g. opt -mem2reg -S < shader.ll \| llc -march=amdgcn -mcpu=bonaire Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> (original patch) Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (w/ debug flag) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-05 09:22:04 -05:00
Nicolai Hähnle	5aafc169ca	gallium/radeon: emit LLVM `ret void` before radeon_llvm_finalize_module This allows dumping a consumable LLVM module before the initial optimization passes are run. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-05 09:21:54 -05:00
Nicolai Hähnle	7e9670c8bc	st/mesa: bail out of try_pbo_upload_common when constant upload fails Also fixes a resource leak when an upload_mgr is used for constants. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-05 09:21:51 -05:00
Nicolai Hähnle	a01e44adcc	st/mesa: bail out of try_pbo_upload_common when vertex upload fails At the same time, fix a memory leak noticed by Ilia Mirkin. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-05 09:21:48 -05:00
Nicolai Hähnle	b27c79bd81	st/mesa: reduce the scope of sampler_view in try_pbo_upload_common We can get rid of our reference immediately, since the driver will hold onto it for us. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-05 09:21:44 -05:00
Nicolai Hähnle	13e21e3ec5	st/mesa: do uploads earlier in try_pbo_upload_common While rather unlikely, uploads _can_ fail. Doing them earlier means we'll have to restore less state when they do fail, and it's slightly easier to check the restore code. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-05 09:21:27 -05:00
Neil Roberts	eb9cf3cfc9	main: Use a derived value for the default sample count Previously the framebuffer default sample count was taken directly from the value given by the application. On the i965 driver on HSW if the value wasn't one that is supported by the hardware it would hit an assert when it tried to program the state for it. This patch fixes it by adding a derived sample count to the state for the default framebuffer. The driver can then quantize this to one of the valid values in its UpdateState handler when the _NEW_BUFFERS state changes. _mesa_geometric_samples is changed to use the new derived value. Fixes the piglit test arb_framebuffer_no_attachments-query Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93957 Cc: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-05 11:05:10 +00:00
Neil Roberts	5fd848f6c9	program: Use _mesa_geometric_samples to calculate gl_NumSamples Otherwise it won't take into account the default samples for framebuffers with no attachments. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-05 11:05:06 +00:00
Neil Roberts	4995d9c9a0	main: Use _mesa_geometric_samples to calculate GL_SAMPLE_BUFFERS Otherwise it won't take into account the default samples for framebuffers with no attachments. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-05 11:05:01 +00:00
Neil Roberts	d8d4661ddb	main: Use _mesa_geometric_samples to calculate the value of GL_SAMPLES Otherwise it won't take into account the default samples for framebuffers with no attachments. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-05 11:04:44 +00:00
Ilia Mirkin	2065e380b2	nvc0: avoid negatives in PUSH_SPACE argument Fixup to commit `03b3eb90d` - the number of buffers could be larger than the number of elements, in which case we'd pass a negative argument to PUSH_SPACE, which would be bad. While we're at it, merge it with the other PUSH_SPACE at the top of the function. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-02-05 00:49:51 -05:00
Ilia Mirkin	03b3eb90d7	nvc0: add some missing PUSH_SPACE's nvc0_vbo has explicit push space checking enabled, so we must run PUSH_SPACE by hand. A few spots missed that. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-02-05 00:41:43 -05:00
Ilia Mirkin	1a0fde1f52	nvc0/ir: fix converting between predicate and gpr The spill logic will insert convert ops when moving between files. It seems like the emission logic wasn't quite ready for these converts. Tested on fermi, and visually looked at nvdisasm output for maxwell. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-02-05 00:41:33 -05:00
Ilia Mirkin	2fed18b8a5	nvc0: add support for ARB_query_buffer_object Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-04 21:21:30 -05:00
Ilia Mirkin	9cd5bb9f9f	st/mesa: add query buffer support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-04 21:21:30 -05:00
Ilia Mirkin	f9e6f46335	gallium: add PIPE_CAP_QUERY_BUFFER_OBJECT Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-04 21:21:30 -05:00
Ilia Mirkin	40d7f02c67	gallium: add a way to store query result into buffer Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-04 21:21:30 -05:00
Ilia Mirkin	386a9ec77b	mesa: add core implementation of ARB_query_buffer_object Forwards query result writes to drivers. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-04 21:21:30 -05:00
Ilia Mirkin	7c3f4b2fd8	mesa: add driver interface for writing query results to buffers Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-04 21:21:30 -05:00
Rafal Mielniczuk	3efcd4df01	mesa: Handle QUERY_BUFFER_BINDING in GetIntegerv Signed-off-by: Rafal Mielniczuk <rafal.mielniczuk2@gmail.com> [imirkin: move to GL/GL_CORE section] Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-04 21:21:30 -05:00
Rafal Mielniczuk	2d0ec0c272	mesa: Add QueryBuffer to context Add QueryBuffer and initialise it to NullBufferObj on start Signed-off-by: Rafal Mielniczuk <rafal.mielniczuk2@gmail.com> [imirkin: also release QueryBuffer on free] Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-04 21:21:30 -05:00
Rafal Mielniczuk	c5bab061da	mesa: Add ARB_query_buffer_object extension flag Signed-off-by: Rafal Mielniczuk <rafal.mielniczuk2@gmail.com> [imirkin: add string to extensions.c] Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-04 21:21:30 -05:00
Rafal Mielniczuk	4913d381a0	glapi: Add xml infrastructure for ARB_query_buffer_object Signed-off-by: Rafal Mielniczuk <rafal.mielniczuk2@gmail.com> [imirkin: move definition to gl_API.xml as it is very short] Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-04 21:21:30 -05:00
Timothy Arceri	23e24e27ac	glsl: simplify setting of image access qualifiers Cc: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-05 10:05:40 +11:00
Timothy Arceri	815929bd15	mesa: remove dead program parameter functions Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-05 09:11:00 +11:00
Axel Davy	94d91c6707	st/nine: Use align_free when needed Use align_free to free memory allocated with align_malloc. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-04 22:12:17 +01:00
Axel Davy	6b12fe77ea	st/nine: Disallow non-argb8888 cursors Only argb8888 cursors are allowed. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-04 22:12:17 +01:00
Axel Davy	24ddadbba9	st/nine: Enforce centroid for color input when multisampling is on The color inputs must automatically use centroid whether multisampling is used or not. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-04 22:12:17 +01:00
Axel Davy	d5389bb92d	st/nine: Fix centroid flag sem.reg.mod & NINED3DSPDM_CENTROID is worth 4 when centroid is requested, whereas TGSI_INTERPOLATE_LOC_CENTROID is worth 1. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Axel Davy	ee31f0fed4	st/nine: Use fast clears more often for MRTs This enables to use fast clears in the following case: pixel shader renders to 1 RT 4 RT bound clear new pixel shader bound that renders to 4 RTs Previously the fast clear path wouldn't be hit, because when trying the fast clear path, the framebuffer state would be configured for 1 RT, instead of 4. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-04 22:12:17 +01:00
Axel Davy	e85ef7d8e5	st/nine: Use linear filtering for shadow mapping Some docs say linear filtering is always used when app does shadow mapping. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	0b35da59de	st/nine: Respect block alignment on surface lock Respect block alignment for ATI1/ATI2 format when trying to lock a surface using LockRect(). Fixes failing WINE tests device.c test_surface_blocks() tests. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Axel Davy	56b4222b29	st/nine: Add Render state validation layer Testing Win behaviour seems to show wrong states are accepted, but then depending on the states some specific 'good' behaviours happen. This adds some validation to catch invalid states and have these 'good' behaviours when it happens. Also reorders SetRenderState to match the expected optimisation: (Value == previous Value) => return immediately, which affects D3D9 hacks too. Signed-off-by: Axel Davy <axel.davy@ens.fr> Signed-off-by: Patrick Rudolph <siro@das-labor.org>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	7132617436	DRI_CONFIG: Add option to override vendor id Add config option override_vendorid to report a fake card in d3dadapter9 drm. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	1a893ac886	st/nine: Implement NineDevice9_GetAvailableTextureMem Implement a device private memory counter similar to Win 7. Only textures and surfaces increment vidmem and may return ERR_OUTOFVIDEOMEMORY. Vertexbuffers and indexbuffers creation always succeedes, even when out of video memory. Fixes "Vampire: The Masquerade - Bloodlines" allocating resources until crash. Fixes "Age of Conan" allocating resources until crash. Fixes failing WINE test device.c test_vidmem_accounting(). Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	a961ec335d	st/nine: Handle Window Occlusion Apps can know if the window is occluded by checking for specific error messages. The behaviour is different for Device9 and Device9Ex. This allow games to release the mouse and stop rendering until the focus is restored. In case of multiple swapchain we do care only of the device one. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	e59908e57f	st/nine: Store minor version num To keep compatible with older ID3DPresent interfaces (used to talk with Wine), store the minor version num accessible to all statetracker functions (in the NineDevice9 structure). Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Axel Davy	0ac01a9fd7	st/nine: Call flush_resource before flush flush_resource needs to be called before flush (for fast clear resolve, etc). Removes useless computation of resource (it is already set correctly). Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	f481b9b952	st/nine: Fix remaining swapchain tests Return D3DERR_INVALIDCALL instead of E_POINTER. On error set ppBackBuffer to NULL. Multiple swapchains can only be created in windowed mode as windowed swapchain. Set backbuffer to NULL in NineDevice9_GetBackBuffer, but not in NineSwapChain9_GetBackBuffer. This fixes all WINE's device.c test_swapchain() tests. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Axel Davy	cbbd3c65cc	st/nine: Fix crash NineDevice9_CreateAdditionalSwapChain When no window is specified, we should revert to the focus window. This deserves more tests however (what if the device swapchain is already using the focus window ?) Fixes crash for FFXIV Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	996f76bd8a	st/nine: Fix possible crash on error In case swapchain creation fails This->swapchains[i] might be NULL and causes a crash. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	40a0b97ebd	st/nine: Test more presentation params Return errors in case of invalid presentation parameters. Fixes failing WINE tests device.c test_swapchain_parameters(). Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	827fee059e	st/nine: Fix resource9 private data Store a copy of GUID in the header that is under our control and use it as key for the hashtable instead of using the application provided pointer. The application might change the memory after leaving the function. Fixes a crash for issue https://github.com/iXit/Mesa-3D/issues/130 Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	5c79bd666b	st/nine: Print GUID instead of pointer To ease debugging print the GUID instead of the pointer to it. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	2a4d1509c8	st/nine: Fix use of uninitialized memory The values of box.z and box.depth weren't set and lead to a crash. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	924038c08f	st/nine: Fix clear for multisample mismatch depth-stencil Tests show in case of multisample mismatch between the depth-stencil buffer and the render target, then it is not cleared. Fixes failing WINE test visual.c test_multisample_mismatch(). Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	7f58ba45a8	st/nine: Fix Volumetexture9_LockBox Check for valid locked box dimensions. Fixes failing wine tests device.c test_lockbox_invalid. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Axel Davy	35047681ff	st/nine: Fix ATI2 pitch for non-square Fixes crash for non-square textures. We were using the height instead of the width for some calculations. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	eeeab8d6b4	st/nine: Support D3DFMT_R8G8B8 Add support for D3DFMT_R8G8B8. It allows format conversion for surfaces of pool scratch. Usually gallium formats equivalents for d3d9 formats have their names reversed. The gallium format PIPE_FORMAT_R8G8B8_UNORM is the right equivalent here, and its name is likely wrong (reversed). Fixes a crash in TmNationsForever. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	a3e7525ada	st/nine: Use cso for viewport Use CSO to catch redundant viewport changes. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	495727af6b	st/nine: Fix shade mode flat Shade mode flat is only working if pixelshaders have interpolate set to TGSI_INTERPOLATE_COLOR on color inputs. Fixes failing WINE tests visual.c test_shademode(). Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	fa887ba65b	st/nine: Clear rendertarget on creation Clear every rendertarget on creation. Fixes https://github.com/iXit/Mesa-3D/issues/139 Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	b142f61621	st/nine: Allow ColorFill on D3DFMT_NULL surfaces Report success instead of failing as there's no resource for those surfaces. Fixes a crash in Crysis: Warhead. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Axel Davy	04e22a04a6	st/nine: Introduce STREAMFREQ state Previous vertex elements code update was protected by 'if ((group & (NINE_STATE_VDECL \| NINE_STATE_VS)) \|\| state->changed.stream_freq & ~1)' itself protected by 'if (group & (NINE_STATE_COMMON \| NINE_STATE_VS))' If no state is changed except the stream frequency, no update would happen. This patch solves the problem by adding a new NINE_STATE_STREAMFREQ state. Another way would be to add state->changed.stream_freq & ~1 check to the main test. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-04 22:12:17 +01:00
Axel Davy	15ce2778fb	st/nine: Catch redundant SetStreamSourceFreq calls Some apps do redundant SetStreamSourceFreq calls. Catch them to improve performance. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	ea3f504f7c	st/nine: Squash indexbuffer9 and vertexbuffer9 The indexbuffer9 codebase was lagging behind the one of vertexbuffer9. Add buffer9 as common code base for indexbuffer9 and vertexbuffer9. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Axel Davy	b6bb8d561a	st/nine: Unset vtxbuf on reset We forgot to reset vtxbuf. This fixes some crashes. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-04 22:12:17 +01:00
Axel Davy	b63c144d1e	st/nine: Use pipe_resource_reference for vtxbuf This seems cleaner to actually reference the resources for vtxbuf, rather than relying on the fact the bound d3d streams do. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Axel Davy	b5876e4762	st/nine: Use ff vertex shader when position_t is used When an application sets a vertex shader, we are supposed to use it, and when no vertex shader are set, we are supposed to revert to fixed function vertex shader. It seems there is an exception: when the vertex declaration has a position_t index, we should revert to fixed function vertex shader. Up to know we were checking if device->state.vs is set to know whether to use programmable shader or not. With this commit we determine whether we use programmable shader or not when vertex shader/declaration are set, but stateblocks do complicate things a bit. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	531acbc56b	st/nine: Don't increment refcount on VertexDeclaration creation failure NineUnknown_ctor increments the refcount even in case of an error. Restructure the code to prevent refcount increments. Fixes a couple of wine tests. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Axel Davy	b39fd5b1da	st/nine: Change StretchRect check order Textures in SYSTEMMEM don't have resources attached. Instead of returning an error for them, StretchRect was crashing. This changes the check order to fix that case. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-04 22:12:17 +01:00
Axel Davy	a82e67812a	st/nine: Initialize lights in stateblocks This fixes a crash. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	9c1d93f8e7	st/nine: Fix fixed-function blendweights The last weighted element is one minus the sum of all previous weights. Fixes WINE test visual.c test_vertex_blending. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	cc830dc214	st/nine: Always normalize hitDir Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	ed7e1046b6	st/nine: Replace r[0] with tmp Replace r[0] with tmp to ease code reading. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	9856203f5a	st/nine: Fix ff calculation of midVec In case of non local viewer the value has to be subtracted. Fixes failing WINE tests in test_specular_lighting() (visual.c) Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	921f3eac58	st/nine: Implement D3DRS_SPECULARENABLE Implement fixed function D3DRS_SPECULARENABLE. Fixes failing WINE tests in test_specular_lighting() (visual.c) Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	9c26fa1b13	st/nine: Fix D3DRS_LOCALVIEWER being ignored Set key->localviewer to D3DRS_LOCALVIEWER. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Axel Davy	aa4454ae85	st/nine: Fix rounding issue with vs1.1 a0 reg vs1.1 rounds a0 to lowest integer, while other versions do round to closest. To use the same path as the other versions (with ARR), we were substracting 0.5 for vs1.1 to get round to lowest. This gives wrong result if a0 is set to 0: round(0 - 0.5) = -1 Instead just use ARL for vs1.1 Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-04 22:12:17 +01:00
Axel Davy	dbb03f6b5b	st/nine: Fix D3DPMISCCAPS_FOGANDSPECULARALPHA support The documentation of the flag doesn't make sense. To sum up the doc, if not set, specular alpha contains fog, and if set specular alpha contains 0 (except for ff). However in practice when the flag is there, apps do use specular alpha as if it could be used normally, which makes much more sense than the doc. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	9298a0b81b	st/nine: Fix AlphaCmpCaps AlphaCmpCaps should advertise D3DPCMPCAPS_NEVER as well. Fixes https://github.com/iXit/Mesa-3D/issues/142 Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Chad Versace	3eebf3686b	anv: Drop anv_image::needs__surface_state anv_image::needs_sampler_surface_state was a redundant member, identical to (usage & VK_IMAGE_USAGE_SAMPLED_BIT). Likewise for the other needs_ members.	2016-02-04 12:20:51 -08:00
Chad Versace	42b9320fbf	anv/image: Rename nonrt_surface_state Let's call it what it is, not what it is not. Rename it to 'sampler_surface_state'.	2016-02-04 12:20:51 -08:00
Marek Olšák	bff640b3e0	radeonsi: implement PK2H and UP2H opcodes Based on a gallivm patch by Ilia Mirkin. +8 piglit regressions due to precision issues (I blame the tests) The benefit is that we'll get v_cvt_f32_f16 and v_cvt_f16_f32 instead of emulation with integer instructions. They are GLSL 4.00 intrinsics. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-02-04 19:52:28 +01:00
Matt Turner	973ba3f4d4	glsl: Ensure glsl/ exists before making the lexer/parser. Reported-by: Jan Ziak <0xe2.0x9a.0x9b@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93989	2016-02-04 09:31:17 -08:00
Matt Turner	8c7a42b3e8	i965/fs: Allocate single register at a time for constants. No instruction counts changed, but: total cycles in shared programs: 64834502 -> 64781530 (-0.08%) cycles in affected programs: 16331544 -> 16278572 (-0.32%) helped: 4757 HURT: 4288 GAINED: 66 LOST: 20 I remember trying this when I first wrote the pass, but it wasn't helpful at the time. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-02-04 09:30:58 -08:00
Marek Olšák	8ec24678ac	radeonsi: fix Hyper-Z on Stoney Cc: 11.0 11.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-04 16:47:41 +01:00
Patrick Baggett	9c78cfd547	mesa: Use SSE prefetch instructions rather than 3DNow instructions 64-bit Pentium 4 CPUs don't have the 3DNow prefetch instructions which results in an Illegal instruction crash. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Tested-by: Timothy Arceri <t_arceri@yahoo.com.au> https://bugs.freedesktop.org/show_bug.cgi?id=27512	2016-02-04 22:02:31 +11:00
Jason Ekstrand	1f5d56304f	anv/descriptor_set: Fix descriptor copies We weren't pulling the actual binding location information out of the set layout. The new code mirrors the descriptor write code.	2016-02-03 22:44:33 -08:00
Ilia Mirkin	edd494ddf0	nv50/ir: make sure to fetch all sources before creating instruction We must fetch all sources into the instruction stream before generating the instruction that uses them. Otherwise we'll define values after using them, which won't work so well. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-02-03 18:40:38 -05:00
Ilia Mirkin	a9d5c64c34	nv50: avoid freeing the symbols if they're about to be stored Spotted by Coverity Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-02-03 18:40:26 -05:00
Ilia Mirkin	9284fd9c0d	st/mesa: fix potential null deref if no shader is passed in Spotted by Coverity Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-02-03 18:40:13 -05:00
Ilia Mirkin	5ac7f0433b	glx: update to updated version of EXT_create_context_es2_profile The EXT spec has been updated to: - logically combine the es2_profile and es_profile exts - allow any legal version to be requested dEQP tests request a specific ES version when using GLX, so this allows dEQP upstream to run against GLX with the appropriate X server patch (which had similar disabling logic). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Matt Turner <mattst88@gmail.com> (v1) Reviewed-by: Adam Jackson <ajax@redhat.com> (v3) v1 -> v2: - distinguish between DRI_API_GLES{,2,3} - add GLX_EXT_create_context_es_profile client-side support v2 -> v3: - fix error in computing mask	2016-02-03 15:44:51 -05:00
Ilia Mirkin	ad0e48e518	dir-locals.el: set case-label offset to 0 While this is the default, private .emacs files might have it set to something else. No harm in forcing it to 0. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-02-03 15:44:51 -05:00
Jose Fonseca	1c0f95f602	appveyor: Bump shallow clone depth. To prevent build failures when a large patch series is committed, like happened in https://ci.appveyor.com/project/jrfonseca-fdo/mesa/build/322 due to 10 commits between `dac2964f3e` and `6f428328d3` where submitted before the build slave started the git clone. 100 commits should be bigger than any patch series seen in practice, and it takes practically the same time to download as 5 commits. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-03 19:37:19 +00:00
Rob Clark	029c89a0cc	Revert "compiler: removed unused Makefile.sources" Whoops, didn't mean to push this one. This reverts commit `78f4c555b9`.	2016-02-03 14:35:10 -05:00
Rob Clark	1be9184ff3	compiler: fix .gitignore for glsl_compiler Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-02-03 13:32:46 -05:00
Rob Clark	78f4c555b9	compiler: removed unused Makefile.sources We seem to end up w/ duplication between compiler/Makefile.sources and compiler/glsl/Makefile.sources. The latter appears unused. Delete it. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-02-03 13:19:45 -05:00
Nicolai Hähnle	43a401a792	gallium: fix the documentation of PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE This parameter is equivalent to the corresponding OpenGL implementation limit which is in texels, not bytes. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-03 14:12:37 +01:00
Nicolai Hähnle	7dd31b81fe	gallium/radeon: support PIPE_CAP_SURFACE_REINTERPRET_BLOCKS This is already used internally in si_resource_copy_region for compressed textures, so the only real change here is the adjusted surface size computation. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-03 14:10:37 +01:00
Nicolai Hähnle	4b02f16537	st/mesa: implement PBO upload for glCompressedTex(Sub)Image v2: - use st->pbo_upload.enabled flag Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-03 14:10:37 +01:00
Nicolai Hähnle	f38bb36f57	st/mesa: redirect CompressedTexSubImage to our own implementation This is where PBO upload will go. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-03 14:10:36 +01:00
Nicolai Hähnle	16c2ea1fcc	st/mesa: inline the implementation of _mesa_store_compressed_teximage We will write our own version of texsubimage for PBO uploads, and we will want to call that here as well. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-03 14:10:36 +01:00
Nicolai Hähnle	c99f2fe70e	st/mesa: implement PBO upload for multiple layers Use instancing to generate two triangles for each destination layer and use a geometry shader to route the layer index. v2: - directly write layer in VS if supported by the driver (Marek Olšák) Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-03 14:10:36 +01:00
Fredrik Höglund	757071ca7c	st/mesa: Accelerate PBO uploads Create a PIPE_BUFFER sampler view on the pixel-unpack buffer, and draw the image on the texture with a fragment shader that maps fragment coordinates to buffer coordinates. Modifications by Nicolai Hähnle: - various cleanups and fixes (e.g. error handling, corner cases) - split try_pbo_upload into two functions, which will allow code to be shared with compressed texture uploads - modify the source format selection to only test for support against the PIPE_BUFFER target v2: - update handling of TGSI_SEMANTIC_POSITION for recent changes in master - MaxTextureBufferSize is number of texels, not bytes (Ilia Mirkin) - only enable when integers are supported (Marek Olšák) - try harder to hit the TextureBufferOffsetAlignment - remove unnecessary MOV from the fragment shader Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-03 14:10:35 +01:00
Nicolai Hähnle	4a448a63ad	st/mesa: use the correct address generation functions in st_TexSubImage blit We need to tell the address generation functions about the dimensionality of the texture to correctly implement the part of Section 3.8.1 (Texture Image Specification) of the OpenGL 2.1 specification which says: "For the purposes of decoding the texture image, TexImage2D is equivalent to calling TexImage3D with corresponding arguments and depth of 1, except that ... * UNPACK SKIP IMAGES is ignored." Fixes a low impact bug that was found by chance while browsing the spec and extending piglit tests. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-03 14:10:35 +01:00
Nicolai Hähnle	6af6d7b08a	gallium: Add PIPE_CAP_SURFACE_REINTERPRET_BLOCKS This cap indicates whether pipe->create_surface can reinterpret a texture as a surface with a format of different block width/height (but equal block size). v2: fix whitespace Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-03 14:10:34 +01:00
Nicolai Hähnle	3abb548ef6	gallium: Add PIPE_CAP_BUFFER_SAMPLER_VIEW_RGBA_ONLY This cap indicates that the driver only supports R, RG, RGB and RGBA formats for PIPE_BUFFER sampler views. v2: move into "unsupported features" section for nouveau (Ilia Mirkin) Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-03 14:10:34 +01:00
Nicolai Hähnle	bc8a6842a9	mesa: add MESA_NO_MINMAX_CACHE environment variable When set to a truish value, this globally disables the minmax cache for all buffer objects. No #ifdef DEBUG guards because this option can be interesting for benchmarking. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-03 14:04:11 +01:00
Nicolai Hähnle	761c7d59c4	vbo: disable the minmax cache when the hit rate is low When applications stream their index buffers, the caches for those BOs become useless and add overhead, so we want to disable them. The tricky part is coming up with the right heuristic for when to disable them. The first question is which hit rate to aim for. Since I'm not aware of any interesting borderline applications that do something like "draw two or three times for each upload", I just kept it simple. The second question is how soon we should give up on the caching. Applications might have a warm-up phase where they fill a buffer gradually but then keep reusing it. For this reason, I count the number of indices that hit and miss (instead of the number of calls that hit or miss), since comparing that to the size of the buffer makes sense. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-03 14:04:06 +01:00
Nicolai Hähnle	115c643b16	mesa: add USAGE_DISABLE_MINMAX_CACHE flag to buffer UsageHistory Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-03 14:03:59 +01:00
Nicolai Hähnle	6b057f8ecc	vbo: cache/memoize the result of vbo_get_minmax_indices (v3) Some games developers are unaware that an index buffer in a VBO still needs to be read by the CPU if some varying data comes from a user pointer (unless glDrawRangeElements and friends are used). This is particularly bad when they tell us that the index buffer should live in VRAM. This cache helps, e.g. lifting This War Of Mine (a particularly bad offender) from under 10fps to slightly over 20fps on a Carrizo. Note that there is nothing prohibiting a user from rendering from multiple threads simultaneously with the same index buffer, hence the locking. (The internal buffer map taken for the buffer still leads to a race, but at least the locks are a move in the right direction.) v2: disable the cache on USAGE_TEXTURE_BUFFER as well (Chris Forbes) v3: - use bool instead of GLboolean for MinMaxCacheDirty (Ian Romanick) - replace the sticky USAGE_PERSISTENT_WRITE_MAP bit by a direct AccessFlags check Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> (v2) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-03 14:03:49 +01:00
Nicolai Hähnle	1a570d96a6	vbo: move vbo_get_minmax_indices into its own source file We will add more code for caching/memoization. Moving the existing code into its own file helps keep things modular. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-03 14:03:48 +01:00
Nicolai Hähnle	46b7a526f5	mesa/main: bail earlier for size == 0 in _mesa_clear_buffer_sub_data Note that the conversion of the clear data (when data != NULL) can fail due to an out of memory condition, but it does not check any error conditions mandated by the spec. Therefore, it is safe to skip when size == 0. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-03 14:03:46 +01:00
Nicolai Hähnle	fd7229b437	mesa/main: add USAGE_PIXEL_PACK_BUFFER flag to buffer UsageHistory We will want to disable minmax index caching for buffers that are used in this way. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-03 14:03:45 +01:00
Nicolai Hähnle	54c4a9803b	mesa/main: add USAGE_TRANSFORM_FEEDBACK_BUFFER flag to buffer UsageHistory We will want to disable minmax index caching for buffers that are used in this way. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-03 14:03:41 +01:00
Nicolai Hähnle	55fb921d69	util/hash_table: add _mesa_hash_table_num_entries Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-03 14:03:35 +01:00
Nicolai Hähnle	8b11d8cfbf	util/hash_table: add _mesa_hash_table_clear (v4) v4: coding style change (Matt Turner) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v3)	2016-02-03 14:03:25 +01:00
Leo Liu	6ad2e55a14	st/omx/dec/h264: fix corruption when scaling matrix present flag set The scaling list should be filled out with zig zag scan v2: integrate zig zag scan for list 4x4 to vl(Christian) v3: move list determination out from the loop(Ilia) Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-02-02 20:29:47 -05:00
Leo Liu	4f598f2173	vl: add zig zag scan for list 4x4 Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-02-02 20:29:43 -05:00
Roland Scheidegger	848a023c05	llvmpipe: use scissor_planes_needed helper function So it doesn't get out of sync in multiple places.	2016-02-03 01:25:45 +01:00
Jordan Justen	141ef75569	i965/gen8: Initialize aux_mode to GEN8_SURFACE_AUX_MODE_NONE GEN8_SURFACE_AUX_MODE_NONE is 0, so this is a no-op. Yet, this also makes it clear that we can compare aux_mode to the other GEN8_SURFACE_AUX_MODE_ values. We will want to compare to GEN8_SURFACE_AUX_MODE_HIZ. v2: Some very minor cherry-pick conflicts due to moving it around in the series. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com> Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>	2016-02-02 15:44:18 -08:00
Mark Janes	6a7e2904e0	nir/spirv: fix build_mat4_det stack smasher When generating a sub-determinate matrix, a 3-element swizzle array was indexed with clever inline boolean logic. Unfortunately, when i and j are both 3, the index overruns the array, smashing the next variable on the stack. For 64 bit builds, the alignment of the 3-element unsigned array leaves 32 bits of spacing before the next local variable, hiding this bug. On i386, a subcolumn pointer was smashed then dereferenced.	2016-02-02 15:30:54 -08:00
Mark Janes	ea8c2d118a	anv: Fix anv_descriptor_set reference error on deletion anv_descriptor_set_destroy uses the descriptor sets's set_layout member to iterate the set's buffer views. However, the set_layout reference may have previously been freed. On 64 bit builds, this bug generated valgrind errors but did not affect CTS test results. On 32 bit builds, it reliably produces assertions and memory corruption.	2016-02-02 15:28:01 -08:00
Kristian Høgsberg Kristensen	5a06bac4a0	anv: Use @LIB_DIR@ in anv_icd.json Otherwise we may get a lib vs lib64 mismatch.	2016-02-02 14:36:22 -08:00
Ilia Mirkin	18f688d62a	mesa: use default geometry's samples when there are no attachments Whether multisampling is turned on depends, in part, on whether attachments are themselves multisample surfaces. However when there are no attachments, we should rely on the default geometry for this. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-02 17:08:46 -05:00
Ilia Mirkin	095da3b550	mesa: invalidate framebuffer when changing parameters This fixes dEQP-GLES31.functional.fbo.completeness.no_attachments When the width or height are 0, the framebuffer is incomplete. We may also not have been passing the new state down to the driver when the widths/heights/etc changed. Make sure to dirty the state so that the framebuffer state is revalidated at draw time. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-02 17:08:46 -05:00
Ilia Mirkin	beac7b1b8b	mesa: use geometric helper for computing min samples In case we have a draw buffer without attachments, we should be looking at the default number of samples. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-02 17:08:46 -05:00
Ilia Mirkin	2d4976fa19	mesa: the _mesa_geometric_* functions require full types from mtypes.h Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-02 17:08:46 -05:00
Jason Ekstrand	fd99f3d658	anv/device: Improve version error reporting	2016-02-02 13:16:13 -08:00
Jason Ekstrand	c7f26bbed9	vulkan: Bump the header to 1.0.3	2016-02-02 13:08:47 -08:00
Jason Ekstrand	0d2145b50f	anv/fence: Default to not ready This is kind-of silly. We really need to do a better job of making sure all objects have all their default values set. We probably also want to, eventually, put everything into the BO (to save memory) and, more specifically, make the GPU write the "ready" flag. That way GetFenceStatus won't ever have to call into the kernel.	2016-02-02 12:22:03 -08:00
Niels Ole Salscheider	fb44cfadce	winsys/radeon: Do not deinit the pb cache if it was not initialized This fixes a crash in pb_cache_release_all_buffers. Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-02-02 21:11:15 +01:00
Marek Olšák	84a6d2d7d6	tgsi/scan: add tgsi_shader_info::reads_samplemask	2016-02-02 21:04:52 +01:00
Marek Olšák	0d68b91220	radeonsi: rework RB+ for Stoney This fixes it. States which also need to be taken into account: - SPI color formats - each down-conversion format supports only a limited set of SPI formats - whether MSAA resolving and logic op are enabled These need special handling: - blending - disabled channels Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-02 21:03:19 +01:00
Marek Olšák	066d76c2f4	radeonsi: rename cb_target_mask state to cb_render_state and rename a variable in the function. SX_PS_DOWNCONVERT will be emitted here. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-02 21:03:19 +01:00
Marek Olšák	5f0f9a5619	radeonsi: treat intensity render targets exactly like red The motivation is to simplify the Stoney RB+ code. Intensity is already treated as red except here. No piglit regressions. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-02 21:03:18 +01:00
Marek Olšák	f96f94966d	tgsi: set correct src type for UP2H Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-02 21:02:26 +01:00
Connor Abbott	19db71807f	util/hash_table: don't compare deleted entries The equivalent of the last patch for the hash table. I'm not aware of any issues this fixes. v2: - use entry_is_deleted (Timothy) Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2016-02-02 14:42:40 -05:00
Connor Abbott	8fc2f652a2	util/set: don't compare against deleted entries When we delete entries in the hash set, we mark them "deleted" by setting their key to the deleted_key, which points to a dummy deleted_key_value. When searching for an entry, we normally skip over those, but set_add() had some code for searching for duplicate entries which forgot to skip over deleted entries. This led to a segfault inside the NIR vectorization pass, since its key comparison function interpreted the memory where deleted_key_value resides as a pointer and tried to dereference it. v2: - add better commit message (Timothy) - use entry_is_deleted (Timothy) Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2016-02-02 14:42:32 -05:00
Jordan Justen	bd97b62525	glsl: Disable tree grafting optimization for shared variables Fixes: * dEQP-GLES31.functional.compute.basic.shared_atomic_op_multiple_groups * dEQP-GLES31.functional.compute.basic.shared_atomic_op_multiple_invocation * dEQP-GLES31.functional.compute.basic.shared_atomic_op_single_group * dEQP-GLES31.functional.compute.basic.shared_atomic_op_single_invocation From https://android.googlesource.com/platform/external/deqp Reported-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-02 10:50:40 -08:00
Jordan Justen	afef1422cb	glsl: Enable debug prints for do_common_optimization Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-02 10:50:40 -08:00
Roland Scheidegger	5e090079e1	Revert "i965: Provide sse2 version for rgba8 <-> bgra8 swizzle" This reverts commit `ab30426e33`. Apparently the memory isn't quite as aligned when this gets called as it should be, causing crashes. (Albeit this looks independent from this code, should crash just as well if ssse3 is enabled when compiling without this patch.) https://bugs.freedesktop.org/show_bug.cgi?id=93962	2016-02-02 15:45:59 +01:00
Dave Airlie	e7a27f70b9	virgl: mark function as static This is fallout from the previous changes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93961 Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-02-02 17:55:40 +10:00
Roland Scheidegger	7221b8aec6	gallivm: add PK2H/UP2H support Add support for these opcodes, the conversion functions were already there albeit need some new packing stuff. Just like the tgsi version, piglit won't like it for all the same reasons, so it's disabled (UP2H passes piglit arb_shader_language_packing tests, albeit since PK2H won't due to those rounding differences I don't know if that one works or not as the piglit test is rather difficult to deal with). Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-02 05:58:20 +01:00
Roland Scheidegger	5171ec9ca9	gallivm: add PK2H/UP2H support Add support for these opcodes, the conversion functions were already there albeit need some new packing stuff. Just like the tgsi version, piglit won't like it for all the same reasons, so it's disabled (UP2H passes piglit arb_shader_language_packing tests, albeit since PK2H won't due those rounding differences I don't know if that one works or not as the piglit test is rather difficult to deal with).	2016-02-02 05:58:19 +01:00
Roland Scheidegger	dc16086e3b	tgsi: add PK2H/UP2H support The util functions handle the half-float conversion. Note that piglit won't like it much due to: a) The util functions use magic float mul conversion but when run inside softpipe/llvmpipe, denorms are flushed to zero, therefore when the conversion is from/to f16 denorm the result will be zero. This is a bug which should be fixed in these functions (should not rely on denorms being available), but will happen elsewhere just the same (e.g. conversion to f16 render targets). b) The util functions use trunc round mode rather than round-to-nearest. This is NOT a bug (as it is a d3d10 requirement). This will result of rounding not representable finite values to MAX_F16 rather than INFINITY. My belief is the piglit tests are wrong here but it's difficult to tell (generally glsl rounding mode is undefined, however I'm not sure if rounding mode might need to be consistent for different operations). Nevertheless, for gl it would be better to use round-to-nearest, but using different rounding for GL and d3d10 is an unsolved problem (as it affects things like conversion to f16 render targets, clear colors, this shader opcode). Hence for now don't enable the cap bit (so the code is unused). (Code is from imirkin, comment from sroland) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmvware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-02 05:58:19 +01:00
Roland Scheidegger	99bd96abbb	llvmpipe: drop scissor planes early if the tri is fully inside them If the tri is fully inside a scissor edge (or rather, we just use the bounding box of the tri for the comparison), then we can drop these additional scissor "planes" early. We do not even need to allocate space for them in the tri. The math actually appears to be slightly iffy due to bounding boxes being rounded, but it doesn't matter in the end. Those scissor rects are costly - the 4 planes from the scissor are already more expensive to calculate than the 3 planes from the tri itself, and it also prevents us from using the specialized raster code for small tris. This helps openarena performance by about 8% or so. Of course, it helps there that while openarena often enables scissoring (and even moves the scissor rect around) I have not seen a single tri actually hit the scissor rect, ever. v2: drop individual scissor edges, and do it earlier, not even allocating space for them. v3: help the compiler a bit with simpler code, suggested by Brian. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-02 05:58:19 +01:00
Roland Scheidegger	9d2a34e105	llvmpipe: minor cleanup of sse2 for calc_fixed_position Just slightly simpler assembly. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-02 05:58:19 +01:00
Roland Scheidegger	8aa168eb8f	llvmpipe: use vector loads for (optimized) tri raster funcs When we switched to 64bit rasterization, we could no longer use straight aligned loads for loading the plane data. However, what the code actually does for loading 3 planes, is 12 scalar loads + 9 unpacks, and then there's another 8 unpacks for the transpose we need (!). It would be possible to do the (scalar) loads of course already transposed (at least saving the additional unpacks), however instead just use (un)aligned vector loads, and recalculate the eo values, which is much less instructions (note in case of the triangle_32_3_4 case, the eo values are not even used, making the scalar loads + unpacks for them all the more pointless). This drops execution time of the triangle_32_3_4 function considerably, albeit it doesn't really make a measurable difference (for small tris we're essentially limited by vertex throughput in any case), for triangle_32_3_16 it's essentially noise (the loop is more costly than the initial code there). (I'm thinking about just ditching storing the eo values in the plane data, so could switch back to using aligned planes, however right now they are still used in the other raster functions dealing with planes with scalar code. Also not touching the ppc code, might not be that bad there in any case.) Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-02 05:58:19 +01:00
Roland Scheidegger	ab30426e33	i965: Provide sse2 version for rgba8 <-> bgra8 swizzle The existing code used ssse3, and because it isn't compiled in a separate file compiled with that, it is usually not used (that, of course, could be fixed...), whereas sse2 is always present at least with 64bit builds. This should be pretty much as fast as the pshufb version, albeit those code paths aren't really used on chips without llc in any case. v2: fix andnot argument order, add comments v3: use pshuflw/hw instead of shifts (suggested by Matt Turner), cut comments Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-02-02 05:58:19 +01:00
Roland Scheidegger	116e4dc995	mesa: fix typo in python scripts Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-02-02 05:58:19 +01:00
Rob Herring	f0f4259324	virgl: also build vtest for Android Enabling swrast on Android causes a link error because vtest is missing. Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-02-02 09:58:51 +10:00
Rob Herring	2d3301e4d5	virgl: fix reference counting of prime handles The virgl reference counting of buffers is broken for prime fd buffers. Each prime fd passed into virgl_drm_winsys_resource_create_handle creates a new resource. The solution requires creating a separate hash table to track flink names separately from prime handles. Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-02-02 09:58:29 +10:00
Rob Herring	f87330dbce	virgl: reuse screen when fd is already open It is necessary to share the screen between mesa and gralloc to properly ref count resources. This implements a hash lookup on the file description to re-use an already created screen. This is a similar implementation as freedreno and radeon. Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-02-02 09:58:29 +10:00
Mark Janes	ac0589b213	i965: fix unsigned long overflows for i386 bit-shifts on 32 bit unsigned longs overflow in several places. The intention was for 64 bit integers to be used.	2016-02-01 14:52:22 -08:00
Mauro Rossi	6711592c2f	nouveau/video: wrap assertion within #ifndef NDEBUG The change is necessary to avoid the following building error in android: external/mesa/src/gallium/drivers/nouveau/nouveau_vp3_video_bsp.c: In function 'nouveau_vp3_bsp_next': external/mesa/src/gallium/drivers/nouveau/nouveau_vp3_video_bsp.c:269:14: error: 'bsp_bo' undeclared (first use in this function) assert(bsp_bo->size >= str_bsp->w0[0] + num_bytes[i]); ^ This matches the declaration of the variables in question. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-01 17:45:19 -05:00
Ilia Mirkin	047b917718	st/mesa: treat a write as a read for range purposes We use this logic to detect live ranges and then do plain renaming across the whole codebase. As such, to prevent WaW hazards, we have to treat a write as if it were also a read. For example, the following sequence was observed before this patch: 13: UIF TEMP[6].xxxx :0 14: ADD TEMP[6].x, CONST[6].xxxx, -IN[3].yyyy 15: RCP TEMP[7].x, TEMP[3].xxxx 16: MUL TEMP[3].x, TEMP[6].xxxx, TEMP[7].xxxx 17: ADD TEMP[6].x, CONST[7].xxxx, -IN[3].yyyy 18: RCP TEMP[7].x, TEMP[3].xxxx 19: MUL TEMP[4].x, TEMP[6].xxxx, TEMP[7].xxxx While after this patch it becomes: 13: UIF TEMP[7].xxxx :0 14: ADD TEMP[7].x, CONST[6].xxxx, -IN[3].yyyy 15: RCP TEMP[8].x, TEMP[3].xxxx 16: MUL TEMP[4].x, TEMP[7].xxxx, TEMP[8].xxxx 17: ADD TEMP[7].x, CONST[7].xxxx, -IN[3].yyyy 18: RCP TEMP[8].x, TEMP[3].xxxx 19: MUL TEMP[5].x, TEMP[7].xxxx, TEMP[8].xxxx Most importantly note that in the first example, the second RCP is done on the result of the MUL while in the second, the second RCP should have the same value as the first. Looking at the GLSL source, it is apparent that both of the RCP's should have had the same source. Looking at what's going on, the GLSL looks something like float tmin_8; float tmin_10; tmin_10 = tmin_8; ... lots of code ... tmin_8 = tmpvar_17; ... more code that never looks at tmin_8 ... And so we end up with a last_read somewhere at the beginning, and a first_write somewhere at the bottom. For some reason DCE doesn't remove it, but even if that were fixed, DCE doesn't handle 100% of cases, esp including loops. With the last_read somewhere high up, we overwrite the previously correct (and large) last_read with a low one, and then proceed to decide to merge all kinds of junk onto this temp. Even if that weren't the case, and there were just some writes after the last read, then we might still overwrite a merged value with one of those. As a result, we should treat a write as a last_read for the purpose of determining the live range. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com> Cc: mesa-stable@lists.freedesktop.org	2016-02-01 17:40:18 -05:00
Jason Ekstrand	8776d3cb8e	nir/spirv: Fix UBO loads of a single element of a row-major matrix	2016-02-01 14:03:05 -08:00
Jason Ekstrand	499f7c2f0b	nir/spirv: Handle the LOD parameter of OpImageQuerySizeLod	2016-02-01 14:03:05 -08:00
Jason Ekstrand	b1a1623293	nir/spirv: Add support for SpvOpImage	2016-02-01 14:03:05 -08:00
Jason Ekstrand	593f88c0db	nir/spirv: Fix the UBO loading case of a single row-major matric column	2016-02-01 14:03:05 -08:00
Jason Ekstrand	abc0e5c1b8	nir/spirv: Fix the UBO loading case of a single row-major matric column	2016-02-01 13:26:59 -08:00
Jason Ekstrand	2d2c6fc6bb	anv/wsi/wayland: Advertise sRGB	2016-02-01 13:06:35 -08:00
Jason Ekstrand	443c578bca	anv/wsi/x11: Expose SRGB all the time After a long discussion with Eric Anholt and Owen Taylor, I learned that X11 is basically always sRGB as that's what the scanout hardware does and X doesn't modify anything. Therefore, we should just always expose sRGB formats.	2016-02-01 13:06:35 -08:00
Chad Versace	afb327a985	anv: Structify a one-member union anv_descriptor contained a union with one member.	2016-02-01 12:18:10 -08:00
Kristian Høgsberg Kristensen	dc5fdcd6b7	anv: Advertise robustBufferAccess The GPU does most of this for us as long as we set up tight bounds for the buffers, which we do. Additionally, we range check dynamically buffers in the shader. With that it's safe to turn on robustBufferAccess.	2016-02-01 12:00:05 -08:00
Chad Versace	ffbc32f8d9	anv/meta: Strip trailing whitespace	2016-02-01 10:51:01 -08:00
Chad Versace	aa5e257860	anv: Update MSAA status in README	2016-02-01 10:46:24 -08:00
Matt Turner	75c9def8ee	i965/gen7+: Use NIR for lowering of pack/unpack opcodes. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-01 10:43:57 -08:00
Matt Turner	f4952421cd	i965/vec4: Implement nir_op_pack_uvec2_to_uint. And mark nir_op_pack_uvec4_to_uint unreachable, since it's only produced by lowering pack[SU]norm4x8 which the vec4 backend does not need. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-01 10:43:57 -08:00
Matt Turner	955d052058	nir: Add lowering support for unpacking opcodes. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-01 10:43:57 -08:00
Matt Turner	9b8786eba9	nir: Add lowering support for packing opcodes. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-01 10:43:57 -08:00
Matt Turner	1dc312e295	i965/fs: Implement support for extract_word. The vec4 backend will lower it. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-01 10:43:57 -08:00
Matt Turner	68f8c5730b	nir: Add opcodes to extract bytes or words. The uint versions zero extend while the int versions sign extend. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-01 10:43:57 -08:00
Matt Turner	8709dc0713	glsl: Remove 2x16 half-precision pack/unpack opcodes. i965/fs was the only consumer, and we're now doing the lowering in NIR. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-01 10:43:57 -08:00
Matt Turner	1a53a4fc7a	i965/fs: Switch from GLSL IR to NIR for un/packHalf2x16 scalarizing. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-01 10:43:57 -08:00
Matt Turner	9ce901058f	nir: Add lowering of nir_op_unpack_half_2x16. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-01 10:43:57 -08:00
Matt Turner	e4278a847e	i965: Make separate nir_options for scalar/vector stages. We'll want to have different lowering options set for scalar/vector stages. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-01 10:43:57 -08:00
Matt Turner	252d497d4c	i965: Move brw_compiler_create() to new brw_compiler.c. A future patch will want to use designated initalizers, which aren't available in C++, but this is C. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-01 10:43:57 -08:00
Matt Turner	140a886c41	nir: Make argument order of unop_convert match binop_convert. Strangely the return and parameter types were reversed. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-01 10:43:57 -08:00
Jason Ekstrand	a88b1eeb13	Update the README	2016-02-01 06:10:51 -08:00
Jason Ekstrand	ea63663a72	wsi/x11: Remove B8G8R8_UNORM We don't actually support that format yet because ISL doesn't have an enum for it. We need to beef up the formats table to allow for tiled-only formats.	2016-02-01 06:00:50 -08:00
Marta Lofstedt	77a60ab5dc	mesa: enable enums for OES_geometry_shader Enable GL_OES_geometry_shader enums for OpenGL ES 3.1. V4: EXTRA tokens updated according to comments from Ilia Mirkin. V5: Account for check_extra does not evaluate "or" lazy. Fix issues with EXTRA_EXT_FB_NO_ATTACH_CS. Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-01 09:30:50 +01:00
François Tigeot	a48afb92ff	gallium: Add DragonFly support Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-01-31 11:56:09 +00:00
Jordan Justen	f96a6c65a3	anv/gen7: Rename gen7_batch_lr* to emit_lr* Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-30 15:06:03 -08:00
Jordan Justen	b207a6b5aa	anv/gen7: Set BypassGatewayControl in MEDIA_VFE_STATE Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-30 15:06:03 -08:00
Ilia Mirkin	7f19e29305	nv50/ir: get rid of memory stores with nop values This happens especially with exports and varying packing, where the last bits aren't always filled in. We end up trying to do quad-wide stores, which ends up being a lot of register moves that carefully preserve the nop value. Instead don't do the stores. total instructions in shared programs : 6131375 -> 6125267 (-0.10%) total gprs used in shared programs : 910139 -> 895501 (-1.61%) total local used in shared programs : 15328 -> 15328 (0.00%) local gpr inst helped 0 7442 4693 hurt 0 90 2687 Most of the helped/hurt instruction changes are by one or two ops because can no longer do quad-wide stores in all cases. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-30 17:18:41 -05:00
Ilia Mirkin	3ca941d60e	nv50/ir: fix false global CSE on instructions with multiple defs If an instruction has multiple defs, we have to do a lot more checks to make sure that we can move it forward. Among other things, various code likes to do a, b = tex() if () c = a else c = b which means that a single phi node will have results pointing at the same instruction. We obviously can't propagate the tex in this case, but properly accounting for this situation is tricky. Just don't try for instructions with multiple defs. This fixes about 20 shaders in shader-db, including the dolphin efb2ram shader. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-01-30 17:18:41 -05:00
Ilia Mirkin	3ca2001b53	nv50,nvc0: fix buffer clearing to respect engine alignment requirements It appears that the nvidia render engine is quite picky when it comes to linear surfaces. It doesn't like non-256-byte aligned offsets, and apparently doesn't even do non-256-byte strides. This makes arb_clear_buffer_object-unaligned pass on both nv50 and nvc0. As a side-effect this also allows RGB32 clears to work via GPU data upload instead of synchronizing the buffer to the CPU (nvc0 only). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> # tested on GF108, GT215 Tested-by: Nick Sarnie <commendsarnex@gmail.com> # GK208 Cc: mesa-stable@lists.freedesktop.org	2016-01-30 16:01:41 -05:00
Jordan Justen	2d8726a4b7	anv/genX_pipeline: Remove unnecessary #include files Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-30 09:30:54 -08:00
Rob Clark	f15447e7c9	freedreno/ir3: ignore clip-vertex varying Since we emulate clip-planes, the clip-vertex is used within the VS itself (thanks to nir_lower_clip). So just ignore it as a VS output. Fixes a boatload of piglit tests that were asserting on unknown varying slot. (Also unrelated spelling/typo fix.) Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-30 12:29:21 -05:00
Rob Clark	f20cf22b54	freedreno/ir3: don't ignore local vars With glsl_to_nir we end up with local variables, instead of global, for arrays. Note that we'll eventually have to do something more clever, I think, when we support multiple functions, but that will probably take some work in a few places. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-30 12:27:57 -05:00
Rob Clark	8039a2a6b3	freedreno/ir3: handle tex instrs w/ const offset Something we start to see with glsl_to_nir. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-30 12:27:27 -05:00
Rob Clark	f212d7dc50	freedreno/ir3: support load_front_face intrinsic With tgsi_to_nir we get this as a normal input with VARYING_SLOT_FACE. But glsl_to_nir plus nir_lower_system_values this becomes an intrinsic. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-30 12:11:54 -05:00
Jordan Justen	8e48ff3ad6	anv/gen7: Set SLM size in interface descriptor Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-30 09:10:54 -08:00
Rob Clark	9e05e8cb75	freedreno: limit string marker to max packet size Experimentally derived max size. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-30 12:10:13 -05:00
Jordan Justen	ab0d8608d2	anv: Support MEDIA_VFE_STATE for gen7 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-30 09:08:34 -08:00
Jordan Justen	dd2effb0e7	anv/gen7: Subtract 1 from num_elements when setting up buffer surface state `e8f51fe4` for gen7 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-30 09:00:00 -08:00
Jordan Justen	4bb1e7937a	anv/gen7: Disable fs dispatch for depth/stencil only pipelines `292031a` for gen7 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-30 09:00:00 -08:00
Jordan Justen	f5b3a2fe32	anv/gen7: Add support for gl_NumWorkGroups Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-30 09:00:00 -08:00
Jordan Justen	7e46cc8603	anv/gen7/compute: Setup push constants and local ids Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-30 09:00:00 -08:00
Jordan Justen	b1158ced45	anv/genX: Add genX_pipeline.c for compute_pipeline_create Adds initial compute_pipeline_create implementation for gen7. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-30 08:58:11 -08:00
Jason Ekstrand	1a442a7923	Merge branch 'vulkan' into 'vulkan' Vulkan WSI Wayland fixes Two small fixes to make mailbox mode actually work again. See merge request !4	2016-01-30 10:28:12 -05:00
Jason Ekstrand	c668dc9f75	anv/pass: Initialize has_resolve	2016-01-30 07:16:33 -08:00
Jason Ekstrand	ad813b072a	anv/wsi: Set the platform field of VkIcdSurfaceBase	2016-01-30 07:05:53 -08:00
Jason Ekstrand	5acc4e2ebf	anv/wsi/x11: Actually pull information from the window's visual	2016-01-30 03:51:47 -08:00
Jason Ekstrand	66e8b5cf2b	anv/wsi/x11: Actually check for DRI3	2016-01-30 03:50:31 -08:00
Jason Ekstrand	44ec860cd6	anv/WSI: Support more usage bits They're just images and we have no intention of stompping alpha channels (at least not yet), so there's no reason why you can't sample.	2016-01-29 20:52:44 -08:00
Jason Ekstrand	337c1e0871	anv/formats: Add more compressed formats This adds support for the DX compression formats. Given that ETC and EAC are working fine, these should be ok too.	2016-01-29 20:46:31 -08:00
Jason Ekstrand	c688e4db11	anv/wsi: Rework to be compatable with the loader	2016-01-29 20:39:21 -08:00
Jason Ekstrand	d4953fb340	vulkan: Import vk_icd.h	2016-01-29 20:37:45 -08:00
Jason Ekstrand	a19ceee46c	anv/device: Fix version check The bottom-end check was wrong so it was only working on <= 1.0.0. Oops.	2016-01-29 20:36:58 -08:00
Ilia Mirkin	438d421f8b	nvc0: avoid crashing when there are holes in vertex array bindings When using the "shared" vertex array configuration strategy, we bind each of the buffers as a separate array. However there can be holes in such vertex buffer lists, so just emit a disable for those. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-01-29 22:10:42 -05:00
Ilia Mirkin	899b1b98a4	nvc0: enable atomic counters and ssbo Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-29 22:10:42 -05:00
Ilia Mirkin	48cf392c0e	nv50/ir: handle new TGSI MEMBAR opcode Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-29 21:22:48 -05:00
Ilia Mirkin	df043f0764	nvc0/ir: fix atomic compare-and-swap arguments Teach the emitter that the two registers are sequential, and drop the second arg entirely, in favor of a double-wide first argument. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-29 21:22:48 -05:00
Ilia Mirkin	7b9a77b905	nv50/ir: add support for indirect buffer loading Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-29 21:22:48 -05:00
Ilia Mirkin	2c4eeb0b5c	nv50/ir: add SUQ op by reading the info from driver constbuf Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-29 21:22:47 -05:00
Ilia Mirkin	c3083c7082	nv50/ir: add support for BUFFER accesses This largely leaves the existing image logic alone. When image support is added this will have to be harmonized somehow. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-29 21:22:47 -05:00
Ilia Mirkin	abe427ebd2	nvc0: handle shader buffer memory barrier Issue a MEM_BARRIER. No idea if this is sufficient. As there are no tests for this, it'll have to do for now. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-29 21:22:38 -05:00
Ilia Mirkin	fe01be4ad5	nvc0: add state management for shader buffers (address, length) pairs are uploaded to the driver constbuf as well to make these values available to the shaders. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-29 21:06:07 -05:00
Ilia Mirkin	b4688c4615	nvc0: double per-shader stage driver constants area We need to store a lot more info now with per-buffer address/size. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-29 21:06:06 -05:00
Ilia Mirkin	ae725d5746	trace: add support for set_shader_buffers Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1) v1 -> v2: add arg_begin/arg_end around buffer array Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-01-29 21:05:47 -05:00
Ilia Mirkin	fea25db925	st/mesa: enable ARB_shader_storage_buffer_object when supported Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-29 21:05:47 -05:00
Ilia Mirkin	6fb8fac853	st/mesa: add shader buffer barrier bit Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-29 21:05:47 -05:00
Ilia Mirkin	792bab24ac	st/mesa: add support for memory barrier intrinsics Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2) v1 -> v2: use TGSI_MEMBAR defines	2016-01-29 21:05:47 -05:00
Ilia Mirkin	c0e1c54a4f	st/mesa: use RESQ to find buffer size Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-29 21:05:47 -05:00
Ilia Mirkin	6880036694	st/mesa: add support for SSBO binding and GLSL intrinsics Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> v1 -> v2: some 80 char reformatting	2016-01-29 21:05:46 -05:00
Ilia Mirkin	9d6f9ccf6b	st/mesa: add atomic counter support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-29 21:05:46 -05:00
Ilia Mirkin	0fddb677e6	mesa: add PROGRAM_IMMEDIATE, PROGRAM_BUFFER This makes PROGRAM_IMMEDIATE a first-class gl_register_file type, and adds PROGRAM_BUFFER to the list. These are used purely inside glsl_to_tgsi conversion. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-29 21:05:35 -05:00
Ilia Mirkin	35f8488668	glsl: keep track of ssbo variable being accessed, add access params Currently any access params (coherent/volatile/restrict) are being lost when lowering to the ssbo load/store intrinsics. Keep track of the variable being used, and bake its access params in as the last arg of the load/store intrinsics. If the variable is accessed via an instance block, then 'variable' points to the instance block variable and not the field inside the instance block that we are accessing. In order to check access parameters for the field itself we need to detect this case and keep track of the corresponding field struct so we can extract the specific field access information from there instead. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1) v1 -> v2: add tracking of struct field v2 -> v3: minor adjustments based on Iago's feedback Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-01-29 21:05:08 -05:00
Ilia Mirkin	2b089c7ffe	glsl: always initialize image_* fields, copy them on interface init Interfaces can have image properties set in case they are buffer interfaces. Make sure not to lose this information. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-29 21:04:56 -05:00
Ilia Mirkin	2ccc42fd2c	tgsi: add MEMBAR opcode to handle memoryBarrier* GLSL intrinsics Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1) v1 -> v2: add defines for the various bits Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-29 21:04:36 -05:00
Kristian Høgsberg Kristensen	f28645f71c	anv: Don't disable snooping for mempools There's an intermittent flushing problem with VkEvent that we need to root cause. For now, using the snooping feature keeps the memory pools up to date with GPU writes and fixes the problem.	2016-01-29 17:19:51 -08:00
Kristian Høgsberg Kristensen	0c4ef36360	anv: clflush is only orderered against mfence We can't use the more fine-grained load and store fence commands (lfence and mfence), since clflush is only guaranteed to be ordered with respect to mfence.	2016-01-29 14:56:41 -08:00
Kristian Høgsberg Kristensen	31d3486bd2	anv: Limit flushing to the range of mapped memory	2016-01-29 14:56:41 -08:00
Ben Widawsky	89ec36f221	anv/cmd_buffer: Emit gen9 style SF state for CHV The state for line width changes on Cherryview to use the GEN9 bits (for extra precision).	2016-01-29 14:12:32 -08:00
Ben Widawsky	31508bd0ce	anv/gen8: Extract SF state For upcoming patch to address difference in Cherryview.	2016-01-29 14:11:53 -08:00
Michel Dänzer	30fcf241e1	winsys/amdgpu: Process RADEON_FLAG_* independently from RADEON_DOMAIN_* In particular, AMDGPU_GEM_CREATE_CPU_GTT_USWC can affect even BOs created in VRAM if they get evicted to GTT. In general there's no need to restrict any of the flags to any particular domains. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-01-29 16:06:06 +09:00
Michel Dänzer	62f837e2ea	winsys/amdgpu: Handle RADEON_FLAG_NO_CPU_ACCESS Failing to do this was resulting in the kernel driver unnecessarily leaving open the possibility of CPU access to tiled BOs. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93862 (This change shouldn't be backported to stable branches, because released versions of xf86-video-amdgpu unnecessarily try to map the front buffer) Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-01-29 16:06:06 +09:00
Karol Herbst	29d09f8747	nv50/ir: optimize mad/fma with third argument 0 to mul Very modest effect, but it's clearly the right thing to do. total instructions in shared programs : 6131491 -> 6131398 (-0.00%) total gprs used in shared programs : 910157 -> 910131 (-0.00%) total local used in shared programs : 15328 -> 15328 (0.00%) local gpr inst bytes helped 0 55 85 85 hurt 0 26 20 20 Signed-off-by: Karol Herbst <nouveau@karolherbst.de> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-28 15:59:41 -05:00
Karol Herbst	3aa681449e	nv50/ir: run DCE backwards Reduces calls up to 50% Signed-off-by: Karol Herbst <nouveau@karolherbst.de> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-28 15:34:29 -05:00
Karol Herbst	978ae28ca2	nv50/ir: optimize shl(shr(a, c), c) to and(a, ~((1 << c) - 1)) Following shader-db results on GK110: total instructions in shared programs : 6141510 -> 6131491 (-0.16%) total gprs used in shared programs : 910187 -> 910157 (-0.00%) total local used in shared programs : 15328 -> 15328 (0.00%) local gpr inst bytes helped 0 18 821 821 hurt 0 0 0 0 Signed-off-by: Karol Herbst <nouveau@karolherbst.de> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-28 15:34:22 -05:00
Chad Versace	f8a4abcd15	anv: Do resolves at end of subpass	2016-01-28 10:49:50 -08:00
Chad Versace	bef8456ede	anv/meta: Remove unneeded resolve pipeline Vulkan does not allow resolving a single-sample image. So remove that pipeline from anv_meta_state::resolve::pipelines.	2016-01-28 10:45:11 -08:00
Chad Versace	ac5594fa71	anv/meta_resolve: Remove redundant initialization params	2016-01-28 10:14:39 -08:00
Chad Versace	142da00486	anv: Drop const on anv_framebuffer::attachments The attachments should be const, but the driver's function signatures are generally not const-friendly. Drop the const because it conflicts with upcoming anv_cmd_buffer_resolve_subpass().	2016-01-28 10:03:00 -08:00
Chad Versace	22258e279d	anv: Add anv_subpass::has_resolve Indicates that the subpass has at least one resolve attachment.	2016-01-28 10:03:00 -08:00
Chad Versace	3d863e8dad	anv/meta_resolve: Save/Restore viewport and scissor	2016-01-28 10:03:00 -08:00
Chad Versace	8487569fa7	anv/meta_resolve: Begin pass outside emit_resolve() This refactor is preparation for handling subpass resolve attachments.	2016-01-28 10:03:00 -08:00
Chad Versace	2bab3cd681	anv/image: Update usage flags for multisample images Meta resolves multisample images by binding them as textures. Therefore we must add VK_IMAGE_USAGE_SAMPLED_BIT.	2016-01-28 10:03:00 -08:00
Ilia Mirkin	089f605439	glsl: disallow implicit conversions in ESSL shaders Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-01-28 11:31:19 -05:00
Axel Davy	dda7a84986	radeonsi: Add option for SI scheduler Add a debug option to select the LLVM SI Machine Scheduler. R600_DEBUG=sisched Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-28 17:22:44 +01:00
Jason Ekstrand	608b411e9f	anv/device: Add a better version check. We now check that the requested version is precicely within the range of versions that we support.	2016-01-28 08:19:40 -08:00
Samuel Iglesias Gonsálvez	f9c43dd22f	glsl: double-precision values don't support interpolation ARB_gpu_shader_fp64 spec says: "This extension does not support interpolation of double-precision values; doubles used as fragment shader inputs must be qualified as "flat"." Fixes the regressions added by commit `781d278`: arb_gpu_shader_fp64-double-gettransformfeedbackvarying arb_gpu_shader_fp64-tf-interleaved arb_gpu_shader_fp64-tf-interleaved-aligned arb_gpu_shader_fp64-tf-separate Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93878 Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-01-28 11:35:03 +01:00
Jason Ekstrand	6286a74f6b	anv/device: Advertise 1.0.2	2016-01-27 22:02:51 -08:00
Jason Ekstrand	ec80d6388a	anv/formats: Properly set FORMAT_FEATURE_SAMPLED_IMAGE_FILTER_LINEAR_BIT This was added last minute and the API bumped to 1.0.2.	2016-01-27 22:02:06 -08:00
Jason Ekstrand	ac75746448	vulkan.h: Update to 1.0.2	2016-01-27 21:59:00 -08:00
Jason Ekstrand	c64bc5463d	anv/device: Improve the api version check to allow 1.0.X	2016-01-27 21:56:46 -08:00
Eric Anholt	3fba517bdd	vc4: Throttle outstanding rendering after submission. Just make sure that after we've submitted, we get to at least 5 (global) submits ago before we go on to do more. Prevents up to seconds of lag with window movement in X with xcompmgr -c. There may be useful tuning to do in the future, but for now this gets us usability. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Eric Anholt <eric@anholt.net>	2016-01-27 20:05:37 -08:00
Eric Anholt	2a449ce7c9	vc4: Don't record the seqno of a failed job submit. On an error return, the returned seqno will probably be unset, so we'd lose track of what we've submitted so far for waiting on in the future. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Eric Anholt <eric@anholt.net>	2016-01-27 20:05:37 -08:00
Francisco Jerez	4604b2871a	vtn: Improve accuracy of acos approximation. The adjusted polynomial coefficients come from the numerical minimization of the L2 norm of the relative error. The old coefficients would give a maximum relative error of about 15000 ULP in the neighborhood around acos(x) = 0, the new ones give a relative error bounded by less than 2000 ULP in the same neighborhood.	2016-01-27 19:55:21 -08:00
Jason Ekstrand	7fb35a8228	An alternate arccosine implementation	2016-01-27 19:55:21 -08:00
Jason Ekstrand	983db2b804	anv/meta_resolve: Fix a bug in the meta pipeline destroy path	2016-01-27 19:48:43 -08:00
Chad Versace	9b240a1e3d	anv/skl: Fix crash in 16x multisampling We built meta clear and resolve pipelines for only up to 8x samples. There were no 16x pipelines.	2016-01-27 18:38:15 -08:00
Chad Versace	61d3d49820	anv: Fix comment for anv_meta_state arrays Array element i is for 2^i samples, not log2(i) samples.	2016-01-27 18:32:05 -08:00
Ben Widawsky	0e06f76a84	i965/skl: Utilize new 5th bit for gateway messages Modify comment as spotted by Matt, and Chris Forbes Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-27 17:12:56 -08:00
Ben Widawsky	2af3281fee	anv/push constants: Use constant buffer #2 SKL has a workaround which requires either some weird programming of buffer 3, OR, just never using buffer 0. Since we don't actually use multiple constant buffers, it's easier to just not use 0. Only SKL requires this workaround, but there is no harm in applying it to all platforms. The big change here is that buffer #0 is relative to dynamic state base normally (depending upon ISTPM), where buffer 1-3 is a GPU virtual address.	2016-01-27 17:09:36 -08:00
Chad Versace	5d4f3298ae	anv/meta: Implement multisample clears	2016-01-27 17:01:59 -08:00
Chad Versace	eb6fb65fd1	anv/meta: Simplify failure handling during clear init Remove all the fine-grained cleanup in anv_device_init_meta_clear_state(). Instead, if anything fails during initialization, simply call anv_device_finish_meta_clear_state() and let it clean up the partially initialized anv_meta_state::clear.	2016-01-27 17:01:56 -08:00
Chad Versace	4085f1f230	anv/meta: Implement vkCmdResolveImage This handles multisample color images that have a floating-point or normalized format and have a single array layer. This does not yet handle integer formats nor multisample array images.	2016-01-27 16:55:59 -08:00
Chad Versace	8cc1e59d61	anv/meta: Add func anv_meta_get_iview_layer() This function is just meta_blit_get_dest_view_base_array_slice(), but moved to the shared header anv_meta.h. Will be needed by anv_meta_resolve.c.	2016-01-27 16:52:30 -08:00
Chad Versace	8cc6f058ce	anv/gen8: Begin enabling pipeline multisample state As far as I can tell, this patch sets all pipeline multisample state except: - alpha to coverage - alpha to one - the dispatch count for per-sample dispatch	2016-01-27 16:52:27 -08:00
Chad Versace	57e4a5ea99	anv/gen8: Set multisample surface state	2016-01-27 16:48:20 -08:00
Chad Versace	9b3d660878	anv/meta: Merge anv_meta_clear.h into anv_meta.h The header was too small.	2016-01-27 16:48:20 -08:00
Kenneth Graunke	32e4c5ae30	vtn: Make tanh implementation even stupider The dEQP "precision" test tries to verify that the reference functions float sinh(float a) { return ((exp(a) - exp(-a)) / 2); } float cosh(float a) { return ((exp(a) + exp(-a)) / 2); } float tanh(float a) { return (sinh(a) / cosh(a)); } produce the same values as the built-ins. We simplified away the multiplication by 0.5 in the numerator and denominator, and apparently this causes them not to match for exactly 1 out of 13,632 values. So, put it back in, fixing the test, but making our code generation (and precision?) worse.	2016-01-27 15:34:50 -08:00
Jason Ekstrand	8f0ef9bbeb	nir/opt_algebraic: Use a more elementary mechanism for lowering ldexp	2016-01-27 15:21:28 -08:00
Jason Ekstrand	f7d6b8ccfe	gen8/state: Fix QPitch for compressed textures on Broadwell	2016-01-27 15:12:42 -08:00
Jason Ekstrand	162c662585	anv/image: Use the entire image height for compressed meta blits	2016-01-27 15:12:42 -08:00
Nanley Chery	235abfb7e6	anv/image: Enlarge the image level 0 extent The extent previously was supposed to match the mip at a given level under the assumption that the base address would be that of the mip as well. Now however, the base address only matches the offset of the containing tile. Therefore, enlarge the extent to match that of phys_slice0, so that we don't draw/fetch in out of bounds territory. This solution isn't perfect because the base adress isn't always at the first tile, therefore the assumed valid memory region by the HW contains some number of invalid tiles on two edges.	2016-01-27 15:12:42 -08:00
Jason Ekstrand	96cf5cfee1	anv/image: Minify before dividing by block dimensions	2016-01-27 15:12:42 -08:00
Jason Ekstrand	1bea1eff38	anv/meta: Don't double-call choose_buffer_format This fixes all the renderpass tests	2016-01-27 15:12:42 -08:00
Nanley Chery	dd22b5c914	anv/meta: Modify make_image_for_buffer()'s image Always use a valid buffer format and convert the extent to units of elements with respect to original image format.	2016-01-27 15:12:42 -08:00
Nanley Chery	d3c1fd53e2	anv/image: Use custom VkBufferImageCopy for iview initialization Use a custom VkBufferImageCopy with the user-provided struct as the base. A few fields are modified when the iview is uncompressed and the underlying image is compressed.	2016-01-27 15:12:42 -08:00
Nanley Chery	6a579ded87	anv: Add offset parameter to anv_image_view_init() This is the offset of the tile that contains the mip specified by isl_surf_get_image_intratile_offset_el(). Used to draw to/from the specified mip.	2016-01-27 15:12:42 -08:00
Nanley Chery	4a0075feeb	anv/meta: Calculate mip offset for compressed texture This value will be used in a later commit.	2016-01-27 15:12:42 -08:00
Nanley Chery	1c87cb51be	anv/meta: Disambiguate slice variable value This will simplify the usage of isl_surf_get_image_intratile_offset_el().	2016-01-27 15:12:42 -08:00
Nanley Chery	8c0c25abde	gen8_state: use iview extent to program RENDER_SURFACE_STATE When creating an uncompressed ImageView on an compressed Image, the SurfaceFormat is updated to match the ImageView's. The surface dimensions must also change so that the HW sees the same size image instead of a 4x larger one. Fixes the following error which results from running many VulkanCTS compressed tests in one shot: ResourceError (vk.queueSubmit(queue, 1, &submitInfo, *m_fence): VK_ERROR_OUT_OF_DEVICE_MEMORY at vktPipelineImageSamplingInstance.cpp:921) Makes all compressed format tests with a height > 1 pass.	2016-01-27 15:12:42 -08:00
Nanley Chery	3f01bbe7f3	anv/image: Scale iview extent by backing image Aligns with formula's presented in Vulkan spec concerning CopyBufferToImage. 18.4 Copying Data Between Buffers and Images This won't conflict with valid API usage, because: 1) Users are not allowed to create an uncompressed ImageView with a compressed Image. see: VkSpec - 11.5 Image Views - VkImageViewCreateInfo's Valid Usage box 2) If users create a differently formatted compressed ImageView with a compressed Image, the block dimensions will still match. see: VkSpec - 28.3.1.5 Format Compatibility Classes - Table 28.5	2016-01-27 15:12:42 -08:00
Nanley Chery	010ab34839	anv/meta: Set depth to 0 for buffer image in CopyBufferToImage() The buffer image is a flat 2D surface. Each surface represents an array/depth layer, therefore, the Z-offset is 0 when blitting.	2016-01-27 15:12:42 -08:00
Nanley Chery	2fb8b859f6	anv/meta: Use the uncompressed rectangle when blitting For an uncompressed ImageView of a compressed Image, the dimensions and offsets are all divided by the appropriate block dimensions. We are not yet using an uncompressed ImageView for a compressed Image, but will do so in a future commit.	2016-01-27 15:12:42 -08:00
Nanley Chery	c3546685ed	i965: Update the surface_format table for ETC formats Enable ETC support for BDW+. In Vulkan, an array lookup on surface_format[] is used to determine HW support for certain formats. In contrast, Mesa dynamically populates an array which reports this information.	2016-01-27 15:12:42 -08:00
Nanley Chery	308ec0279b	anv/image: Update usages of isl_surf_get_image_offset_sa	2016-01-27 15:12:42 -08:00
Nanley Chery	02629a16d1	isl: Add logical z offset to GEN4_2D surfaces 3D surfaces in Skylake are stored with ISL_DIM_LAYOUT_GEN4_2D. Any delta in the logical z offset causes an equivalent delta in the surface's array layer.	2016-01-27 15:12:42 -08:00
Chad Versace	a6ecfe1dd3	isl/tests: Add some tests for intratile offsets Test isl_surf_get_image_intratile_offset_el() in the tests: test_bdw_2d_r8g8b8a8_unorm_512x512_array01_samples01_noaux_tiley0 test_bdw_2d_r8g8b8a8_unorm_1024x1024_array06_samples01_noaux_tiley0	2016-01-27 15:12:42 -08:00
Chad Versace	7ab0d2e2c0	isl: Add func isl_get_intratile_image_offset_el()	2016-01-27 15:12:42 -08:00
Chad Versace	18a83eaa8c	isl/tests: Rename t_assert_offset() Rename it to t_assert_offset_el(), clarifying that the offset in units of surface elements, not samples.	2016-01-27 15:12:42 -08:00
Chad Versace	fa08f95ff5	isl: Add func isl_surf_get_image_offset_el() This replaces function isl_surf_get_image_offset_sa()	2016-01-27 15:12:42 -08:00
Chad Versace	ea44d31528	isl: Fix row pitch for compressed formats When calculating row pitch, the row's width in samples must be divided by the format's block width. The commit below accidentally removed the division. commit `eea2d4d059` Author: Chad Versace <chad.versace@intel.com> Date: Tue Jan 5 14:28:28 2016 -0800 Subject: isl: Don't align phys_slice0_sa.width twice	2016-01-27 15:12:42 -08:00
Chad Versace	45ecfcd637	isl: Add func isl_surf_get_tile_info()	2016-01-27 15:12:42 -08:00
Kenneth Graunke	9f954310e8	vtn: Fix atan2 for non-scalars. The if/then/else block was bogus, as it can only take a scalar condition, and we need to select component-wise. The GLSL IR implementation of atan2 handles this by looping over components, but I decided to try and do it vector-wise, and messed up. For now, just bcsel. It means that we do the atan1 math even if all components hit the quick case, but it works, and presumably at least one component will hit the expensive path anyway.	2016-01-27 15:07:42 -08:00
Kenneth Graunke	f92a35d831	vtn: Fix Modf. We were botching this for negative numbers - floor of a negative rounds the wrong way. Additionally, both results are supposed to retain the sign of the original. To fix this, just take the abs of both values, then put the sign back. There's probably a better way to do this, but this works for now.	2016-01-27 14:21:08 -08:00
Kenneth Graunke	4acfc9effb	i965: Fix SIN/COS precision problems. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-01-27 13:56:54 -08:00
Ilia Mirkin	34c2c7c61e	glsl: only expose double mod when doubles are available Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-27 15:15:10 -05:00
Kristian Høgsberg Kristensen	b833e7a63c	anv: Put back code to grow shader scratch space This was lost in commit `a71e614d33`.	2016-01-27 11:36:56 -08:00
Kenneth Graunke	38a3a535eb	anv: Update the device limits. Fixes dEQP-VK.api.info.device.properties. I haven't tested any others.	2016-01-26 23:09:45 -08:00
Jason Ekstrand	d3607351fe	gen7/cmd_buffer: SCISSOR_RECT structs are tightly packed The pointer has to be 32-byte aligned, but the structs themselves are 2 dwords each, tightly packed.	2016-01-26 22:10:14 -08:00
Jason Ekstrand	f2f03c5b65	anv/pipeline: Set MaximumVPIndex in 3DSTATE_CLIP	2016-01-26 21:52:59 -08:00
Jason Ekstrand	dc3de6f8df	anv/pipeline: Only lower input indirects if EmitNoIndirectInput is set	2016-01-26 21:45:21 -08:00
Jason Ekstrand	9ac624751e	anv/formats: Use is_power_of_two instead of is_rgb to determine renderability	2016-01-26 20:29:16 -08:00
Jason Ekstrand	2af3acd061	HACK/i965/surface_formats: Mark A4B4G4R4 as being supported The table has this marked as unsupported on all gens, but I don't really believe that given how early it is in the table. I've tested and it seems to work on Broadwell. The Bspec says that it sould be renderable on SKL+ but alpha blending is questionable. Side note: We really need to audit the format table again.	2016-01-26 20:29:16 -08:00
Jordan Justen	c20f78dc5d	anv: Support swizzled formats. Some formats require a swizzle in order to map them to actual hardware formats. This allows us to turn on two new Vulkan formats.	2016-01-26 20:29:16 -08:00
Jason Ekstrand	9bc72a9213	anv/image: Do swizzle remapping in anv_image.c TODO: At some point, we really need to make an image_view_init_info that's a flyweight and stop stuffing everything into image_view.	2016-01-26 20:23:59 -08:00
Jason Ekstrand	7d84fe9b1f	HACK: Expose support for stencil blits If someone actually tries to use them, they won't work, but at least we don't fail to return format properties now.	2016-01-26 17:29:49 -08:00
Kenneth Graunke	32dcfc953e	vtn: Delete references to IMix opcode. This is being removed in SPIR-V. Bugzilla: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=15452	2016-01-26 17:02:35 -08:00
Ben Widawsky	c5dc6cdf26	i965/skl: Utilize new 5th bit for gateway messages Cc: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>	2016-01-26 15:44:48 -08:00
Jason Ekstrand	a1ea45b857	genX/pipeline: Don't make vertex bindings with holes	2016-01-26 15:44:18 -08:00
Jason Ekstrand	7ef0d39cb2	anv/cmd_buffer: Put base_instance in the second component	2016-01-26 15:44:02 -08:00
Francisco Jerez	6840cc1513	anv/image: clflush surface state map in anv_fill_buffer_surface_state(). Some of its users had the required clflush on non-LLC platforms, some didn't. Put the clflush in anv_fill_buffer_surface_state() so we don't forget.	2016-01-26 15:14:50 -08:00
Francisco Jerez	fc7a7b31c5	anv/image: clflush the right state map in anv_fill_image_surface_state(). It was clflushing the nonrt_surface_state structure regardless of which state structure was actually being initialized.	2016-01-26 15:14:50 -08:00
Francisco Jerez	a50dc70e21	anv/image: Upload raw buffer surface state for untyped storage image and texel buffer access.	2016-01-26 15:14:50 -08:00
Francisco Jerez	d2ec510dda	anv/image: Fix image parameter initialization.	2016-01-26 15:14:50 -08:00
Francisco Jerez	d9e0b9a06a	isl/gen9: Fix slice offset calculation for 1D array images. The X component of the offset is set to the layer index times layer height which is obviously bogus, return the vertical offset of the slice as Y component instead. Fixes a few image load/store tests that use 1D arrays on SKL when forcing it to fall back to untyped reads and writes.	2016-01-26 15:14:50 -08:00
Jason Ekstrand	cc065e0ad7	i965/fs_surface_builder: Mask signed integers after conversion	2016-01-26 15:14:50 -08:00
Jason Ekstrand	ba393c9d81	anv/image: Actually fill out brw_image_param structs	2016-01-26 15:14:50 -08:00
Jason Ekstrand	aa9987a395	anv/image_view: Add base mip and base layer fields These will be needed by image_load_store	2016-01-26 15:14:50 -08:00
Jason Ekstrand	42cd994177	gen7: Add support for base vertex/instance	2016-01-26 14:56:37 -08:00
Jason Ekstrand	4bf3cadb66	gen8: Add support for base vertex/instance	2016-01-26 14:56:37 -08:00
Jason Ekstrand	6ba67795db	nir/spirv: Add proper support for InstanceIndex	2016-01-26 14:56:37 -08:00
Jason Ekstrand	1c3b7fe1ee	nir/lower_io: Lower INSTNACE_INDEX	2016-01-26 14:56:37 -08:00
Jason Ekstrand	b2b7c93318	glsl/enums: Add an enum for Vulkan instance index	2016-01-26 14:56:37 -08:00
Jason Ekstrand	da75492879	genX/pipeline: Break emit_vertex_input out into common code It's mostly the same and contains some non-trivial logic, so it really should be shared. Also, we're about to make some modifications here that we would really like to share.	2016-01-26 14:56:37 -08:00
Karol Herbst	19ae5de981	nv50/ir: fix memory corruption when spilling and redoing RA When RA fails, and we spill, we have to clean everything up before doing RA again. We were forgetting to reset the hi/lo linked lists - at least the hi list is guaranteed to still have pointers to now-deleted RIG nodes. Signed-off-by: Karol Herbst <nouveau@karolherbst.de> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-01-26 17:55:06 -05:00
Kristian Høgsberg Kristensen	fe6ccb6031	anv: Remove long unused anv_aub.h	2016-01-26 14:53:00 -08:00
Kristian Høgsberg Kristensen	074a7c7d7c	anv: Dirty fragment shader descriptors in meta restore We need to reemit render targets, so dirtying VK_SHADER_STAGE_VERTEX_BIT doesn't help us much.	2016-01-26 14:44:02 -08:00
Kristian Høgsberg Kristensen	725d969753	anv: Reemit STATE_BASE_ADDRESS after second level cmd buffers Otherwise the primary batch will continue using the state base addresses set by the secondary. Fixes remaining renderpass tests.	2016-01-26 14:44:02 -08:00
Timothy Arceri	d580a979a4	glsl: remove old FINISHME This should have been removed long ago. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2016-01-27 09:15:21 +11:00
Chad Versace	df5f6d824b	anv/meta: Fix sample mask in clear pipelines Once we begin emitting the correct sample mask, genX_3DSTATE_SAMPLE_MASK_pack will hit an assertion if the mask contains too many bits.	2016-01-26 11:04:44 -08:00
Marek Olšák	98cebc913c	configure.ac: don't require EGL/DRM and GBM if OpenGL is disabled This allows building VDPAU/OMX/VA drivers without OpenGL and its dependencies. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-01-26 19:07:03 +01:00
Jan Vesely	efc4142acd	r600,compute: Plug few memory leaks v2: drop inline keyword drop radeon_llvm_dispose_kernel_module wrapper v3: move definitions to .c file use in radeonsi Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-01-26 19:04:38 +01:00
Jan Vesely	e1dcd333e4	r600: Typos and whitespace fixes Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-26 19:01:22 +01:00
Marek Olšák	2924ca131f	radeonsi: fix clover crash caused by `ce1e7784d0` Trivial.	2016-01-26 18:53:41 +01:00
Marek Olšák	af57507e4f	radeonsi: fix shader precompilation for shader-db The addition of spi_shader_col_format killed all color outputs in precompiled shaders. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> (v1) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1) v2: also set the alpha func (trivial)	2016-01-26 18:49:50 +01:00
Ilia Mirkin	38c63abf09	glsl: add GL_OES_geometry_point_size and conditionalize gl_PointSize For now this will be enabled in tandem with GL_OES_geometry_shader. Should a driver come along that wants to separate them out, another enable can be added. Also adds the missed GL_OES_geometry_shader define in glcpp. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>	2016-01-26 12:36:15 -05:00
Emil Velikov	eb63640c1d	glsl: move to compiler/ Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Matt Turner <mattst88@gmail.com> Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-26 16:08:33 +00:00
Emil Velikov	a39a8fbbaa	nir: move to compiler/ Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Matt Turner <mattst88@gmail.com> Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-26 16:08:30 +00:00
Emil Velikov	f694da80c7	compiler: move the glsl_types C wrapper alongside their C++ brethren At a later stage we might want to split out the NIR specific [XXX: which one was it], as to make things move obvious and rename the files appropriately. This patch aims to split it out of nir. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Matt Turner <mattst88@gmail.com> Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-26 16:08:27 +00:00
Emil Velikov	24f984f64a	nir: move glsl_types.{cpp,h} to compiler Allows us to remove the SCons workaround :-) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Matt Turner <mattst88@gmail.com> Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-26 16:08:24 +00:00
Emil Velikov	1a882fd2ee	nir: move shader_enums.[ch] to compiler This way one can reuse it in glsl, nir or other infrastructure without pulling nir as dependency. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Matt Turner <mattst88@gmail.com> Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-26 16:08:20 +00:00
Emil Velikov	2f86383091	compiler: introduce a libcompiler static library Currently it's an empty library, although it'll be used to store common code between GLSL and NIR that is compiler specific (rather than generic as the one in src/util). XXX: strictly speaking we could add a python/mako parser to generate the relevant files instead including builtin_type_macros.h in such a manner. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Matt Turner <mattst88@gmail.com> Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-26 16:07:27 +00:00
Nicolai Hähnle	41875ac4ed	gallium/ddebug: add 'verbose' option This currently just writes out the name of dump files, which can be useful to easily correlate those files with other log outputs (driver debug output, apitrace calls, etc.) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-26 09:58:55 -05:00
Nicolai Hähnle	f4c8fa4e49	gallium/ddebug: make 'noflush' also affect 'always' mode This changes the default behavior of 'always' mode to be consistent with hang detection mode. I have used this to more easily compare dumped command streams using diff. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-26 09:58:49 -05:00
Nicolai Hähnle	8894b5f008	radeonsi: use llvm.amdgcn.s.barrier instead of llvm.AMDGPU.barrier.local The new name for the intrinsic was introduced in LLVM r258558. v2: use ternary operator instead of preprocessor Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> (v1) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-26 09:57:06 -05:00
Jason Ekstrand	725fb3623f	i965/compiler: Set nir_options.vertex_id_zero_based	2016-01-25 16:10:28 -08:00
Jason Ekstrand	6b6a8a99f8	HACK/i965: Default to scalar GS on BDW+	2016-01-25 15:52:53 -08:00
Ben Widawsky	a443b5b732	i965/bxt: Fix conservative wm thread counts. When setting the conservative thread counts, I halved everything. That isn't correct for the wm, which has nothing to do with actual thread counts. I suck. BXT only has 1 slice, and there is some ambiguity about subslices, so just reserve the max possible for now. It looks like this might fix: piglit.spec.glsl-1_50.execution.variable-indexing.gs-output-array-vec4-index-wr.bxtm64. I kind of question why that is, but it is what Jenkins says. Mark is current running some of the other blacklisted tests on this patch. (it effects anything requiring scratch space). Cc: mesa-stable <mesa-stable@lists.freedesktop.org> Cc: Neil Roberts <neil@linux.intel.com> Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Mark Janes <mark.a.janes@intel.com>	2016-01-25 15:51:17 -08:00
Jason Ekstrand	e462d4d815	Merge remote-tracking branch 'mattst88/nir-lower-pack-unpack' into vulkan	2016-01-25 15:50:31 -08:00
Jason Ekstrand	6bbf3814dc	gen7/state: Apply min/mag filters individually for samplers This fixes tests which apply different min and mag filters, and depend on the min filter to be correct.	2016-01-25 15:33:08 -08:00
Ben Widawsky	9c69f4632d	gen8/state: Apply min/mag filters individually for samplers This fixes tests which apply different min and mag filters, and depend on the min filter to be correct.	2016-01-25 15:29:18 -08:00
Jason Ekstrand	2434ceabf4	i965/fs: Feel free to spill partial reads/writes Now that we properly handle write-masking, this should be safe.	2016-01-25 15:23:10 -08:00
Jason Ekstrand	9c0109a1f6	i965/fs: Properly write-mask spills For unspills (scratch reads), we can just set WE_all all the time because we always unspill into a new GRF. For spills, we have two options: If the instruction has a 32-bit-per-channel destination and "normal" regioning, then we just do a regular write and it will interleave channels from different control-flow paths properly. If, on the other hand, the the regioning is non-normal, then we have to unspill, run the instruction, and spill afterwards. In this second case, we need to do the spill with we_ALL.	2016-01-25 15:23:10 -08:00
Kristian Høgsberg Kristensen	8e07f7942e	anv: Remove a few finished finishme	2016-01-25 15:16:13 -08:00
Kristian Høgsberg Kristensen	76c096f0e7	anv: Remove stale assert This goes back to when we didn't have the subpass number in the command buffer begin info.	2016-01-25 15:15:59 -08:00
Matt Turner	874ede4983	i965/gen7+: Use NIR for lowering of pack/unpack opcodes.	2016-01-25 14:48:34 -08:00
Matt Turner	5deba3f00a	i965/vec4: Implement nir_op_pack_uvec2_to_uint. And mark nir_op_pack_uvec4_to_uint unreachable, since it's only produced by lowering pack[SU]norm4x8 which the vec4 backend does not need.	2016-01-25 14:24:07 -08:00
Matt Turner	8bb22dc351	nir: Add lowering support for unpacking opcodes.	2016-01-25 14:24:07 -08:00
Matt Turner	d7781038f5	nir: Add lowering support for packing opcodes.	2016-01-25 14:24:07 -08:00
Matt Turner	6c1b3bc950	i965/fs: Implement support for extract_word. The vec4 backend will lower it.	2016-01-25 14:24:07 -08:00
Matt Turner	26f0444ead	nir: Add opcodes to extract bytes or words. The uint versions zero extend while the int versions sign extend.	2016-01-25 14:24:07 -08:00
Nanley Chery	2c94f659e8	anv/meta: Fix CopyBuffer when size matches HW limit Perform a copy when the copy_size matches the HW limit (max_copy_size). Otherwise the current behavior is that we fail the following assertion: assert(height < max_surface_dim); because the values are equal.	2016-01-25 12:26:39 -08:00
Kristian Høgsberg Kristensen	c21de2bf04	anv: Don't use uninitialized barycentric_interp_modes If we don't have a fragment shader, wm_prog_data in undefined.	2016-01-25 11:34:32 -08:00
Kristian Høgsberg Kristensen	292031a1a5	anv: Disable fs dispatch for depth/stencil only pipelines Fixes most renderpass bugs.	2016-01-25 11:26:19 -08:00
Matt Turner	26b2cc6f3a	glsl: Remove 2x16 half-precision pack/unpack opcodes. i965/fs was the only consumer, and we're now doing the lowering in NIR.	2016-01-25 11:12:36 -08:00
Matt Turner	24d385f85c	i965/fs: Switch from GLSL IR to NIR for un/packHalf2x16 lowering.	2016-01-25 11:11:56 -08:00
Matt Turner	5eb1145434	nir: Add lowering of nir_op_unpack_half_2x16.	2016-01-25 11:11:56 -08:00
Matt Turner	84166aed92	i965: Make separate nir_options for scalar/vector stages. We'll want to have different lowering options set for scalar/vector stages.	2016-01-25 11:11:26 -08:00
Matt Turner	b6bb3b9bcd	i965: Move brw_compiler_create() to new brw_compiler.c. A future patch will want to use designated initalizers, which aren't available in C++, but this is C.	2016-01-25 11:11:25 -08:00
Matt Turner	b126039784	nir: Make argument order of unop_convert match binop_convert. Strangely the return and parameter types were reversed.	2016-01-25 11:11:08 -08:00
Ian Romanick	2542871387	meta: Use internal functions to set texture parameters _mesa_texture_parameteriv is used because (the more obvious) _mesa_texture_parameteri just stuffs the parameter in an array and calls _mesa_texture_parameteriv. This just cuts out the middleman. As a side bonus we no longer need check that ARB_stencil_texturing is supported. The test doesn't allow non-supporting implementations to avoid any work, and it's redundant with the value-changed test. Fix bug #93717 because the state restore commands at the bottom of _mesa_meta_GenerateMipmap no longer depend on the bound state. Fixes piglit arb_direct_state_access-generatetexturemipmap with the changes recently sent to the piglit mailing list. See the bugzilla entry for more info. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93717 Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-25 10:43:47 -08:00
Ian Romanick	18b0ba340b	meta/blit: Restore GL_DEPTH_STENCIL_TEXTURE_MODE state for GL_TEXTURE_RECTANGLE Commit `c246828c` added the code to save and restore the stencil texturing mode. The restore, however, was erroneously inside the 'target != GL_TEXTURE_RECTANGLE' block. Fixes piglit test 'arb_stencil_texturing-blit_corrupts_state GL_TEXTURE_RECTANGLE'. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-25 10:43:47 -08:00
Ian Romanick	f7800fadff	meta/copy_image: Fix typo in comment Trivial. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-25 10:43:47 -08:00
Ian Romanick	bae8a4f05b	mesa: Don't include meta.h Commit `055093e` removed the call to _mesa_meta_in_progress, and meta.h has not been necessary in src/mesa/main/enable.c since. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-25 10:43:47 -08:00
Nicolai Hähnle	1067e6eb55	radeonsi: add DCC buffer for sampler views on new CS This fixes a VM fault and possible lockup in high memory pressure situations. Cc: "11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-25 10:16:12 -05:00
Nicolai Hähnle	0bacbf5b7e	radeonsi: emit rw_buffers for tes_shader only if tes_shader present Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-25 10:16:08 -05:00
Nicolai Hähnle	2385b253c6	radeonsi: do not set the shader->key for gs copy shaders The key for a geometry shader would be interpreted as the key for a vertex shader further down the line, which really doesn't make sense. This does not affect the contents of shader->key because geometry shaders don't have any key entries anyway. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-25 10:16:05 -05:00
Nicolai Hähnle	46c0ba60c6	radeonsi: si_llvm_emit_vs_epilogue is never used with gs copy shaders Hence remove the misleading branch on is_gs_copy_shader. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-25 10:16:02 -05:00
Nicolai Hähnle	c55b9499d5	radeonsi: move is_gs_copy_shader to si_shader_context It is only used during shader creation now, so no need to keep it around afterwards. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-25 10:16:00 -05:00
Nicolai Hähnle	a7754ffd31	radeonsi: replace use of is_gs_copy_shader in si_shader_vs We now have an explicit parameter that contains the same information, and this will allow us to get rid of is_gs_copy_shader in the si_shader struct. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-25 10:15:55 -05:00
Nicolai Hähnle	004fcd4230	radeonsi: ensure that VGT_GS_MODE is sent when necessary Specifically, when the API switches from using a GS to not using a GS and then back to using the same GS again, we do not have to re-send all the GS state, but we do have to send VGT_GS_MODE. So make VGT_GS_MODE consistently be a part of the VS state. This fixes a rendering bug in Dolphin, but surely other applications are affected as well. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93648 Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-25 10:15:31 -05:00
Nicolai Hähnle	9f89bd69df	radeonsi: extract the VGT_GS_MODE calculation into its own function Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-25 10:15:08 -05:00
Samuel Pitoiset	429371f22a	trace: fix a segfault when tracing indirect draw calls Like other resources, the indirect draw buffer must be unwrapped. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-24 19:53:53 +01:00
Marek Olšák	24ea81a491	Revert "mesa: enable enums for OES_geometry_shader" This reverts commit `67e3098703`. It breaks a bunch of geometry shader tests, such as "spec@!opengl 3.2@minmax" and others depending on the glGet queries.	2016-01-24 15:47:39 +01:00
Marek Olšák	e707b9d8ba	winsys/amdgpu: optionally use buffer lists with all allocated buffers Set RADEON_ALL_BOS=1 to use it. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-23 17:01:54 +01:00
Jason Ekstrand	a804d82ef6	anv/cmd_buffer: Zero out binding tables and samplers in state_reset This fixes a use of an undefined value if the client uses push constants in a stage without ever setting any descriptors on GEN8-9.	2016-01-22 22:57:05 -08:00
Jason Ekstrand	9e0bc29f80	nir/opcodes: Properly flush denormals in fquantize2f16	2016-01-22 22:18:31 -08:00
Jason Ekstrand	89672d81f3	i965/nir: Properly flush denormals in nir_op_fquantize2f16	2016-01-22 22:18:31 -08:00
Kenneth Graunke	ae9f73ea40	glsl: Conditionalize atan2 math. In the old hand-writen implementation of atan2, the calculation of atan(y/x) was performed conditionally in the "then" block of the outermost if statement. I believe I accidentally lifted this out into unconditional code when converting to IR builder. For reference, the original hand-written IR is visible in commit `722eff674b`. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: Erik Faye-Lund <kusmabite@gmail.com>	2016-01-22 21:03:00 -08:00
Jason Ekstrand	2bfb9f29b8	anv/format: Add a helpful comment about format names	2016-01-22 19:14:41 -08:00
Jason Ekstrand	259e1bdf79	anv/formats: Add support for 3 more formats	2016-01-22 19:03:27 -08:00
Jason Ekstrand	0b6c1275d0	anv/pipeline: Add a default L3$ setup	2016-01-22 19:02:55 -08:00
Rob Herring	7ee8954753	virgl: enable building on Android This is just a copy-n-paste and rename of vc4 Android makefiles. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-23 12:35:29 +10:00
Rob Herring	657dc4f533	virtio_gpu: Add PCI ID to driver map Add the virtio-gpu PCI ID so the driver probing works. Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-23 12:35:24 +10:00
Chad Versace	99a4885328	anv/formats: Rename ambiguous func parameter vkGetPhysicalDeviceImageFormatProperties has multiple 'flags' parameters.	2016-01-22 17:51:24 -08:00
Chad Versace	149d5ba64d	anv/formats: Advertise multisample formats Teach vkGetPhysicalDeviceImageFormatProperties() to advertise multisampled formats.	2016-01-22 17:50:15 -08:00
Chad Versace	d96d78c3b6	anv/image: Drop assertion that samples == 1	2016-01-22 17:19:57 -08:00
Chad Versace	fda074b23f	isl: Fix gen8_choose_msaa_layout() Gen8 requires any Y tiling, not any standard Y tiling.	2016-01-22 17:19:57 -08:00
Chad Versace	2fa1f745ea	isl: Add func isl_tiling_is_any_y()	2016-01-22 17:19:57 -08:00
Chad Versace	fa5f45e8aa	anv/meta: Assert correct sample counts for blit funcs Add assertions to: anv_CmdBlitImage anv_CmdCopyImage anv_CmdCopyImageToBuffer anv_CmdCopyBufferToImage	2016-01-22 17:19:57 -08:00
Chad Versace	dfcb4ee6df	anv: Add anv_image::samples It's set but not yet used.	2016-01-22 17:19:57 -08:00
Chad Versace	1c5d7b38e2	anv: Use isl_device_get_sample_counts() Use it in vkGetPhysicalDeviceProperties.	2016-01-22 17:19:57 -08:00
Chad Versace	14b753f666	isl: Add func isl_device_get_sample_counts()	2016-01-22 17:19:57 -08:00
Nanley Chery	d4de918ad0	gen8/state: Remove SKL special-casing for MinimumArrayElement MinimumArrayElement carries the same meaning for BDW and SKL. Suggested by Jason. No regressions in dEQP-VK.pipeline.image.view_type.cube_array.* Fixes a number of cube tests, including cube_array_base_slice and cube_base_slice tests.	2016-01-22 17:10:14 -08:00
Chad Versace	6a03c69adb	anv/state: Dedupe code for lowering surface format Add helper anv_surface_format().	2016-01-22 16:49:17 -08:00
Francisco Jerez	11d5c1905c	anv/meta: Set sampler type and instruction arrayness consistently in blit shader.	2016-01-22 16:43:18 -08:00
Francisco Jerez	bf151b8892	anv/meta: Fix meta blit fragment shader for 1D arrays.	2016-01-22 16:43:15 -08:00
Jason Ekstrand	53b83899e0	genX/state: Set CubeSurfaceControlMode to OVERRIDE This makes it act like the address mode is set to TEXCOORDMODE_CUBE whenever this sampler is combined with a cube surface. This should be what we need for Vulkan. Interestingly, the PRM contains a programming note for this field that says simply, "This field must be set to CUBECTRLMODE_PROGRAMMED". However, emprical evidence suggests that it does what the PRM says it does and OVERRIDE is just fine.	2016-01-22 16:34:13 -08:00
Jason Ekstrand	35879fe829	gen8/state: Divide depth by 6 for cube maps for GEN8 For Broadwell cube maps, MinimumArrayElement is in terms of 2d slices (a multiple of 6) but Depth is in terms of whole cubes.	2016-01-22 16:14:54 -08:00
Nanley Chery	3cd8c0bb04	gen8_state: Enable all cube faces These fields are ignored for non-cube surfaces. For cube surfaces these fields should be enabled when using TEXCOORDMODE_CLAMP and TEXCOORDMODE_CUBE. TODO: Determine if these are the only two modes used in Vulkan.	2016-01-22 16:12:52 -08:00
Kenneth Graunke	b3340cd32a	i965: Implement a drirc workaround for broken dual color blending. OpenGL's dual color blending feature was specified so that an implementation could support both multiple render targets (MRT) and dual source blending. Fragment shader outputs specify both "location" (the render target number) and "index" (either color 0 or 1). I believe DirectX only has the notion of "location" - if using dual color blending, location 0 or 1 will specify the operands. If not, then location means the render target index. The two features can't be used together. As such, some applications mistakenly try to use <loc = 0, index = 0> and <loc = 1, index = 0> in a shader used for dual color blending with a single render target, rather than the correct <loc = 0, index = 0> and <loc = 0, index = 1>. In particular, Unigine Heaven 4.0 and Valley 1.0 suffer from this bug. Unigine is aware of the problem, and quickly developed a fix, but has not bothered to change the download link on their website to a working copy in over a year. People were still using the broken version and complaining. We tried working around this by disabling dual color blending, but that apparently hurts performance, and people were once again unhappy. On i965, dual source blending is achieved by using different framebuffer write messages than normal rendering. So, we have to compile different code for the two cases. We're not being pedantic: we actually have to know in order to function. Normally, dual source blending is detectable in the shader: if a shader has an output with index = 1, then it's meant for blending, not MRT. With the broken inputs, they're indistinguishable, so we can only tell by looking at the current GL state. This patch implements a new drirc workaround: export dual_color_blend_by_location=true which makes the i965 driver detect when OpenGL state is configured for dual source blending, and recompile the fragment shader to use the right messages. In that case, we allow either location = 1 or index = 1 to specify the second source for the blending equations. It also re-enables GL_ARB_blend_func_extended for Unigine. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92233 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-22 14:14:26 -08:00
Marek Olšák	cd9c07e7cd	radeonsi: add ETC1 support for Stoney It's a subset of ETC2. Tested. For more information, see page 42 and onward: http://www.graphicshardware.org/previous/www_2007/presentations/strom-etc2-gh07.pdf Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-01-22 22:05:42 +01:00
Marek Olšák	b3bac55621	radeonsi: change LLVM intrinsics for BREV, CLAMP, EX2 Requested by Matt Arsenault. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-22 22:05:42 +01:00
Marek Olšák	ce1e7784d0	radeonsi: add max waves / SIMD to shader stats (v2) v2: account for LDS usage in PS the limit is per SIMD, not per CU Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-22 22:05:42 +01:00
Marek Olšák	5944f3d2fc	radeonsi: enable late VS allocation (v3) v2: take the number of CUs into account v3: change in LS allocation Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-22 22:05:42 +01:00
Marek Olšák	97648229e4	radeonsi: allow using all CUs for tessellation and on-chip GS (v2) v2: After more discussion with hw teams, the kernel already contains the optimal settings allowing us to use all CUs. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-22 22:05:42 +01:00
Jeremy Huddleston Sequoia	7c99557f53	Revert "mesa: Deal with size differences between GLuint and GLhandleARB in GetAttachedObjectsARB" This reverts commit `739ac3d39d`. This will be done a differnet way. See http://lists.freedesktop.org/archives/mesa-dev/2016-January/105642.html	2016-01-22 13:02:01 -08:00
Jason Ekstrand	107a109d1c	isl/format_layout: R11G11B10_FLOAT is unsigned	2016-01-22 11:57:49 -08:00
Jason Ekstrand	e5558ffa64	anv/image: Move common code to anv_image.c	2016-01-22 11:57:01 -08:00
Jason Ekstrand	84612f4014	anv/state: Refactor surface state setup into a "fill" function	2016-01-22 11:40:56 -08:00
Francisco Jerez	448285ebf2	anv/state: Add missing clflushes for storage image surface state.	2016-01-22 11:12:09 -08:00
Francisco Jerez	d533c3796d	anv/state: Factor out surface state calculation from genX_image_view_init. Some fields of the surface state template were dependent on the surface type, which is dependent on the usage of the image view, which wasn't known until the bottom of the function after the template had been constructed. This caused failures in all image load/store CTS tests using cubemaps. Refactor the surface state calculation into a function that is called once for each required usage.	2016-01-22 11:12:09 -08:00
Jason Ekstrand	16780632c2	i965/nir: Temporariliy disable mul+add fusion We don't want to do this in the long-run but it's needed for passing the NoContraction tests at the moment. Eventually, we want to plumb this through NIR properly.	2016-01-22 11:10:54 -08:00
Ben Widawsky	315cda6715	i965/fs: Remove unused count from vs urb setup This was originally removed here: commit `031d350132` Author: Kenneth Graunke <kenneth@whitecape.org> Date: Tue Aug 25 16:59:12 2015 -0700 i965/vs: Unify URB entry size/read length calculations between backends. Then added back: commit `bd198b9f0a` Author: Kenneth Graunke <kenneth@whitecape.org> Date: Fri Aug 14 16:01:33 2015 -0700 i965/vs: Simplify fs_visitor's ATTR file. Note that the authorship dates are out of order, but the above reflects the order of the commit dates. Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-01-22 10:38:41 -08:00
Chad Versace	d9abbbe0d8	isl: Fix indentation of isl_format_layout comment	2016-01-22 09:48:11 -08:00
Chad Versace	65f3c420c3	isl/tests: Give tests less cryptic names	2016-01-22 09:46:48 -08:00
Chad Versace	f9d4d09549	isl: Fix isl_surf_get_image_offset_sa for gen4_3d layout Bug found by unit test test_bdw_3d_r8g8b8a8_unorm_256x256x256_levels09_tiley0.	2016-01-22 09:45:22 -08:00
Chad Versace	891ed5ca8c	isl/tests: Add test for bdw 3d surface test_bdw_3d_r8g8b8a8_unorm_256x256x256_levels09_tiley0 Currently fails.	2016-01-22 09:45:21 -08:00
Nicolai Hähnle	d76bd85c35	Revert "radeonsi: fix discard-only fragment shaders (v2)" This reverts commit `843855bbf0`. It became redundant due to Marek's earlier pushed `8667a1ae` which achieves the same thing.	2016-01-22 12:40:26 -05:00
Nicolai Hähnle	843855bbf0	radeonsi: fix discard-only fragment shaders (v2) When a fragment shader is used that has no outputs but does conditional discard (KILL_IF), all fragments are killed without this patch. By comparing various register settings, my conclusion is that the exec mask is either not properly forwarded to the DB by NULL exports or ends up being unused, at least when there is _only_ a NULL export (the ISA documentation claims that NULL exports can be used to override a previously exported exec mask). Of the various approaches I have tried to work around the problem, this one seems to be the least invasive one. v2: take discard by alpha test into account as well Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93761 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-22 11:59:50 -05:00
Marta Lofstedt	3e640c256a	mesa: Update _mesa_has_geometry_shaders Updates the _mesa_has_geometry_shaders function to also look for OpenGL ES 3.1 contexts that has OES_geometry_shader enabled. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-22 17:13:55 +01:00
Marta Lofstedt	ae4e4ba06d	glsl: add support for GL_OES_geometry_shader This adds glsl support of GL_OES_geometry_shader for OpenGL ES 3.1. Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-22 17:13:55 +01:00
Marta Lofstedt	67e3098703	mesa: enable enums for OES_geometry_shader Enable GL_OES_geometry_shader enums for OpenGL ES 3.1. V4: EXTRA tokens updated according to comments from Ilia Mirkin. Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-22 17:13:55 +01:00
Marta Lofstedt	af5a14d1e0	glapi: add GL_OES_geometry_shader extension Add xml definitions for the GL_OES_geometry_shader extension and expose the extension for OpenGL ES 3.1. Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-22 17:13:55 +01:00
Emil Velikov	bb58b59998	docs: correct 11.1.1 release year Seems like I wasn't ready to let 2015 go :-) Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-01-22 15:50:48 +00:00
Emil Velikov	45c5000ffc	docs: add news item and link release notes for 11.0.9 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-01-22 15:49:47 +00:00
Emil Velikov	87b0a52de8	docs: add sha256 checksums for 11.0.9 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-01-22 15:47:12 +00:00
Emil Velikov	51e8152186	docs: add release notes for 11.0.9 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-01-22 15:47:11 +00:00
Chad Versace	fbc87ce4be	isl/tests: Remove copy-paste assertion	2016-01-22 07:18:04 -08:00
Chad Versace	63d999b762	isl/tests: Fix build isl_device_init() acquired a new param for bit6 swizzling.	2016-01-22 07:17:57 -08:00
Marek Olšák	a9d5842ec0	radeonsi: add ETC2 support for Stoney Tested and working.	2016-01-22 15:36:14 +01:00
Marek Olšák	6f428328d3	radeonsi: implement SAMPLEPOS system value without a constant buffer load We always get per-sample input position. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-22 15:02:40 +01:00
Marek Olšák	2b66bc87d4	winsys/amdgpu: compute num_good_compute_units correctly Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-22 15:02:40 +01:00
Marek Olšák	0d8e4f958f	gallium/radeon: rename max_compute_units -> num_good_compute_units radeon sets this correctly, but not amdgpu Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-22 15:02:40 +01:00
Marek Olšák	99dfeb01bd	radeonsi: disable SPI color outputs the shader doesn't write Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-22 15:02:40 +01:00
Marek Olšák	f6360de8c0	radeonsi: use all SPI color formats because not using SPI_SHADER_32_ABGR doubles fill rate. We should also get optimal performance if alpha isn't needed or blending isn't enabled. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-22 15:02:40 +01:00
Marek Olšák	933e3c4145	radeonsi: use 32_AR for alpha-to-coverage without a color buffer This avoids the fp16 packing instructions. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-22 15:02:40 +01:00
Marek Olšák	f1f0158837	radeonsi: add shader conversion code for all SPI color formats Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-22 15:02:40 +01:00
Marek Olšák	e28b8530b9	radeonsi: set CB_SHADER_MASK according to SPI color formats Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-22 15:02:40 +01:00
Marek Olšák	8667a1aea2	radeonsi: use SPI_SHADER_COL_FORMAT fields instead of export_16bpc This does change the behavior slightly: If a shader writes COLOR[i] and that color buffer isn't bound, the shader will export MRT_NULL instead and discard the IR tree that calculates the output. The only exception is alpha-to-coverage, which requires an alpha export. v2: - update a comment about 16BPC - account for MRTZ when when fixing alpha-test/kill Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-22 15:02:40 +01:00
Marek Olšák	0446ea9d08	radeonsi: don't enable blending if colormask == 0 most likely useless, but doesn't hurt Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-22 15:02:40 +01:00
Ilia Mirkin	dac2964f3e	glsl: always compute proper varying type, irrespective of varying packing Normally there's a producer and consumer, and the producer var gets picked. In both the vertex->gs and tes->gs cases, that's the un-arrayed version. In the SSO case, however, there is no producer. So we picked the arrayed GS variable, and as a result, used more slots than we should. More critically, these slots would also no longer line up with the producer's calculation. To fix this, we need to fix up the type of the variable based on stage no matter what. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93650 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2016-01-22 08:48:27 -05:00
Emil Velikov	54702c2fa1	egl/dri2: expose srgb configs when KHR_gl_colorspace is available Otherwise the user has no way of using it, and we'll try to access the linear one. v2: - Bail out when KHR_gl_colorspace is missing and srgb is set (Marek) Cc: Chih-Wei Huang <cwhuang@android-x86.org> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Fixes: c2c2e9ab604(egl: implement EGL_KHR_gl_colorspace (v2)) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91596 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Mauro Rossi <issor.oruam@gmail.com>	2016-01-22 11:55:54 +00:00
Emil Velikov	f29a772a7e	targets/dri: android: use WHOLE static libraries By using whole static libraries the android buildsystem provides whole-archive (alike) solution. This means that we don't need to worry about the order of the static libraries and any reverse, recursive or circular dependencies that they have between one another. Without this the linker will discard any unused hunks of one library and we'll end up with unresolved symbols as those are required by another static library. This issue has become more prominent with the introduction of pipe-loader. Whole static libraries has been used in i915/i965 for a very long time, so we might do the same. v2: - Better commit message (Ilia) - Keep external dependencies as [normal] static libs (Mauro) Cc: mesa-stable@lists.freedesktop.org Cc: Mauro Rossi <issor.oruam@gmail.com> Reported-by: Mauro Rossi <issor.oruam@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-01-22 11:55:34 +00:00
Emil Velikov	72fda2b710	i915: correctly parse/set the context flags With an earlier commit we've spit the flags parsing to a separate function, but forgot to update all the dri modules to use it. Noticed when we've enabled KHR_debug for every dri module - fdo#93048 Fixes: `38366c0c6e` "dri_util: Don't assume __DRIcontext->driverPrivate is a gl_context" Cc: Mark Janes <mark.a.janes@intel.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Cc: Kristian Høgsberg <krh@bitplanet.net> Cc: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Mark Janes <mark.a.janes@intel.com> Tested-by: Mark Janes <mark.a.janes@intel.com>	2016-01-22 11:54:01 +00:00
Iago Toral Quiroga	ab0c7c0829	glsl/lower_instructions: fix regression in dldexp_to_arith The commit `b4e198f47f` changed the offset and bits parameters of the bitfield insert operation from scalars to vectors. However, the lowering of ldexp on doubles operates on each vector component and emits scalar code (since it has to deal with the lower and upper 32-bit chunks of each double component), so it needs its bits and offset parameters to be scalars. Fixes fp64 regression (crash) in: spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-ldexp-dvec4.shader_test Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-22 08:14:11 +01:00
Francisco Jerez	2e54381622	anv/batch_chain: Fix patching up of block pool relocations on Gen8+. Relocations are 64 bits on Gen8+. Most CTS tests that send non-trivial work to the GPU would fail when run from a single deqp-vk invocation because they were effectively relying on reloc presumed offsets to be wrong so the kernel would come and apply relocations correctly.	2016-01-21 16:30:44 -08:00
Jason Ekstrand	13aaf90048	nir/spirv: Ignore cull distance	2016-01-21 16:20:39 -08:00
Jason Ekstrand	13858a1c1a	nir/lower_system_values: Use the correct invication id for CS	2016-01-21 16:20:39 -08:00
Jason Ekstrand	d8c0e0805b	nir/spirv: Properly assign locations to split structures	2016-01-21 16:20:39 -08:00
Jason Ekstrand	514507825c	nir/spirv: Improve handling of variable loads and copies Before we were asuming that a deref would either be something in a block or something that we could pass off to NIR directly. However, it is possible that someone would choose to load/store/copy a split structure all in one go. We need to be able to handle that.	2016-01-21 16:20:39 -08:00
Jason Ekstrand	7e5e64c8a9	nir/spirv: Make vectors a proper array time with an array_element This makes dealing with single-component derefs easier	2016-01-21 16:20:39 -08:00
Jason Ekstrand	a8af0f536c	nir/spirv: Rework access chains a bit to allow for literals This makes them much easier to construct because you can also just specify a literal number and it doesn't have to be a valid SPIR-V id.	2016-01-21 16:20:39 -08:00
Jason Ekstrand	5d9a6fd526	vtn/variables: Compact local loads/stores into one function This is similar to what we did for block loads/stores.	2016-01-21 16:20:39 -08:00
Jason Ekstrand	b298743d7b	nir/spirv: Add an actual variable struct to spirv_to_nir This allows us, among other things, to do structure splitting on-the-fly to more correctly handle input/output structs.	2016-01-21 16:20:39 -08:00
Jason Ekstrand	2892693d56	nir/spirv: Split variable handling out into its own file It's 1300 lines all by itself and it will only grow.	2016-01-21 16:20:39 -08:00
Jason Ekstrand	1112bf633f	nir/spirv: Rework access chains Previously, we were creating nir_deref's immediately. Now, instead, we have an intermediate vtn_access_chain structure. While a little more awkward initially, this will allow us to more easily do structure splitting on-the-fly.	2016-01-21 16:18:37 -08:00
Eduardo Lima Mitev	263f829d2e	i965/vec4/tcs: Return NULL instead of false in brw_compile_tcs() brw_compile_tcs() is expected to return 'const unsigned *', so the compiler complains. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-01-21 16:16:26 -08:00
Kenneth Graunke	824f776355	nir/spirv: Implement ModfStruct opcode.	2016-01-21 14:57:47 -08:00
Kenneth Graunke	f89d5cb807	nir/spirv: Delete stray fmod remnants. Jason left these stray code fragments in `22804de110`.	2016-01-21 14:54:20 -08:00
cstout	13b87e02b9	freedreno/a4xx: Add support for adreno 430 Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-21 17:20:11 -05:00
Christian Gmeiner	66672e791c	freedreno: make opc array static const Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-21 17:20:11 -05:00
Rob Clark	bc1a37378c	freedreno: implement emit_string_marker Writes string to cmdstream in payload of a no-op packet. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-21 17:20:11 -05:00
Rob Clark	d6408372eb	gallium: add GREMEDY_string_marker Since the GREMEDY extensions are normally only exposed by the gremedy debugger (and could possibly trigger debug paths in the app), we don't expose the extension by default, but instead only with ST_DEBUG=gremedy. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-21 17:19:56 -05:00
Rob Clark	a6a99fbf05	mesa: wire up EmitStringMarker for KHR_debug The extension spec[1] describes DEBUG_TYPE_MARKER as "Annotation of the command stream". So for DEBUG_TYPE_MARKER, also pass the buf to the driver's EmitStringMarker() to be inserted in the command stream. [1] https://www.opengl.org/registry/specs/KHR/debug.txt Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-21 17:19:05 -05:00
Rob Clark	1f7a96e005	mesa: add GREMEDY_string_marker Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-21 17:19:05 -05:00
Kristian Høgsberg Kristensen	ac60e98a58	vk: Do render cache flush for GEN8+ This is needed for SKL as well.	2016-01-21 14:18:52 -08:00
Kristian Høgsberg Kristensen	9eab8fc683	vk: Emit surface state base address before renderpass If we're continuing a render pass, make sure we don't emit the depth and stencil buffer addresses before we set the state base addresses. Fixes crucible func.cmd-buffer.small-secondaries	2016-01-21 14:18:52 -08:00
Neil Roberts	cbf0e64ee1	texobj: Remove redundant checks that the texture cube faces match size The texture mipmap completeness checking code was checking whether all of the faces have the same size. However this is pointless because the code just above it checks whether the face has the expected size calculated for the mipmap level anyway so the error condition could never be reached. This patch just removes it. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-21 21:45:53 +00:00
Neil Roberts	666d96d169	texobj: Fix the completeness checks for cube textures According to the GL 1.4 spec section 3.8.10, a cubemap texture is only complete if: • The level base arrays of each of the six texture images making up the cube map have identical, positive, and square dimensions. • The level base arrays were each specified with the same internal format. • The level base arrays each have the same border width. Previously the texture completeness code was only checking the first point. This patch makes it additionally check the other two. This fixes the following two dEQP tests: deqp-gles2.functional.texture.completeness.cube.format_mismatch_rgba_rgb_level_0_neg_z deqp-gles2.functional.texture.completeness.cube.format_mismatch_rgb_rgba_level_0_pos_z And also this Piglit test: spec/!opengl 2.0/incomplete-cubemap-format Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93792 Cc: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-21 21:45:18 +00:00
Grazvydas Ignotas	0153ff8379	r600g: don't leak driver const buffers The buffers are referenced from r600_update_driver_const_buffers() -> r600_set_constant_buffer() -> u_upload_data(), but nothing ever releases the reference. Similar case with driver_consts. Found using valgrind. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-21 15:36:24 -05:00
Kristian Høgsberg Kristensen	c5490d0277	vk: Fix indirect push constants This currently sets the base and size of all push constants to the entire push constant block. The idea is that we'll use the base and size to eventually optimize the amount we actually push, but for now we don't do that.	2016-01-21 11:10:11 -08:00
Kristian Høgsberg Kristensen	83c86e09a8	Merge remote-tracking branch 'jekstrand/wip/i965-uniforms' into vulkan	2016-01-21 11:09:58 -08:00
Jeremy Huddleston Sequoia	739ac3d39d	mesa: Deal with size differences between GLuint and GLhandleARB in GetAttachedObjectsARB Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> Reviewed-by: Nicolai Hähnle <nhaehnle@gmail.com>	2016-01-21 09:18:06 -08:00
Jeremy Huddleston Sequoia	b20d6bf96d	mesa: Fix format warnings main/shaderapi.c:1318:51: warning: format specifies type 'unsigned int' but the argument has type 'GLhandleARB' (aka 'unsigned long') [-Wformat] _mesa_debug(ctx, "glDeleteObjectARB(%u)\n", obj); ~~ ^~~ %lu Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-21 09:18:06 -08:00
Jeremy Huddleston Sequoia	a087a09fa8	mesa: Fix some function prototype mismatching main/api_exec.c:543:36: warning: incompatible pointer types passing 'void (GLhandleARB, GLuint, const GLcharARB )' (aka 'void (unsigned long, unsigned int, const char )') to parameter of type 'void ()(GLuint, GLuint, const GLchar )' (aka 'void ()(unsigned int, unsigned int, const char )') [-Wincompatible-pointer-types] SET_BindAttribLocation(exec, _mesa_BindAttribLocation); ^~~~~~~~~~~~~~~~~~~~~~~~ ./main/dispatch.h:7590:88: note: passing argument to parameter 'fn' here static inline void SET_BindAttribLocation(struct _glapi_table disp, void (GLAPIENTRYP fn)(GLuint, GLuint, const GLchar )) { ^ main/api_exec.c:547:31: warning: incompatible pointer types passing 'void (GLhandleARB)' (aka 'void (unsigned long)') to parameter of type 'void ()(GLuint)' (aka 'void ()(unsigned int)') [-Wincompatible-pointer-types] SET_CompileShader(exec, _mesa_CompileShader); ^~~~~~~~~~~~~~~~~~~ ./main/dispatch.h:7612:83: note: passing argument to parameter 'fn' here static inline void SET_CompileShader(struct _glapi_table disp, void (GLAPIENTRYP fn)(GLuint)) { ^ main/api_exec.c:568:33: warning: incompatible pointer types passing 'void (GLhandleARB, GLuint, GLsizei, GLsizei , GLint , GLenum , GLcharARB )' (aka 'void (unsigned long, unsigned int, int, int , int , unsigned int , char )') to parameter of type 'void ()(GLuint, GLuint, GLsizei, GLsizei , GLint , GLenum , GLchar )' (aka 'void ()(unsigned int, unsigned int, int, int , int , unsigned int , char )') [-Wincompatible-pointer-types] SET_GetActiveAttrib(exec, _mesa_GetActiveAttrib); ^~~~~~~~~~~~~~~~~~~~~ ./main/dispatch.h:7711:85: note: passing argument to parameter 'fn' here static inline void SET_GetActiveAttrib(struct _glapi_table disp, void (GLAPIENTRYP fn)(GLuint, GLuint, GLsizei , GLsizei , GLint , GLenum , GLchar )) { ^ main/api_exec.c:571:35: warning: incompatible pointer types passing 'GLint (GLhandleARB, const GLcharARB )' (aka 'int (unsigned long, const char )') to parameter of type 'GLint ()(GLuint, const GLchar )' (aka 'int ()(unsigned int, const char )') [-Wincompatible-pointer-types] SET_GetAttribLocation(exec, _mesa_GetAttribLocation); ^~~~~~~~~~~~~~~~~~~~~~~ ./main/dispatch.h:7744:88: note: passing argument to parameter 'fn' here static inline void SET_GetAttribLocation(struct _glapi_table disp, GLint (GLAPIENTRYP fn)(GLuint, const GLchar )) { ^ main/api_exec.c:585:33: warning: incompatible pointer types passing 'void (GLhandleARB, GLsizei, GLsizei , GLcharARB )' (aka 'void (unsigned long, int, int , char )') to parameter of type 'void ()(GLuint, GLsizei, GLsizei , GLchar )' (aka 'void ()(unsigned int, int, int , char )') [-Wincompatible-pointer-types] SET_GetShaderSource(exec, _mesa_GetShaderSource); ^~~~~~~~~~~~~~~~~~~~~ ./main/dispatch.h:7788:85: note: passing argument to parameter 'fn' here static inline void SET_GetShaderSource(struct _glapi_table disp, void (GLAPIENTRYP fn)(GLuint, GLsizei, GLsizei , GLchar )) { ^ main/api_exec.c:597:29: warning: incompatible pointer types passing 'void (GLhandleARB)' (aka 'void (unsigned long)') to parameter of type 'void ()(GLuint)' (aka 'void ()(unsigned int)') [-Wincompatible-pointer-types] SET_LinkProgram(exec, _mesa_LinkProgram); ^~~~~~~~~~~~~~~~~ ./main/dispatch.h:7909:81: note: passing argument to parameter 'fn' here static inline void SET_LinkProgram(struct _glapi_table disp, void (GLAPIENTRYP fn)(GLuint)) { ^ main/api_exec.c:628:30: warning: incompatible pointer types passing 'void (GLhandleARB, GLsizei, const GLcharARB const , const GLint )' (aka 'void (unsigned long, int, const char const , const int )') to parameter of type 'void ()(GLuint, GLsizei, const GLchar const , const GLint )' (aka 'void ()(unsigned int, int, const char const , const int )') [-Wincompatible-pointer-types] SET_ShaderSource(exec, _mesa_ShaderSource); ^~~~~~~~~~~~~~~~~~ ./main/dispatch.h:7920:82: note: passing argument to parameter 'fn' here static inline void SET_ShaderSource(struct _glapi_table disp, void (GLAPIENTRYP fn)(GLuint, GLsizei, const GLchar const , const GLint )) { ^ main/api_exec.c:653:28: warning: incompatible pointer types passing 'void (GLhandleARB)' (aka 'void (unsigned long)') to parameter of type 'void ()(GLuint)' (aka 'void ()(unsigned int)') [-Wincompatible-pointer-types] SET_UseProgram(exec, _mesa_UseProgram); ^~~~~~~~~~~~~~~~ ./main/dispatch.h:8173:80: note: passing argument to parameter 'fn' here static inline void SET_UseProgram(struct _glapi_table disp, void (GLAPIENTRYP fn)(GLuint)) { ^ main/api_exec.c:655:33: warning: incompatible pointer types passing 'void (GLhandleARB)' (aka 'void (unsigned long)') to parameter of type 'void ()(GLuint)' (aka 'void ()(unsigned int)') [-Wincompatible-pointer-types] SET_ValidateProgram(exec, _mesa_ValidateProgram); ^~~~~~~~~~~~~~~~~~~~~ ./main/dispatch.h:8184:85: note: passing argument to parameter 'fn' here static inline void SET_ValidateProgram(struct _glapi_table disp, void (GLAPIENTRYP fn)(GLuint)) { main/dlist.c:9457:26: warning: incompatible pointer types passing 'void (GLhandleARB)' (aka 'void (unsigned long)') to parameter of type 'void ()(GLuint)' (aka 'void ()(unsigned int)') [-Wincompatible-pointer-types] SET_UseProgram(table, save_UseProgramObjectARB); ^~~~~~~~~~~~~~~~~~~~~~~~ ./main/dispatch.h:8173:80: note: passing argument to parameter 'fn' here static inline void SET_UseProgram(struct _glapi_table *disp, void (GLAPIENTRYP fn)(GLuint)) { ^ 1 warning generated. Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-21 09:18:06 -08:00
Andreas Boll	5d4b20267d	glapi: Build glapi_gentable.c only on Darwin Removes the public symbol _glapi_create_table_from_handle from libGL.so.1.2.0 on all platforms except Darwin. Since the symbol is not used on other platforms it makes sense to build glapi_gentable.c only on Darwin. As a side effect it accelerates the build a bit and reduces the size of libGL.so.1.2.0 as follows: size lib/libGL.so.1.2.0 on my system shows text data bss dec hex filename 469211 21848 2720 493779 788d3 lib/libGL.so.1.2.0 before 420988 11240 2720 434948 6a304 lib/libGL.so.1.2.0 after A little bit of history: _glapi_create_table_from_handle was introduced in commit `85937f4c0d` Author: Jeremy Huddleston <jeremyhu@apple.com> Date: Thu Jun 9 16:59:49 2011 -0700 glapi: Add API that can create a _glapi_table from a dlfcn handle Example usage: void handle = dlopen(opengl_library_path, RTLD_LOCAL); struct _glapi_table disp = _glapi_create_table_from_handle(handle, "gl"); Signed-off-by: Jeremy Huddleston <jeremyhu@apple.com> and the only user in mesa was added in commit `f35913b96e` Author: Jeremy Huddleston <jeremyhu@apple.com> Date: Thu Jun 9 17:29:51 2011 -0700 apple: Use _glapi_create_table_from_handle to initialize our dispatch table Signed-off-by: Jeremy Huddleston <jeremyhu@apple.com> gl_gentable.py was also used for XQuartz in xserver 1.11 - 1.14. v2: Fix typos in commit message Add missing XORG_GLAPI_OUTPUTS += \ into src/mapi/glapi/gen/Makefile.am Add glapi_gentable.c to EXTRA_DIST for inclusion in the release tarball v3: Fix commit message: s/gl_gentable.c/glapi_gentable.c/ Reported-by: Arlie Davis <arlied@google.com> Cc: Jeremy Huddleston <jeremyhu@apple.com> Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-21 15:04:02 +01:00
Arlie Davis	daa775b58e	mesa: Reduce libGL.so binary size by about 15% This patch significantly reduces the size of the libGL.so binary. It does not change the (externally visible) behavior of libGL.so at all. gl_gentable.py generates a function, _glapi_create_table_from_handle. This function allocates a large dispatch table, consisting of 1300 or so function pointers, and fills this dispatch table by doing symbol lookups on a given shared library. Previously, gl_gentable.py would generate a single, very large _glapi_create_table_from_handle function, with a short cluster of lines for each entry point (function). The idiom it generates was a NULL check, a call to snprintf, a call to dlsym / GetProcAddress, and then a store into the dispatch table. Since this function processes a large number of entry points, this code is duplicated many times over. We can encode the same information much more compactly, by using a lookup table. The previous total size of _glapi_create_table_from_handle on x64 was 125848 bytes. By using a lookup table, the size of _glapi_create_table_from_handle (and the related lookup tables) is reduced to 10840 bytes. In other words, this enormous function is reduced by 91%. The size of the entire libGL.so binary (measured when stripped) itself drops by 15%. So the purpose of this change is to reduce the binary size, which frees up disk space, memory, etc. size lib/libGL.so.1.2.0 on my system shows (Andreas) text data bss dec hex filename 565947 11256 2720 579923 8d953 lib/libGL.so.1.2.0 before 469211 21848 2720 493779 788d3 lib/libGL.so.1.2.0 after v2: Incorporate Matt's feedback. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> Tested-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>	2016-01-21 15:03:53 +01:00
Jordan Justen	b1a7a27d60	nir/spirv: Handle compute shared atomics Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-21 00:31:29 -08:00
Jordan Justen	a7e5b683ca	nir/spirv: Support workgroup (shared) variable translation Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-21 00:31:29 -08:00
Jordan Justen	bc035db3c8	anv/gen8: Set SLM size in interface descriptor Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-21 00:31:29 -08:00
Jordan Justen	819cb69434	anv/gen8+9: Invalidate color calc state when switching to the GPGPU pipeline Port `044acb9256` to anv. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-21 00:31:29 -08:00
Jordan Justen	19830031cb	anv/gen8: Enable SLM in L3 cache control register Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-21 00:31:29 -08:00
Jordan Justen	97b09a9268	anv/pipeline: Set size of shared variables in prog_data Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-21 00:31:29 -08:00
Jordan Justen	86daceb7f2	i965/nir: Lower nir compute shader shared variables Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-21 00:31:29 -08:00
Jordan Justen	ca55817fa1	nir: Lower shared var atomics during nir_lower_io Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-21 00:31:29 -08:00
Jordan Justen	36157cd5ea	nir: Add support for lowering load/stores of shared variables Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-21 00:31:29 -08:00
Jordan Justen	7a9a54b5c8	nir: Add atomic operations on variables This allows us to first generate atomic operations for shared variables using these opcodes, and then later we can lower those to the shared atomics intrinsics with nir_lower_io. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-21 00:31:29 -08:00
Jordan Justen	10db985fa0	nir: Add compute shader shared variable storage class Previously we were receiving shared variable accesses via a lowered intrinsic function from glsl. This change allows us to send in variables instead. For example, when converting from SPIR-V. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-21 00:31:29 -08:00
Jordan Justen	65a5407931	nir/print: Add space after shader_storage var mode Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-21 00:31:29 -08:00
Jordan Justen	9f4a72c9e3	i965/fs/nir: Move shared variable load/store to nir_emit_cs_intrinsic Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-21 00:31:29 -08:00
Ilia Mirkin	daa0fd7843	nv50/ir: 64-bit splitting fixes Take reading shader outputs into account, and use setFlagsDef for the carry since we rely on having i->flagsDef being set. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-20 19:37:34 -05:00
Ilia Mirkin	c0b66d96d7	gk110/ir: allow carry to be set/read by imad Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-20 19:37:34 -05:00
Ilia Mirkin	73c9ca7544	gm107/ir: add carry emission to LOP and IADD Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-20 19:37:34 -05:00
Ilia Mirkin	71a489633b	gm107/ir: add ATOM and CCTL support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-20 19:37:34 -05:00
Ilia Mirkin	57b0025814	gm107/ir: set LD/ST address width bit Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-20 19:37:34 -05:00
Ilia Mirkin	2e533ab74b	gk110/ir: fix double-wide vm address	2016-01-20 19:37:34 -05:00
Ilia Mirkin	8c2dfe05c5	gk110/ir: add OP_CCTL handling	2016-01-20 19:37:33 -05:00
Ilia Mirkin	7d9a97d6be	gk110/ir: add atomic op emission, fix gmem loads Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-20 19:37:33 -05:00
Chad Versace	5ce5a7d021	anv/image: Stop including gen8_pack.h in common file	2016-01-20 15:42:17 -08:00
Chad Versace	8ab527de03	isl: Add a README Most of the file-level comment in isl.h is moved to the README.	2016-01-20 15:24:40 -08:00
Roland Scheidegger	dc8b9bd0aa	llvmpipe: warn about illegal use of objects in different contexts Doing that is clearly a bug. We can't quite assert as st/mesa may hit this, but increase at least visibility of it a bit. (For the non-refcounted objects it would be illegal too, but we can't detect that unless we'd store the context ourselves. Plus, those don't tend to cause random crashes at context or object destruction time... So just sampler views, surfaces and so targets for now.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-21 00:09:55 +01:00
Roland Scheidegger	e925ec8811	llvmpipe,i915: add back NEW_RASTERIZER dependency when computing vertex info I removed this mistakenly in `2dbc20e456`. I actually thought it should not be necessary and a piglit run didn't show any differences, but this shouldn't have been in there. draw_prepare_shader_outputs() is in fact dependent on NEW_RASTERIZER. The new polygon-mode-facing test indeed shows why this is necessary, there's lots of invalid reads and writes with valgrind (also crashes without valgrind), because the pre-pipeline vertex size doesn't match the post-pipeline vertex size (note this won't help much with stages which don't have the prepare hook which can grow the vertex size, in particular the wide point stage, but this isn't used by llvmpipe). The test still won't pass, of course, but it is only usage of uninitialized values now, which is much less dangerous... (Albeit I'm pretty sure for i915 it really is not needed anymore as it doesn't care about the extra outputs and doesn't call draw_prepare_shader_outputs().) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-21 00:09:55 +01:00
Ilia Mirkin	dc3ac418bf	nv50/ir: don't flip SHL(ADD) into ADD(SHL) if ADD sources have modifiers Fixes: `31fde8fa` (nv50/ir: flip shl(add, imm) into add(shl, imm)) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-20 18:03:36 -05:00
Kristian Høgsberg Kristensen	7b7a7c2bfc	vk: Make maxSamplerAllocationCount more reasonable We can't allocate 4 billion samplers. Let's go with 64k.	2016-01-20 14:36:52 -08:00
Ilia Mirkin	3a63576168	gk110/ir: fix load from shared memory It was accidentally using the store opcode. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-20 17:16:09 -05:00
Ilia Mirkin	9f23007a7a	gk110/ir: add partial BAR support This is enough for the plain TGSI BARRIER implementation. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-20 17:16:09 -05:00
Kristian Høgsberg Kristensen	8ef002dd7a	vk/tests: Add stub for anv_gem_get_bit6_swizzle()	2016-01-20 13:47:40 -08:00
Kristian Høgsberg Kristensen	420e8664cb	vk/tests: Add isl include path	2016-01-20 13:47:40 -08:00
Kenneth Graunke	b76e4458f9	nir/spirv/glsl450: Use fabs not iabs in ldexp. This was just wrong.	2016-01-20 12:18:02 -08:00
Tapani Pälli	f1152c3455	Revert "glsl: move uniform calculation to link_uniforms" This reverts commit `4475d8f916`.	2016-01-20 22:04:46 +02:00
Kristian Høgsberg Kristensen	947ebd9c71	isl: Add ish.h to libsil_la_SOURCES	2016-01-20 12:03:46 -08:00
Jason Ekstrand	21b2d87408	nir/spirv/glsl450: Implement FrexpStruct	2016-01-20 11:36:41 -08:00
Jason Ekstrand	c7896d1868	spirv/nir/glsl450: Use vtn_create_ssa_value to create SSA values	2016-01-20 11:36:26 -08:00
Jason Ekstrand	e45748bade	anv/device: Default to scalar GS on BDW+	2016-01-20 11:16:44 -08:00
Jason Ekstrand	34f9a5f301	nir/spirv: Pull texture dimensionality out of the image when available	2016-01-20 11:11:30 -08:00
Jason Ekstrand	59ef7c6507	anv/meta: fix UpdateBuffer in the case where we do multiple updates	2016-01-20 07:56:48 -08:00
Jason Ekstrand	a0516cfbac	anv/meta: Fix a finishme	2016-01-20 07:33:41 -08:00
Tapani Pälli	4475d8f916	glsl: move uniform calculation to link_uniforms Patch moves uniform calculation to happen during link_uniforms, this is possible with help of UniformRemapTable that has all the reserved locations. Location assignment for implicit locations is changed so that we utilize also the 'holes' that explicit uniform location assignment might have left in UniformRemapTable, this makes it possible to fit more uniforms as previously we were lazy here and wasting space. Fixes following CTS tests: ES31-CTS.explicit_uniform_location.uniform-loc-mix-with-implicit-max ES31-CTS.explicit_uniform_location.uniform-loc-mix-with-implicit-max-array v2: code cleanups, increment NumUniformRemapTable correctly, fix find_empty_block to work properly and add some more comments. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>	2016-01-20 07:24:39 +02:00
Timothy Arceri	0a6a05c8ea	glsl: add missing explicit_image_format flag to has_layout() Fixes piglit regression after fixes to duplicate layout rules. Previously catching multiple layouts was relying on the code meant to catch duplicates within a single layout(...), this change triggers the rules for multiple layouts. Cc: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-01-20 15:45:56 +11:00
Jason Ekstrand	c7203aa621	nir/spirv: Move OpPhi handling to vtn_cfg.c Phi handling is somewhat intrinsically tied to the CFG. Moving it here makes it a bit easier to handle that. In particular, we can now do SSA repair after we've done the phi node second-pass. This fixes 6 CTS tests.	2016-01-19 19:00:00 -08:00
Jason Ekstrand	891564adb9	nir/spirv: Handle OpLine and OpNoLine in foreach_instruction This way we don't have to explicitly handle them everywhere.	2016-01-19 19:00:00 -08:00
Kenneth Graunke	e79f8a4926	nir: Lower ldexp to arithmetic. This is a port of Matt's GLSL IR lowering pass to NIR. It's required because we translate SPIR-V directly to NIR, bypassing GLSL IR. I haven't introduced a lower_ldexp flag, as I believe all current NIR consumers would set the flag. i965 wants this, vc4 doesn't implement this feature, and st_glsl_to_tgsi currently lowers ldexp unconditionally anyway. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-01-19 18:10:30 -08:00
Kenneth Graunke	b3cc10f3b2	nir: Let nir_opt_algebraic rules contain unsigned constants > INT_MAX. struct.pack('i', val) interprets `val` as a signed integer, and dies if `val` > INT_MAX. For larger constants, we need to use 'I' which interprets it as an unsigned value. This patch makes us use 'I' for all values >= 0, and 'i' for negative values. This should work in all cases. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2016-01-19 18:10:30 -08:00
Jason Ekstrand	eb2a119da2	anv/meta: Implement UpdateBuffer	2016-01-19 16:53:35 -08:00
Jason Ekstrand	0ae1bd321e	anv/meta: Implement CmdFillBuffer	2016-01-19 16:53:35 -08:00
Jason Ekstrand	46eef31311	anv/meta_clear: Call emit_clear directly in ClearImage Using the load op means that we end up with recursive meta. We shouldn't be doing that.	2016-01-19 16:53:35 -08:00
Jason Ekstrand	6325a75011	anv/meta_clear: Do save/restore in actual entry points	2016-01-19 16:53:35 -08:00
Jason Ekstrand	56dbf13045	anv: Add support for VK_WHOLE_SIZE several places	2016-01-19 16:53:35 -08:00
Kenneth Graunke	549be68258	nir/spirv/glsl450: Implement Frexp.	2016-01-19 16:46:03 -08:00
Roland Scheidegger	b21973acaa	llvmpipe: turn depth clears into full depth/stencil clears for d24x8 formats If we have a d24x8 format, there is no stencil. Therefore, we can always clear these bits too, which means this will be some kind of memset rather than read-modify-write. This is good for some 7% increase or so in gears with huge window size - seems to have a bigger effect if things aren't in caches. Of course, any real app won't spend nearly as much time comparatively in clearing depth buffer in the first place, so the speedup will be much lower. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-20 01:45:56 +01:00
Kenneth Graunke	68c9ca1a94	nir/spirv/glsl450: Blindly implement Atan2. This is untested and probably broken. We already passed the atan2 CTS tests before implementing this opcode. Presumably, glslang or something was giving us a plain Atan opcode instead of Atan2. I don't know why.	2016-01-19 16:14:05 -08:00
Kenneth Graunke	2ab3efa0ad	nir/spirv/glsl450: Implement Atan.	2016-01-19 16:14:05 -08:00
Kenneth Graunke	bc9d9bc2e3	nir/spirv/glsl450: Implement Asin and Acos.	2016-01-19 16:14:05 -08:00
Francisco Jerez	f8ac314cc2	i965: Implement compute sampler state atom. Fixes a number of GLES31 CTS failures and hangs on various hardware: ES31-CTS.texture_gather.plain-gather-depth-2d ES31-CTS.texture_gather.plain-gather-depth-2darray ES31-CTS.texture_gather.plain-gather-depth-cube ES31-CTS.texture_gather.offset-gather-depth-2d ES31-CTS.texture_gather.offset-gather-depth-2darray ES31-CTS.layout_binding.sampler2D_layout_binding_texture_ComputeShader ES31-CTS.layout_binding.sampler2DArray_layout_binding_texture_ComputeShader ES31-CTS.explicit_uniform_location.uniform-loc-types-samplers ES31-CTS.compute_shader.resources-texture Some of them were actually passing by luck on some generations even though we weren't uploading sampler state tables explicitly for the compute stage, most likely because they relied on the cached sampler state left from previous rendering to be close enough. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92589 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93312 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93325 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93407 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93725 Reported-by: Marta Lofstedt <marta.lofstedt@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-19 16:11:04 -08:00
Francisco Jerez	9e4c8acd78	i965: Trigger CS state reemission when new sampler state is uploaded. This reuses the NEW_SAMPLER_STATE_TABLE state bit (currently only used on pre-Gen7 hardware) to signal that the sampler state tables have changed in order to make sure that the GPGPU interface descriptor is updated. Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-19 16:11:04 -08:00
Kenneth Graunke	4fc018576b	glsl: Don't abbreviate tessellation shader stage names. I have a patch that writes shaders as .shader_test files, and it uses this function to create the headers (i.e. [vertex shader]). [tess ctrl shader] isn't a valid shader_runner header - it's spelled out as [tessellation control shader]. There's no real reason to abbreviate it, so spell it out. v2: Rebase on Rob's patches to move the code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-01-19 14:57:42 -08:00
Timothy Arceri	11fc7ad62e	mesa: remove link validation that should be done elsewhere Even if re-linking fails rendering shouldn't fail as the previous succesfully linked program will still be available. It also shouldn't be possible to have an unlinked program as part of the current rendering state. This fixes a subtest in: ES31-CTS.sepshaderobjs.StateInteraction This change should improve performance on CPU limited benchmarks as noted in commit `d6c6b186cf`. >From Section 7.3 (Program Objects) of the OpenGL 4.5 spec: "If a program object that is active for any shader stage is re-linked unsuccessfully, the link status will be set to FALSE, but any existing executables and associated state will remain part of the current rendering state until a subsequent call to UseProgram, UseProgramStages, or BindProgramPipeline removes them from use. If such a program is attached to any program pipeline object, the existing executables and associated state will remain part of the program pipeline object until a subsequent call to UseProgramStages removes them from use. An unsuccessfully linked program may not be made part of the current rendering state by UseProgram or added to program pipeline objects by UseProgramStages until it is successfully re-linked." "void UseProgram(uint program); ... An INVALID_OPERATION error is generated if program has not been linked, or was last linked unsuccessfully. The current rendering state is not modified." V2: apply the rule to both core and compat. Cc: Tapani Pälli <tapani.palli@intel.com> Cc: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-20 09:35:04 +11:00
Timothy Arceri	6a660a5f5d	glsl: allow multiple layout qualifiers for a single declaration From the ARB_shading_language_420pack spec: "More than one layout qualifier may appear in a single declaration. If the same layout-qualifier-name occurs in multiple layout qualifiers for the same declaration, the last one overrides the former ones." The parser was already failing correctly when the extension is not available but testing for duplicates within a single layout qualifier was still causing this to fail when available as both cases share the same function for merging. Here we add a parameter to differentiate between the two uses and apply it to the duplicate test. Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2016-01-20 08:06:50 +11:00
Timothy Arceri	564009986f	glsl: update parser to allow duplicate default layout qualifiers In order to only create a single node for each default declaration we add a new boolean parameter to the in/out merge function to only create one once we reach the rightmost layout qualifier. From the ARB_shading_language_420pack spec: "More than one layout qualifier may appear in a single declaration. If the same layout-qualifier-name occurs in multiple layout qualifiers for the same declaration, the last one overrides the former ones." Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2016-01-20 08:06:45 +11:00
Timothy Arceri	a0a93470e3	glsl: move default layout qualifier rules out of the parser Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2016-01-20 08:06:40 +11:00
Timothy Arceri	fd612e4547	glsl: split layout_defaults into specific types This will allow merging of duplicate layout qualifiers as allowed by ARB_shading_language_420pack Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2016-01-20 08:06:35 +11:00
Timothy Arceri	c8b8c578d1	glsl: allow duplicate layout-qualifier-names This is added by ARB_enhanced_layouts although it doesn't fit into any of the six main changes so we enable this independently. From the ARB_enhanced_layouts spec: "More than one layout qualifier may appear in a single declaration. Additionally, the same layout-qualifier-name can occur multiple times within a layout qualifier or across multiple layout qualifiers in the same declaration" Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2016-01-20 08:06:29 +11:00
Matt Turner	866a6bf9f7	i965/vec4: Spaces around operators.	2016-01-19 12:12:38 -08:00
Matt Turner	e734fb0326	i965: Inform compiler of variable range to silence warning. Extends commit `6531ccb70` to silence the warning in release builds as well. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-19 12:08:59 -08:00
Matt Turner	a439788c59	glsl: Restore Mesa-style to shader_enums.c/h.	2016-01-19 12:08:59 -08:00
Jason Ekstrand	5e57a87dcf	anv/pipeline: Fix point size	2016-01-19 12:03:13 -08:00
Daniel Stone	f9ca780ea4	anv/wsi: Mark Wayland buffers as busy We were diligently setting Wayland buffers as non-busy, but nowhere in the code did we set them to busy when submitted to the server. This meant that acquire_next_image would only ever find the same buffer in a loop, over and over. Signed-off-by: Daniel Stone <daniels@collabora.com>	2016-01-19 16:54:55 +00:00
Daniel Stone	ba5ef49dcb	anv/wsi: Avoid stuck Wayland connection In acquire_next_image, we are waiting for a wl_buffer::release to arrive and release one of the buffers in our swapchain. Most compositors don't explicitly flush release events, so we may need to perform a roundtrip instead, to ensure the event arrives. Signed-off-by: Daniel Stone <daniels@collabora.com>	2016-01-19 16:54:55 +00:00
Christian König	f3b067af86	st/va: fix motion adaptive deinterlacing Signed-off-by: Christian König <christian.koenig@amd.com>	2016-01-19 17:28:38 +01:00
Nicolai Hähnle	e6281a2850	util/u_pstipple.c: copy immediates during transformation Apparently, nobody has combined stippling with a fragment shader containing immediates in almost five years... Fixes a bug in Kodi with radeonsi reported by Christian König. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Tested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-19 10:52:35 -05:00
Marta Lofstedt	2bcacc69b9	mesa: Move sanity check of BindVertexBuffer for OpenGL ES 3.1 Sanity check of BindVertexBuffer for OpenGL ES in _mesa_handle_bind_buffer_gen breaks OpenGL ES 2 conformance. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93426 Signed-off-by: Marta Lofstedt <marta.lofstedt@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-01-19 13:08:42 +01:00
Timothy Arceri	d018619d7f	glsl: fix interface block error message Print the stream value not the pointer to the expression, also use the unsigned format specifier. Cc: 11.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-19 14:51:31 +11:00
Jason Ekstrand	3276610ea6	getX/state: Set LOD pre-clamp to OpenGL mode This gets us another couple hundred sampler tests	2016-01-18 17:51:35 -08:00
Jason Ekstrand	580b2e85e4	isl/device: Add a flag for bit 6 swizzling	2016-01-18 17:21:05 -08:00
Jason Ekstrand	587842a0ca	anv/gem: Add a helper for getting bit6 swizzling information	2016-01-18 17:21:05 -08:00
Jason Ekstrand	c2a6f4302e	nir/spirv: Patch through image qualifiers	2016-01-18 17:21:05 -08:00
Jason Ekstrand	56c8a5f2b8	nir/spirv: Implement ImageQuerySize for storage iamges SPIR-V only has one ImageQuerySize opcode that has to work for both textures and storage images. Therefore, we have to special-case that one a bit and look at the type of the incoming image handle.	2016-01-18 17:21:05 -08:00
Jason Ekstrand	bb8cadd169	nir/spirv: Insert movs around image intrinsics Image intrinsics always take a vec4 coordinate and always return a vec4. This simplifies the intrinsics a but but also means that they don't actually match the incomming SPIR-V. In order to compensate for this, we add swizzling movs for both source and destination to get the right number of components.	2016-01-18 17:21:05 -08:00
Ilia Mirkin	a31819cff8	nv50/ir: swap the least-ref'd source into src1 when both const/imm The whole point of inlining sources is to reduce loads. We can end up in a situation where one value is used a lot of times, and one value is used only once per instruction. The once-per-instruction one is the one that should get inlined, but with the previous algorithm, it was given no preference. This flips things around to preferring putting less-referenced values into src1 which increases the likelihood of them being inlined. While we're at it, adjust the heuristic to not treat 0 as an immediate, as well as (effectively) check for situations where LIMMs can't be loaded. All this yields improvements on nvc0: total instructions in shared programs : 6261157 -> 6255985 (-0.08%) total gprs used in shared programs : 945082 -> 943417 (-0.18%) total local used in shared programs : 30372 -> 30288 (-0.28%) total bytes used in shared programs : 50089256 -> 50047880 (-0.08%) local gpr inst bytes helped 21 822 3332 3332 hurt 0 278 565 565 And more importantly avoids generating really bad code with SSBOs, where we end up checking a lot of different values (usually immediates) against the length. On nv50 we get comparable results, and even improve packing (bytes went down more than instructions): total instructions in shared programs : 6346564 -> 6341277 (-0.08%) total gprs used in shared programs : 728719 -> 725131 (-0.49%) total local used in shared programs : 3552 -> 3552 (0.00%) total bytes used in shared programs : 43995688 -> 43932928 (-0.14%) local gpr inst bytes helped 0 1380 3252 3774 hurt 0 287 1710 1365 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-18 17:52:07 -05:00
Ilia Mirkin	af686e7de3	st/mesa: restore the stObj's size if it was cleared out An issue could still occur if the base level is set, but fixing that would require a lot more logic. This fixes the recently-failing texelFetch 3D tests because the mipmaps were no longer being generated, which in turn caused the copying logic to be hit, which in turn didn't work because of the broken width/height/depth. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-18 17:52:07 -05:00
Jason Ekstrand	6f956b0b22	anv/meta: Improve meta clear cleanup a bit	2016-01-18 14:07:46 -08:00
Jason Ekstrand	45d17fcf9b	anv: Misc allocation scope fixes	2016-01-18 14:04:13 -08:00
Jason Ekstrand	378af64e30	anv/meta: Add a meta allocator that uses SCOPE_DEVICE The Vulkan spec requires all allocations that happen for device creation to happen with SCOPE_DEVICE. Since meta calls into other things that allocate memory, the easiest way to do this is with an allocator.	2016-01-18 14:03:24 -08:00
Rob Clark	805e080ba0	freedreno/a4xx: use smaller threadsize for more registers Once we go past half of the "GPR" register file, it seems like we need to run frag shader with smaller threadsize. (The vertex shader already runs at TWO_QUADS, which is the minimum.) Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-18 16:58:25 -05:00
Rob Clark	6062941e4d	freedreno: per-generation OUT_IB packet Some a4xx firmware doesn't implement the "PFD" (prefetch-disabled) version of the CP_INDIRECT_BUFFER packet. So allow for PFD vs PFE per generation. Switch a3xx and a4xx over to using prefetch-enabled version (which is also what blob does.. it seems only on a2xx we cannot use PFE). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-18 16:58:25 -05:00
Jason Ekstrand	3dfa6a881c	anv/meta: Initialize a handle to null	2016-01-18 13:05:02 -08:00
Jason Ekstrand	d49298c702	gen8: Fix border color The border color packet is specified as a 64-byte aligned address relative to dynamic state base address. The way the packing functions are currently set up, we need to provide it with (offset >> 6) because it just shoves the bits in where the PRM says they go and isn't really aware that it's an address.	2016-01-18 12:16:31 -08:00
Jason Ekstrand	bfcc744892	genX/pack: Add a __gen_fixed helper and use it for TextureLODBias The __gen_fixed helper properly clamps the value and also handles negative values correctly. Eventually, we need to make the scripts generate this and use it for more things.	2016-01-18 11:35:04 -08:00
Jason Ekstrand	5a67df2546	anv/pack: Make TextureLODBias a proper 4.8 float XXX: We need to update the generators so this doesn't get stompped.	2016-01-18 10:36:53 -08:00
Jason Ekstrand	15e6af0708	nir/spirv: Handle if's where the merge is also a break or continue	2016-01-18 10:10:47 -08:00
Jason Ekstrand	14ebd0fdd7	nir/spirv: Hanle continues that use SSA values from the loop body Instead of emitting the continue before the loop body we emit it afterwards. Then, once we've finished with the entire function, we run nir_repair_ssa to add whatever phi nodes are needed.	2016-01-18 09:43:12 -08:00
Jason Ekstrand	61ba97522e	nir/lower_returns: Repair SSA after doing return lowering	2016-01-18 09:43:12 -08:00
Jason Ekstrand	b11825590d	nir: Add a pass to repair SSA form	2016-01-18 09:43:12 -08:00
Jason Ekstrand	a7a5e8a2de	nir/vars_to_ssa: Use the new nir_phi_builder helper The efficiency should be approximately the same. We do a little more work per phi node because we have to sort the predecessors. However, we no longer have to walk the blocks a second time to pop things off the stack. The bigger advantage, however, is that we can now re-use the phi placement and per-block SSA value tracking in other passes.	2016-01-18 09:18:42 -08:00
Jason Ekstrand	8aab4a7bd2	nir: Add a phi node placement helper Right now, we have phi placement code in two places and there are other places where it would be nice to be able to do this analysis. Instead of repeating it all over the place, this commit adds a helper for placing all of the needed phi nodes for a value.	2016-01-18 09:18:42 -08:00
Jason Ekstrand	b1f1200e80	util/bitset: Allow iterating over const bitsets	2016-01-18 09:18:42 -08:00
Emil Velikov	c03f3dd0a5	gallium: bundle the compat header u_pwr8.h in the tarball Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-01-18 13:37:58 +02:00
Emil Velikov	7bc714509b	mapi: include gl.xml in the tarball Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-01-18 13:37:58 +02:00
Emil Velikov	a78e08e88f	i965: adding missing headers to the dist tarball Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-01-18 13:37:58 +02:00
Christian König	eaf7ec9cfc	st/va: add motion adaptive deinterlacing v2 v2: minor cleanup Signed-off-by: Christian König <christian.koenig@amd.com>	2016-01-18 10:59:32 +01:00
Michel Dänzer	ad20be1f30	gallium/radeon: Rename do_invalidate_resource to invalidate_buffer And only call it from r600_invalidate_resource for buffer resources. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-18 17:39:37 +09:00
Michel Dänzer	0491dd1deb	st/dri: Don't call invalidate_resource for NULL depth/stencil buffers Fixes crash in 4 EGL piglit tests with radeonsi. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-18 17:39:37 +09:00
Michel Dänzer	a9ab7172a6	radeonsi: Avoid warning about LLVM generating R_0286D0_SPI_PS_INPUT_ADDR Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-01-18 17:39:37 +09:00
Michel Dänzer	4297259fc8	radeonsi: Print "LLVM emitted unknown config register" warning only once Say "LLVM" instead of "Compiler" for clarity. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-18 17:39:37 +09:00
Oded Gabbay	679a654a77	llvmpipe: use vpkswss when dst is signed This patch fixes a bug when building a pack instruction. For POWER (altivec), in case the destination is signed and the src width is 32, we need to use vpkswss. The original code used vpkuwus, which emits an unsigned result. This fixes the following piglit tests on ppc64le: - spec@arb_color_buffer_float@gl_rgba8-drawpixels - shaders@glsl-fs-fogscale I've also corrected some coding style issues in the function. v2: Returned else statements to vmware style Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-18 09:45:25 +02:00
Dave Airlie	119bef9543	glsl: fix subroutine lowering reusing actual parmaters One of the oglconform tests was crashing here, and it was due to not cloning the actual parameters before creating the new call. This makes a call clone function that does the right things to make sure we clone all the needed info, and points the callee at it. (It differs from ->clone due to this). this may fix https://bugs.freedesktop.org/show_bug.cgi?id=93722, I had this patch in my cts fixes tree, but hadn't had time to make sure I liked it. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-01-18 15:02:34 +10:00
Timothy Arceri	9258d9f23d	glsl: remove special case for detecting stream duplicates Any duplicates in a single declaration will already fail the generic duplicates test due to the explicit_stream flag being set. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-01-18 13:09:28 +11:00
Timothy Arceri	eac2cece31	glsl: add missing explicit_stream flag to has_layout() This will allow the ARB_shading_language_420pack rules in glsl_parser.yy for catching duplicate layout qualifiers to be triggered for the stream identifier rather than relying on the code meant to catch duplicates within a single layout(...) Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-01-18 13:09:16 +11:00
Timothy Arceri	86677f1016	mesa: fix segfault in glUniformSubroutinesuiv() From Section 7.9 (SUBROUTINE UNIFORM VARIABLES) of the OpenGL 4.5 Core spec: "The command void UniformSubroutinesuiv(enum shadertype, sizei count, const uint *indices); will load all active subroutine uniforms for shader stage shadertype with subroutine indices from indices, storing indices[i] into the uniform at location i. The indices for any locations between zero and the value of ACTIVE_SUBROUTINE_UNIFORM_LOCATIONS minus one which are not used will be ignored." V2: simplify NULL check suggested by Jason. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Dave Airlie <airlied@redhat.com> Cc: "11.0 11.1" mesa-stable@lists.freedesktop.org https://bugs.freedesktop.org/show_bug.cgi?id=93731	2016-01-18 11:53:24 +11:00
Timothy Arceri	50376e0c0e	glsl: fix segfault linking subroutine uniform with explicit location Reviewed-by: Dave Airlie <airlied@redhat.com> Cc: "11.0 11.1" mesa-stable@lists.freedesktop.org	2016-01-18 11:30:45 +11:00
Ilia Mirkin	4ac1274caa	gm107/ir: don't do indirect frag shader inputs on GM107 Apparently the IPA op decided to stop working with offsets. Need to figure out if we need to do an AL2P situation or something similar. For now just turn it back off. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-17 16:37:04 -05:00
Ilia Mirkin	3281ae96c8	tgsi: initialize Atomic field in tgsi_default_declaration Spotted by Coverity. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-17 16:37:04 -05:00
Ilia Mirkin	5a81b48ad0	nvc0: bsp_bo can't be null We already deref it earlier. And these are all allocated on load. Spotted by Coverity. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-17 16:37:04 -05:00
Oded Gabbay	529aa8249a	llvmpipe: fix arguments order given to vec_andc This patch fixes a classic "confuse the enemy" bug. _mm_andnot_si128 (SSE) and vec_andc (VMX) do the same operation, but the arguments are opposite. _mm_andnot_si128 performs "r = (~a) & b" while vec_andc performs "r = a & (~b)" To make sure this error won't return in another place, I added a wrapper function, vec_andnot_si128, in u_pwr8.h, which makes the swap inside. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-17 21:07:27 +02:00
Rob Clark	02ac91d717	freedreno/ir3: fix mad 3rd src delay calc In `fad158a0` ("freedreno/ir3: array rework") the src # (n) shifted by one, but missed updating delay-slot calc. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-17 12:21:45 -05:00
Rob Clark	2a6ec1e061	freedreno/ir3: better array register allocation Detect arrays which don't conflict with each other and allow overlapping register allocation. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-16 14:23:52 -05:00
Rob Clark	6a33c5c0df	freedreno/ir3: array offset can be negative It at least happens with some piglit tests, like $piglit/bin/vp-address-01 VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL CONST[0..7] DCL ADDR[0] 0: ARL ADDR[0].x, IN[1].xxxx 1: MOV_SAT OUT[1], CONST[ADDR[0].x-1] 2: DP4 OUT[0].x, CONST[4], IN[0] 3: DP4 OUT[0].y, CONST[5], IN[0] 4: DP4 OUT[0].z, CONST[6], IN[0] 5: DP4 OUT[0].w, CONST[7], IN[0] 6: END Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-16 14:23:20 -05:00
Rob Clark	ddede497b8	freedreno/ir3: workaround bug/feature Seems like in certain cases, we cannot use c<a0.x+0> as the third src to cat3 instructions. This may be slightly conservative, we may only have this restriction when the first src is also const. This fixes, for example, +24/-0 of the variable-indexing piglit tests. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-16 14:22:43 -05:00
Rob Clark	ebd3a1fc17	ttn: use writemask for store_var Only user is freedreno, and after array-rework it can cope. Avoids generating loads for a store. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-16 14:21:52 -05:00
Rob Clark	fad158a0e0	freedreno/ir3: array rework Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-16 14:21:08 -05:00
Rob Clark	cc7ed34df9	freedreno/ir3: refactor/simplify cp If we handle separately the special case of eliminating output mov (which includes keeps and various other cases where we don't have a consuming instruction's src register to collapse things into), we can simplify the logic. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-16 14:20:46 -05:00
Rob Clark	680664dff9	freedreno/ir3: fix incorrect decoding of mov instructions Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-16 14:20:37 -05:00
Rob Clark	2809c87f90	freedreno/ir3: remove unused tgsi tokens ptr Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-16 14:18:59 -05:00
Rob Clark	fc0d2f7e02	freedreno/ir3: bit of ra refactor Shuffle things slightly, passing instr-data to ra_name() to reduce the number of places where we need to add support for array names. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-16 14:18:47 -05:00
Rob Clark	d430f443de	freedreno/ir3: cosmetic de-indent Collapse two nested if's into one to reduce indent level. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-16 14:18:33 -05:00
Rob Clark	6f0377d651	ttn: add missing writemask on store_output Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-01-16 13:35:44 -05:00
Rob Clark	683794fd60	nir/print: const_index is signed Noticed this with $piglit/bin/vp-address-01 Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-01-16 13:35:44 -05:00
Rob Clark	211b0644e6	nir: few missing struct names nir.h is a bit inconsistent about 'typedef struct {} nir_foo' vs 'typedef struct nir_foo {} nir_foo'. But missing struct name tags is inconvenient when you need a fwd declaration without pulling in all of nir. So add missing struct name tag for nir_variable, and a couple other spots where it would likely be useful. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-16 13:35:43 -05:00
Ilia Mirkin	32a9fe013b	nv50/ir: add saturate support on ex2 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-16 00:10:56 -05:00
Jeff Muizelaar	e5fefe49f2	gallivm: avoid crashing in mod by 0 with llvmpipe This adds code that is basically the same as the code in umod, udiv and idiv. However, unlike idiv we return -1. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-16 03:36:29 +01:00
Kenneth Graunke	d54a70aa18	glsl: Allow implicit int -> uint conversions for bitwise operators (&, ^, \|). The ARB has decided that implicit conversions should be performed for bitwise operators in future language revisions. Implementations of current language revisions may or may not perform them. This patch makes Mesa apply implicti conversions even on current language versions. Applications appear to expect this behavior, and there's really no downside to doing so. Fixes shader compilation in Shadow of Mordor. Bugzilla: https://www.khronos.org/bugzilla/show_bug.cgi?id=1405 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: mesa-stable@lists.freedesktop.org	2016-01-15 17:53:44 -08:00
Jason Ekstrand	61b0cfd84e	i965/fs: Always set channel 2 of texture headers in some stages In the vertex and fragment stages, the hardware is nice to us and leaves g0.2 zerod out for us so we can use it for headers. However, in compute, geometry, and tessellation stages, the hardware is not so nice. In particular, for compute shaders on BDW, the hardware places some debug bits in 23:15. As it happens, bit 15 is interpreted by the sampler as the alpha channel mask. This means that if you use a texturing instruction with a header in a compute shader, you may randomly get the alpha channel disabled. Since channel masks affect the return length of the sampler message, this can lead the GPU to expect a different mlen to the one you specified in the shader and this, in turn, hangs your GPU. Cc: "11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-15 16:44:02 -08:00
Jason Ekstrand	9870f798be	i965/fs/generator: Take an actual shader stage rather than a string Cc: "11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-15 16:44:02 -08:00
Jason Ekstrand	0a6811207f	i965/vec4: Use UW type for multiply into accumulator on GEN8+ BDW adds the following restriction: "When multiplying DW x DW, the dst cannot be accumulator." Cc: "11.1,11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-15 16:44:02 -08:00
Jason Ekstrand	f509a89082	nir/lower_system_values: Lower vertexID to id+base if needed	2016-01-15 16:15:50 -08:00
Jason Ekstrand	6b64dddd71	anv/batch_chain: Remove padding from the BO before emitting BUFFER_END	2016-01-15 15:59:58 -08:00
Jason Ekstrand	67bf74f020	anv/batch_chain: Don't call current_batch_bo() again We call it once at the top of the function and then hold on to the pointer. It shouldn't have changed, so there's no reason to query for it again.	2016-01-15 15:49:32 -08:00
Jason Ekstrand	117cac75d0	nir/spirv: Stop trusting the SPIR-V for the number of texture coordinates	2016-01-15 11:13:51 -08:00
Roland Scheidegger	03f66dfb4b	llvmpipe: ditch additional ref counting for vertex/geometry sampler views The cleaning up was quite a performance hog (making pipe_resource_reference the number two in profilers on the vertex path, and 3rd overall, with its cousin pipe_reference_described not far behind) if there were lots of tiny draw calls (ipers). Now the reason was really that it was blindly calling this for all potential shader views (so 32 each for vs and gs) even though the app never touched a single one which could have been fixed, however I can't come up with a good reason why we refcount these. We've got references, of course, in the sampler views, which should be quite sufficient as we do all vertex and geometry shader execution fully synchronous. (Calling prepare_shader_sampling for all draw calls even if there were no changes looks quite suboptimal too, but generally we don't really expect vs/gs shader sampling to be used much with llvmpipe, and there's even an early exit if there aren't any views to avoid the "null loop" albeit it's now no longer always trying to loop through all 32 slots. Maybe improve another time...). Of course, if we manage to make vertex loads run asynchronously some day, we need references again, but adding that back would be the least of the problems... Also only set LP_NEW_SAMPLER_VIEW for fragment sampler views. Nothing on the vertex side depends on it (I suppose we'd really wanted a separate flag in any case). (Good for a 3% improvement or so in ipers under the right conditions.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-15 20:13:45 +01:00
Roland Scheidegger	2f9a325b6a	llvmpipe: fix "leaking" textures This was not really a leak per se, but we were referencing the textures for longer than intended. If textures were set via llvmpipe_set_sampler_views() (for fs) and then picked up by lp_setup_set_fragment_sampler_views(), they were referenced in the setup state. However, the only way to unreference them was by replacing them with another texture, and not when the texture slot was replaced with a NULL sampler view. (They were then further also referenced by the scene too which might have additional minor side effects as we limit the memory size which is allowed to be referenced by a scene in a rather crude way.) Only setup destruction (at context destruction time) then finally would get rid of the references. Fix this by noting the number of textures the last time, and unreference things if the new view is NULL (avoiding having to unreference things always up to PIPE_MAX_SHADER_SAMPLER_VIEWS which would also have worked). Found by code inspection, no test... v2: rename var Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-15 20:13:45 +01:00
Chad Versace	0e420cb67f	anv: Populate SURFACE_STATE more safely genX_image_view_init allocates up to 3 separate SURFACE_STATE structures, and populates each from a single template. Stop mutating the template between each final SURFACE_STATE.	2016-01-15 11:00:22 -08:00
Chad Versace	eab6212efd	anv/meta: Stop leaking renderpass and framebuffer	2016-01-15 10:14:07 -08:00
Chad Versace	482a1f5eab	anv/meta: Reuse code for vkCmdClear{Color,DepthStencil}Image The two function bodies were very similar. Move common code to anv_cmd_clear_image(). Fixes all 'dEQP-VK.renderpass.formats.*' on Skylake.	2016-01-15 07:46:10 -08:00
Chad Versace	1afe33f8b3	anv/gen8: Fix SF_CLIP_VIEWPORT's Z elements SF_CLIP_VIEWPORT does not clamp Z values. It only scales and shifts them. Clamping to VkViewport::minDepth,maxDepth is instead handled by CC_VIEWPORT. Fixes dEQP-VK.renderpass.simple.depth on Broadwell.	2016-01-14 22:53:05 -08:00
Chad Versace	842b424d3b	anv/meta: Implement vkCmdClearDepthStencilImage	2016-01-14 22:53:05 -08:00
Chad Versace	e4b17a2e1a	anv/meta: Implement vkCmdClearAttachments	2016-01-14 22:53:05 -08:00
Chad Versace	0038ae2e4a	anv/meta: Add VkClearRect param to emit_clear() Prepares for vkCmdClearAttachments.	2016-01-14 22:53:05 -08:00
Chad Versace	11f5433715	anv: Distinguish between subpass setup and subpass start vkCmdBeginRenderPass, vkCmdNextSubpass, and vkBeginCommandBuffer with VK_COMMAND_BUFFER_USAGE_RENDER_PASS_CONTINUE_BIT, all setup the command buffer for recording commands for some subpass. But only the first two, vkCmdBeginRenderPass and vkCmdNextSubpass, can start a subpass. Therefore, calling anv_cmd_buffer_begin_subpass() inside vkCmdBeginCommandBuffer is misleading. Clarify its purpose by renaming it to anv_cmd_buffer_set_subpass() and adding comments.	2016-01-14 22:53:05 -08:00
Chad Versace	deb8dd89b5	anv: Emit load clears at start of each subpass This should improve cache residency for render targets. Pre-patch, vkCmdBeginRenderPass emitted all the meta clears for VK_ATTACHMENT_LOAD_OP_CLEAR before any subpass began. Post-patch, vCmdBeginRenderPass and vkCmdNextSubpass emit only the clears needed for that current subpass.	2016-01-14 22:53:05 -08:00
Chad Versace	0679bef49f	anv/meta: Create 8 pipelines for color clears This prepares for moving the clear ops from the start of the render pass into each subpass. Pipeline N will be used to clear color attachment N of the current subpass. Currently meta color clears still create a throwaway subpass with exactly one attachment, so currently only pipeline 0 is used. This is an ugly hack to workaround the compiler's current inability to dynamically set the render target index in the render target write message.	2016-01-14 22:53:05 -08:00
Chad Versace	2997b0da4a	anv: Allow override of pipeline color attachment count Add anv_graphics_pipeline_create_info::color_attachment_count. If non-negative, then it overrides the color attachment count in the pipeline's subpass. Useful for meta. (All the hacks for meta!)	2016-01-14 22:53:05 -08:00
Chad Versace	13610c03a7	anv/meta: Name the nir shaders The names appear in debug output.	2016-01-14 22:53:05 -08:00
Chad Versace	6a1a760e3c	anv: Move MAX_* defs to top of anv_private.h Because I need to use MAX_RTS in struct anv_meta_state.	2016-01-14 22:53:05 -08:00
Chad Versace	4c2bafb9bf	anv: Define zero() macro zero(x) memsets x to zero. Eliminates bugs due to errors in memset's size param.	2016-01-14 22:53:05 -08:00
Chad Versace	f2700d665c	anv/meta: Rename emit_load_*_clear funcs The functions will soon handle clears unrelated to VK_ATTACHMENT_LOAD_OP_CLEAR, namely vkCmdClearAttachments. So remove "load" from their name: emit_load_color_clear -> emit_color_clear emit_load_depthstencil_clear -> emit_depthstencil_clear	2016-01-14 22:53:05 -08:00
Chad Versace	356f952f87	anv/meta: Use anv_cmd_state::attachments for clears Rewrite anv_cmd_buffer_clear_attachments, which emits the top-of-pass clears, to use the data provided in anv_cmd_state::attachments. This prepares for deferring each attachment clear to the first subpass that uses the attachment.	2016-01-14 22:53:05 -08:00
Chad Versace	a4b045ca44	anv: Add anv_cmd_state::attachments This array contains attachment state when recording a renderpass instance. It's populated on each call to anv_cmd_buffer_set_pass. The data is currently set but unused. We'll use it later to defer each attachment clear to the subpass that first uses the attachment.	2016-01-14 22:53:05 -08:00
Samuel Iglesias Gonsálvez	781d2787bc	glsl: restrict consumer stage condition to modify interpolation type Only modify interpolation type for integer-based varyings or when the consumer is known and different than fragment shader. If we are linking separate shader programs and the consumer is unknown, the consumer could be added later and be a fragment shader. If we modify the interpolation type in this case, we could read wrong values in the fragment shader inputs, as shown in bug 93320. Fixes the following CTS test: ES31-CTS.vertex_attrib_binding.advanced-bindingUpdate Fixes the following dEQP tests: dEQP-GLES31.functional.separate_shader.random.102 dEQP-GLES31.functional.separate_shader.random.111 dEQP-GLES31.functional.separate_shader.random.115 dEQP-GLES31.functional.separate_shader.random.17 dEQP-GLES31.functional.separate_shader.random.22 dEQP-GLES31.functional.separate_shader.random.23 dEQP-GLES31.functional.separate_shader.random.3 dEQP-GLES31.functional.separate_shader.random.32 dEQP-GLES31.functional.separate_shader.random.39 dEQP-GLES31.functional.separate_shader.random.64 dEQP-GLES31.functional.separate_shader.random.73 dEQP-GLES31.functional.separate_shader.random.91 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93320 Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-01-15 07:06:41 +01:00
Jason Ekstrand	5d1c2736b6	i965/fs/generator: Change a comment as per jordan's suggestion	2016-01-14 22:03:15 -08:00
Kenneth Graunke	3657cbf24f	i965: Apply add_const_offset_to_base for vec4 VS inputs too. This shouldn't hurt anything, and I'm about to introduce a pass that will want it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-14 21:32:59 -08:00
Kenneth Graunke	a3500f943e	i965: Make add_const_offset_to_base() work at the shader level. This makes it a pass, hiding the parameter structs and block callbacks so it's simpler to work with. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-14 21:32:59 -08:00
Kenneth Graunke	824d82025d	i965: Make an is_scalar boolean in brw_compile_vs(). Shorter than compiler->scalar_stage[MESA_SHADER_VERTEX], which can help with line-wrapping. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-14 21:32:59 -08:00
Kenneth Graunke	bb6612f06b	nir/builder: Add a nir_build_ivec4() convenience helper. nir_build_ivec4 is more readable and succinct than using nir_build_imm directly, even if you have C99. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-14 21:32:59 -08:00
Tapani Pälli	cf96bce0ca	glsl: mark explicit uniforms as explicit in other stages too If shader declares uniform explicit location in one stage but implicit in another, explicit location should be used. Patch marks implicit uniforms as explicit if they were explicit in previous stage. This makes sure that we don't treat them implicit later when assigning locations. Fixes following CTS test: ES31-CTS.explicit_uniform_location.uniform-loc-implicit-in-some-stages3 v2: move check to cross_validate_globals (Timothy) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-01-15 07:12:42 +02:00
Jason Ekstrand	6be517b20e	i965/fs: Always set hannel 2 of texture headers in some stages	2016-01-14 20:42:47 -08:00
Jason Ekstrand	e1d13cd058	i965/fs/generator: Take an actual shader stage rather than a string	2016-01-14 20:27:56 -08:00
Francisco Jerez	0556b87de4	i965/gen7.5+: Disable resource streamer during GPGPU workloads. The RS and hardware binding tables are only supported on the 3D pipeline and can lead to corruption if left enabled during a GPGPU workload. Disable it when switching to the GPGPU (or media) pipeline and re-enable it when switching back to the 3D pipeline. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2016-01-14 19:26:24 -08:00
Francisco Jerez	c8df0e7bf3	i965/gen7: Emit stall and dummy primitive draw after switching to the 3D pipeline. This hardware bug can supposedly lead to a hang on IVB and VLV. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-01-14 19:26:23 -08:00
Francisco Jerez	635be1402c	i965/gen4-5: Emit MI_FLUSH as required prior to switching pipelines. AFAIK brw_emit_select_pipeline() is only called once during context init on Gen4-5, at which point the pipeline is likely to be already idle so it may just happen to work by luck regardless of the MI_FLUSH. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-01-14 19:26:23 -08:00
Francisco Jerez	18c76551ee	i965/gen6-7: Implement stall and flushes required prior to switching pipelines. Switching the current pipeline while it's not completely idle or the read and write caches aren't flushed can lead to corruption. Fixes misrendering of at least the following Khronos CTS test: ES31-CTS.shader_image_load_store.basic-allTargets-store-fs The stall and flushes are no longer required on Gen8+. v2: Emit PIPE_CONTROL with non-zero post-sync op before the write cache flush on SNB due to hardware bug. (Ken) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93323 Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-01-14 19:26:23 -08:00
Francisco Jerez	044acb9256	i965/gen8+: Invalidate color calc state when switching to the GPGPU pipeline. This hardware bug can cause a hang on context restore while the current pipeline is set to GPGPU (BDWGFX HSD 1909593). In addition to clearing the valid bit, mark the CC state as dirty to make sure that the CC indirect state pointer is re-emitted when we switch back to the 3D pipeline. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-01-14 19:26:23 -08:00
Francisco Jerez	22ac1f6922	i965: Add state bit to trigger re-emission of color calculator state. This will be used on Gen8+ to make sure that the color calculator state pointers are re-emitted when switching back to the 3D pipeline after some GPGPU workload due to a hardware workaround. There are other state bits already defined that could be used to achieve the same effect but they all cause a ton of unrelated state to be re-emitted (e.g. BRW_NEW_STATE_BASE_ADDRESS), so just define a new one, state bits are cheap. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-01-14 19:26:23 -08:00
Jason Ekstrand	47af950df5	anv/apply_pipeline_layout: Stomp texture array size to 1	2016-01-14 18:58:25 -08:00
Jason Ekstrand	6483d3f8fe	nir/spirv: Fix texture return types We were just hard-coding everything to a vec4. This meant we weren't handling shadow samplers at all and integer things were getting the wrong return type.	2016-01-14 18:48:57 -08:00
Ilia Mirkin	fffb559129	nv50/ir: rebase indirect temp arrays to 0, so that we use less lmem space Reduces local memory usage in a lot of Metro 2033 Redux and a few KSP shaders: total local used in shared programs : 54116 -> 30372 (-43.88%) Probably modest advantage to execution, but it's an imporant prerequisite to dropping some of the TGSI optimizations done by the state tracker. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-14 20:14:01 -05:00
Ilia Mirkin	e231f59b6d	nv50/ir: only use FILE_LOCAL_MEMORY for temp arrays that use indirection Previously we were treating any indirect temp array usage to mean that everything should end up in lmem. The MemoryOpt pass would clean a lot of that up later, but in the meanwhile we would lose a lot of opportunity for optimization. This helps a lot of Metro 2033 Redux and a handful of KSP shaders: total instructions in shared programs : 6288373 -> 6261517 (-0.43%) total gprs used in shared programs : 944051 -> 945131 (0.11%) total local used in shared programs : 54116 -> 54116 (0.00%) A typical case is for register usage to double and for instructions to halve. A future commit can also optimize local memory usage size to be reduced with better packing. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-14 20:13:59 -05:00
Ilia Mirkin	37b67db6ae	nvc0/ir: be careful about propagating very large offsets into const load Indirect constbuf indexing works by using very large offsets. However if an indirect constbuf index load is const-propagated, it becomes a very large const offset. Take that into account when legalizing the SSA by moving the high parts of that offset into the file index. Also disallow very large (or small) indices on most other instructions. This fixes regressions in ubo_array_indexing/*-two-arrays piglit tests. Fixes: `abd326e81b` (nv50/ir: propagate indirect loads into instructions) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-14 18:20:27 -05:00
Kristian Høgsberg Kristensen	2eb52198ff	vk: Fix struct field indentation	2016-01-14 15:18:40 -08:00
Chad Versace	5dea9d0039	anv: Document anv_cmd_state::current_pipeline It's the value of PIPELINE_SELECT.PipelineSelection.	2016-01-14 13:18:40 -08:00
Chad Versace	ed33ccde63	anv: Make vkBeginCommandBuffer reset the command buffer If its the command buffer's first call to vkBeginCommandBuffer, we must initialize the command buffer's state. Otherwise, we must reset its state. In both cases, let's use anv_ResetCommandBuffer. From the Vulkan 1.0 spec: If a command buffer is in the executable state and the command buffer was allocated from a command pool with the VK_COMMAND_POOL_CREATE_RESET_COMMAND_BUFFER_BIT flag set, then vkBeginCommandBuffer implicitly resets the command buffer, behaving as if vkResetCommandBuffer had been called with VK_COMMAND_BUFFER_RESET_RELEASE_RESOURCES_BIT not set. It then puts the command buffer in the recording state.	2016-01-14 13:14:40 -08:00
Chad Versace	ea20389320	anv: Add FIXME for vkResetCommandPool vkResetCommandPool currently destroys its command buffers. The Vulkan 1.0 spec requires that it only reset them: Resetting a command pool recycles all of the resources from all of the command buffers allocated from the command pool back to the command pool. All command buffers that have been allocated from the command pool are put in the initial state.	2016-01-14 13:14:40 -08:00
Chad Versace	20fd816b6b	anv: Remove duplicate func prototype anv_private.h declared anv_cmd_buffer_begin_subpass twice.	2016-01-14 13:14:40 -08:00
Chad Versace	0415dfcfe7	anv/meta: Add FINISHME for clearing multi-layer framebuffers	2016-01-14 13:14:40 -08:00
Jason Ekstrand	32f8bcb84f	i965/vec4: Use UW type for multiply into accumulator on GEN8+ BDW adds the following restriction: "When multiplying DW x DW, the dst cannot be accumulator."	2016-01-14 12:04:25 -08:00
Jason Ekstrand	45349acad0	Merge remote-tracking branch 'mesa-public/master' into vulkan This fixes the bitfieldextract and bitfieldinsert CTS tests	2016-01-14 11:36:27 -08:00
Ilia Mirkin	7a521ddf36	nvc0: allow fragment shader inputs to use indirect indexing Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-14 14:28:04 -05:00
Ilia Mirkin	e94ef885bb	st/mesa: use surface format to generate mipmaps when available This fixes the recently posted mipmap + texture views piglit test. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-14 14:28:04 -05:00
Marek Olšák	dc96a18d24	radeonsi: don't miss changes to SPI_TMPRING_SIZE I'm not sure about the consequences of this bug, but it's definitely dangerous. This applies to SI, CIK, VI. Cc: 11.0 11.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-14 19:55:41 +01:00
Charmaine Lee	6303231a1d	svga: add DXGenMips command support For those formats that support hw mipmap generation, use the DXGenMips command. Otherwise fallback to the mipmap generation utility. Tested with piglit, OpenGL apps (Heaven, Turbine, Cinebench) v2: make sure the texture surface was created with the render target bind flag set relocation flag to SVGA_RELOC_WRITE for the texture surface Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-14 10:44:25 -07:00
Charmaine Lee	78e628ae43	svga: add num-generate-mipmap HUD query The actual increment of the num-generate-mipmap counter will be done in a subsequent patch when hw generate mipmap is supported. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-14 10:39:53 -07:00
Charmaine Lee	3038e8984d	gallium/st: add pipe_context::generate_mipmap() This patch adds a new interface to support hardware mipmap generation. PIPE_CAP_GENERATE_MIPMAP is added to allow a driver to specify if this new interface is supported; if not supported, the state tracker will fallback to mipmap generation by rendering/texturing. v2: add PIPE_CAP_GENERATE_MIPMAP to the disabled section for all drivers v3: add format to the generate_mipmap interface to allow mipmap generation using a format other than the resource format v4: fix return type of trace_context_generate_mipmap() Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-14 10:39:53 -07:00
Brian Paul	b1e11f4d71	st/mesa: declare struct pipe_screen in st_cb_bufferobjects.h To silence a compiler warning. Trivial.	2016-01-14 10:38:18 -07:00
Matt Turner	b82e26a6a4	nir: Lower bitfield_extract. The OpenGL specifications for bitfieldExtract() says: The result will be undefined if <offset> or <bits> is negative, or if the sum of <offset> and <bits> is greater than the number of bits used to store the operand. Therefore passing bits=32, offset=0 is legal and defined in GLSL. But the earlier SM5 ubfe/ibfe opcodes are specified to accept a bitfield width ranging from 0-31. As such, Intel and AMD instructions read only the low 5 bits of the width operand, making them not able to implement the GLSL-specified behavior directly. This commit adds ubfe/ibfe operations from SM5 and a lowering pass for bitfield_extract to to handle the trivial case of <bits> = 32 as bitfieldExtract: bits > 31 ? value : bfe(value, offset, bits) Fixes: ES31-CTS.shader_bitfield_operation.bitfieldExtract.uvec3_0 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92595 Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Tested-by: Marta Lofstedt <marta.lofstedt@intel.com>	2016-01-14 09:28:01 -08:00
Matt Turner	15640ee77a	nir: Handle <bits>=32 case in bitfield_insert lowering. The OpenGL specifications for bitfieldInsert() says: The result will be undefined if <offset> or <bits> is negative, or if the sum of <offset> and <bits> is greater than the number of bits used to store the operand. Therefore passing bits=32, offset=0 is legal and defined in GLSL. But the earlier SM5 bfi opcode is specified to accept a bitfield width ranging from 0-31. As such, Intel and AMD instructions read only the low 5 bits of the width operand, making them not able to implement the GLSL-specified behavior directly. This commit fixes the lowering of bitfield_insert to handle the trivial case of <bits> = 32 as bitfieldInsert: bits > 31 ? insert : bfi(bfm(bits, offset), insert, base) Fixes: ES31-CTS.shader_bitfield_operation.bitfieldInsert.uint_2 ES31-CTS.shader_bitfield_operation.bitfieldInsert.uvec4_3 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92595 Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Tested-by: Marta Lofstedt <marta.lofstedt@intel.com>	2016-01-14 09:27:52 -08:00
Jason Ekstrand	f46f4e4886	nir/spirv: Add initial support for Vertex/Instance index	2016-01-14 09:12:32 -08:00
Jason Ekstrand	3d0fac7aca	vulkan.h: Pull in 1.0.1 header	2016-01-14 08:37:54 -08:00
Jason Ekstrand	24a6fcba77	vulkan-1.0.0: Bump the version to 1.0.0	2016-01-14 08:26:37 -08:00
Jason Ekstrand	c310fb032d	vulkan-1.0.0: Rework memory barriers	2016-01-14 08:09:39 -08:00
Brian Paul	6470435190	st/mesa: add check for color logicop in blit_copy_pixels() We check that a bunch of raster operations are disabled in blit_copy_pixels(). We also need to check that color logicop is disabled. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-14 09:08:21 -07:00
Jason Ekstrand	b14a78cfb8	vulkan-1.0.0: No-op WSI changes	2016-01-14 08:02:44 -08:00
Jason Ekstrand	6d3322d0e5	vulkan-1.0.0: Make extents unsigned	2016-01-14 08:00:18 -08:00
Jason Ekstrand	b57c72d964	vulkan-1.0.0: Rework blits to use four offsets	2016-01-14 07:59:37 -08:00
Jason Ekstrand	f6cae99294	vulkan-1.0.0: Split out command buffer inheritance info	2016-01-14 07:45:15 -08:00
Jason Ekstrand	f99f847412	vulkan-1.0.0: Re-order some structs in the header	2016-01-14 07:43:05 -08:00
Jason Ekstrand	aab9517f3d	vulkan-1.0.0: Misc. field and argument renames	2016-01-14 07:41:45 -08:00
Jason Ekstrand	d877095e66	vulkan-1.0.0: Get rid of MIPMAP_MODE_BASE	2016-01-14 07:32:16 -08:00
Jason Ekstrand	7b81637762	vulkan-1.0.0: Convert pPreserveAttachments to a uint32_t	2016-01-14 07:30:46 -08:00
Jason Ekstrand	802f00219a	anv/device: Update features and limits	2016-01-14 07:30:46 -08:00
Jason Ekstrand	08735ba91c	anv/cmd_buffer: Fix setting of viewport/scissor count	2016-01-14 07:30:46 -08:00
Jason Ekstrand	ed4fe3e9ba	anv/state: Respect SamplerCreateInfo.anisotropyEnable	2016-01-14 07:30:46 -08:00
Jason Ekstrand	8a81d136f8	anv/image: Fill out VkSubresourceLayout.arrayPitch	2016-01-14 07:30:46 -08:00
BogDan Vatra	102c74277f	WIP: Partially upgrade to vulkan v0.221.0 TODO, make use of: - VkPhysicalDeviceFeatures.drawIndirectFirstInstance, - VkPhysicalDeviceFeatures.inheritedQueries - VkPhysicalDeviceLimits.timestampComputeAndGraphics - VkSubmitInfo.pWaitDstStageMask - VkSubresourceLayout.arrayPitch - VkSamplerCreateInfo.anisotropyEnable	2016-01-14 07:30:46 -08:00
Nicolai Hähnle	e976860638	gallium/radeon: do not reallocate user memory buffers The whole point of AMD_pinned_memory is that applications don't have to map buffers via OpenGL - but they're still allowed to, so make sure we don't break the link between buffer object and user memory unless explicitly instructed to. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-14 09:41:24 -05:00
Nicolai Hähnle	321140d563	gallium/radeon: implement PIPE_CAP_INVALIDATE_BUFFER Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-14 09:41:04 -05:00
Nicolai Hähnle	08c71740ad	gallium/radeon: reset valid_buffer_range on PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE This accomodates a streaming pattern where the discard flag is set when the application wraps back to the beginning of the buffer. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-14 09:40:00 -05:00
Nicolai Hähnle	70e66c57bb	st/mesa: implement Driver.InvalidateBufferSubData Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-14 09:39:57 -05:00
Nicolai Hähnle	9e2240e892	st/mesa: use pipe->invalidate_resource instead of buffer re-allocation Drivers are expected to avoid unnecessary work when possible in this code path. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-14 09:39:53 -05:00
Nicolai Hähnle	654670b404	gallium: add PIPE_CAP_INVALIDATE_BUFFER It makes sense to re-use pipe->invalidate_resource for the purpose of glInvalidateBufferData, but this function is already implemented in vc4 where it doesn't have the expected behavior. So add a capability flag to indicate that the driver supports the expected behavior. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-14 09:39:38 -05:00
Nicolai Hähnle	6f4ae81005	mesa: add Driver.InvalidateBufferSubData Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-14 09:39:30 -05:00
Nicolai Hähnle	53c77494aa	mesa: fix the checks in _mesa_InvalidateBuffer(Sub)Data Change the check to be in line with what the quoted spec fragment says. I have sent out a piglit test for this as well. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-14 09:39:22 -05:00
Nicolai Hähnle	cbcdef7b40	winsys/radeon: fix warnings about incompatible pointer types Some confusion between pb_buffer and radeon_bo as well as between radeon_drm_winsys and radeon_winsys. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-14 09:33:58 -05:00
Neil Roberts	06b526de05	texobj: Check completeness with InternalFormat rather than Mesa format The internal Mesa format used for a texture might not match the one requested in the internalFormat when the texture was created, for example if the driver is internally remapping RGB textures to RGBA. Otherwise it can cause false positives for completeness if one mipmap image is created as RGBA and the other as RGB because they would both have an RGBA Mesa format. If we check the InternalFormat instead then we are directly checking the API usage which I think better matches the intention of the check. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93700 Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-01-14 12:18:24 +00:00
Jordan Justen	8ce2b0e140	nir/spirv: Add support for ArrayLength op Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-13 23:34:45 -08:00
Jason Ekstrand	4507d8a57a	nir/spirv/alu: Properly implement mod/rem	2016-01-13 16:53:02 -08:00
Jason Ekstrand	7d5ae2d34b	i965: Implement nir_op_irem and nir_op_srem	2016-01-13 16:53:02 -08:00
Ben Widawsky	f4ab7340ca	i965: Remove unused hw_must_use_separate_stencil I spotted this while looking for what needs updating in future platforms. I'm too lazy to go through the git logs, but it was probably missed by Jason when all the brw refactoring happened. Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-01-13 16:41:04 -08:00
Matt Turner	138a7dc826	i965: Drop extra newline from shader compile messages. Ilia changed shader-db's run.c to not expect messages to contain a newline in shader-db commit 51bbc8035.	2016-01-13 16:19:18 -08:00
Jason Ekstrand	cac99fffdb	nir: Add more modulus and remainder opcodes SPIR-V makes a distinction between "modulus" and "remainder" for both floating-point and signed integer variants. The difference is primarily one of which source they take their sign from. The "remainder" opcode for integers is equivalent to the C/C++ "%" operation while the "modulus" opcode is more mathematically correct (at least for an unsigned divisor). This commit adds corresponding opcodes to NIR.	2016-01-13 15:18:36 -08:00
Jason Ekstrand	0079523a0d	nir/spirv: Add support for OpSpecConstantOp	2016-01-13 15:18:36 -08:00
Jason Ekstrand	8c408b9b81	nir/spirv/alu: Factor out the opcode table	2016-01-13 15:18:36 -08:00
Jason Ekstrand	9b7e08118b	anv/pipeline: Pass through specialization constants	2016-01-13 15:18:36 -08:00
Jason Ekstrand	c95c3b2c21	nir/spirv: Add initial support for specialization constants	2016-01-13 15:18:36 -08:00
Matt Turner	74cff779eb	nir: Change bfm's semantics to match Intel/AMD/SM5. Intel/AMD's hardware instructions do not handle arguments of 32. Constant evaluation should not produce a result different from the hardware instruction. The s/1ull/1u/ change is intentional: previously we wanted defined behavior for the "1 << 32" case, but we're making this case undefined so we can make it 1u and save ourselves a 64-bit operation. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-13 11:22:40 -08:00
Matt Turner	a5fcff6628	glsl: Fix undefined shifts. Shifting into the sign bit is undefined, as is shifting by 32. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-13 11:22:11 -08:00
Matt Turner	966a0dd720	glsl: Handle failure of Python codegen scripts. If a Python codegen script failed, it would write a zero-byte file, which on subsequent invocations of make would trick it into thinking the file was appropriately generated. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-13 10:35:12 -08:00
Kenneth Graunke	84d6130c21	glsl, nir: Make ir_triop_bitfield_extract a vectorized operation. We would like to be able to combine result.x = bitfieldExtract(src0.x, src1.x, src2.x); result.y = bitfieldExtract(src0.y, src1.y, src2.y); result.z = bitfieldExtract(src0.z, src1.z, src2.z); result.w = bitfieldExtract(src0.w, src1.w, src2.w); into a single ivec4 bitfieldInsert operation. This should be possible with most drivers. This patch changes the offset and bits parameters from scalar ints to ivecN or uvecN. The type of all three operands will be the same, for simplicity. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-13 10:35:12 -08:00
Kenneth Graunke	b4e198f47f	glsl, nir: Make ir_quadop_bitfield_insert a vectorized operation. We would like to be able to combine result.x = bitfieldInsert(src0.x, src1.x, src2.x, src3.x); result.y = bitfieldInsert(src0.y, src1.y, src2.y, src3.y); result.z = bitfieldInsert(src0.z, src1.z, src2.z, src3.z); result.w = bitfieldInsert(src0.w, src1.w, src2.w, src3.w); into a single ivec4 bitfieldInsert operation. This should be possible with most drivers. This patch changes the offset and bits parameters from scalar ints to ivecN or uvecN. The type of all four operands will be the same, for simplicity. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-13 10:35:12 -08:00
Kenneth Graunke	b85a229e1f	glsl: Delete the ir_binop_bfm and ir_triop_bfi opcodes. TGSI doesn't use these - it just translates ir_quadop_bitfield_insert directly. NIR can handle ir_quadop_bitfield_insert as well. These opcodes were only used for i965, and with Jason's recent patches, we can do this lowering in NIR (which also gains us SPIR-V handling). So there's not much point to retaining this GLSL IR lowering code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-01-13 10:35:12 -08:00
Matt Turner	92f1773869	nir: Fix constant evaluation of bfm. NIR's bfm, like Intel/AMD's hardware instructions and GLSL IR's ir_binop_bfm takes <bits> as src0 and <offset> as src1. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-01-13 10:35:12 -08:00
Matt Turner	7dc2e5f940	i965/fs: Skip assertion on NaN. A shader in Unreal4 uses the result of divide by zero in its color output, producing NaN and triggering this assertion since NaN is not equal to itself. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93560 Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-01-13 10:32:53 -08:00
Matt Turner	64800933b8	i965/fs: Add debugging to constant combining pass. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-01-13 10:32:53 -08:00
Brian Paul	9638c03a4e	meta: remove const qualifier on _mesa_meta_fb_tex_blit_begin() To silence a compiler warning about a const/non-const mismatch. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-13 08:02:25 -07:00
Brian Paul	235a299534	st/mesa: fix incorrect buffer token passed to _mesa_BindFramebuffer() I added this code right at the end, and got it wrong. Only used by the WGL_ARB_render_texture code. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-01-13 08:01:56 -07:00
Emil Velikov	2065ffb4cf	docs: add news item and link release notes for 11.1.1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-01-13 15:27:50 +02:00
Emil Velikov	183b5ff109	docs: add sha256 checksums for 11.1.1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `4b2d9f29e9`)	2016-01-13 15:25:32 +02:00
Emil Velikov	8f16739528	docs: add release notes for 11.1.1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `330aa44a0d`)	2016-01-13 15:25:31 +02:00
Neil Roberts	cda886a485	i965/gen9: Don't allow the RGBX formats for texturing/rendering The RGBX surface formats aren't renderable so we internally remap them to RGBA when rendering. They are retained as RGBX when used as textures. However since the previous patch fast clears are disabled for surfaces that use a different format for rendering than for texturing. To avoid this situation we can just pretend not to support RGBX formats at all. This will cause the upper layers of mesa to pick an RGBA format internally instead. This should be safe because we always override the alpha component to 1.0 for RGBX in the texture swizzle anyway. We could also do this for all gens except that it's a bit more difficult when the hardware doesn't support texture swizzling. Gens using the blorp have further problems because that doesn't implement this swizzle override. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-01-13 12:16:31 +00:00
Marek Olšák	4ea0febcb0	radeonsi: move POSITION and FACE fragment shader inputs to system values And FACE becomes integer instead of float. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-13 12:27:28 +01:00
Marek Olšák	caf3c2abea	radeonsi: simplify gl_FragCoord behavior It will become a system value, not an input. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-13 12:27:28 +01:00
Samuel Iglesias Gonsálvez	69c4c75264	glsl: add image_format check in cross_validate_globals() Fixes CTS test: ES31-CTS.shader_image_load_store.negative-linkErrors Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93410 Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-01-13 07:01:55 +01:00
Tapani Pälli	e937fd779f	mesa: do not validate io of non-compute and compute stage Fixes regression on SSO tests that have both non-compute and compute programs in a program pipeline. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93532 Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>	2016-01-13 07:31:57 +02:00
Tapani Pälli	6b0706b2aa	glsl: add packed varyings for outputs with single stage program Commit `8926dc8` added a check where we add packed varyings of output stage only when we have multiple stages, however duplicates are already handled by changes in commit `0508d950` and we want to add outputs also in case where we have only one stage. Fixes regression caused by `8926dc8` for following test: ES31-CTS.program_interface_query.separate-programs-vertex Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-01-13 07:30:46 +02:00
Roland Scheidegger	38cdcb000d	llvmpipe: (trivial) use cast wrapper for __m128d to __m128 casts some compiler was unhappy.	2016-01-13 04:48:41 +01:00
Roland Scheidegger	49ec647c3b	llvmpipe: avoid most 64 bit math in rasterization The trick here is to recognize that in the c + n * dcdx calculations, not only can the lower FIXED_ORDER bits not change (as the dcdx values have those all zero) but that this means the sign bit of the calculations cannot be different as well, that is sign(c + ndcdx) == sign((c >> FIXED_ORDER) + n(dcdx >> FIXED_ORDER)). That shaves off more than enough bits to never require 64bit masks. A shifted plane c value could still easily exceed 32 bits, however since we throw out planes which are trivial accept even before binning (and similarly don't even get to see tris for which there was a trivial reject plane)) this is never a problem. The idea isnt't all that revolutionary, in fact something similar was tried ages ago (`9773722c2b`) back when the values were only 32 bit anyway. I believe now it didn't quite work then because the adjustment needed for testing trivial reject / partial masks wasn't handled correctly. This still keeps the separate 32/64 bit paths for now, as the 32 bit one still looks minimally simpler (and also because if we'd pass in dcdx/dcdy/eo unscaled from setup which would be a good reason to ditch the 32 bit path, we'd need to change the special-purpose rasterization functions for small tris). This passes piglit triangle-rasterization (-fbo -auto -max_size -subpixelbits 8) and triangle-rasterization-overdraw (with some hacks to make it work correctly with large sizes) easily (full piglit as well of course, but most tests wouldn't use triangles large enough to be affected, that is tris with a bounding box over 128x128). The profiler says indeed time spent in rast_tri functions is reduced substantially, BUT of course only if the tris are large. I measured a 3% improvement in mesa gloss demo when supersized to twice the screen size... Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-13 03:50:57 +01:00
Roland Scheidegger	16530fdc82	llvmpipe: scale up bounding box planes to subpixel precision Otherwise some planes we get in rasterization have subpixel precision, others not. Doesn't matter so far, but will soon. (OpenGL actually supports viewports with subpixel accuracy, so could even do bounding box calcs with that). Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-13 03:34:59 +01:00
Roland Scheidegger	0298f5aca7	llvmpipe: add sse code for fixed position calculation This is quite a few less instructions, albeit still do the 2 64bit muls with scalar c code (they'd need way more shuffles, plus fixup for the signed mul so it totally doesn't seem worth it - x86 can do 32x32->64bit signed scalar muls natively just fine after all (even on 32bit). (This still doesn't have a very measurable performance impact in reality, although profiler seems to say time spent in setup indeed has gone down by 10% or so overall. Maybe good for a 3% or so improvement in openarena.) Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-13 03:34:09 +01:00
Roland Scheidegger	9422999e40	draw: fix key comparison with uninitialized value Discovered by accident, valgrind was complaining (could have possibly caused us to create redundant geometry shader variants). v2: convinced by Brian and Jose, just use memset for both gs and vs keys, just as easy and less error prone.	2016-01-13 02:43:04 +01:00
Jason Ekstrand	610aa00cdf	nir/spirv: Add support for OpQuantize	2016-01-12 15:36:38 -08:00
Jason Ekstrand	282a837317	i965: Implement nir_op_fquantize2f16	2016-01-12 15:35:00 -08:00
Jason Ekstrand	15a56459d7	nir: Add a fquantize2f16 opcode This opcode simply takes a 32-bit floating-point value and reduces its effective precision to 16 bits.	2016-01-12 15:33:02 -08:00
Timothy Arceri	6143e2d651	mesa: print the invalid enum when CreateShader fails Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-01-13 09:46:56 +11:00
Jason Ekstrand	aee970c844	anv/device: Bump the max program size again No one will ever need more than 128K, right?	2016-01-12 13:49:05 -08:00
Kenneth Graunke	c034dbeda8	glsl: Make read_from_write_only_variable_visitor ignore .length(). .length() on an unsized SSBO variable doesn't actually read any data from the SSBO, and is allowed on variables marked 'writeonly'. Fixes compute shader compilation in Shadow of Mordor. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-12 12:20:02 -08:00
Kenneth Graunke	9095847c25	i965: Mark TCS URB writes as having side effects. This adds barrier dependencies around TCS_OPCODE_URB_WRITE, preventing reads and writes from being incorrectly scheduled. Fixes rendering in GFXBench 4.0's tessellation demo. For some reason, we haven't ever listed URB writes as having side-effects. This hasn't been a problem because in most stages, we never read from the URB, and only write to each location once. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93526 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2016-01-12 12:19:47 -08:00
Kristian Høgsberg Kristensen	d7a193327b	vk: Implement workaround for occlusion queries We have an issue with occlusion queries (PIPE_CONTROL depth writes) after using the pipeline with the VS disabled. We work around it by using a depth cache flush PIPE_CONTROL before doing a depth write. Fixes dEQP-VK.query_pool.*	2016-01-12 11:50:36 -08:00
Jason Ekstrand	6fc278ae4f	anv/UpdateDescriptorSets: Respect write.dstArrayElement	2016-01-12 11:45:12 -08:00
Kristian Høgsberg Kristensen	af422fe9b3	Merge ../mesa into vulkan Merge master again to get the brw_device_info with the correct slice counts for KBL.	2016-01-12 10:54:26 -08:00
Kristian Høgsberg Kristensen	7df20f0c14	vk: Support SpvBuiltInViewportIndex	2016-01-12 10:53:59 -08:00
Kristian Høgsberg Kristensen	2b4bacb84b	vk: Use the correct stride for CC_VIEWPORT structs	2016-01-12 10:53:59 -08:00
Tom St Denis	56fc2986d5	st/omx: Avoid segfault in deconstructor if constructor fails If the constructor fails before the LIST_INIT calls the pointers will be null and the deconstructor will segfault. Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-01-12 19:13:19 +01:00
Christian König	6f898f740c	vl: use preferred format for deinterlacing Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-01-12 13:28:42 +01:00
Christian König	5fdd4a5aef	vl: improve motion adaptive deinterlacer Handle other formats than YV12 as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-01-12 13:28:39 +01:00
Christian König	e945235aed	st/va: add BOB deinterlacing v2 Tested with MPV. v2: correctly handle compositor deinterlacing as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-01-12 13:28:35 +01:00
Christian König	3949cf0e02	st/va: add NV12 -> NV12 post processing v2 Usefull for mpv and GStreamer. v2: use common functionality for size adjustment. Signed-off-by: Indrajit-kumar Das <Indrajit-kumar.Das@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-01-12 13:28:28 +01:00
Christian König	9f644295dc	st/va: use vl_video_buffer_adjust_size Use the new helper function instead of open coding it. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-01-12 13:28:24 +01:00
Christian König	da39637764	st/vdpau: use vl_video_buffer_adjust_size Use the new helper function instead of open coding it. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-01-12 13:28:21 +01:00
Christian König	52ca9a9b8b	vl/buffers: extract vl_video_buffer_adjust_size helper Useful for the state trackers as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-01-12 13:28:16 +01:00
Christian König	8479782361	st/va: make the implementation thread safe v2 Otherwise we might crash with MPV. v2: minor cleanups suggested on the list. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Julien Isorce <j.isorce@samsung.com> Tested-by: Julien Isorce <j.isorce@samsung.com>	2016-01-12 13:26:24 +01:00
Jason Ekstrand	62e56492c3	nir/spirv: Allow non-block variables with interface types in lists The original objective was to disallow UBO and SSBO variables from the variable lists. This was accidentally broken in `b208620fd` when fixing some other interface issues.	2016-01-12 01:32:19 -08:00
Jason Ekstrand	4141d13de5	nir/spirv: Handle matrix decorations on arrays of matrices Connor's original shallow-copy plan works great except that a couple of the decorations apply to a matrix which may be some levels down in an array. We weren't properly unpacking that. This fixes most of the remaining SSBO and UBO layout tests.	2016-01-12 01:04:44 -08:00
Tapani Pälli	8926dc87af	mesa: use gl_shader_variable in program resource list Patch changes linker to allocate gl_shader_variable instead of using ir_variable. This makes it possible to get rid of ir_variables and ir in memory after linking. v2: check that we do not create duplicate entries with packed varyings v3: document 'patch' bit (Ilia Mirkin) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-12 09:07:10 +02:00
Tapani Pälli	4985159ad6	glsl: track total amount of uniform locations used Linker missed a check for situation where we exceed max amount of uniform locations with explicit + implicit locations. Patch adds this check to already existing iteration over uniforms in linker. Fixes following CTS test: ES31-CTS.explicit_uniform_location.uniform-loc-negative-link-max-num-of-locations v2: use var->type->uniform_locations() (Timothy) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-01-12 07:52:44 +02:00
Jason Ekstrand	b208620fd2	nir/spirv: Allow creating local/global variables from interface types Not sure if this is actually allowed, but it's not that hard to just strip the interface information from the type.	2016-01-11 17:45:54 -08:00
Jason Ekstrand	350bbd3d15	nir/spirv: Allow base derefs in get_vulkan_resource_index	2016-01-11 17:45:24 -08:00
Jason Ekstrand	1c5393d57d	nir/spirv: Allow OpBranchConditional without a merge This can happen if you have a predicated break/continue.	2016-01-11 17:03:52 -08:00
Jason Ekstrand	24523e98a4	nir/spirv/cfg: Allow breaking from the continue block	2016-01-11 17:03:16 -08:00
Jason Ekstrand	c381906bbd	nir/spirv: Stop wrapping carry/borrow in b2i The upstream versions now return an integer like GLSL/SPIR-V want.	2016-01-11 17:02:30 -08:00
Jason Ekstrand	dee09d7393	nir/spirv: Better handle OpCopyMemory	2016-01-11 16:29:38 -08:00
Jason Ekstrand	1ca97cefb0	nir/spirv: Add no-op support for OpSourceContinued	2016-01-11 16:06:11 -08:00
Erik Faye-Lund	395b53dad6	main: get rid of needless conditional We already check if the driver changed the completeness, we don't need to duplicate that check. Let's just early out there instead. Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-01-12 11:02:16 +11:00
Erik Faye-Lund	2a15dc0dd5	gallium/util: removed unused header-file This hasn't been in use since `c476305` ("gallium/util: pregenerate half float tables"), where the last bit of run-time init using this was killed. So let's just get rid of the pointless header. Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-01-12 11:02:08 +11:00
Samuel Pitoiset	e67f5cac79	nvc0: do not force re-binding of compute constbufs on Fermi Re-binding compute constant buffers after launching a grid have no effects because they are not currently validated and because dirty_cp is not updated accordingly. This might also prevent weird future behaviours when UBOs will be bound for compute. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-12 00:47:20 +01:00
Ian Romanick	5be700e5cc	meta: Unconditionally set GL_SKIP_DECODE_EXT The path that depends on this will be avoided (by fallback_required) if the extension is not supported. _mesa_set_sampler_srgb_decode does not generate GL errors (by design), so there are no problems there. I kept this change separate and last because it is one of the few in the series that is not a candidate for the stable branch. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-11 15:38:04 -08:00
Ian Romanick	1799eddb51	meta: Only bind the sampler in one place All of the calls after the first _mesa_bind_sampler call are DSA style calls that don't depend on the current binding. I kept this change separate and last because it is one of the few in the series that is not a candidate for the stable branch. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-11 15:38:04 -08:00
Ian Romanick	ae50157363	meta/decompress: Don't pollute the sampler object namespace tl;dr: For many types of GL object, we can NEVER use the Gen function. In OpenGL ES (all versions!) and OpenGL compatibility profile, applications don't have to call Gen functions. The GL spec is very clear about how you can mix-and-match generated names and non-generated names: you can use any name you want for a particular object type until you call the Gen function for that object type. Here's the problem scenario: - Application calls a meta function that generates a name. The first Gen will probably return 1. - Application decides to use the same name for an object of the same type without calling Gen. Many demo programs use names 1, 2, 3, etc. without calling Gen. - Application calls the meta function again, and the meta function replaces the data. The application's data is lost, and the app fails. Have fun debugging that. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-11 15:38:04 -08:00
Ian Romanick	b03ee127d8	meta/decompress: Save and restore the sampler using gl_sampler_object instead of GL API object handle Some meta operations can be called recursively. Future changes (the "Don't pollute the ... namespace" changes) will cause objects with invalid names to be used. If a nested meta operation tries to restore an object named 0xDEADBEEF, it will fail. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-11 15:38:04 -08:00
Ian Romanick	d4094f64c1	meta/decompress: Track sampler using gl_sampler_object instead of GL API object handle Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-11 15:38:03 -08:00
Ian Romanick	1998af813a	meta/decompress: Use internal functions for sampler object access Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-11 15:38:03 -08:00
Ian Romanick	b85c5fe526	meta/generate_mipmap: Don't pollute the sampler object namespace tl;dr: For many types of GL object, we can NEVER use the Gen function. In OpenGL ES (all versions!) and OpenGL compatibility profile, applications don't have to call Gen functions. The GL spec is very clear about how you can mix-and-match generated names and non-generated names: you can use any name you want for a particular object type until you call the Gen function for that object type. Here's the problem scenario: - Application calls a meta function that generates a name. The first Gen will probably return 1. - Application decides to use the same name for an object of the same type without calling Gen. Many demo programs use names 1, 2, 3, etc. without calling Gen. - Application calls the meta function again, and the meta function replaces the data. The application's data is lost, and the app fails. Have fun debugging that. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-11 15:38:03 -08:00
Ian Romanick	d6782712a1	meta/generate_mipmap: Save and restore the sampler using gl_sampler_object instead of GL API object handle Some meta operations can be called recursively. Future changes (the "Don't pollute the ... namespace" changes) will cause objects with invalid names to be used. If a nested meta operation tries to restore an object named 0xDEADBEEF, it will fail. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-11 15:38:03 -08:00
Ian Romanick	36f413209f	meta/generate_mipmap: Track sampler using gl_sampler_object instead of GL API object handle Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-11 15:38:03 -08:00
Ian Romanick	b94e7f398d	meta/generate_mipmap: Use internal functions for sampler object access Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-11 15:38:03 -08:00
Ian Romanick	963065b76c	meta/blit: Don't pollute the sampler object namespace in _mesa_meta_setup_sampler tl;dr: For many types of GL object, we can NEVER use the Gen function. In OpenGL ES (all versions!) and OpenGL compatibility profile, applications don't have to call Gen functions. The GL spec is very clear about how you can mix-and-match generated names and non-generated names: you can use any name you want for a particular object type until you call the Gen function for that object type. Here's the problem scenario: - Application calls a meta function that generates a name. The first Gen will probably return 1. - Application decides to use the same name for an object of the same type without calling Gen. Many demo programs use names 1, 2, 3, etc. without calling Gen. - Application calls the meta function again, and the meta function replaces the data. The application's data is lost, and the app fails. Have fun debugging that. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-11 15:38:03 -08:00
Ian Romanick	533320e4d1	meta/blit: Save and restore the sampler using gl_sampler_object instead of GL API object handle Some meta operations can be called recursively. Future changes (the "Don't pollute the ... namespace" changes) will cause objects with invalid names to be used. If a nested meta operation tries to restore an object named 0xDEADBEEF, it will fail. v2: Add a comment explaining why samp_obj_save is set to NULL in _mesa_meta_fb_tex_blit_begin. This came out of review feedback from Jason. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-11 15:38:03 -08:00
Ian Romanick	d796b491cc	meta/blit: Use internal functions for sampler object access This requires tracking the sampler object using the gl_sampler_object* instead of the object name. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-11 15:38:03 -08:00
Ian Romanick	ad5b1b41ae	meta/blit: Group the SamplerParameteri calls with the other sampler operations Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-11 15:38:03 -08:00
Ian Romanick	adb4b31bc3	mesa: Refator _mesa_BindSampler to make _mesa_bind_sampler Pulls the parts of _mesa_BindSampler that aren't just parameter validation out into a function that can be called from other parts of Mesa (e.g., meta). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-11 15:38:03 -08:00
Ian Romanick	4cf5c85ec7	mesa: Add _mesa_set_sampler_srgb_decode method Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-11 15:38:03 -08:00
Ian Romanick	ecba76d3c0	mesa: Add _mesa_set_sampler_filters method v2: Add filter enum assertions. Suggested by Jason. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-11 15:38:03 -08:00
Ian Romanick	08822b4b43	mesa: Add _mesa_set_sampler_wrap method Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-11 15:38:03 -08:00
Jason Ekstrand	bb5882e6af	nir/spirv/cfg: Handle unreachable instructions	2016-01-11 15:35:15 -08:00
Jason Ekstrand	fc3f659aa9	nir/vars_to_ssa: Add phi sources for unreachable predecessors It is possible to end up with unreachable blocks if, for instance, you have an "if (...) { break; } else { continue; } unreachable()". In this case, the unreachable block does not show up in the dominance tree so it never gets visited. Instead, we go and visit all of those in follow-on pass.	2016-01-11 15:33:44 -08:00
Samuel Pitoiset	3029d60de7	nvc0: remove useless goto in nvc0_launch_grid() Trivial. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-12 00:19:34 +01:00
Ian Romanick	5318bd351e	mesa: Mark Identity as const I was going to send this as review for `dce1e1a8`, but I missed that window. This saves 64 bytes of unshared data and prelaces it with 96 bytes shared text. My guess is that some of the calls to memcpy get optimized to something else. text data bss dec hex filename 7847613 220208 27432 8095253 7b8615 i965_dri.so before 7847709 220144 27432 8095285 7b8635 i965_dri.so after Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: Brian Paul <brianp@vmware.com>	2016-01-11 14:34:38 -08:00
Jason Ekstrand	c974b94578	nir/spirv: Properly handle OpConstantNull	2016-01-11 14:30:46 -08:00
Jason Ekstrand	96683065f2	nir/spirv: Assert that matrix types are valid	2016-01-11 14:30:46 -08:00
Jason Ekstrand	d032ede26f	nir/types: Add an is_error helper	2016-01-11 14:30:46 -08:00
Jason Ekstrand	17cfafd83a	nir/spirv: Handle OpNoLine	2016-01-11 14:30:46 -08:00
Chad Versace	52d4af6a3c	anv/gen7: Remove unheeded helper begin_render_pass() The helper didn't help much. It looks like a leftover from past code-reuse. Now it's called from exactly one location, gen7_CmdBeginRenderPass(). So fold it into its caller.	2016-01-11 14:08:30 -08:00
Oded Gabbay	647d8e95d1	configure.ac: always define __STDC_CONSTANT_MACROS The ISO C99 standard (7.18.4) specifies that C++ implementations should define UINT64_C only when __STDC_CONSTANT_MACROS is defined. Because we now use UINT64_C in our cpp files (since commit `208bfc493d`), we need to add this define. This also solves compilation errors with GCC 4.8.x on ppc64le machines. v2: add this define to SCons build system Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-11 23:28:23 +02:00
Kenneth Graunke	aa6aa39a8f	i965: Upload 3DSTATE_BINDING_TABLE_POINTERS_HS when !TCS on Gen9+. Gen9+ requires us to emit 3DSTATE_BINDING_TABLE_POINTERS_HS for the hull shader push constants to take effect. The passthrough TCS uses push constants for the default tessellation levels. So, when those change, we need to re-upload the binding table as well. Fixes five Piglit tests on Skylake: - spec/arb_tessellation_shader/vs-tes-vertex - spec/arb_tessellation_shader/vs-tes-tessinner-tessouter-inputs-quads - spec/arb_tessellation_shader/vs-tes-tessinner-tessouter-inputs-tris - spec/arb_tessellation_shader/tes-read-texture - spec/arb_tessellation_shader/tess_with_geometry Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-11 12:10:00 -08:00
Mark Janes	f2c8913536	Add missing platform information for KBL In testing KBL, I found: - urb size was not set for slices gt1.5, gt2, and gt3. The value I used for these slices (384) was taken from an earlier patch authored by Ben Widawsky. - slice count was missing. This field was added by `a403ad4f5a` With this commit, KBL passes piglit at parity with SKL. Note: As requested by Kristian, Sarah modified this patch to drop setting urb size for gt1.5, gt2, and gt3, since the correct default is set in the GEN9 macro by commit `c1e38ad370` "i965/skl: Use larger URB size where available." Signed-off-by: Mark Janes <mark.a.janes@intel.com> Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com> Cc: "11.1" <mesa-stable@lists.freedesktop.org>	2016-01-11 11:24:20 -08:00
Jason Ekstrand	790565b06e	anv/pipeline: Handle output lowering in anv_pipeline instead of spirv_to_nir While we're at it, we delete any unused variables. This allows us to prune variables that are not used in the current stage from the shader.	2016-01-11 11:06:06 -08:00
Jason Ekstrand	b8ec48ee76	anv/pipeline: Only delete functions for SPIR-V shaders We can assume that direct NIR shaders only have one entrypoint	2016-01-11 11:06:06 -08:00
Jason Ekstrand	30883adfb8	nir/spirv: Get rid of a bunch of stage asserts Since we may have multiple entrypoints from different stages, we don't know what stage we are actually in so these asserts are invalid.	2016-01-11 11:06:06 -08:00
Jason Ekstrand	9f4ba499d1	nir/spirv: Take an entrypoint stage as well as a name	2016-01-11 11:06:06 -08:00
Jason Ekstrand	83bf1f752d	nir/dead_variables: Add a a mode parameter This allows dead_variables to be used on any type of variable.	2016-01-11 11:06:06 -08:00
Ilia Mirkin	f21df5c513	nv50/ir: the whole point of data array is to hand out regular registers Fixes: `0d3051f75a` (nv50/ir: Fix scratch allocation size and file) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-11 13:01:11 -05:00
Dave Airlie	a9eace326e	mesa/uniform_query: add IROUNDD and use for doubles->ints (v2) For the case where we convert a double to an int, we should round the same as we do for floats. This fixes GL41-CTS.gpu_shader_fp64.state_query v2: add IROUNDD (Ilia) Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-11 02:27:51 +00:00
Timothy Arceri	124c9c2b97	glsl: replace unreachable code path with assert The lower_named_interface_blocks() pass is called before we try assign locations to varyings so this shouldn't be reachable. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-11 09:24:05 +11:00
Timothy Arceri	cf757f48ea	Revert "glsl: replace unreachable code path with assert" This reverts commit `98270fd20d`. Something went terribly wrong the commit is not what the commit message says.	2016-01-11 09:20:39 +11:00
Timothy Arceri	98270fd20d	glsl: replace unreachable code path with assert The lower_named_interface_blocks() pass is called before we try assign locations to varyings so this shouldn't be reachable. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-11 09:18:51 +11:00
Timothy Arceri	e4c5ace6a9	glsl: combine if blocks Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-11 09:18:45 +11:00
Rhys Kidd	7b4f8c827d	mesa: Update todo regarding StencilOp and StencilOpSeparate. OpenGL 2.0 function StencilOp() is in part internally implemented via StencilOpSeparate(). This change happened some time ago, however the accompanying doxygen todo comment was not accordingly updated. Replace the outdated portion of this doxygen todo comment, leaving the remainder unchanged. Also better respect the 80 character suggested line length in this file. v2: Fully remove comment, following code review by t_arceri@yahoo.com.au Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-01-11 09:10:17 +11:00
Kenneth Graunke	5e3edd4b28	glsl: Make bitfield_insert/extract and bfi/bfm non-vectorizable. Currently, opt_vectorize() tries to combine: result.x = bitfieldInsert(src0.x, src1.x, src2.x, src3.x); result.y = bitfieldInsert(src0.y, src1.y, src2.y, src3.y); result.z = bitfieldInsert(src0.z, src1.z, src2.z, src3.z); result.w = bitfieldInsert(src0.w, src1.w, src2.w, src3.w); into a single ir_quadop_bitfield_insert opcode, which operates on ivec4s. However, GLSL IR's opcodes currently require the bits and offset parameters to be scalar integers. So, this breaks. We want to be able to vectorize this eventually, but for now, just chicken out and make opt_vectorize() bail by marking all the bitfield insert/extract related opcodes as horizontal. This is a relatively uncommon case today, so we'll do the simple fix for stable branches, and fix it properly on master. Fixes assertion failures when compiling Shadow of Mordor vertex shaders on i965 in vec4 mode (where OptimizeForAOS enables opt_vectorize()). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2016-01-09 15:46:37 -08:00
Pierre Moreau	0d3051f75a	nv50/ir: Fix scratch allocation size and file Signed-off-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-09 12:58:21 -05:00
Kristian Høgsberg Kristensen	a9c0e8f00f	vk: Handle uninitialized FS inputs and gl_PrimitiveID These show up as varying_to_slot[attr] == -1. Instead of storing -1 - 2 in swiz.Attribute[input_index].SourceAttribute, handle it correctly.	2016-01-09 01:03:20 -08:00
Kristian Høgsberg Kristensen	b538ec5409	vk: Support reseting timestamp query pools	2016-01-09 00:51:50 -08:00
Kristian Høgsberg Kristensen	925ad84700	vk: Advertise number of timestamp bits We have 36 bits.	2016-01-09 00:51:14 -08:00
Kristian Høgsberg Kristensen	dae800daa8	vk: Expose correct timestampPeriod for SKL Skylake uses 83.333ms per tick.	2016-01-09 00:50:04 -08:00
Kristian Høgsberg Kristensen	ec8e261208	vk: Mark VkEvent and VkSemaphore as done	2016-01-09 00:48:41 -08:00
Kristian Høgsberg Kristensen	bbb2a85c81	vk: Assert on use of uninitialized surface state This exposes a case where we want to anv_CmdCopyBufferToImage() on an image that wasn't created with VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT and end up using uninitialized color_rt_surface_state from the meta image view.	2016-01-08 23:51:11 -08:00
Kristian Høgsberg Kristensen	a8cdef3dce	vk: Only begin subpass if we're continuing a render pass If VK_COMMAND_BUFFER_USAGE_RENDER_PASS_CONTINUE_BIT is not set in pBeginInfo->flags, we don't have a render pass or framebuffer. Change the condition that guard looking up render pass and framebuffer to test for VK_COMMAND_BUFFER_USAGE_RENDER_PASS_CONTINUE_BIT instead of VK_COMMAND_BUFFER_LEVEL_SECONDARY. Fixes all remaining crashes in dEQP-VK.api.command_buffers.*.	2016-01-08 23:02:46 -08:00
Kristian Høgsberg Kristensen	7c5e1fd998	vk: Remove unsupported warnings for Skylake and Broxton These are working as well as Broadwell and Cherryiew. The recent merge from mesa master brings in Kabylake device info and that should be all we need to enable that.	2016-01-08 22:29:06 -08:00
Kristian Høgsberg Kristensen	f0993f81c7	Merge ../mesa into vulkan	2016-01-08 22:16:43 -08:00
Jason Ekstrand	cfdc955fd5	anv/reloc_list: Make valgrind explicitly check relocation data	2016-01-08 16:44:54 -08:00
Nicolai Hähnle	da5d4583e5	mesa: merge bind_atomic_buffers_{base\|range} Reduced code duplication should make the code more maintainable. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-08 19:37:38 -05:00
Nicolai Hähnle	5eb104d6ab	mesa: merge bind_shader_storage_buffers_{base\|range} Reduced code duplication should make the code more maintainable. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-08 19:37:38 -05:00
Nicolai Hähnle	e8dd7cc303	mesa: merge bind_uniform_buffers_{base\|range} Reduced code duplication should make the code more maintainable. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-08 19:37:37 -05:00
Nicolai Hähnle	b3ca26cded	mesa: merge bind_xfb_buffers_{base\|range} Reduced code duplication should make the code more maintainable. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-08 19:37:37 -05:00
Kristian Høgsberg Kristensen	81f7fd3c54	glsl: Don't add nir files to libglsl_la_SOURCES SCons doesn't understand nir yet and doesn't want to compile the glsl to nir pass. Move the files to their own variable so we can add it only for automake. Tested-by: Brian Paul <brianp@vmware.com>	2016-01-08 16:15:49 -08:00
Jason Ekstrand	7a1c4a0ccc	nir/spirv: Add matrix determinants and inverses	2016-01-08 16:02:30 -08:00
Ilia Mirkin	e3706a7118	nv50,nvc0: use a face sysval to avoid the useless back-and-forth conversion Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-08 17:40:52 -05:00
Kristian Høgsberg Kristensen	82ad571abf	glsl: Move _mesa_shader_stage_to_string/abbrev to shader_enums.c These are used by code that doesn't necessarily link to libglsl.la. Move them to shader_enums.[ch] where we keep similar helpers. Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-08 14:26:20 -08:00
Kristian Høgsberg Kristensen	1d25ef6ae7	i965: Move GLSL lowering passes out of libi965_compiler.la The scope of libi965_compiler.la is to be able to take nir shaders and generate i965 EU code. As such, we don't want the GLSL IR lowering passes in the library. With this change, libi965_compiler.la no longer needs to link to libglsl.la. Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-08 14:26:16 -08:00
Kristian Høgsberg Kristensen	e97caba1f6	glsl: Move glsl_to_nir files to LIBGLSL_FILES libglsl_la_SOURCES includes both NIR_FILES and LIBGLSL_FILES, so for libglsl.la consumers, this is a no-op. libnir.la however no longer uses any GLSL IR infrastructure and can be used without also linking to libglsl.la. Acked-by: Matt Turner <mattst88@gmail.com>	2016-01-08 14:26:12 -08:00
Jordan Justen	1d54ac6c9f	mesa: Use separate indices for UBO & SSBO during binding Previously we were treating the binding index for Uniform Buffer Objects and Shader Storage Buffer Objects as being part of the combined BufferInterfaceBlocks array. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93322 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-01-08 13:11:31 -08:00
Jordan Justen	cf66a8ffb7	mesa: Map program UBOs and SSBOs to Interface Blocks v2: * Fill UboInterfaceBlockIndex and SsboInterfaceBlockIndex in split_ubos_and_ssbos (Iago) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-01-08 13:10:28 -08:00
Jordan Justen	c7f6e42a7d	anv: Increate dynamic pool block size from 2k to 16k This is needed because compute push constant data is replicated per invocation. For gen7, this can be up to 64. With a push constant data max of 128 bytes, this is 8k of data. We need additional space for local-id payloads, so we are going with 16k for now. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-08 13:03:30 -08:00
Jason Ekstrand	4e15d26e47	nir/spirv: Fix a small bug in row-major matrix loading	2016-01-08 12:27:25 -08:00
Sarah Sharp	5d349fab46	mesa: docs: Add link to planet.freedesktop.org The freedesktop.org blog feeds aren't mentioned on either mesa3d.org or any of the graphics project wikis (including the DRI wiki) on freedeskop.org. Fix that by linking to it from the sidebar. Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-08 12:18:12 -08:00
Ilia Mirkin	dff1caccac	freedreno: add ir3_compiler to gitignore Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-08 15:16:37 -05:00
Ilia Mirkin	90ba06618e	gallium: add a RESQ opcode to query info about a resource Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-08 15:10:33 -05:00
Ilia Mirkin	ebfb5446c7	gallium: add PIPE_CAP_SHADER_BUFFER_OFFSET_ALIGNMENT Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-08 15:10:33 -05:00
Ilia Mirkin	266d001261	gallium: add PIPE_SHADER_CAP_MAX_SHADER_BUFFERS Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-08 15:10:33 -05:00
Ilia Mirkin	8cb493acc7	tgsi: update atomic op docs Specify that the operation only applies to the x component, not per-component as previously specified. This is unnecessary for GL and creates additional complications for images which need to support these operations as well. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-08 15:10:33 -05:00
Ilia Mirkin	bdef02ff26	tgsi: add a is_store property Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-08 15:10:33 -05:00
Ilia Mirkin	50b8488926	tgsi: provide a way to encode memory qualifiers for SSBO Each load/store on most hardware can specify what caching to do. Since SSBO allows individual variables to also have separate caching modes, allow loads/stores to have the qualifiers instead of attempting to encode them in declarations. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-08 15:10:32 -05:00
Ilia Mirkin	888ddd632d	ureg: add buffer support to ureg Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-08 15:10:32 -05:00
Ilia Mirkin	8cc9a8aa2a	tgsi: add ureg support for image decls Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-08 15:10:32 -05:00
Jose Fonseca	208bfc493d	glsl: Ensure 64bits shift is used. I believe that `1u << x`, where x >= 32 yields undefined results according to the C standard. Particularly MSVC says `warning C4334: '<<' : result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)`. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-08 20:06:59 +00:00
Jose Fonseca	e378184d9c	mesa/main: Avoid `void function returning a value` warning. Trivial. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-08 20:06:59 +00:00
Oded Gabbay	6613042c4e	configure.ac: add --enable-profile For profiling mesa's code, especially llvmpipe, PROFILE should be defined. Currently, this define can only be generated if mesa is built using scons. This patch makes it possible to generate this define also when building mesa through automake tools. v2: - Change --enable-llvmpipe-profile to --enable-profile - Add -fno-omit-frame-pointer to CFLAGS and CXXFLAGS when enabling profile Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-08 21:59:47 +02:00
Jason Ekstrand	fe2f44f2a4	nir/spirv: Use create_ssa_value for block_load_store	2016-01-08 11:50:34 -08:00
Jason Ekstrand	8b9dfb4b6d	nir/spirv: Add real support for outer products	2016-01-08 11:38:59 -08:00
Jason Ekstrand	927ef0ea4e	nir/spirv: Add support for add, subtract, and negate on matrices	2016-01-08 11:26:43 -08:00
Jason Ekstrand	393562f47b	nir/spirv: Split ALU operations out into their own file	2016-01-08 11:26:43 -08:00
Marek Olšák	1e463d20ba	nine: allow fragment shader POSITION and FACE to be system values Reported-by: Axel Davy <axel.davy@ens.fr>	2016-01-08 20:07:16 +01:00
Marek Olšák	d0cf66d835	vl: allow fragment shader POSITION to be a system value Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-08 20:07:16 +01:00
Marek Olšák	69f43c2cc9	util/pstipple: allow fragment shader POSITION to be a system value Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-08 20:07:16 +01:00
Marek Olšák	8a13ce14fd	st/mesa: add support for POSITION and FACE system values Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-08 20:07:15 +01:00
Marek Olšák	c00e534283	tgsi/scan: update for POSITION and FACE sytem values Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-08 20:07:15 +01:00
Marek Olšák	34738a92de	gallium: add caps for POSITION and FACE system values v2: document the integer behavior Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-08 20:07:15 +01:00
Marek Olšák	24737f2298	program: add a helper for rewriting FP position input to sysval Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-08 20:06:23 +01:00
Marek Olšák	4191c1a57c	glsl: optionally declare gl_FragCoord & gl_FrontFacing as system values Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-08 20:06:23 +01:00
Marek Olšák	c07cf5f5a9	tgsi/ureg: handle redundant declarations in ureg_DECL_system_value Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-08 20:06:22 +01:00
Marek Olšák	c886422656	tgsi/ureg: remove index parameter from ureg_DECL_system_value It can be trivially derived from the number of already declared system values. This allows ureg users not to worry about which index to choose. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-08 20:06:22 +01:00
Marek Olšák	91e8f2b0a5	st/mesa: remove dead code from mesa_to_tgsi These aren't part of ARB_fragment_program. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-08 20:06:22 +01:00
Edward O'Callaghan	cb513485a0	radeon, si: Use TGSI chan name defines in lp_build_emit_fetch() calls Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-08 12:18:36 -05:00
Edward O'Callaghan	b42254eff3	gallium/aux: Use TGSI chan name defines inplace of literals Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-08 12:18:24 -05:00
Nicolai Hähnle	d6db7ceedf	mesa: check that internalformat of CopyTexImage*D is not 1, 2, 3, 4 The piglit copyteximage check has recently been augmented to test this, but apparently it hasn't been fixed in Mesa so far. This language also already appears in the OpenGL 2.1 spec (Ian). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-08 10:58:27 -05:00
Jason Ekstrand	72bff62e7f	nir/spirv: Add support for SSBO atomics	2016-01-07 22:13:46 -08:00
Jason Ekstrand	fe57ad62a6	nir/spirv: Rework UBOs and SSBOs This completely reworks all block load/store operations. In particular, it should get row-major matrices working.	2016-01-07 22:13:46 -08:00
Chad Versace	1818463733	anv/gen9: Fix cube surface state For gen9 SURFTYPE_CUBE, the RENDER_SURFACE_STATE's Depth, MinimumArrayElement, and RenderTargetViewExtent is in units of full cubes and so must be divided by 6. Fixes 'dEQP-VK.pipeline.image.view_type.cube_array.cube_array.'. Now all of 'dEQP-VK.pipeline.image.' passes.	2016-01-07 17:20:25 -08:00
Chad Versace	24d82a3f79	anv/gen8: Refactor genX_image_view_init() Drop the temporary variables for RENDER_SURFACE_STATE's Depth and RenderTargetViewExtent. Instead, assign them in-place. This simplifies the next commit, which fixes gen9 cube surfaces.	2016-01-07 17:20:25 -08:00
Kristian Høgsberg Kristensen	1b1dca75a4	vk: Make sure we emit binding table pointers after push constants SKL needs this to make sure we flush the push constants. It gets a little tricky, since we also need to emit binding tables before push constants, since that may affect the push constants (dynamic buffer offsets and storage image parameters). This patch splits emitting binding tables from emitting the pointers so that we can emit push constants after binding tables but before emitting binding table pointers.	2016-01-07 16:31:57 -08:00
Kristian Høgsberg Kristensen	a18b5e642c	vk: Implement VK_QUERY_RESULT_WITH_AVAILABILITY_BIT	2016-01-07 16:31:57 -08:00
Kristian Høgsberg Kristensen	bbf3fc815b	vk: Add missing DepthStallEnable to OQ pipe control	2016-01-07 16:31:57 -08:00
Kristian Høgsberg Kristensen	067dbd7a17	vk: Issue PIPELINE_SELECT before setting up render pass We need to make sure we're selected the 3D pipeline before we start setting up depth and stencil buffers.	2016-01-07 16:31:57 -08:00
Jordan Justen	d24e88b98e	anv/gen7: Setup state to enable barrier() function Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-07 17:11:46 -08:00
Jordan Justen	36a2304686	anv/gen8: Setup state to enable barrier() function Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-07 17:11:46 -08:00
Jason Ekstrand	040e314143	i965/compiler: Enable more lowering in NIR We don't need these for GLSL or ARB, but we need them for SPIR-V Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-07 16:14:42 -08:00
Jason Ekstrand	d00abcc283	nir/algebraic: Add more lowering This commit adds lowering options for the following opcodes: - nir_op_fmod - nir_op_bitfield_insert - nir_op_uadd_carry - nir_op_usub_borrow Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-07 16:14:38 -08:00
Jason Ekstrand	b0d4ee520e	nir/opcodes: Fix up uadd_carry and usub_borrow Both were defined as returning bool but the gpu_shader5 functions are defined to return int. Also, we had the parameters for usub borrwo backwards in the folding expression. Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-07 16:14:25 -08:00
Ilia Mirkin	67b31b3c59	nvc0: add ARB_indirect_parameters support I chose to make separate macros for this due to the additional complexity and extra scratch usage. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-07 18:38:46 -05:00
Ilia Mirkin	9a54ccf30a	st/mesa: expose ARB_indirect_parameters when the backend driver allows Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-07 18:38:46 -05:00
Ilia Mirkin	e1eab5a76f	mesa: add support for ARB_indirect_parameters draw functions Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-07 18:38:46 -05:00
Ilia Mirkin	9327e2d312	mesa: add parameter buffer, used for ARB_indirect_parameters Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-07 18:38:46 -05:00
Ilia Mirkin	b3e2c21fe5	glapi: add ARB_indirect_parameters definitions Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-07 18:38:46 -05:00
Ilia Mirkin	7ca67c752b	nvc0: add support for real ARB_multi_draw_indirect The draw groups are now split up into groups of 32 if there's a non-packed stride, or in groups of 400-500 if the draw data is packed. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-07 18:38:46 -05:00
Ilia Mirkin	d3e43baffe	nvc0: adjust indirect draw macros to handle multiple draws at once These are still invoked one at a time, but the underlying macro can handle multiple draws. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-07 18:38:46 -05:00
Ilia Mirkin	2860f20859	st/mesa: add support for new mesa indirect draw interface This shifts all indirect draws to go through the new function. If the driver doesn't have support for multi draws, we break those up and perform N draws. Otherwise, we pass everything through for just a single draw call. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-07 18:38:46 -05:00
Ilia Mirkin	d67b9ba9a1	gallium: add caps to expose support for multi indirect draws Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-07 18:38:46 -05:00
Ilia Mirkin	3e11656694	gallium: add sufficient draw interface to allow new indirect features This makes it possible to support indirect multidraws as well as having the number of such draws to come from a separate GPU resource. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-07 18:38:46 -05:00
Ilia Mirkin	60d0cfd429	vbo: create a new draw function interface for indirect draws All indirect draws are passed to the new draw function. By default there's a fallback implementation which pipes it right back to draw_prims, but eventually both the fallback and draw_prim's support for indirect drawing should be removed. This should allow a backend to properly support ARB_multi_draw_indirect and ARB_indirect_parameters. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-07 18:38:45 -05:00
Roland Scheidegger	2923c7a0ed	llvmpipe: do 64bit plane calculations in the sse path The sse path was pretty much disabled for practical purposes because the largest allowed fb size was 128x128. So, adapt it for 64bit plane calculations. This is actually not that difficult, though a problem is that we can't do a signed 32x32->64bit mul, only unsigned, so need to fix that up. Overall, the code still looks reasonable, though it's not like changes there in setup really make much of a difference in the end... Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-08 00:34:14 +01:00
Roland Scheidegger	fad283ba9e	llvmpipe: don't store eo as 64bit int eo, just like dcdx and dcdy, cannot overflow 32bit. Store it as unsigned though just in case (it cannot be negative, but in theory twice as big as dcdx or dcdy so this gives it one more bit). This doesn't really change anything, albeit it might help minimally on 32bit archs. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-08 00:34:14 +01:00
Roland Scheidegger	b61b9a377e	llvmpipe: use aligned data for the assembly program in setup Back in the day (before `24678700ed`) the values were not actually in a struct but even then I can't see why we didn't simply align the values. Especially since it's trivial to do so. (Not that it actually matters since the code is pretty much unused for now.) Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>	2016-01-08 00:34:13 +01:00
Roland Scheidegger	9db7309595	draw: initialize prim header flags when clipping lines Otherwise, clipped lines would have undefined stippling reset bit if line stippling is enabled. (Untested, and I just assume copying over the bits from the original line is actually the right thing to do.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-08 00:34:13 +01:00
Roland Scheidegger	64da11f052	draw: fix line stippling with unfilled prims The unfilled stage was not filling in the prim header, and the line stage then decided to reset the stipple counter or not based on the uninitialized data. This causes some failures in conform linestipple test (albeit quite randomly happening depending on environment). So fill in the prim header in the unfilled stage - I am not entirely sure if anybody really needs determinant after that stage, but there's at least later stages (wide line for instance) which copy over the determinant as well. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-08 00:34:13 +01:00
Timothy Arceri	5cf156c6b4	glsl: replace null check with assert This was added in `54f583a20` since then error handling has improved. The test this was added to fix now fails earlier since `01822706ec` Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-08 09:12:45 +11:00
Nicolai Hähnle	051603efd5	i965: use _mesa_delete_buffer_object This is more future-proof, plugs the memory leak of Label and properly destroys the buffer mutex. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-07 17:07:12 -05:00
Nicolai Hähnle	1b74c02e83	i915: use _mesa_delete_buffer_object This is more future-proof, plugs the memory leak of Label and properly destroys the buffer mutex. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-07 17:07:09 -05:00
Nicolai Hähnle	8882b46226	radeon: use _mesa_delete_buffer_object This is more future-proof, plugs the memory leak of Label and properly destroys the buffer mutex. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-07 17:07:03 -05:00
Nicolai Hähnle	1c2187b1c2	st/mesa: use _mesa_delete_buffer_object This is more future-proof than the current code. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2016-01-07 17:06:58 -05:00
Nicolai Hähnle	6aed083b93	mesa/bufferobj: make _mesa_delete_buffer_object externally accessible gl_buffer_object has grown more complicated and requires cleanup. Using this function from drivers will be more future-proof. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-07 17:05:54 -05:00
Chad Versace	4c7f4c25d0	anv/meta: Fix hardcoded format size in anv_CmdCopy* When looping through VkBufferImageCopy regions, for each region we incremented the offset into the VkBuffer assuming the format size was 4. Fixes CTS tests dEQP-VK.pipeline.image.view_type.cube_array.3d.* on Skylake.	2016-01-07 13:56:58 -08:00
Oded Gabbay	f41b6cfb07	llvmpipe: use sse2 conv code for altivec In lp_build_conv() and lp_build_conv_auto(), there is a special case of conversion when sse2 is present. That code path is suitable without any changes to altivec, because all the functions that are called in that code path already support altivec. This patch increase the FPS in POWER arch across the board between 10%-25% I checked ipers, glxgears, glxspheres64, openarena, xonotic and glmark2. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-07 22:07:02 +02:00
Chad Versace	a50c78a5cf	isl: Add missing break statement in array pitch calculation Fixes regression in ed98c374bd3f1952fbab3031afaf5ff4d178ef41.	2016-01-07 11:08:12 -08:00
Chad Versace	d1e6c1b29b	isl/gen9: Fix array pitch of 3d surfaces For tiled 3D surfaces, the array pitch must aligned to the tile height. From the Skylake BSpec >> RENDER_SURFACE_STATE >> Surface QPitch: Tile Mode != Linear: This field must be set to an integer multiple of the tile height Fixes CTS tests 'dEQP-VK.pipeline.image.view_type.3d.format.r8g8b8a8_unorm.'. Fixes Crucible tests 'func.miptree.r8g8b8a8-unorm.aspect-color.view-3d.'.	2016-01-07 11:04:17 -08:00
Chad Versace	0af77fe5b6	isl: Refactor func isl_calc_array_pitch_sa_rows Update the function to calculate the array pitch is element rows, and it rename it accordingly to isl_calc_array_pitch_el_rows.	2016-01-07 11:04:17 -08:00
Jordan Justen	2f0a10149c	isl: Assert that alignments are not 0 for isl_align Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-07 10:37:35 -08:00
Jordan Justen	4d68c477ad	anv: Assert that alignments are not 0 for align_* Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-07 10:37:35 -08:00
Jordan Justen	be91f23e3b	isl: Fix image alignment calculation The previous code was resulting in an alignment of 0. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-07 10:37:35 -08:00
Marek Olšák	bca18057a3	radeonsi: adjust the parameters of si_shader_dump The function will be extended to dump all binaries shaders will consist of, so si_shader* makes sense here. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	0a51b010e5	radeonsi: move si_shader_dump call out of si_compile_llvm Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	b0df5f4c19	radeonsi: inline si_shader_binary_read Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	c9c031f3d0	radeonsi: move si_shader_dump call out of si_shader_binary_read Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	f8b34fe093	radeonsi: separate shader dumping code to si_shader_dump and *_dump_stats Eventually, I'd like to dump stats for several combined binaries, which is why you don't see a binary parameter in si_shader_dump_stats Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	ccd7d7e13d	radeonsi: add si_shader_destroy_binary Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	5c9f104567	radeonsi: don't pass si_shader to si_compile_llvm Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	54ed83669e	radeonsi: move si_shader_binary_upload out of si_compile_llvm Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	f20a76a4fd	radeonsi: always keep shader code, rodata, and relocs in memory We won't compile shaders in draw calls, but we will concatenate shader binaries according to states in draw calls, so keep the binaries. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	63345cfc3a	radeonsi: don't pass si_shader to si_shader_binary_read Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	2d3a96448a	radeonsi: don't pass si_shader to si_shader_binary_read_config Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	20b9b5d7f5	radeonsi: add struct si_shader_config There will be 1 config per variant, which will be a union of configs from {prolog, main, epilog}. For now, just add the structure. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	890873d106	radeonsi: move NULL exporting into a separate function Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	a72ed2f6bc	radeonsi: move MRT color exporting into a separate function This will be used by a fragment shader epilog. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	0ffe3d3772	radeonsi: use EXP_NULL for pixel shaders without outputs This never happens currently. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	677c65968b	radeonsi: only use LLVMBuildLoad once when updating color outputs at the end without LLVMBuildStore. So: - do LLVMBuildLoad - update the values as necessary - export Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	185267a6fd	radeonsi: export "undef" values for undefined PS outputs Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	1ce659f820	radeonsi: move MRTZ export into a separate function Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	5f3e6b5b0f	radeonsi: simplify setting the DONE bit for PS exports First find out what the last export is and simply set the DONE bit there. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	e00f3f23b1	radeonsi: set SPI color formats and CB_SHADER_MASK outside of compilation Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	4e597c25c7	radeonsi: write all MRTs only if there is exactly one output This doesn't fix a known bug, but better safe than sorry. Also, simplify the expression in si_shader.c. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	746a7a7498	radeonsi: determine SPI_SHADER_Z_FORMAT outside of shader compilation Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:05 +01:00
Marek Olšák	2cb8bf90cd	radeonsi: determine DB_SHADER_CONTROL outside of shader compilation because the API pixel shader binary will not emulate alpha test one day, so the KILL_ENABLE bit must be determined elsewhere. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:05 +01:00
Marek Olšák	ff7e77724e	tgsi/scan: set which color components are read by a fragment shader This will be used by radeonsi. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:05 +01:00
Marek Olšák	18ec76730a	tgsi/scan: fix tgsi_shader_info::reads_z This has no users in Mesa. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:05 +01:00
Marek Olšák	f3658be108	tgsi/scan: set if a fragment shader writes sample mask This will be used by radeonsi. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:05 +01:00
Kenneth Graunke	3e8f644ed3	glsl: Disallow vectorization of vector_insert/extract. vector_insert takes a vector, a scalar location, and a scalar value, and produces a new vector with that component updated. As such, it can't be vectorized properly. vector_extract takes a vector and a scalar location, and returns that scalar component of the vector. Vectorization doesn't really make any sense. Treating both as horizontal operations makes sure the vectorizer won't try to touch these. Found by inspection. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-06 21:22:06 -08:00
Jason Ekstrand	d8cd5e333e	anv/state: Pull sampler vk-to-gen maps into genX_state_util.h	2016-01-06 19:53:45 -08:00
Jason Ekstrand	195c60deb4	nir/spirv: Wrap borrow/carry ops in b2i NIR specifies them as booleans but SPIR-V wants ints.	2016-01-06 17:13:06 -08:00
Jason Ekstrand	000eb00862	nir/spirv/cfg: Only set fall to true at the start of a case Previously, we were setting it to true at the top of the switch statement. However, this causes all of the cases to get executed until you hit a break. Instead, you want to be not executing at the start, start executing when you hit your case, and end at a break.	2016-01-06 17:00:55 -08:00
Roland Scheidegger	8d4039ecdb	softpipe: tell draw about the vertex layout we want This makes it more similar to llvmpipe. It also allows us to let draw emit code handle things like getting zeros for non-existing vs outputs automatically. There probably isn't really any overhead either way, there isn't really any "simply copy everything" code in the emit path it would copy each attrib individually just the same. Likewise, we still do another mapping step in softpipe as the layout may still not match exactly (same as in llvmpipe, should probably nuke the pointless mapping in both drivers). This fixes the piglit arb_fragment_layer_viewport no_gs/no_write tests. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-07 02:00:04 +01:00
Roland Scheidegger	8e3a76791f	llvmpipe: use ints not unsigned for slots They can't actually be 0 (as position is there) but should avoid confusion. This was supposed to have been done by `af7ba989fb` but I accidentally pushed an older version of the patch in the end... Also prettify slightly. And make some notes about the confusing and useless fs input "map". Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-07 01:59:17 +01:00
Roland Scheidegger	2dbc20e456	draw: nuke the interp parameter from vertex_info draw emit couldn't care less what the interpolation mode is... This somehow looked like it would matter, all drivers more or less dutifully filled that in correctly. But this is only used for emit, if draw needs to know about interpolation mode (for clipping for instance) it will get that information from the vs anyway. softpipe actually used to depend on that interpolation parameter, as it abused that structure quite a bit but no longer. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-07 01:58:05 +01:00
Roland Scheidegger	892e2d1395	softpipe: don't abuse the draw vertex_info struct for something different softpipe would calculate two "vertex layouts". The second one was however just used for internal purposes, draw would know nothing about it even though it looked exactly the same as the other one we tell draw about. So, store that information separately as this was just confusing. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-07 01:57:21 +01:00
Roland Scheidegger	b64d008052	softpipe: fix mapping of "special" vs outputs Unlike llvmpipe, softpipe always tells draw to emit the vertices as-is. The two vertex layouts it calculates are a bit confusing, one which is just used to tell draw to emit vertices as-is, and the other which has draw written all over it but draw is completely unaware of and is used only to look up the correct interpolation info later in setup. Thus, the slots used are different to what llvmpipe does (I'm going to clean up the confusing two layout stuff). Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-07 01:56:43 +01:00
Roland Scheidegger	01761a38e8	llvmpipe: scratch some special handling of vp_index/layer It was actually slightly buggy (missing initialization / setup not dependent on new vs albeit I didn't see issues), but the case of non-existing attributes is now handled by draw emit code so don't need that anymore. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-07 01:55:45 +01:00
Roland Scheidegger	afa035031f	draw: rework handling of non-existing outputs in emit code Previously the code would just redirect requests for attributes which don't exist to use output 0. Rework this to output all zeros instead which seems more useful - in particular some extensions like ARB_fragment_layer_viewport require 0 in the fs even if it wasn't output by previous stages. That way, drivers don't have to special case this depending if the vs/gs outputs some attribute or not. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-07 01:52:39 +01:00
Jordan Justen	de65d4dcaf	anv: Fix build without VALGRIND Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-06 15:54:51 -08:00
Jason Ekstrand	5bbf060ece	i965/compiler: Enable more lowering in NIR We don't need these for GLSL or ARB, but we need them for SPIR-V	2016-01-06 15:30:53 -08:00
Jason Ekstrand	573351cb0f	nir/algebraic: Add more lowering This commit adds lowering options for the following opcodes: - nir_op_fmod - nir_op_bitfield_insert - nir_op_uadd_carry - nir_op_usub_borrow	2016-01-06 15:30:53 -08:00
Jason Ekstrand	1f503603d3	nir/opcodes: Fix the folding expression for usub_borrow	2016-01-06 15:30:53 -08:00
Jason Ekstrand	22804de110	nir/spirv: Properly implement Modf	2016-01-06 15:30:53 -08:00
Jason Ekstrand	1f3593d8a1	nir/builder: Add a helper for storing to a deref	2016-01-06 15:30:53 -08:00
Sarah Sharp	39c41be50d	mesa: Add KBL PCI IDs and platform information. Add PCI IDs for the Intel Kabylake platforms. The IDs are taken directly from the Linux kernel patches, which are under review: http://lists.freedesktop.org/archives/intel-gfx/2015-October/078967.html http://cgit.freedesktop.org/~vivijim/drm-intel/log/?h=kbl-upstream-v2 The Kabylake PCI IDs taken from the kernel are rearranged to be in order of GT type, then PCI ID. Please note that if this patch is backported, the following fixes will need to be added before this patch: commit `28ed1e08e8` "i965/skl: Remove early platform support" commit `c1e38ad370` "i965/skl: Use larger URB size where available." Thanks to Ben for fixing a bug around setting urb.size, and being patient with my questions about what the various fields mean. Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Suggested-by: Ben Widawsky <benjamin.widawsky@intel.com> Tested-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (KBL-GT2) Cc: "11.1" <mesa-stable@lists.freedesktop.org>	2016-01-06 15:11:00 -08:00
Sinclair Yeh	0819287f56	svga: Rename SVGA_HINT_FLAG_DRAW_EMITTED Rename SVGA_HINT_FLAG_DRAW_EMITTED to SVGA_HINT_FLAG_CAN_PRE_FLUSH because preemptive flush can be unblocked by more commands than draw. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-06 16:04:45 -07:00
Sinclair Yeh	9ccc716534	svga: allow preemptive flushing on DMA, update, and readback commands The existing code effectively turns off preemptive flushing for all but the regions used for draws. This turns out to be overly restrictive as some memory regions, e.g. GMR, may never get a draw when used as a DMA upload staging area, causing problems for apps that upload a large amount of textures, e.g. Unigine Heaven. This patch fixes the Unigine Heaven memory allocation error and has been verified to not cause a regression in the previous extended retina display issue. Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-06 16:03:33 -07:00
Charmaine Lee	b074a5b02d	svga: skip vertex attribute instruction with zero usage_mask In emit_input_declarations(), we are skipping declarations for those registers that are not being used. But in emit_vertex_attrib_instructions(), we are still emitting instructions to tweak the vertex attributes even if they are not being used. This causes an assert in the backend because an input register is not declared in the shader. This patch fixes the problem by skipping the instruction if the vertex attribute is not being used. Changes in this patch is originated from the code snippet from Jose as suggested in bug 1530161. Tested with piglit, Heaven, Turbine, glretrace. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-06 16:01:38 -07:00
Brian Paul	b59fad8478	st/mesa: minor clean-ups in st_atom.c Remove useless comment. Reformat code.	2016-01-06 15:53:47 -07:00
Brian Paul	85444ab08b	st/mesa: replace bitmap size checks with assertion The _mesa_Bitmap() caller already checks for zero-sized bitmaps.	2016-01-06 15:53:47 -07:00
Brian Paul	18038b9fd6	st/mesa: check texture target in allocate_full_mipmap() Some kinds of textures never have mipmaps. 3D textures seldom have mipmaps. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-01-06 15:53:47 -07:00
Brian Paul	c032ae85ee	st/mesa: move mipmap allocation check logic into a function Better readability and easier to extend. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-01-06 15:53:46 -07:00
Brian Paul	0d39b5fc3b	main: s/GLuint/GLbitfield for state bitmasks Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-01-06 15:53:46 -07:00
Brian Paul	c81ddc2092	vbo: s/GLuint/GLbitfield/ for state bitmasks Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-01-06 15:53:46 -07:00
Brian Paul	3c0521cd0f	st/mesa: use GLbitfield in st_state_flags, add comments Use GLbitfield instead of GLuint to be consistent with other variables. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-01-06 15:53:46 -07:00
Brian Paul	4cd1bd46ed	s/GLuint/GLbitfield/ for st_invalidate_state() parameter To match dd_function_table::UpdateState(). Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-01-06 15:53:46 -07:00
Brian Paul	2cc52801c0	st/mesa: be more careful about state validation in st_Bitmap() If the only dirty state is mesa's _NEW_PROGRAM_CONSTANTS flag, we can skip state validation before drawing a bitmap since that state doesn't effect bitmap rendering. This further increases the performance of the ipers demo on llvmpipe to about what it was before commit `36c93a6fae`. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-01-06 15:53:46 -07:00
Brian Paul	b6bcf08641	st/mesa: move bitmap cache flushing out of state validation Just do it where needed (before drawing, clearing, etc). Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-01-06 15:53:46 -07:00
Brian Paul	c28d72a347	st/mesa: check state->mesa in early return check in st_validate_state() We were checking the dirty->st flags but not the dirty->mesa flags. When we took the early return, we didn't clear the dirty->mesa flags so the next time we called st_validate_state() we'd often flush the glBitmap cache. And since st_validate_state() is called from st_Bitmap(), it meant we flushed the bitmap cache for every glBitmap() call. This change seems to recover most of the performance loss observed with the ipers demo on llvmpipe since commit commit `36c93a6fae`. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-01-06 15:53:46 -07:00
Brian Paul	c75d00e054	st/mesa: protect debug printf() with a conditional instead of comment	2016-01-06 15:53:46 -07:00
Brian Paul	72d6bbca5b	st/mesa: fix comment indentation in st_flush_bitmap_cache()	2016-01-06 15:53:46 -07:00
Timothy Arceri	e58be8ac0e	glsl: fix varying slot allocation for blocks and structs with explicit locations Previously each member was being counted as using a single slot, count_attribute_slots() fixes the count for array and struct members. Also don't assign a negitive to the unsigned expl_location variable. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-07 09:44:32 +11:00
Timothy Arceri	47dde2bd45	glsl: don't try adding built-ins to explicit locations bitmask Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-07 09:06:26 +11:00
Timothy Arceri	ac6e2c2056	glsl: fix overlapping of varying locations for arrays and structs Previously we were only reserving a single location for arrays and structs. We also didn't take into account implicit locations clashing with explicit locations when assigning locations for their arrays or structs. This patch fixes both issues. V5: fix regression for patch inputs/outputs in tessellation shaders V4: just use count_attribute_slots() to get the number of slots, also calculate the correct number of slots to reserve for gs and tess stages by making use of the new get_varying_type() helper. V3: handle arrays of structs V2: also fix for arrays of arrays and structs. Acked-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-07 09:06:20 +11:00
Timothy Arceri	5907a02ab6	glsl: create helper to remove outer vertex index array used by some stages This will be used in the following patch for calculating array sizes correctly when reserving explicit varying locations. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-07 09:06:16 +11:00
Timothy Arceri	30991d7389	glsl: remove unused varyings before packing them Previously we would pack varyings before trying to remove them, this relied on the packing pass not packing varyings with a location of -1 to avoid packing varyings that should be removed. However this meant unused varyings with an explicit location would be packed before they could be removed when we enable packing of them in a later patch. V2: fix regression in V1 removing unused varyings in multi-stage SSO, fix regression with single stage programs. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-07 09:06:12 +11:00
Krzysztof Sobiecki	0d7477a289	gallium/r600: Replace ALIGN_DIVUP with DIV_ROUND_UP ALIGN_DIVUP is a driver specific(r600g) macro that duplicates DIV_ROUND_UP functionality. Replacing it with DIV_ROUND_UP eliminates this problems. Signed-off-by: Krzysztof A. Sobiecki <sobkas@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-06 16:09:12 -05:00
Eric Anholt	bbd29f1375	vc4: Fix driver build from last minute rebase fix. I had the driver all tested for the last series, and in my last build I noticed that get_swizzled_channel was unused now, and removed it... apparently without testing to find that I removed the wrong channel swizzle function.	2016-01-06 12:49:45 -08:00
Eric Anholt	25aa436e86	vc4: Optimize out a comparison for bcsel based on an ALU comparison We routinely have code like: vec1 ssa_220 = fge ssa_104, ssa_61 vec1 ssa_199 = bcsel ssa_220, ssa_106, ssa_105 and we would compare fge's args and choose between ~0 and 0 to generate ssa_220, then compare ssa_220 to 0 and choose between bcsel's args. Instead, try to notice the pattern and compare between fge's args to select between bcsel's args. total instructions in shared programs: 88019 -> 87574 (-0.51%) instructions in affected programs: 9985 -> 9540 (-4.46%) total estimated cycles in shared programs: 245752 -> 245237 (-0.21%) estimated cycles in affected programs: 17232 -> 16717 (-2.99%)	2016-01-06 12:43:09 -08:00
Eric Anholt	7a9eb76786	vc4: Add missing sRGB decode to texel fetches. We only see txf on MSAA textures, currently, and apparently this didn't impact any of our piglit tests.	2016-01-06 12:43:09 -08:00
Eric Anholt	f01ca9eeda	vc4: Add support for GL_ARB_texture_swizzle. We already had the code supporting it, since it's needed for the depth mode when doing shadow comparisons.	2016-01-06 12:43:09 -08:00
Eric Anholt	12519a972f	vc4: Use NIR texture lowering for texture swizzling. We can't use its other features currently (mostly because we don't want Newton-Raphson on rcps for texture coordinates), but it gets us started. This eliminates some comparisons with constants in GLB2.7 and ETQW traces at the QIR level by moving the comparisons into NIR, where they get constant-folded out. instructions in affected programs: 165 -> 156 (-5.45%) total uniforms in shared programs: 32087 -> 32085 (-0.01%) total estimated cycles in shared programs: 245762 -> 245752 (-0.00%) estimated cycles in affected programs: 461 -> 451 (-2.17%)	2016-01-06 12:43:08 -08:00
Eric Anholt	71db7d3dc5	vc4: Replace the SSA-style SEL operators with conditional MOVs. I'm moving away from QIR being SSA (since NIR is doing lots of SSA optimization for us now) and instead having QIR just be QPU operations with virtual registers. By making our SELs be composed of two MOVs, we could potentially coalesce the registers for the MOV's src and dst and eliminate the MOV. total instructions in shared programs: 88448 -> 88028 (-0.47%) instructions in affected programs: 39845 -> 39425 (-1.05%) total estimated cycles in shared programs: 246306 -> 245762 (-0.22%) estimated cycles in affected programs: 162887 -> 162343 (-0.33%)	2016-01-06 12:39:51 -08:00
Eric Anholt	0a89f307f9	vc4: Don't try the SF coalescing unless it's on a def. If you want the SF of the value of a register produced from a series of packing MOVs or conditional MOVs, we can't just SF on the last MOV into the register.	2016-01-06 12:39:27 -08:00
Chad Versace	8284786c5d	anv/gen9: Teach gen9_image_view_init() about 1D surface qpitch QPitch is usually expressed as rows of surface elements (where a surface element is an compression block or a single surface sample. Skylake 1D is an outlier; there QPitch is expressed as individual surface elements.	2016-01-06 09:38:57 -08:00
Chad Versace	e05b307942	isl: Add isl_surf_get_array_pitch_el() Will be needed to program SurfaceQPitch for Skylake 1D arrays.	2016-01-06 09:38:57 -08:00
Chad Versace	c1e890541e	isl/gen9: Support ISL_DIM_LAYOUT_GEN9_1D	2016-01-06 09:38:57 -08:00
Chad Versace	eea2d4d059	isl: Don't align phys_slice0_sa.width twice It's already aligned to the format's block width. Don't align it again in isl_calc_row_pitch().	2016-01-06 09:38:57 -08:00
Chad Versace	39d043f94a	isl: Fix the documented units of isl_surf::row_pitch It's the pitch between surface elements, not between surface samples.	2016-01-06 09:38:57 -08:00
Chad Versace	dcb9c11dc7	anv/gen9: Fix oob lookup of surface halign, valign For 1D surfaces and for surfaces with Yf or Ys tiling, the hardware ignores SurfaceVerticalAlignment and SurfaceHorizontalAlignment. Moreover, the anv_halign[] and anv_valign[] lookup tables may not even contain the surface's actual alignment values. So don't do the lookup for those surfaces.	2016-01-06 09:38:57 -08:00
Chad Versace	94566d9b68	anv/meta: Teach meta how to blit from a 1D image Meta needed a VkShader with a 1D sampler type.	2016-01-06 09:38:57 -08:00
Edward O'Callaghan	1953cee6d7	gallium/drivers/svga: Use unsigned for loop index Fix a 's/unsigned int/unsigned/' consistency case while here. Found-by: Coccinelle Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-06 08:04:03 -07:00
Edward O'Callaghan	8e2a8ec731	gallium/drivers/r600: Use unsigned for loop index Found-by: Coccinelle Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-06 08:04:03 -07:00
Edward O'Callaghan	76a7d6f412	gallium/drivers/ilo: Use unsigned for loop index Found-by: Coccinelle Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-06 08:04:03 -07:00
Edward O'Callaghan	5071c192cc	gallium: Use unsigned for loop index Found-by: Coccinelle Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-06 08:04:03 -07:00
Edward O'Callaghan	bfabd5e74a	gallium/drivers: Remove unnecessary semicolons Found-by: Coccinelle Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-06 08:04:03 -07:00
Edward O'Callaghan	67d4b4b28c	gallium: Remove unnecessary semicolons Fix silly issue with MSVC case fall-though support to need a extra 'break;' Found-by: Coccinelle Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-06 08:04:03 -07:00
Oded Gabbay	9d59b9d00c	llvmpipe: Optimize lp_rast_triangle_32_3_16 for POWER8 This patch converts the SSE-optimized lp_rast_triangle_32_3_16() to VMX/VSX. I measured the results on POWER8 machine with 32 cores at 3.4GHz and 16GB of RAM. FPS/Score Name Before After Delta ------------------------------------------------ openarena 16.35 16.7 2.14% xonotic 4.707 4.97 5.57% glmark2 didn't show a significant (more than 1%) difference. v2: Make sure code is build only on POWER8 LE machine Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-06 14:54:16 +02:00
Oded Gabbay	925c46cfc4	llvmpipe: Optimize BUILD_MASK(_LINEAR) for POWER8 This patch converts the SSE-optimized build_mask_32() and build_mask_linear_32() to VMX/VSX. I measured the results on POWER8 machine with 32 cores at 3.4GHz and 16GB of RAM. FPS/Score Name Before After Delta ------------------------------------------------ glmark2 (score) 139.8 142.7 2.07% openarena and xonotic didn't show a significant (more than 1%) difference. v2: Make sure code is build only on POWER8 LE machine Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-06 14:54:16 +02:00
Oded Gabbay	3bbe16ea79	llvmpipe: Optimize do_triangle_ccw for POWER8 This patch converts the SSE optimization done in do_triangle_ccw to VMX/VSX. I measured the results on POWER8 machine with 32 cores at 3.4GHz and 16GB of RAM. FPS/Score Name Before After Delta ------------------------------------------------ glmark2 (score) 136.6 139.8 2.34% openarena 16.14 16.35 1.30% xonotic 4.655 4.707 1.11% v2: - Convert loads to use aligned loads - Make sure code is build only on POWER8 LE machine Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-06 14:54:16 +02:00
Oded Gabbay	e99555ef0b	llvmpipe: add POWER8 portability file - u_pwr8.h This file provides a portability layer that will make it easier to convert SSE-based functions to VMX/VSX-based functions. All the functions implemented in this file are prefixed using "vec_". Therefore, when converting from SSE-based function, one needs to simply replace the "_mm_" prefix of the SSE function being called to "vec_". Having said that, not all functions could be converted as such, due to the differences between the architectures. So, when doing such conversion hurt the performance, I preferred to implement a more ad-hoc solution. For example, converting the _mm_shuffle_epi32 needed to be done using ad-hoc masks instead of a generic function. All the functions in this file support both little-endian and big-endian but currently the file is build only on POWER8 LE machine. All of the functions are implemented using the Altivec/VMX intrinsics, except one where I needed to use inline assembly (due to missing intrinsic). v2: - Use vec_vgbbd instead of __builtin_vec_vgbbd - Add an aligned load function - Don't use typeof() - Make file build only on POWER8 LE machine Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-06 14:54:16 +02:00
Oded Gabbay	afe88f66a8	configure.ac: Detect if running on POWER8 arch To determine if we could use special POWER8 assembly directives, we first need to detect whether we are running on POWER8 architecture. This patch adds this detection to configure.ac and adds the necessary compilation flags accordingly. v2: - Add option to disable POWER8 instructions generation - Detect whether building on BE or LE machine and build with -mpower8-vector only on LE machine - Make the printed messages more standard Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-06 14:54:16 +02:00
Kenneth Graunke	7295f4fcc2	nir: Add a lower_fdiv option, turn fdiv into fmul/frcp. The nir_opt_algebraic rule (('fadd', ('flog2', a), ('fneg', ('flog2', b))), ('flog2', ('fdiv', a, b))), can produce new fdiv operations, which need to be lowered on i965, as we don't actually implement fdiv. (Normally, we handle this in GLSL IR's lower_instructions pass, but in the above case we introduce an fdiv after that point. So, make NIR do it for us.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2016-01-05 19:22:11 -08:00
Kenneth Graunke	bd21b54607	i965: Only turn on ARB_compute_shader if we can write registers. Compute shaders require reconfiguring the L3 for shared local memory support. We have to be able to write the L3 registers to do that. This effectively turns off compute shaders prior to Kernel 4.2. (Previously, the extension enable was in an API_OPENGL_CORE conditional. However, that isn't necessary - core Mesa extension handling already restricts it properly. I've moved it out in this patch.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-01-05 18:07:27 -08:00
Kenneth Graunke	25b7e4a01f	i965: Use rcp in brw_lower_texture_gradients rather than 1.0 / x. That's what it's for. Plus, we actually implement rcp. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-05 18:07:27 -08:00
Timothy Arceri	3d402d4450	mesa: fix GL_MAX_NAME_LENGTH query for tessellation shaders This fixes some piglit subtests for ARB_program_interface_query. V3: remove some of the unnecessary parentheses V2: fix alignment Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-06 12:01:09 +11:00
Jason Ekstrand	7a069bea5d	nir/spirv: Fix switch statements with duplicate cases	2016-01-05 16:18:01 -08:00
Jason Ekstrand	506a467f16	nir/spirv/cfg: Assert that blocks only ever get added once This effectively prevents infinite loops in cfg_walk_blocks.	2016-01-05 15:56:59 -08:00
Timothy Arceri	e1e1b67878	glsl: don't change the varying type in validation code There is a function dedicated to demoting unused varyings lets trust it to do its job. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-06 10:52:58 +11:00
Timothy Arceri	21590a307c	glsl: move lowering after matching validation After lowering the matching flag is_unmatched_generic_inout is lost so we need to move this validation before lowering. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-06 10:52:54 +11:00
Timothy Arceri	0508d9504a	glsl: only add outward facing varyings to resourse list for SSO An SSO program can have multiple stages and we only want to add the externally facing varyings. The current code was adding both the packed inputs and outputs for the first and last stage of each program. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-06 10:52:48 +11:00
Jason Ekstrand	71a25a0b07	nir/spirv: Simplify phi node handling Instead of trying to crawl through predecessor chains and build phi nodes, we just do a poor-man's out-of-ssa on the spot. The into-SSA pass will deal with putting the actual phi nodes in for us.	2016-01-05 14:59:40 -08:00
Anuj Phogat	4d2a7f5111	i965/gen9: Modify the conditions to use blitter on skl+ Conditions modified allow skl+ to use blitter: - for all tiling formats - to write data to YF/YS tiled surfaces Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-01-05 13:43:32 -08:00
Anuj Phogat	0bf037c0fe	i965/gen9: Return false in place of assert in intelEmitCopyBlit() This allows the fallback paths to handle it correctly. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-05 13:43:32 -08:00
Anuj Phogat	5cbe01c83f	i965/gen9: Remove regions overlap check in fast copy blit Overlapping blits are anyway undefined in OpenGL. So no need of overlap check here. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-05 13:43:32 -08:00
Anuj Phogat	3c8b97a45b	i965/gen9: Don't use fast copy blit in case of non power of 2 cpp Fast copy blit is currently enabled for use only with Yf/Ys tiling. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-05 13:43:32 -08:00
Jason Ekstrand	ec899f6b42	anv/pipeline: Lower indirect temporaries and inputs	2016-01-05 13:42:52 -08:00
Jason Ekstrand	bff45dc44e	nir: Add an indirect deref lowering pass	2016-01-05 13:42:52 -08:00
Ian Romanick	ee4676aa57	i915/i965: Fix typo in perf_debug message Trivial Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-05 13:18:45 -08:00
Brian Paul	a13e9adbee	st/mesa: minor indentation fixes	2016-01-05 13:04:46 -07:00
Kristian Høgsberg Kristensen	30521fb19e	vk: Implement a basic pipeline cache This is not really a cache yet, but it allows us to share one state stream for all pipelines, which means we can bump the block size without wasting a lot of memory.	2016-01-05 12:03:21 -08:00
Kristian Høgsberg Kristensen	f551047751	vk: Destroy device->mutex when destroying the device	2016-01-05 12:03:21 -08:00
Brian Paul	f4caa7d2fc	draw: minor indentation fix	2016-01-05 13:03:05 -07:00
Brian Paul	dce1e1a8eb	mesa: minor clean-up of some memcpy/sizeof() calls in m_matrix.c Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-01-05 13:03:05 -07:00
Brian Paul	95d412181d	util: add debug_dump_ubyte_rgba_bmp() Like debug_dump_float_rgba_bmp() but takes ubyte values. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-01-05 13:03:04 -07:00
Brian Paul	f04d7439a0	mesa: check for z=0 in _mesa_Vertex3dv() It's very rare that a GL app calls glVertex3dv(), but one in particular calls it lot, always with Z = 0. Check for that condition and convert the call into glVertex2f. This reduces VBO memory used and reduces the number of times we have to switch between float[2] and float[3] vertex formats in the svga driver. This results in a small but measurable performance improvement. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-01-05 13:03:04 -07:00
Brian Paul	eec8d7e7e0	svga: fix test for SVGA_NEW_STIPPLE We only want to set the SVGA_NEW_STIPPLE dirty flag when the polygon stipple state changes. Before, we only set the flag when we were enabling stipple, but not disabling. We don't really have to add SVGA_NEW_STIPPLE to the dirty FS state set since it's a subset of SVGA_NEW_RAST, but let's be explicit. This doesn't fix any known bugs. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-01-05 13:03:04 -07:00
Brian Paul	993b04ee2c	svga: add some comments in svga_state_vs.c Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-01-05 13:03:04 -07:00
Brian Paul	fc07658895	svga: change svga_hw_view_state::dirty to boolean Since it's a true/false value. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-01-05 13:03:04 -07:00
Brian Paul	077aa3be93	svga: avoid emitting redundant SetVertexBuffers() commands Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-01-05 13:03:04 -07:00
Brian Paul	b11bd20889	svga: check for no-ops in svga_bind_sampler_states() and svga_set_sampler_views(). If there's no change, return early and don't set a SVGA_NEW_x dirty state flag. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-01-05 13:03:04 -07:00
Chad Versace	8d6f0a1b80	isl: Don't force linear for 1d surfaces in gen7_filter_tiling() gen7_filter_tiling() should filter out only tiling flags that are incompatible with the surface. It shouldn't make performance decisions, such as forcing linear for 1D; that's the role of the caller.	2016-01-05 11:37:32 -08:00
Chad Versace	8135786605	isl: Document gen7_filter_tiling()	2016-01-05 11:35:13 -08:00
Chad Versace	33f06842be	isl: Prefer linear tiling for 1D surfaces	2016-01-05 11:35:13 -08:00
Chad Versace	98af1cc6d7	isl: Remove isl_format_layout::bpb struct isl_format_layout contained two near-redundant members: bpb (bits per block) and bs (block size). There do exist some hardware formats for which bpb != 8 * bs, but Vulkan does not use them. Therefore we don't need bpb.	2016-01-05 10:00:39 -08:00
Chad Versace	89b68dc8d0	anv: Use isl_format_layout::bs instead of ::bpb For all formats used by Vulkan, 8 * bs == bpb. (bs=block_size_in_bytes, bpb=bits_per_block)	2016-01-05 10:00:39 -08:00
Chad Versace	a1d64ae561	isl: Align isl_surf::phys_level0_sa to the format's compression block	2016-01-05 09:52:07 -08:00
Chad Versace	2172f0e9bb	isl: Fix mis-documented units of isl_surf::phys_level_sa It's in physical surface samples. Hence the _sa suffix.	2016-01-05 09:52:07 -08:00
Ilia Mirkin	6531ccb705	i965: quieten compiler warning about out-of-bounds access gcc 4.9.3 shows the following error: brw_vue_map.c:260:20: warning: array subscript is above array bounds [-Warray-bounds] return brw_names[slot - VARYING_SLOT_MAX]; This is because BRW_VARYING_SLOT_COUNT is a valid value for the enum type. Adding an assert will generate no additional code but will teach the compiler to not complain. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-01-05 12:07:53 -05:00
Julien Isorce	777d1453f1	build: enable st/va with nouveau driver vainfo fails in vaDriverInit because "dd_create_screen" does not reach strcmp(driver_name, "nouveau") code. Indeed when compiling the va target.c, the macro GALLIUM_NOUVEAU is not defined. This patch define the macro the same it is done for dri and vdpau targets. Tested with: ./autogen.sh --enable-glx --enable-gles2 --enable-egl --enable-vdpau --enable-glx-tls=yes --enable-va --with-gallium-drivers=swrast,nouveau --with-dri-drivers=swrast,nouveau --with-egl-platforms=x11 LIBVA_DRIVER_NAME=gallium vainfo Output: vainfo: Driver version: mesa gallium vaapi vainfo: Supported profile and entrypoints VAProfileMPEG2Simple : VAEntrypointVLD VAProfileMPEG2Main : VAEntrypointVLD VAProfileMPEG4Simple : VAEntrypointVLD VAProfileMPEG4AdvancedSimple : VAEntrypointVLD VAProfileVC1Simple : VAEntrypointVLD VAProfileVC1Main : VAEntrypointVLD VAProfileVC1Advanced : VAEntrypointVLD VAProfileH264Baseline : VAEntrypointVLD VAProfileH264Main : VAEntrypointVLD VAProfileH264High : VAEntrypointVLD VAProfileNone : VAEntrypointVideoProc Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-05 12:07:53 -05:00
Julien Isorce	abb30b9c8b	nvc0: add support for st/va - split nvc0_decoder_bsp in begin/next/end - preserve content buffer when calling nvc0_decoder_bsp_next - implement pipe_video_codec::begin_frame/end_frame https://bugs.freedesktop.org/show_bug.cgi?id=89969 Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-05 12:07:53 -05:00
Julien Isorce	7ba27f60f7	nouveau: split nouveau_vp3_bsp in begin/next/end It allows to call nouveau_vp3_bsp_next multiple times between one begin/end. It is required to support st/va. https://bugs.freedesktop.org/show_bug.cgi?id=89969 Signed-off-by: Julien Isorce <j.isorce@samsung.com> [imirkin: create strparm_bsp function, simplified w0 calculation] Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-05 12:07:53 -05:00
Julien Isorce	851e7e12aa	st/va: count number of slices The counter was not set but used by the nouveau driver. It is required otherwise visual output is garbage. Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian Koenig <christian.koenig@amd.com>	2016-01-05 15:02:47 +00:00
Ilia Mirkin	14f21f53d5	i965/wm: use binding size for ubo/ssbo when automatic size is unset This fixes the same tests that commit `8cf2e892f` was attempting to fix: ES31-CTS.shader_storage_buffer_object.advanced-unsizedArrayLength-cs-std430-vec-bindrangeOffset ES31-CTS.shader_storage_buffer_object.advanced-unsizedArrayLength-cs-std430-vec-bindrangeSize as confirmed by Samuel. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: Marta Lofstedt <marta.lofstedt@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-01-05 01:30:09 -05:00
Ilia Mirkin	a1d664a0b7	Revert "i965/wm: use proper API buffer size for the surfaces." This reverts commit `8cf2e892fc`. It's entirely bogus to attempt to store anything about the binding in the buffer object itself, which might be bound any number of times. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: Marta Lofstedt <marta.lofstedt@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-01-05 01:29:49 -05:00
Jason Ekstrand	8b403d599b	nir/spirv: Add support for the ControlBarrier instruction	2016-01-04 22:08:24 -08:00
Jason Ekstrand	ba7b5edc26	anv/UpdateDescriptorSets: Use the correct index for the buffer view	2016-01-04 21:36:11 -08:00
Jason Ekstrand	b8f0bea07a	nir/spirv: Implement extended add, sub, and mul	2016-01-04 20:59:16 -08:00
Jason Ekstrand	3a3c4aecf1	nir/spirv: Add support for bitfield operations	2016-01-04 17:37:10 -08:00
Jason Ekstrand	01ba96e059	nir/spirv: Add support for msb/lsb opcodes	2016-01-04 17:37:10 -08:00
Jason Ekstrand	f32370a536	nir/spirv: Add a documenting assert for OpConstantSampler	2016-01-04 17:37:10 -08:00
Jason Ekstrand	0309199802	nir/spirv: Add initial support for ConstantNull	2016-01-04 17:37:10 -08:00
Chad Versace	8cc21d3aea	isl: Align single-level 2D surfaces to compression block This fixes an assertion failure at isl.c:1003. Reported-by: Nanley Chery <nanley.g.chery@intel.com>	2016-01-04 16:48:58 -08:00
Jason Ekstrand	151694228d	anv/formats: Hand out different formats based on tiled vs. linear	2016-01-04 16:08:05 -08:00
Jason Ekstrand	f665fdf0e7	anv/image_view: Separate vulkan and isl formats Previously, anv_image_view had a anv_format pointer that we used for everything. This commit replaces that pointer with a VkFormat enum copied from the API and an isl_format. In order to implement RGB formats, we have to use a different isl_format for the actual surface state than the obvious one from the VkFormat. Separating the two helps us keep things streight.	2016-01-04 16:08:05 -08:00
Jason Ekstrand	ceb05131da	anv_get_isl_format: Support depth+stencil aspect value You just get the depth format in this case.	2016-01-04 16:08:05 -08:00
Jason Ekstrand	a7cc12910d	anv/image: Do more work in anv_image_view_init There was a bunch of common code in gen7/8_image_view_init that we really should be sharing.	2016-01-04 16:08:05 -08:00
Jason Ekstrand	87dd59e578	anv/formats: Rework GetPhysicalDeviceFormatProperties It now calls get_isl_format to get both linear and tiled views of the format and determines linear/tiled properties from that. Buffer properties are determined from the linear format.	2016-01-04 16:08:05 -08:00
Jason Ekstrand	2712c0cca3	anv/formats: Add a tiling parameter to get_isl_format Currently, this parameter does nothing.	2016-01-04 16:08:05 -08:00
Jason Ekstrand	603a3a9439	isl/format: Add some helpers for working with RGB formats	2016-01-04 16:08:05 -08:00
Jason Ekstrand	0639f44d0f	isl: Add a file for format helpers	2016-01-04 16:08:05 -08:00
Jason Ekstrand	5f5fc23e7c	genX/state: Pull some generic helpers into a shared header	2016-01-04 16:08:05 -08:00
Jason Ekstrand	ad9ff4f2b2	meta/blit: Rework how format and aspect choices are made This commit does two things. First, it introduces choose_* functions for chosing formats and aspects. Second, it changes the copy (not blit) code to use appropreately sized UINT formats for everything except depth. There are two main reasons for this: First, it means that compressed and other non-renderable texture upload should "just work" because it won't be tripping over non-renderable formats. Second, it allows us to easly copy an RGB buffer to and from an RGBX image because the formats will get switched over to their UINT variants and the shader will deal with the extra channel for us.	2016-01-04 16:08:05 -08:00
Jason Ekstrand	3200a81a55	anv/image: Add a vk_format field We've been trying to move away from anv_format for a while and this should help with the transition. There are cases (mostly in meta) where we need the original format for the image and not the isl_format. These will be moved over to the new vk_format and everythign else will use the isl_format from the particular anv_surface.	2016-01-04 16:08:05 -08:00
Nicolai Hähnle	2123bfcc9c	st/mesa: make KHR_debug output independent of context creation flags (v2) Instead, keep track of GL_DEBUG_OUTPUT and (un)install the pipe_debug_callback accordingly. Hardware drivers can still use the absence of the callback to skip more expensive operations in the normal case, and users can no longer be surprised by the need to set the debug flag at context creation time. v2: - re-add the proper initialization of debug contexts (Ilia Mirkin) - silence a potential warning (Ilia Mirkin) Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-04 18:40:49 -05:00
Chad Versace	0d7614dce6	isl: Document mnemonic in Yf and Ys tiling The 'f' means "four K". The 's' means "sixty-four K".	2016-01-04 15:37:39 -08:00
Kristian Høgsberg Kristensen	0f34a4ec4e	isl: Use isl_align_npot for row_pitch Many formats are not power-of-two bytes per pixels and we need the non-power-of-two align macro here. This reverts the revert from `4f9a211b`, but keeps the change from `a827b553` that fixed the yuv if-else mix-up.	2016-01-04 10:53:47 -08:00
Kristian Høgsberg Kristensen	abc1c9878f	vk: Don't leak pipeline if initialization fails	2016-01-04 10:42:50 -08:00
Kristian Høgsberg Kristensen	fca1c08e34	vk: Allocate subpass attachment in one big block This avoids making a lot of small allocations and handles allocation failure correctly. Fixes dEQP-VK.api.object_management.alloc_callback_fail.* failures.	2016-01-04 10:07:10 -08:00
Kristian Høgsberg Kristensen	5526c1782a	vk: Handle allocation failures in meta init paths Fixes dEQP-VK.api.object_management.alloc_callback_fail.* failures.	2016-01-04 10:07:08 -08:00
Kristian Høgsberg Kristensen	b2ad2a20b6	vk: Handle allocation failure in anv_pipeline_init() Fixes dEQP-VK.api.object_management.alloc_callback_fail.* failures.	2016-01-04 10:06:50 -08:00
Kristian Høgsberg Kristensen	3954594eb4	vk: Call vk_error when we generate a VK_ERROR	2016-01-04 10:02:50 -08:00
Kristian Høgsberg Kristensen	75e01c8b2d	vk: Only finish wayland wsi if we created it Failure during instance creation will leave instance->wayland_wsi undefined. When we then try to clean that up we crash. Set instance->wayland_wsi to NULL on failure and only clean it up if it's non-NULL. Fixes part of dEQP-VK.api.object_management.alloc_callback_fail.*	2016-01-04 10:02:50 -08:00
Chad Versace	05c22f2d74	isl: Fix row pitch for linear buffers isl always aligned the row pitch to the surface's image alignment. This was sometimes wrong when the surface backed a VkBuffer. For a VkBuffer, the surface's row pitch is set by VkBufferImageCopy::bufferRowLength, whose required alignment is only that of the VkFormat. In particular, VkBuffer rows are packed in many dEQP and Crucible tests. And packed rows are rarely aligned to the surface's image alignment. Fixes: dEQP-VK.pipeline.image.view_type.2d.format.r8g8b8a8_unorm.size.13x13	2016-01-04 09:57:25 -08:00
Chad Versace	a827b553d9	isl: Fix swapped if-else in isl_calc_row_pitch The YUV case was applied to non-YUV formats. Oops.	2016-01-04 09:57:23 -08:00
Ilia Mirkin	b16c9be4a5	nvc0: scale up inter_bo size so that it's 16M for a 4K video Experimentally, 4M causes corruption and slowness, try to ramp it up with size instead. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2016-01-04 11:32:45 -05:00
Ilia Mirkin	b5f2f7073f	nv50,nvc0: fix crash when increasing bsp bo size for h264 H264 doesn't have a bitplane bo. We just need a device reference, so use the one from the client. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2016-01-04 11:32:45 -05:00
Samuel Iglesias Gonsálvez	8cf2e892fc	i965/wm: use proper API buffer size for the surfaces. Commit `5bb5eeea` fixes a bug indicating that the surfaces should have the API buffer size. Hovewer it picked the wrong value. This patch adds a new variable, which takes into account glBindBufferRange() values. This patch fixes the following CTS regressions: ES31-CTS.shader_storage_buffer_object.advanced-unsizedArrayLength-cs-std430-vec-bindrangeOffset ES31-CTS.shader_storage_buffer_object.advanced-unsizedArrayLength-cs-std430-vec-bindrangeSize Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>	2016-01-04 07:52:24 +01:00
Marek Olšák	86fa48426c	radeonsi: remove unused parameter from si_shader_binary_read_config Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-03 22:41:16 +01:00
Marek Olšák	b6d95248f0	radeonsi: move si_shader_binary_upload out of si_shader_binary_read Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-03 22:41:16 +01:00
Marek Olšák	7fa6bb47e3	gallium/radeon: dump LLVM module outside of radeon_llvm_compile Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-03 22:41:16 +01:00
Marek Olšák	fb98acb5a1	gallium/radeon: always add +DumpCode to the LLVM target machine for LLVM <= 3.5 It's the same behavior that we use for later LLVM. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-03 22:41:16 +01:00
Marek Olšák	cd7f252b11	gallium/radeon: r600_can_dump_shader should get TGSI processor type directly Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-03 22:41:16 +01:00
Marek Olšák	fd7000bd78	radeonsi: pass TGSI processor type to si_shader_binary_read for dumping the parameter will be used later Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-03 22:41:16 +01:00
Marek Olšák	3ce0a2fd7f	radeonsi: pass TGSI processor type to si_compile_llvm for dumping the parameter will be used later Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-03 22:41:16 +01:00
Marek Olšák	dd79034ca6	radeonsi: rename shader parameter definitions and variables for more clarity Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-03 22:41:16 +01:00
Ilia Mirkin	34217018c4	nvc0/ir: add support for PK2H/UP2H Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-03 16:20:52 -05:00
Ilia Mirkin	20dee333f3	st/mesa: use PK2H/UP2H when supported Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-03 16:20:47 -05:00
Ilia Mirkin	e9f43d6333	gallium: add PIPE_CAP_TGSI_PACK_HALF_FLOAT to indicate UP2H/PK2H support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-03 16:20:41 -05:00
Ilia Mirkin	459e4532af	tgsi: update PK2H/UP2H channel behavior info Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-03 16:20:27 -05:00
Ilia Mirkin	6eb74b87b8	gallium: document PK2H/UP2H Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-03 16:19:57 -05:00
Samuel Pitoiset	0ab2c21b93	st/mesa: fix parameter names for tesseval/tessctrl prototypes Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-03 22:01:18 +01:00
Ilia Mirkin	bf34748b39	nouveau: fix double-const qualifier Reported by Tom^ on IRC. The original intent was to mark the pointer constant as well as the data being pointed to, so move the *. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-03 11:32:15 -05:00
Rob Clark	3684e899ea	freedreno/ir3: use NIR_PASS helper macros Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-03 09:11:27 -05:00
Rob Clark	317628dbb3	nir: extract out helper macros for running passes Note these are a bit uglier, due to avoidance of GNU C extensions. But drivers which do not need to be built with compilers that don't support the extension can wrap these macros with their own. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-01-03 09:11:27 -05:00
Rob Clark	23bd6affb2	freedreno/ir3: we require block_index metadata Found during NIR_TEST_CLONE=1 piglit run. We were using block->index but forgetting to require it. Causing things to not work with a cloned shader which didn't preserve block_index. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-03 09:11:27 -05:00
Rob Clark	74135f804a	freedreno/ir3: refactor NIR IR handling Immediately convert into NIR and do an initial key-agnostic lowering/ optimization pass. This should let us share most of the per-variant transformations between each variant, and hopefully minimize the draw- time variant creation part of the compilation process. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-03 09:11:27 -05:00
Rob Clark	ab4efb19dc	freedreno/ir3: drop unnecessary unreachable() case It will still hit a compile_assert() in emit_tex, which has the advantage of dumping out the offending shader. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-03 09:11:27 -05:00
Samuel Pitoiset	6a49fcfb1f	gallium/tests: fix build with clang compiler Nested functions are supported as an extension in GNU C, but Clang don't support them. This fixes compilation errors when (manually) building compute.c, or by setting --enable-gallium-tests to the configure script. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75165 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-03 12:18:00 +01:00
Samuel Pitoiset	53dddab78c	nv50,nvc0: optimize coherent buffer checking at draw time Instead of iterating over all the buffer resources looking for coherent buffers, we keep track of a context-wide count. This will save some iterations (and CPU cycles) in 99.99% case because usually coherent buffers are not so used. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-03 12:17:05 +01:00
Kenneth Graunke	28dea26626	i965: Make TCS precompile use the TES primitive mode when available. If there's a linked TES program, we should just use the actual primitive mode. If not, just guess triangles (as we did before). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-02 18:46:16 -08:00
Kenneth Graunke	4a1c8a3037	i965: Push most TES inputs in SIMD8 mode. Using the push model for inputs is much more efficient than pulling inputs - the hardware can simply copy a large chunk into URB registers at thread creation time, rather than having the thread send messages to request data from the L3 cache. Unfortunately, it's possible to have more TES inputs than fit in registers, so we have to fall back to the pull model in some cases. However, it turns out that most tessellation evaluation shaders are fairly simple, and don't use many inputs. An arbitrary cut-off of 32 vec4 slots (16 registers) is more than sufficient to ensure that 100% of TES inputs are pushed for Shadow of Mordor, Unigine Heaven, GPUTest/TessMark, and SynMark. Note that unlike most SIMD8 stages, this actually reads packed vec4 data, since that is what our vec4 TCS programs write. Improves performance in GPUTest's tessmark_x64 microbenchmark by 93.4426% +/- 5.35541% (n = 25) on my Lenovo X250 at 1024x768. Improves performance in Synmark's Gl40TerrainFlyTess microbenchmark by 22.74% +/- 0.309394% (n = 5). Improves performance in Shadow of Mordor at low settings with tessellation enabled at 1280x720 by 2.12197% +/- 0.478553% (n = 4). shader-db statistics for files containing tessellation shaders: total instructions in shared programs: 184358 -> 181181 (-1.72%) instructions in affected programs: 27971 -> 24794 (-11.36%) helped: 226 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-02 18:46:16 -08:00
Kenneth Graunke	b022150d70	i965: Use LOAD_PAYLOAD for SIMD8 TES input loads, not MOV. We need a MOV to replicate g0.0<0,1,0> to all 8 channels. Since the message payload is a single register, MOV seemed more sensible than LOAD_PAYLOAD. However, MOV cannot be CSE'd, while LOAD_PAYLOAD can. All input loads can use the same header - we don't need to re-expand g0 every time. CSE accomplishes this, saving instructions. shader-db statistics for files containing tessellation shaders: total instructions in shared programs: 186923 -> 184358 (-1.37%) instructions in affected programs: 30536 -> 27971 (-8.40%) helped: 226 HURT: 0 total cycles in shared programs: 1009850 -> 1005356 (-0.45%) cycles in affected programs: 168206 -> 163712 (-2.67%) helped: 226 HURT: 0 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-02 18:46:16 -08:00
Kenneth Graunke	53a9b6223f	i965: Move 3-src subnr swizzle handling into the vec4 backend. While most align16 instructions only support a SubRegNum of 0 or 4 (using swizzling to control the other channels), 3-src instructions actually support arbitrary SubRegNums. When the RepCtrl bit is set, we believe it ignores the swizzle and uses the equivalent of a <0,1,0> region from the subnr. In the past, we adopted a vec4-centric approach of specifying subnr of 0 or 4 and a swizzle, then having brw_eu_emit.c convert that to a proper SubRegNum. This isn't a great fit for the scalar backend, where we don't set swizzles at all, and happily set subnrs in the range [0, 7]. This patch changes brw_eu_emit.c to use subnr and swizzle directly, relying on the higher levels to set them sensibly. This should fix problems where scalar sources get copy propagated into 3-src instructions in the FS backend. I've only observed this with TES push model inputs, but I suppose it could happen in other cases. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-02 18:46:16 -08:00
Eric Anholt	64253fdb2e	vc4: Fix build from upload changes.	2016-01-02 17:33:19 -08:00
Nicolai Hähnle	8f384d07a8	gallium/radeon: send LLVM diagnostics as debug messages Diagnostics sent during code generation and the every error message reported by LLVMTargetMachineEmitToMemoryBuffer are disjoint reporting mechanisms. We take care of both and also send an explicit message indicating failure at the end, so that log parsers can more easily tell the boundary between shader compiles. Removed an fprintf that could never be triggered. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-02 16:47:24 -05:00
Nicolai Hähnle	255ccd1e99	gallium/radeon: pass pipe_debug_callback into radeon_llvm_compile (v2) This will allow us to send shader debug info via the context's debug callback. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> (v1) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-02 16:47:24 -05:00
Nicolai Hähnle	f8cd11403a	radeonsi: send shader info as debug messages in addition to stderr output The output via stderr is very helpful for ad-hoc debugging tasks, so that remains unchanged, but having the information available via debug messages as well will allow the use of parallel shader-db runs. Shader stats are always provided (if the context is a debug context, that is), but you still have to enable the appropriate R600_DEBUG flags to get disassembly (since it is rather spammy and is only generated by LLVM when we explicitly ask for it). Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-02 16:47:24 -05:00
Nicolai Hähnle	4bb1c8dfec	radeonsi: pass pipe_debug_callback down into si_shader_binary_read (v2) This will allow us to send shader debug info. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> (v1) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-02 16:47:23 -05:00
Nicolai Hähnle	b6847062dd	gallium/radeon: implement set_debug_callback Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-02 16:47:23 -05:00
Jason Ekstrand	f6c4658cde	nir/spirv: Fix group decorations They were completely bogus before. For one thing, OpDecorationGroup created a value of type undef rather than decoration_group. Also OpGroupMemberDecorate didn't properly apply the decoration to the different members of the different groups. It should be correct now but there's no good way to test it yet.	2016-01-02 11:53:36 -08:00
Jason Ekstrand	6b0b57225c	anv/device: Only allocate whole pages in AllocateMemory The kernel is going to give us whole pages anyway, so allocating part of a page doesn't help. And this ensures that we can always work with whole pages.	2016-01-02 07:52:24 -08:00
Marek Olšák	ecb2da1559	u_upload_mgr: allow specifying PIPE_USAGE_* for the upload buffer Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-02 15:15:45 +01:00
Marek Olšák	37d0aea772	u_upload_mgr: remove alignment parameter from u_upload_create Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-02 15:15:45 +01:00
Marek Olšák	1bb79c3a7b	u_upload_mgr: pass alignment to u_upload_buffer manually Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-02 15:15:44 +01:00
Marek Olšák	e0f932846c	u_upload_mgr: pass alignment to u_upload_data manually Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-02 15:15:44 +01:00
Marek Olšák	020009f7cc	u_upload_mgr: pass alignment to u_upload_alloc manually The fixed alignment of u_upload_mgr will go away. This is the first step. The motivation is that one u_upload_mgr can have multiple users, each allocating from the same buffer, but requiring a different alignment. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-02 15:15:44 +01:00
Marek Olšák	ffc4716e97	u_upload_mgr: rework the application of alignment The function only aligned the size, but not the offset. The offset was aligned only when the previous suballocation was aligned. That yielded the correct offset alignment if the alignment was constant for all suballocations. Instead, directly align the offset, but allow an unaligned size. There is no change in behavior, because the alignment is constant at the moment. This a prerequisite for allowing a variable alignment for suballocations. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-02 15:15:44 +01:00
Marek Olšák	36c93a6fae	st/mesa: fix GLSL uniform updates for glBitmap & glDrawPixels (v2) Spotted by luck. The GLSL uniform storage is only associated once in LinkShader and can't be reallocated afterwards, because that would break the association. v2: don't remove st_upload_constants calls, clarify why they're needed Cc: 11.0 11.1 <mesa-stable@lists.freedesktop.org>	2016-01-02 15:15:44 +01:00
Marek Olšák	294ed5cd13	program: add _mesa_reserve_parameter_storage The next commit will use this. Reviewed-by: Brian Paul <brianp@vmware.com> Cc: 11.0 11.1 <mesa-stable@lists.freedesktop.org>	2016-01-02 15:15:44 +01:00
Jordan Justen	a2942d8f26	mesa: Fix warning with MESA_VERBOSE=api for BindBufferRange Reported-by: Dieter Nützel <Dieter@nuetzel-hh.de> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-01 17:27:14 -08:00
Ilia Mirkin	c1d14c6817	nv50,nvc0: make sure there's pushbuf space and that we ref the bo early First off, we can't flush in the middle of a command. Secondly requesting the extra push space might cause a flush to happen. If that flush happens, we'd have to do the PUSH_REFN again. So instead do PUSH_REFN after the push space request. This helps avoid rare crashes with supertuxkart in libdrm due to assertion failures. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2016-01-01 19:52:41 -05:00
Ilia Mirkin	33a415310b	st/mesa: sort extensions enablement array Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-01 19:50:02 -05:00
Rob Clark	816ddee6b8	nir/lower_clip: add missing writemask on store Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-01-01 15:32:46 -05:00
Jordan Justen	3dce7bf268	mesa: Add MESA_VERBOSE=api for GL_ARB_program_interface_query v2: * Add braces '{}' when the _mesa_debug call spans multiple lines (Ken) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-01-01 12:00:51 -08:00
Jordan Justen	36db91c4c4	mesa: Add MESA_VERBOSE=api for several indexed BindBuffer variants v2: * Add braces '{}' when the _mesa_debug call spans multiple lines (Ken) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-01 12:00:51 -08:00
Jason Ekstrand	f076d5330d	anv/device: Handle non-4k-aligned calls to MapMemory As per the spec: minMemoryMapAlignment is the minimum required alignment, in bytes, of host-visible memory allocations within the host address space. When mapping a memory allocation with vkMapMemory, subtracting offset bytes from the returned pointer will always produce a multiple of the value of this limit.	2016-01-01 09:29:29 -08:00
Dave Airlie	b835255992	st/glsl_to_tgsi: fix block movs for doubles While playing with fp64, I disable varying packing to debug something else, and noticed we never emitted half the output movs for double matrix arrays. We should be moving the left index two slots for dual source doubles, and the right index two slots for non-vs input doubles. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-01 09:43:54 +10:00
Dave Airlie	d214ce86cf	st/glsl_to_tgsi: handle different attrib size vertex inputs are counted differently in some cases, with vertex inputs we need to make sure we don't double count them. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-01 09:43:54 +10:00
Dave Airlie	dc7b33c1f3	st/glsl_to_tgsi: readd the double_reg2 for input index mapping Otherwise we end up emitting the wrong index for the second double. This fixes dmat-vs-gs-tcs-tes.shader_test and dvec3-vs-gs-tcs-tes.shader_test Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-01 09:43:54 +10:00
Dave Airlie	84dbf3c4ff	st/glsl_to_tgsi: when doing reladdr get vec4 of correct type This fixes fp64 relative addressing, in the upcoming dmat-vs-gs-tcs-tes.shader_test. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-01 09:43:53 +10:00
Dave Airlie	d87894b98f	st/glsl_to_tgsi: handle double immediates in matrices properly. This handles matrix initialisation properly. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-01 09:43:53 +10:00
Dave Airlie	7351c7684f	st/glsl_to_tgsi: setup writemask for double arrays and matricies. It's important for the double instruction emission code that the writemasks are correct going in for double so it know which channels to replicate. This fixes it for the array and matrix cases. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-01 09:43:53 +10:00
Dave Airlie	14506dcae2	st/glsl_to_tgsi: handle doubles in array shrinking code. This code takes into account double inputs in the array shrinking code. This fixes some issues with doubles and geom/tess inputs. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-01 09:43:53 +10:00
Dave Airlie	aab0c6c9c4	st/glsl_to_tgsi: handle doubles outputs in arrays. This handles the case where a double output is stored in an array, and tracks it for use in the double instruction emit code. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-01 09:43:53 +10:00
Dave Airlie	fc890d703e	st/glsl_to_tgsi: store if dst is double in array This is just a precursor patch to a fix for doubles with tessellation that I've written. We need to descend into output arrays in that case and mark dst's as double. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-01 09:43:53 +10:00
Jason Ekstrand	6b5cbdb317	anv/format: Get rid of num_channels	2015-12-31 12:07:43 -08:00
Jason Ekstrand	3fe1f118f8	anv/cmd_buffer: Fix a pointer-cast typo	2015-12-31 12:07:43 -08:00
Chad Versace	86ecb28ec6	isl: Document some isl_surf::phys_level0_sa invariants isl_dim_layout restricts the range of isl_surf::phys_level0_sa.	2015-12-31 12:06:02 -08:00
Jason Ekstrand	5318424d49	anv/pipeline: Better vertex input channel setup First off, it now uses isl formats instead of anv_format. Also, it properly handles integer vs. floating-point default channels and can properly handle alpha-only channels. (Not sure if those are allowed).	2015-12-31 12:02:08 -08:00
Jason Ekstrand	c6364495b2	anv/pipeline: Move vk_to_gen tables into a shared header	2015-12-31 12:02:08 -08:00
Chad Versace	d25cff687b	isl: Better document surface units Logical pixels, physical surface samples, and physical surface elements. Requested-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-12-31 11:56:13 -08:00
Chad Versace	373fd89e4b	isl: Document the 3D block extent of isl_format	2015-12-31 11:55:48 -08:00
Jason Ekstrand	1ddcbbf05f	nir/spirv: Add a missing break statement in handle_image	2015-12-30 21:57:04 -08:00
Jason Ekstrand	4f9a211b4a	Revert "isl: Fix assertion failure for npot pixel formats" This reverts commit `96d1baa88d`.	2015-12-30 21:01:55 -08:00
Jason Ekstrand	0bb103d010	nir/spirv: Handle push constants after decorations	2015-12-30 20:54:27 -08:00
Jason Ekstrand	3421ba1843	anv/device: Place memory types at heapIndex == 0 Previously, they were at heapIndex == 1 even though we only advertised one heap.	2015-12-30 19:32:43 -08:00
Jason Ekstrand	cf6ce424e0	nir/spirv: Fix constant num_elements and allocation Thanks to the addition of nir_clone, we now have a num_elements field in nir_constant which we weren't setting. Also, constants have to be parented to the variable they initialize, so we have to make a copy.	2015-12-30 18:51:59 -08:00
Jason Ekstrand	601b7d5f98	nir/lower_outputs_to_temporaries: Reparent constant initializers	2015-12-30 18:51:06 -08:00
Jason Ekstrand	7d57528233	nir/clone: Expose nir_constant_clone	2015-12-30 18:44:19 -08:00
Jason Ekstrand	fed98df428	nir/gather_info: Add support for end_primitive_with_counter	2015-12-30 17:45:43 -08:00
Jason Ekstrand	5afac62b28	nir/spirv: Handle OpLine	2015-12-30 17:45:43 -08:00
Jason Ekstrand	149f35bbba	nir/spirv: Let OpEntryPoint act as an OpName	2015-12-30 17:45:43 -08:00
Jason Ekstrand	5f7f88524c	nir/lower_outputs_to_temporaries: Take a nir_function entrypoint	2015-12-30 17:45:43 -08:00
Jason Ekstrand	0fe4580e64	nir/spirv: Add support for multiple entrypoints per shader This is done by passing the entrypoint name into spirv_to_nir. It will then process the shader as if that were the only entrypoint we care about. Instead of returning a nir_shader, it now returns a nir_function.	2015-12-30 17:45:43 -08:00
Jason Ekstrand	e993e45eb1	nir/spirv: Get the shader stage from the SPIR-V Previously, we depended on it being passed in.	2015-12-30 17:45:43 -08:00
Jason Ekstrand	db3a64fcea	nir/spirv: Use shader stage for determining variable locations	2015-12-30 17:45:43 -08:00
Jason Ekstrand	d7ae2200f9	nir/spirv: Get rid of default GS info shaderc has been fixed for a while now.	2015-12-30 17:45:43 -08:00
Jason Ekstrand	d9c9a117dc	nir/spirv: Handle execution modes as decorations They're basically the same thing.	2015-12-30 17:45:43 -08:00
Jason Ekstrand	2b6bcaf91a	nir/spirv: Separate handling of preamble from type/var/const instructions	2015-12-30 17:45:43 -08:00
Chad Versace	96d1baa88d	isl: Fix assertion failure for npot pixel formats When aligning to isl_format_layout::bs (which is the number of bytes in the pixel), use isl_align_npot() instead of isl_align(), because isl_align() works only for power-of-2 alignment. Fixes assertion in dEQP-VK.pipeline.image.view_type.1d.format.r16g16b16_sfloat.size.512x1.	2015-12-30 16:28:19 -08:00
Kenneth Graunke	65d3f85eb3	nvc0: Set winding order regardless of domain. Quads need to respect winding order, too - not just triangles. Fixes rendering in GFXBench 4.0's tessellation benchmark. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-30 16:04:12 -08:00
Kenneth Graunke	7cdc2b9ca0	glsl: Fix varying struct locations when varying packing is disabled. varying_matches::record tries to compute the number of components in each varying, which varying_matches::assign_locations uses to assign locations. With varying packing, it uses glsl_type::component_slots() to come up with a reasonable value. Without varying packing, it fell back to an open-coded computation that didn't bother to handle structs at all. I believe we can simply use 4 * glsl_type::count_attribute_slots(false), which already handles these cases correctly. Partially fixes rendering in GFXBench 4.0's tessellation benchmark. (NVE0 is almost right after this, but i965 is still mostly garbage.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-30 16:04:12 -08:00
Kenneth Graunke	4acf71c89b	drirc: Disable ARB_blend_func_extended for Heaven 4.0/Valley 1.0. Unigine Heaven 4.0 and Valley 1.0 use dual color blending but don't specify which fragment shader output is which, so there's at best a 50/50 chance of us guessing it correctly. This is invalid. Unigine fixed this in 4.1 and 1.1 versions over a year and a half ago, but hasn't actually released them for whatever reason. So, add the workaround back so that it works for most people. Fixes Heaven 4.0/Valley 1.0 rendering on Ivybridge. For whatever reason, Broadwell worked. 4.1 and 1.1 have always worked. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92233 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: mesa-stable@lists.freedesktop.org	2015-12-30 16:04:12 -08:00
Ilia Mirkin	5ac15f788b	glsl: add GL_ARB_shader_draw_parameters define Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-12-30 18:59:18 -05:00
Jason Ekstrand	07b4f17aaf	nir/spirv/GLSL450: Add support for SAbs	2015-12-30 14:41:49 -08:00
Kenneth Graunke	e6cd0c0e1c	nir/spirv: Implement IsInf and IsNan built-ins.	2015-12-30 14:10:44 -08:00
Jason Ekstrand	a7e827192b	isl: Tile-align height in image size calculation This fixes a bunch of gpu hangs on the dEQP-VK.glsl.ShaderExecutor.common group of CTS tests.	2015-12-30 14:03:47 -08:00
Ilia Mirkin	517a93b346	nvc0: add ARB_shader_draw_parameters support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-30 16:55:57 -05:00
Ilia Mirkin	89bda9772d	st/mesa: add GL_ARB_shader_draw_parameters support Hooks up the new system values, passes the drawid in. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-12-30 16:55:56 -05:00
Ilia Mirkin	daaf0bdf46	gallium: add a drawid to pipe_draw_info This will allow the state tracker to inform the driver where in a broken-up multidraw we currently are. This can then be passed into the vertex shader. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-12-30 16:55:56 -05:00
Ilia Mirkin	87b4e4e29f	gallium: add PIPE_CAP_DRAW_PARAMETERS This allows the state tracker to know that the various draw parameters are available in vertex shaders. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-12-30 16:55:56 -05:00
Ilia Mirkin	bb52ea45cc	gallium: add baseinstance/drawid semantics Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-12-30 16:55:56 -05:00
Kenneth Graunke	9f23116bfa	Revert "nir/spirv: Update to the 1.0 GLSL.std.450 header" This reverts commit `b33f5d3889`, and also removes the (empty) case statements for the new built-ins. It doesn't look like glslang has updated yet, so updating the header just breaks everything, as we no longer agree on opcode numbers.	2015-12-30 13:26:56 -08:00
Jason Ekstrand	e6fc170afb	anv/allocator: Rework state streams again If we're going to hav valgrind verify state streams then we need to ensure that once we choose a pointer into a block we always use that pointer until the block is freed. I was trying to do this with the "current_map" thing. However, that breaks down because you have to use the map from the block pool to get to the stream_block to get at current_map. Instead, this commit changes things to track the stream_block by pointer instead of by offset into the block pool.	2015-12-30 11:40:38 -08:00
Jason Ekstrand	28243b2fba	gen7/8/cmd_buffer: Allocate the correct ammount for COLOR_CALC_STATE We were allocating 6 bytes when we should have been allocating 6 dwords.	2015-12-30 10:37:57 -08:00
Jason Ekstrand	a0b2829f20	anv/stream_alloc: Properly manage valgrind NOACCESS and UNDEFINED status When I first did the valgrindifying for stream allocators, I misunderstood some things about valgrind's expectations for NOACCESS and UNDEFINED. First off, valgrind expects things to be marked NOACCESS before you allocate out of them. Since our blocks came from a pool backed by a mmapped memfd, they came in as UNDEFINED; we needed to mark them as NOACCESS. Also, I didn't realize that VALGRIND_MEMPOOL_CHANGE only updated the mempool allocation state and didn't actually change definedness; we had to add a VALGRIND_MAKE_MEM_UNDEFINED to get rid of the NOACCESS on the newly allocated portion.	2015-12-30 10:36:19 -08:00
Ilia Mirkin	d50e6128b8	nv50/ir: attempt to do more constant folding on mad -> add conversion The add might actually have a 0 as an argument, which would convert it into a mov. Make sure to detect that. Also avoid the hack of putting the immediate directly into the instruction, instead use a mov to put it into place and let the later LoadPropagation pass place it if possible. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-30 12:29:07 -05:00
Marta Lofstedt	97685ff10e	i965/gen8: Always use BRW_REGISTER_TYPE_UW for MUL on GEN8+ The imulExtended tests of the shader bitfield tests of the OpenGL ES 3.1 CTS, fail on gen8+, when BRW_REGISTER_TYPE_W is used for SHADER_OPECODE_MULH. Also, remove unused helper function: static inline bool type_is_signed(unsigned type) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92595 Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-12-30 09:29:14 +01:00
Kristian Høgsberg Kristensen	91d93f7908	nir/spirv: Lower gl_GlobalInvocationID correctly Use nir_intrinsic_load_local_invocation_id, not nir_intrinsic_load_invocation_id (missing 'local'), which is a geometry shader built-in.	2015-12-30 00:03:54 -08:00
Jason Ekstrand	451fe2670c	nir/spirv/cfg: Handle discard	2015-12-29 19:23:25 -08:00
Jason Ekstrand	5693637faa	nir/print: Handle variables with var->name == NULL	2015-12-29 16:58:00 -08:00
Jason Ekstrand	8cc55780fd	nir/inline_functions: Switch to inlining everything	2015-12-29 16:58:00 -08:00
Timothy Arceri	0d4cd045c8	glsl: tidy up struct with a single member There used to be more members but they now share other fields in order to keep memory use low. Also making the naming more generic will allow us to reuse the field for explicit byte offsets within blocks for ARB_enhanced_layouts. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2015-12-30 11:52:05 +11:00
Emil Velikov	2c1a215409	glsl/linker: annotate static functions as such Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2015-12-30 11:51:58 +11:00
Emil Velikov	c704b89fe4	glsl: annotate ast_process_struct_or_iface_block_members() as static Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2015-12-30 11:51:51 +11:00
Kenneth Graunke	7cdcee3bed	nir/spirv/glsl450: Enumerate more built-in opcodes.	2015-12-29 16:06:35 -08:00
Kenneth Graunke	ccd84848f0	anv/state: Fix reversed MIN vs. MAX in levelCount handling. The point is to promote a levelCount of 0 to 1 before subtracting 1. This needs MAX, not MIN.	2015-12-29 15:51:14 -08:00
Jason Ekstrand	2a58cb03d0	nir/spirv: Use instr_rewrite_src for updating phi sources You can't just add a new source to a phi because use/def information won't get updated properly. Instead, you have to use one of the core helpers. Some day, we may want to add a nir_phi_instr_add_src helper.	2015-12-29 15:44:39 -08:00
Jason Ekstrand	69d5838aee	nir/validate: Don't validate the return deref for void function calls	2015-12-29 15:35:29 -08:00
Jason Ekstrand	51b04d03d5	nir/dominance: Handle unreachable blocks Previously, nir_dominance.c didn't properly handle unreachable blocks. This can happen if, for instance, you have something like this: loop { if (...) { break; } else { break; } } In this case, the block right after the if statement will be unreachable. This commit makes two changes to handle this. First, it removes an assert and allows block->imm_dom to be null if the block is unreachable. Second, it properly skips unreachable blocks in calc_dom_frontier_cb.	2015-12-29 15:29:27 -08:00
Kenneth Graunke	b4a1c9b506	nir/spirv/glsl450: Implement inverse hyperbolic trig built-ins.	2015-12-29 15:27:03 -08:00
Kenneth Graunke	2ea111664c	nir/spirv/glsl450: Implement Refract built-in.	2015-12-29 15:27:03 -08:00
Kenneth Graunke	74529a2c50	nir/spirv/glsl450: Implement hyperbolic trig built-ins.	2015-12-29 15:27:03 -08:00
Kenneth Graunke	0b1a436ac8	nir/spirv/glsl450: implement Reflect built-in.	2015-12-29 15:27:03 -08:00
Kenneth Graunke	659a3623b0	nir/spirv/glsl450: Implement FaceForward built-in.	2015-12-29 15:27:03 -08:00
Kenneth Graunke	b10af36d93	nir/spirv/glsl450: Implement SmoothStep.	2015-12-29 15:27:03 -08:00
Kenneth Graunke	6a0fa2d758	nir/spirv/glsl450: Implement Cross built-in.	2015-12-29 15:27:03 -08:00
Kenneth Graunke	083fd6ec2a	nir/spirv/glsl450: Implement Clamp/SClamp/UClamp.	2015-12-29 15:27:03 -08:00
Kenneth Graunke	034010924e	nir/spirv/glsl450: Implement the Log built-in.	2015-12-29 15:27:03 -08:00
Kenneth Graunke	ffc5ae7c9e	nir/spirv/glsl450: Implement Exp built-in.	2015-12-29 15:27:03 -08:00
Kenneth Graunke	227e250005	nir/spirv/glsl450: Add a helper for doing fclamp().	2015-12-29 15:27:03 -08:00
Kenneth Graunke	0f801752f2	nir/spirv/glsl450: Add helpers for calculating exp() and log().	2015-12-29 15:27:03 -08:00
Kenneth Graunke	9c9edd1ce8	nir/spirv/glsl450: Add an 'nb' shortcut variable. "nb" is shorter and more convenient than "&b->nb", especially when several operations are composed together into a larger expression tree.	2015-12-29 15:27:03 -08:00
Jason Ekstrand	5f04a61219	nir/lower_returns: Don't just change the type of a jump. It doesn't give core NIR the opportunity to update predecessors and successors. Instead, we have to remove and re-insert the instruction.	2015-12-29 14:51:47 -08:00
Jason Ekstrand	6fa47c9c17	nir/builder: Add a nir_jump helper	2015-12-29 14:48:34 -08:00
Jason Ekstrand	37a38548d4	glsl/types.cpp: Fix function_key_compare	2015-12-29 14:32:10 -08:00
Jason Ekstrand	b33f5d3889	nir/spirv: Update to the 1.0 GLSL.std.450 header	2015-12-29 14:29:03 -08:00
Jason Ekstrand	a33fcc0fd4	Merge remote-tracking branch 'mesa-public/master' into vulkan This pulls in nir_builder_init_simple_shader and allows us to delete anv_nir_builder.h entirely.	2015-12-29 13:53:41 -08:00
Jason Ekstrand	0119773ffc	nir/builder: Add an init function that creates a simple shader for you A hugely common case when using nir_builder is to have a shader with a single function called main. This adds a helper that gives you just that. This commit also makes us use it in the NIR control-flow unit tests as well as tgsi_to_nir and prog_to_nir. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-29 13:44:05 -08:00
Jason Ekstrand	5dd4386b92	nir/spirv: Use a C99-style initializer for structure fields This ensures that all unknown fields get zero-initizlied so we don't have undefined values floating around.	2015-12-29 13:15:20 -08:00
Jason Ekstrand	e10b0e2b49	anv/pipeline: Use vs_prog_data.inputs_read when computing vb_used	2015-12-29 13:03:01 -08:00
Jason Ekstrand	0a2ab87947	nir/spirv: Move CF emit code into vtn_cfg.c	2015-12-29 12:50:31 -08:00
Jason Ekstrand	4e22cd2e32	nir/spirv: Add support for switch statements	2015-12-29 12:50:31 -08:00
Jason Ekstrand	cf555dc1c2	nir/spirv: A couple simple loop fixes	2015-12-29 12:50:31 -08:00
Jason Ekstrand	303d095f58	nir/spirv: Add an actual CFG data structure The current data structure doesn't handle much that we couldn't handle before. However, this will be absolutely crucial for doing swith statements. Also, this should fix structured continues.	2015-12-29 12:50:31 -08:00
Kristian Høgsberg Kristensen	55ca5b0e74	mesa/st: Pad out _mesa_sysval_to_semantic for new SYSTEM_VALUE_* enums GL_ARB_shader_draw_parameters added two new system values. This gets us back to mapping mesa system values to the right TGSI semantics. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-29 12:15:01 -08:00
Ilia Mirkin	724134f683	nv50/ir: float(s32 & 0xff) = float(u8), not s8 Make sure to make conversion unsigned when we're ANDing the high bits away. Fixes corruption in dolphin. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-29 15:08:20 -05:00
Kristian Høgsberg Kristensen	581f81860e	i965: Reemit vertex state between indirect multi draws If we're doing an indirect draw, prims[i].basevertex is always 0 and the real base vertex value is in the indirect parameter buffer. We try to avoid flagging BRW_NEW_VERTICES if prims[i].basevertex doesn't change, which then breaks down for indirect draws. Thus, if a program uses base vertex or base instance, and the draw call is indirect, always flag BRW_NEW_VERTICES. A new piglit test, spec/ARB_shader_draw_parameters/drawid-indirect-vertexid tests this. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-12-29 10:39:25 -08:00
Kristian Høgsberg Kristensen	f9283f2668	nir: Teach nir_opt_algebraic about adding and subtracting the same thing This optimizes a + b - b to just a. Modest shader-db results (BDW): total instructions in shared programs: 7842452 -> 7841862 (-0.01%) instructions in affected programs: 61938 -> 61348 (-0.95%) total loops in shared programs: 2131 -> 2131 (0.00%) helped: 263 HURT: 0 GAINED: 0 LOST: 0 but the optimization turns gl_VertexID - gl_BaseVertexARB into just a reference to SYSTEM_VALUE_VERTEX_ID_ZERO_BASE, which the i965 hardware supports natively. That means we can avoid using the internal vertex buffer for gl_BaseVertexARB in this case. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-12-29 10:39:25 -08:00
Kristian Høgsberg Kristensen	cddfc2cefa	i965: Add support for gl_DrawIDARB and enable extension We have to break open a new vec4 for gl_DrawIDARB. We've used up all space in the vec4 we use for SGVS and gl_DrawIDARB has to come from its own separate vertex buffer anyway. This is because we point the vb for base vertex and base instance into the draw parameter BO for indirect draw calls, but the draw id is generated by mesa in a different buffer. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-12-29 10:39:25 -08:00
Kristian Høgsberg Kristensen	17ebb55a14	i965: Add support for gl_BaseVertexARB and gl_BaseInstanceARB We already have gl_BaseVertexARB in the .x component of the SGVS vec4 and plug gl_BaseInstanceARB into the last free component (.y). Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-12-29 10:39:25 -08:00
Kristian Høgsberg Kristensen	b70616f3e7	i965: Assert that SYSTEM_VALUE_VERTEX_ID gets lowered fs_visitor::emit_vs_system_value() looks like it's trying to handle SYSTEM_VALUE_VERTEX_ID, but we should never see that value in the backend. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-29 10:39:25 -08:00
Kristian Høgsberg Kristensen	1a59aeaebd	mesa: Add core mesa support for GL_ARB_shader_draw_parameters Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-12-29 10:39:25 -08:00
Kristian Høgsberg Kristensen	42dd2c028d	mesa/vbo: Add draw_id field to struct _mesa_prim The drivers will need this for passing in gl_DrawIDARB. For indirect multidraw calls, we get the prim array and prim[i].draw_id == i and is redundant. But for non-indirect calls, we get one primitive at a time and need the draw_id field. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-29 10:39:25 -08:00
Aaron Watry	70d8dbc9a1	nir: Remove function overload in control flow test Fixes make check. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-12-29 09:42:14 -08:00
Jason Ekstrand	bbf99511d0	gen7/8/pipeline: s/vb_used/elements in emit_vertex_input	2015-12-29 09:40:22 -08:00
Nicolai Hähnle	7b8db37abb	radeonsi: add RADEON_REPLACE_SHADERS debug option This option allows replacing a single shader by a pre-compiled ELF object as generated by LLVM's llc, for example. This can be useful for debugging a deterministically occuring error in shaders (and has in fact helped find the causes of https://bugs.freedesktop.org/show_bug.cgi?id=93264). v2: drop the debug flag, use DEBUG_GET_ONCE_OPTION instead Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-12-29 09:07:04 -05:00
Nicolai Hähnle	7d1fc2cf51	radeonsi: count compilations in si_compile_llvm This changes the count slightly (because of si_generate_gs_copy_shader), but this is only relevant for the driver-specific num-compilations query. It sets the stage for the next commit. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-12-29 09:07:01 -05:00
Nicolai Hähnle	4711170239	gallium/util: add DEBUG_GET_ONCE_OPTION This is analogous to the alreading existing macros for BOOL, NUM, and FLAGS. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-12-29 09:06:57 -05:00
Grazvydas Ignotas	da0e216e06	r600: fix constant buffer size programming When buffer size is less than 16, zero ends up being programmed as size, which prevents the hardware from fetching the correct values. Fix it by combining shift and align so that the value is always rounded up. Cc: "11.1 11.0 10.6" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92229 Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-12-29 09:05:55 -05:00
Kristian Høgsberg Kristensen	fc03723bcd	vk: Fill out buffer surface state when updating descriptor set We can do this when we update the descriptor set instead of on the fly.	2015-12-28 21:57:56 -08:00
Kristian Høgsberg Kristensen	a00524a216	vk: Unstub VkSemaphore implementation There really is nothing to do for us here, at least with the current kernel interface.	2015-12-28 21:57:56 -08:00
Jason Ekstrand	5fab35d090	gen7/pipeline: Actually use inputs_read from the VS for laying out inputs	2015-12-28 18:21:11 -08:00
Jason Ekstrand	b090f9dce1	gen8/pipeline: Actually use inputs_read from the VS for laying out inputs	2015-12-28 18:21:11 -08:00
Jason Ekstrand	3eb108ef87	anv/meta: Fix the pos_out location for the vertex shader	2015-12-28 18:21:11 -08:00
Jason Ekstrand	b005fd62f9	nir/spirv: Add GLSL.std.450.h It accidentally got removed during the mass rename.	2015-12-28 15:46:22 -08:00
Jason Ekstrand	9c84b6cce0	anv/device: Set device->info sooner in CreateDevice anv_block_pool_init calls anv_block_pool_grow which checks device->info.has_llc to see if it needs to set caching parameters. If we don't set device->info early enough, this reads an undefined value which is probably 0 and not what we want on llc platforms. Found with valgrind.	2015-12-28 13:29:01 -08:00
Jason Ekstrand	763176a3e2	nir/lower_returns: Fix a bug in loop lowering	2015-12-28 13:22:09 -08:00
Kenneth Graunke	dfce9759ab	docs: Mark ARB_tessellation_shader as done on all i965 platforms. We now support all Intel GPUs which can do tessellation. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-28 13:17:08 -08:00
Kenneth Graunke	381a89cf2a	i965: Enable ARB_tessellation_shader on Gen7-7.5. We've resolved all the GPU hangs, and everything seems to be working. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-28 13:17:05 -08:00
Kenneth Graunke	bd8ab8dedb	i965: Don't set interleave or complete on TCS EOT message. Setting interleave on the TCS EOT message causes Ivybridge hardware to GPU hang like crazy. Individual tests would pass, but running even a simple test like nop.shader_test in a loop would hang within 1-3 runs. Adding sleep delays worked around the problem, somehow. Interleave doesn't make much sense given that we only have one patch URB handle, not two. Complete doesn't seem useful either. There's no reason to actually set those bits. We were just being lazy. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-28 13:17:03 -08:00
Kenneth Graunke	b7793783b3	i965: Relase input URB Handles on Gen7/7.5 when TCS threads finish. Pre-Broadwell hardware requires us to manually release the ICP Handles by issuing URB read messages with the "Complete" bit set. We can do this in pairs to use fewer URB read messages. Based heavily on work from Chris Forbes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-28 13:17:00 -08:00
Kenneth Graunke	6ceabb72ea	i965: Use proper TCS barrier ID bits for Ivybridge/Baytrail. Gen7 uses bits 15:12 while Gen7+ uses bits 16:13. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-28 13:16:57 -08:00
Kenneth Graunke	5898cbae24	i965: Use proper TCS Instance ID bits for Ivybridge/Baytrail. Gen7 uses 22:16 while Gen7.5+ uses 23:17. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-28 13:16:54 -08:00
Kenneth Graunke	1245724f72	i965: Port tessellation evaluation shaders to vec4 mode. This can be used on Broadwell by setting INTEL_SCALAR_TES=0. More importantly, it will be used for Ivybridge and Haswell. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-28 13:16:48 -08:00
Kenneth Graunke	889d987904	i965: Emit a real 3DSTATE_DS on Gen7. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-28 13:16:45 -08:00
Kenneth Graunke	2c240b05e9	i965: Emit a real 3DSTATE_HS on Gen7. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-28 13:16:34 -08:00
Kenneth Graunke	74b83fe368	i965: Add the TCS/TES state upload atoms to the gen7_atoms list. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-28 13:16:19 -08:00
Jason Ekstrand	7aaed91581	nir/spirv: Move to its own directory	2015-12-28 11:49:39 -08:00
Jason Ekstrand	d5fa51bdee	Merge remote-tracking branch 'mesa-public/master' into vulkan This pulls in the removal of nir_function_overload	2015-12-28 10:56:31 -08:00
Jason Ekstrand	d9dcfafacc	nir/spirv: Use nir_build_alu for alu instructions	2015-12-28 10:35:31 -08:00
Jason Ekstrand	237f2f2d8b	nir: Get rid of function overloads When Connor originally drafted NIR, he copied the same function+overload system that GLSL IR had with a few names changed. However, this double-indirection is not really needed and has only served to confuse people. Instead, let's just have functions which may not have unique names and may or may not have an implementation. If someone wants to do overload resolving, they can hav a hash table based function+overload system in the overload resolving pass. There's no good reason to keep it in core NIR. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> ir3 bits are Reviewed-by: Rob Clark <robclark@gmail.com>	2015-12-28 09:59:53 -08:00
Jason Ekstrand	ea77b384e8	Merge remote-tracking branch 'mesa-public/master' into vulkan This pulls in tessellation and the store_var changes that go with it.	2015-12-27 23:23:05 -08:00
Jason Ekstrand	f948767471	nir/lower_returns: Better algorithm as per connor	2015-12-27 22:50:45 -08:00
Jason Ekstrand	3489f66056	nir: Add a cursor helper for getting a cursor after any phi nodes	2015-12-27 22:50:14 -08:00
Ilia Mirkin	109c348284	nvc0: don't forget to reset VTX_TMP bufctx slot after blit completion Also release the scratch allocation if any. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-27 21:33:36 -05:00
Ilia Mirkin	28e07fdd4a	nv50,nvc0: add a note when converting vertex elements using CPU Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-27 19:49:44 -05:00
Jason Ekstrand	c60456dfaa	nir/gather_info: Handle multi-slot variables in io bitfields	2015-12-24 00:47:20 -08:00
Jason Ekstrand	bbebd2de13	nir: Add a helper for getting the bitmask for a variable's location	2015-12-24 00:47:20 -08:00
Jason Ekstrand	4ff4310a78	nir/types: Expose glsl_type::count_attribute_slots()	2015-12-24 00:47:19 -08:00
Jason Ekstrand	0bc1b0fd23	nir/lower_return: Do it for real this time	2015-12-24 00:47:19 -08:00
Jason Ekstrand	e1b1d58bec	nir/cf: Make extracting or re-inserting nothing a no-op	2015-12-23 23:46:04 -08:00
Jason Ekstrand	eae352e75c	nir: Add a function for comparing cursors	2015-12-23 18:09:42 -08:00
Connor Abbott	41c7912d04	gallium/auxiliary: don't build NIR sources with MSVC2008 flags NIR has never been built with MSVC2008, so we shouldn't add MSVC2008_COMPAT_CFLAGS to anything that uses it. This allows us to get rid of the pragma in tgsi_to_nir.c. Build tested with freedreno. v2: Use MSVC2013_COMPAT_CLFAGS instead. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2015-12-23 20:46:48 -05:00
Jason Ekstrand	54c870ff61	nir/spirv: Add support for undefs in vtn_ssa_value()	2015-12-23 14:14:39 -08:00
Jason Ekstrand	2e823d5754	nir/spirv: Properly handle vector times matrix	2015-12-23 13:49:56 -08:00
Jason Ekstrand	452ba4db2b	nir/spirv: Create the correct type if a matrix-vector multiply produces a vector	2015-12-23 13:49:56 -08:00
Jason Ekstrand	5b30132388	nir/spirv: Fix some mem_ctx issues with create_vec	2015-12-23 13:49:56 -08:00
Jason Ekstrand	66168a798b	nir/spirv: Better document vtn_ssa_value.transposed	2015-12-23 13:49:56 -08:00
Jason Ekstrand	3b391892aa	anv/descriptor_set: Use anv_foreach_stage	2015-12-23 13:49:56 -08:00
Jason Ekstrand	72ceb99bab	anv: Mask out invalid stages in foreach_stage	2015-12-23 13:49:56 -08:00
Jason Ekstrand	5644b1cece	nir/spirv: Handle LogicalNot	2015-12-23 13:49:56 -08:00
Jason Ekstrand	6219a69589	nir/spirv: Handle derefs in vtn_ssa_value This is kind of a hack, but it makes vtn_ssa_value insert a load if the value requested is actually a deref. This shouldn't happen normally but, thanks to the impedence mismatch of the NIR function parameter model vs. the SPIR-V model, this can happen for function arguments.	2015-12-23 13:49:56 -08:00
Jason Ekstrand	3ab1b7afa8	nir/spirv: Do boolean fixup on block loads We used to do it for variable loads on things of type "uniform" but that never got ported to block loads.	2015-12-23 13:49:56 -08:00
Jason Ekstrand	af74ce5a19	spirv/nir: Handle non-vector extractions in vtn_composite_extract	2015-12-23 13:49:56 -08:00
Jason Ekstrand	79b8b42081	nir/spirv: Handle function calls	2015-12-23 13:49:56 -08:00
Jason Ekstrand	95990c96cc	nir: Create the params array in function_impl_create	2015-12-23 13:49:56 -08:00
Jason Ekstrand	a7f3e113ad	i965/nir: Remove return handling This was added because we were getting spurrious returns coming out of SPIR-V. Now that we're calling lower_returns, we don't need this.	2015-12-23 13:49:56 -08:00
Jason Ekstrand	ac975b73cf	anv/pipeline: Run lower_returns and inline_functions after spirv_to_nir	2015-12-23 13:49:56 -08:00
Jason Ekstrand	8fba4bf79f	nir: Add a function inlining pass	2015-12-23 13:49:56 -08:00
Jason Ekstrand	b21db9cea5	nir/builder: Add a copy_deref_var helper	2015-12-23 13:49:56 -08:00
Jason Ekstrand	23cfa683d5	nir: move nir_copy_var from anv_nir_builder to nir_builder	2015-12-23 13:49:56 -08:00
Jason Ekstrand	4aac03fe61	nir/clone: Add support for cloning a single function_impl This will be useful for things such as function inlining.	2015-12-23 13:49:56 -08:00
Jason Ekstrand	98291b8f2c	nir: Add a helper for creating a "bare" nir_function_impl This is useful if you want to clone a single function_impl if, for instance, you wanted to do function inlining.	2015-12-23 13:49:56 -08:00
Jason Ekstrand	86772c2488	nir/control_flow: Handle relinking top-level blocks This can happen if a function ends in a return instruction and you remove the return.	2015-12-23 13:49:56 -08:00
Jason Ekstrand	1749e667ea	nir: Add a stub function inlining pass All it does is remove the return at the end, but it's good enough for simple functions.	2015-12-23 13:49:56 -08:00
Jason Ekstrand	413a9d3517	nir/print: Factor variable name lookup into a helper Otherwise, we have a problem when we go to print functions with arguments because their names get added to the hash table during declaration which happens after we print the prototype.	2015-12-23 13:49:56 -08:00
Anuj Phogat	52865efc41	i965: Add tr_mode and mip tail information in surface state dump Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2015-12-23 13:20:45 -08:00
Jordan Justen	8326eb13f2	i965/gen8/cs: Gen8 requires 64 byte alignment for push constant data The BDW PRM Vol2a: Command Reference: Instructions, section MEDIA_CURBE_LOAD, says that 'CURBE Total Data Length' and 'CURBE Data Start Address' are 64-byte aligned. This is different from previous gens, that were 32-byte aligned. v2 (Jordan): - CURBE Data Start Address is also 64-byte aligned. - The call to brw_state_batch should also use 64-byte alignment. - Improve PRM reference. v3: * New patch from Jordan. Always align base and size to 64 bytes. Fixes the following SSBO CTS tests on BDW: ES31-CTS.shader_storage_buffer_object.basic-atomic-case1-cs ES31-CTS.shader_storage_buffer_object.basic-operations-case1-cs ES31-CTS.shader_storage_buffer_object.basic-operations-case2-cs ES31-CTS.shader_storage_buffer_object.basic-stdLayout_UBO_SSBO-case2-cs ES31-CTS.shader_storage_buffer_object.advanced-write-fragment-cs ES31-CTS.shader_storage_buffer_object.advanced-indirectAddressing-case2-cs ES31-CTS.shader_storage_buffer_object.advanced-matrix-cs And many other CS CTS tests as reported by Marta Lofstedt. (Commit message is from Iago, but in v3, code is from Jordan.) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Tested-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-22 23:54:02 -08:00
Rob Clark	843cec6d3a	freedreno/ir3: spelling.. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-12-23 00:28:24 -05:00
Rob Clark	dc21747838	nir/print: print variable constant-initializers Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-23 00:28:01 -05:00
Kenneth Graunke	6524897606	docs: Clarify that ARB_tessellation_shader is only done on i965/gen8+. Requested by kisak on IRC.	2015-12-22 20:14:35 -08:00
Kenneth Graunke	209d130dd1	docs: Mark ARB_tessellation_shader as done on i965/gen8+.	2015-12-22 18:50:38 -08:00
Kenneth Graunke	7738f3a988	i965: Enable ARB_tessellation_shader on Gen8+. Everything is in place and I'm not aware of any further issues. Tested with: - Piglit - Tessmark - Unigine Heaven - Shadow of Mordor - GRID Autosport I have patches to backport this to Haswell, Ivybridge, and Baytrail as well (the first Intel hardware to support tessellation), but there are still a lot of GPU hangs left to debug. So that will come later. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-22 17:22:14 -08:00
Kenneth Graunke	794eb9d727	i965: Handle mix-and-match TCS/TES with separate shader objects. GL_ARB_separate_shader_objects allows the application to mix-and-match TCS and TES programs separately. This means that the interface between the two stages isn't known until the final SSO pipeline is in place. This isn't a great match for our hardware: the TCS and TES have to agree on the Patch URB entry layout. Since we store data as per-patch slots followed by per-vertex slots, changing the number of per-patch slots can significantly alter the layout. This can easily happen with SSO. To handle this, we store the [Patch]OutputsWritten and [Patch]InputsRead bitfields in the TCS/TES program keys, introducing program recompiles. brw_upload_programs() decides the layout for both TCS and TES, and passes it to brw_upload_tcs/tes(), which store it in the key. When creating the NIR for a shader specialization, we override nir->info.inputs_read (and friends) to the program key's values. Since everything uses those, no further compiler changes are needed. This also replaces the hack in brw_create_nir(). To avoid recompiles, brw_precompile_tes() looks to see if there's a TCS in the linked shader. If so, it accounts for the TCS outputs, just as brw_upload_programs() would. This eliminates all recompiles in the non-SSO case. In the SSO case, there should only be recompiles when using a TCS and TES that have different input/output interfaces. Fixes Piglit's mix-and-match-tcs-tes test. v2: Pull the brw_upload_programs code into a brw_upload_tess_programs() helper function (requested by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-22 17:22:11 -08:00
Kenneth Graunke	01b1b44d31	i965: Defer input lowering for tessellation stages until specialization. With tessellation shaders and SSO, we won't be able to always decide on VUE map layouts at LinkProgram time. Unfortunately, we have to delay it until shader specialization time. However, uniform lowering cannot be deferred - brw_codegen_*_prog() reads nir->num_uniforms. Fortunately, we don't need to defer it - uniform, system value, atomic, and sampler lowering can safely stay where it is. This patch moves those to brw_lower_nir()'s only caller, renames brw_lower_nir() to brw_nir_lower_io(), and introduces calls to that. For non-tessellation stages, I chose to call brw_nir_lower_io() from brw_create_nir(), so it's still done at the same time. There's no need to defer it, and doing it at LinkProgram time is nice. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-22 17:22:10 -08:00
Kenneth Graunke	8bc073d601	i965: Automatically create a passthrough TCS when needed. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-22 17:22:09 -08:00
Kenneth Graunke	4ec3f0f4b9	i965: Start program_string_id from 1, not 0. This way, I can safely use brw_tcs_prog_key::program_string_id == 0 to mean "not filled out because no program exists", which avoids the need for adding an extra boolean to that struct. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-22 17:22:08 -08:00
Kenneth Graunke	2432643e89	i965: Create and set a new brw_tcs_prog_data::outputs_written field. When the application hasn't supplied a TCS, and we have to create one, we need to know what VS outputs to copy to TES inputs. To do this, we create a new program key field, and set it to the TES InputsRead bitfield. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-22 17:22:06 -08:00
Kenneth Graunke	239a4bdcd4	i965: Upload HS push constants whenever default tess. levels change. When using tessellation on OpenGL without a TCS, default values for gl_TessLevelOuter/gl_TessLevelInner are provided via the API. Core Mesa will flag ctx->DriverFlags.NewDefaultTessLevels whenever those values change. We add a corresponding BRW_NEW_DEFAULT_TESS_LEVELS flag and hook it up to HS push constants (which will be used to upload these default values to the autogenerated TCS). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-22 17:22:05 -08:00
Kenneth Graunke	0d5cb4aef4	i965: Only call _mesa_load_state_parameters if prog exists. With the automatic-TCS creation, we won't have a prog, but still need to upload push constants. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-22 17:22:04 -08:00
Kenneth Graunke	a122af696c	i965: Switch TCS gl_program/gl_shader_program checks over to TES. Tessellation control shaders are optional, but evaluation shaders will always be present when using tessellation. However, we'll always enable the TCS (HS) hardware stage when using tessellation - we'll just create a program on the fly. That program, however, won't have a gl_program or gl_shader_program. So we shouldn't check brw->tess_ctrl_program or shader_prog->_LinkedShaders[MESA_SHADER_TESS_CTRL] - if we want to know whether tessellation is enabled, we should look for a TES. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-22 17:22:03 -08:00
Kenneth Graunke	9d35fecfb9	i965: Remove unnecessary brw->tess_ctrl_program assertions. This is trying to enforce the fact that the hardware requires HS, TE, and DS to be enabled or disabled together. But it's kind of an ad-hoc attempt, and not too useful. More importantly, we aren't going to have a gl_shader_program for the TCS which is automatically generated when none is present. (We'll just handle it in the driver backend.) So, these will trip for no reason. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-22 17:22:02 -08:00
Kenneth Graunke	f46dbfaed9	i965: Consolidate BRW_NEW_TESS_{CTRL,EVAL}_PROGRAM flags. For several reasons, I don't think it's particularly useful to have separate flags: 1. Most of the time, tessellation shaders are paired, so both will be replaced at the same time. 2. The data layout is tightly coupled. Both need to agree on the number of per-patch slots in the VUE map. Even adding extra TCS outputs that aren't read by the TES will trigger the need for recompiles. 3. The TCS is optional from an API perspective, but required by the hardware whenever tessellation is enabled. So, atoms that deal with the TCS must check brw->tess_eval_program (BRW_NEW_TESS_EVAL_PROGRAM?) rather than brw->tess_ctrl_program to tell whether tessellation is enabled. So, not only is it unlikely to be useful, it's a bit confusing to get right. Simply using one flag for both simplifies this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-22 17:22:00 -08:00
Kenneth Graunke	8498cb4a45	i965: Only call brw_upload_tcs/tes_prog when using tessellation. If there's no evaluation shader, tessellation is disabled. The upload functions would just bail. Instead, don't bother calling them. This will simplify the optional-TCS case a bit, as brw_upload_tcs can assume that we're doing tessellation. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-22 17:21:59 -08:00
Kenneth Graunke	2bcf989407	nir: Add a glsl_vec_type() helper. I need access to glsl_type::vec2_type from C. Wrapping vec() also gives us access to vec3 if we need it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-22 17:21:47 -08:00
Kenneth Graunke	0daf51e130	nir: Use writemasked store_vars in glsl_to_nir. Instead of performing the read-modify-write cycle in glsl->nir, we can simply emit a partial writemask. For locals, nir_lower_vars_to_ssa will do the equivalent read-modify-write cycle for us, so we continue to get the same SSA values we had before. Because glsl_to_nir calls nir_lower_outputs_to_temporaries, all outputs are shadowed with temporary values, and written out as whole vectors at the end of the shader. So, most consumers will still not see partial writemasks. However, nir_lower_outputs_to_temporaries bails for tessellation control shader outputs. So those remain actual variables, and stores to those variables now get a writemask. nir_lower_io passes that through. This means that TCS outputs should actually work now. This is a functional change for tessellation control shaders. v2: Relax the nir_validate assert to allow partial writemasks. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-12-22 15:57:59 -08:00
Kenneth Graunke	7d539080c1	nir: Add a writemask to store intrinsics. Tessellation control shaders need to be careful when writing outputs. Because multiple threads can concurrently write the same output variables, we need to only write the exact components we were told. Traditionally, for sub-vector writes, we've read the whole vector, updated the temporary, and written the whole vector back. This breaks down with concurrent access. This patch prepares the way for a solution by adding a writemask field to store_var intrinsics, as well as the other store intrinsics. It then updates all produces to emit a writemask of "all channels enabled". It updates nir_lower_io to copy the writemask to output store intrinsics. Finally, it updates nir_lower_vars_to_ssa to handle partial writemasks by doing a read-modify-write cycle (which is safe, because local variables are specific to a single thread). This should have no functional change, since no one actually emits partial writemasks yet. v2: Make nir_validate momentarily assert that writemasks cover the complete value - we shouldn't have partial writemasks yet (requested by Jason Ekstrand). v3: Fix accidental SSBO change that arose from merge conflicts. v4: Don't try to handle writemasks in ir3_compiler_nir - my code for indirects was likely wrong, and TTN doesn't generate partial writemasks today anyway. Change them to asserts as requested by Rob Clark. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> [v3]	2015-12-22 15:57:59 -08:00
Tapani Pälli	50fc4a9256	mesa: update gl_HelperInvocation support status in docs Was enabled for i965 and nvc0 by following commits: `c875e3cdd2` `39f51ec96f` Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com> Acked-by: Matt Turner <mattst88@gmail.com>	2015-12-22 15:14:02 +02:00
Tapani Pälli	f2be5b8ba4	mesa: fix interface matching done in validate_io Patch makes following changes for interface matching: - do not try to match builtin variables - handle swizzle in input name, as example 'a.z' should match with 'a' - add matching by location - check that amount of inputs and outputs matches These changes make interface matching tests to work in: ES31-CTS.sepshaderobjs.StateInteraction The test still does not pass completely due to errors in rendering output. IMO this is unrelated to interface matching. Note that type matching is not done due to varying packing which changes type of variable, this can be added later on. Preferably when we have quicker way to iterate resources and have a complete list of all existed varyings (before packing) available. v2: add spec reference, return true on desktop since we do not have failing cases for it, inputs and outputs amount do not need to match on desktop. v3: add some more spec reference, remove desktop specifics since not used for now on desktop, add match by location qualifier, rename input_stage and output_stage as producer and consumer as suggested by Timothy. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-12-22 14:50:25 +02:00
Iago Toral Quiroga	5f8bb6fbb1	mesa: add SSBOs to the list of fragment shader side effects The i965 driver uses this function to decide if it can disable the FS unit in the absence of color/depth writes. We don't want to disable the unit in the presence of SSBOs, since the fragment shader could be writing to it. We could go a step further and check not just for the presence of SSBOs but also if the shader code writes to them. Does not look worth the trouble though and we are not doing this for atomic buffers either anyway. v2: put this into a generic _mesa_active_fragment_shader_has_side_effects function instead of having one specific for SSBOs (Jason). Fixes the following CTS test: ES31-CTS.shader_storage_buffer_object.advanced-usage-sync-vsfs Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-12-22 12:38:48 +01:00
Iago Toral Quiroga	9bbdd0eda4	i965: Ensure FS execution in presence of atomic buffers On Haswell we need to set the UAV_ONLY WM state bit when there are no colour or depth buffer writes and on all hardware we should set the early depth/stencil control field to PSEXEC unless early fragment tests are enabled to make sure that the fragment shader is executed regardless of whether per-fragment tests pass or not as the spec requires. So far we have been doing this for images only, but we should apply the same treatment to all side effectful scenarios. Suggested by Curro. This is not strictly required for compliance with the original ARB_shader_atomic_counters extension, it's only necessary to get the execution semantics specified in GL4.2+ right. v2: - Mark active_fs_has_side_effects as constant. (Curro) - Mention that this is only only necessary to get the execution semantics specified in GL4.2+ right. (Curro) Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-12-22 12:38:48 +01:00
Iago Toral Quiroga	1a95b87dad	mesa: Add a _mesa_active_fragment_shader_has_side_effects helper Some drivers can disable the FS unit if there is nothing in the shader code that writes to an output (i.e. color, depth, etc). Right now, mesa has a function to check for atomic buffers and the i965 driver also checks for images. Refactor this logic into a generic function that we can use for any source of side effects in a fragment shader. Suggested by Jason. v2: - Use '_Shader', as suggested by Tapani, to fix the following CTS test: ES31-CTS.shader_atomic_counters.advanced-usage-many-draw-calls2 Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-12-22 12:38:48 +01:00
Kenneth Graunke	57f7c85dcf	i965: Implement gl_PatchVerticesIn by baking it into brw_tcs_prog_key. The hardware provides us no decent way of getting at the number of input vertices in the patch topology from the tessellation control shader. It's actually very surprising - normally this sort of information would be available in the thread payload. For the precompile, we guess that the number of vertices will be the same for both the input and output patches. This usually seems to be the case. On Gen8+, we could pass in an extra push constant containing this value. We may be able to do that on Haswell too. It's quite a bit trickier on Ivybridge, however. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-22 02:12:05 -08:00
Kenneth Graunke	24be658d13	i965: Add tessellation control shaders. The TCS is the first tessellation shader stage, and the most complicated. It has access to each of the control points in the input patch, and computes a new output patch. There is one logical invocation per output control point; all invocations run in parallel, and can communicate by reading and writing output variables. One of the main responsibilities of the TCS is to write the special gl_TessLevelOuter[] and gl_TessLevelInner[] output variables which control how much new geometry the hardware tessellation engine will produce. Otherwise, it simply writes outputs that are passed along to the TES. We run in SIMD4x2 mode, handling two logical invocations per EU thread. The hardware doesn't properly manage the dispatch mask for us; it always initializes it to 0xFF. We wrap the whole program in an IF..ENDIF block to handle an odd number of invocations, essentially falling back to SIMD4x1 on the last thread. v2: Update comments (requested by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-22 02:12:05 -08:00
Kenneth Graunke	a5038427c3	i965: Add tessellation evaluation shaders The TES is essentially a post-tessellator VS, which has access to the entire TCS output patch, and a special gl_TessCoord input. Otherwise, they're very straightforward. This patch implements SIMD8 tessellation evaluation shaders for Gen8+. The tessellator can generate a lot of geometry, so operating in SIMD8 mode (8 vertices per thread) is more efficient than SIMD4x2 mode (only 2 vertices per thread). I have another patch which implements SIMD4x2 mode for older hardware (or via an environment variable override). We currently handle all inputs via the pull model. v2: Improve comments (suggested by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-22 02:12:05 -08:00
Timothy Arceri	54daffef16	nir: remove field only used in GLSL IR when assigning varying locations This field is used as a flag to optimise out any varyings that don't have a matching varying on the other side of the interface. The value should be the same for all varyings (except for SSO but we can't optimise those) by the time they reach nir and are no longer be needed. Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-12-22 17:08:03 +11:00
Ben Skeggs	a8c4747602	nouveau: enable use of new kernel interfaces Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-22 13:24:17 +10:00
Ben Skeggs	5b614b141a	nvc0: remove use of deprecated sw class identifier Also emits a method to properly bind the class to a subchannel, which was missing previously. The kernel currently doesn't care, but this will break if it ever decides to (ie. to support multiple sw classes). Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-22 13:24:13 +10:00
Ben Skeggs	33a3ba8c59	nv50: fix g98+ vdec class allocation The kernel previously exposed incorrect classes for some of the chipsets that this code supports. It no longer does, but the older object ioctls have compatibility to avoid breaking userspace. This needs to be fixed before switching over to the newer interfaces. Rather than hardcoding chipset->class like the rest of the driver does, this makes use of (new) sclass queries to determine what's available. v2. - update to use symbolic class identifier from <nvif/class.h> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-22 13:24:10 +10:00
Ben Skeggs	791a3e1850	nouveau: remove use of deprecated nouveau_device_wrap() Switching to the newer libdrm entry-points tells libdrm that it's OK to make use of newer kernel interfaces. We want to be able to isolate any bugs to either the interfaces changes, or the use of NVIF itself. As such, this commit has a slight hack which forces libdrm to continue using the older kernel interfaces. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-22 13:24:08 +10:00
Ben Skeggs	323d4da372	nouveau: fix screen creation failure paths The winsys layer would attempt to cleanup the nouveau_device if screen init failed, however, in most paths the pipe driver would have already destroyed it, resulting in accesses to freed memory etc. This commit fixes the problem by allowing the winsys to detect whether the pipe driver's destroy function needs to be called or not. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-22 13:24:05 +10:00
Ben Skeggs	6c1bfff66c	nouveau: return nouveau_screen from hw-specific creation functions Kills off a void cast. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-22 13:24:03 +10:00
Ben Skeggs	1a9ec8e062	nouveau: remove use of deprecated nouveau_device::drm_version v2. update for libdrm nouveau_drm::lib_version removal Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-22 13:24:01 +10:00
Ben Skeggs	a458ffacba	nouveau: remove use of deprecated nouveau_device::fd Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-22 13:23:59 +10:00
Ben Skeggs	a8abdf2f35	nouveau: bump required libdrm version to 2.4.66 v2. forgot bump for non-gallium driver Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-22 13:23:27 +10:00
Dave Airlie	d19106649f	r600: fix viewport clipping handling (v2) If oViewport is written, vertex reuse need to be turned off. If oViewport is constant, vertex reuse is fine, and VPORT_PROVOKE_DISABLE need to be set. (we don't have enough info to program VPORT_PROVOKE). Fixes: arb_viewport_array-render-viewport-2 and some CTS tests. v2: drop vport provoke write, drop initial state writing this on evergreen, only program it on evergreen. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-12-22 09:09:56 +10:00
Dave Airlie	73e7c5fd7f	radeonsi: fix viewport clipping handling. (v2) If oViewport is written, vertex reuse need to be turned off. If oViewport is constant, vertex reuse is fine, and VPORT_PROVOKE_DISABLE need to be set. (We don't know if oViewport is constant so we skip this.) Fixes: arb_viewport_array-render-viewport-2 and some CTS tests. v2: drop writing to provoke disable, drop write in initial state. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-22 09:09:52 +10:00
Dave Airlie	847f91f4e5	r600: drop VTX_CNT_EN write from initial state we always program this in shader stages atom now. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-22 09:09:48 +10:00
Nicolai Hähnle	ea8c0b16ec	gallium/radeon: fix regression in a number of driver queries This rather silly mistake was introduced by commit `01910676`. Cc: "11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-12-21 15:47:10 -05:00
Ben Widawsky	0865088cca	i965: Only apply CS stall workaround pre-SKL As per the docs. Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-21 10:42:42 -08:00
Ilia Mirkin	f7b7145123	glx/dri3: a drawable might not be bound at wait time A trace of Alien Isolation hit this on nouveau. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-21 06:43:58 -05:00
Emil Velikov	37186c43b5	docs: add news item and link release notes for 11.0.8 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2015-12-21 10:13:17 +00:00
Emil Velikov	1c1994da58	docs: add sha256 checksums for 11.0.8 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `b9b19162ee`)	2015-12-21 10:11:28 +00:00
Emil Velikov	bb5adf065f	docs: add release notes for 11.0.8 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `261daab6b4`)	2015-12-21 10:11:27 +00:00
Kristian Høgsberg Kristensen	220ac9337b	vk: Only require wc bo mmap for !llc GPUs	2015-12-19 22:25:57 -08:00
Kristian Høgsberg Kristensen	b49aaf5de0	vk: Remove stale 48 bit addresses FIXMEs This has worked fine for a long time.	2015-12-19 22:20:45 -08:00
Kristian Høgsberg Kristensen	c4802bc44c	vk/gen8: Implement VkEvent for gen8 We use PIPE_CONTROL for setting and resetting the event from cmd buffers and MI_SEMAPHORE_WAIT in polling mode for waiting on an event.	2015-12-19 22:17:19 -08:00
Dave Airlie	97eee90547	glsl: count attributes for vertex inputs properly. This function deals with vertex inputs and fragment outputs, so we should count the attribute locations correctly for the vertex inputs. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-19 17:57:43 +10:00
Kenneth Graunke	14193e4643	ralloc: Fix ralloc_adopt() to the old context's last child's parent. I was cleverly using one iteration to obtain a pointer to the last item in ralloc's singly list child list, while also setting parents. Unfortunately, I forgot to set the parent on that last item. Cc: "11.1 11.0 10.6" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-12-18 23:30:51 -08:00
Dave Airlie	b476c587e3	glsl: fix transform feedback for 64-bit outupts. This fixes the calculations for transform feedback for doubles. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-19 11:42:26 +10:00
Dave Airlie	64cfacf319	glsl: fix partial marking for fp64 types. This doubles the element width for the types that are greater than 2 elements wide. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-19 11:42:26 +10:00
Dave Airlie	1fc39dae22	glsl: only update doubles inputs for vertex inputs. This doesn't apply to other stages. This is only used in the mesa/st code, which needs further fixes. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-19 11:42:25 +10:00
Kristian Høgsberg Kristensen	8ac46d84ff	vk: Fix check for I915_PARAM_MMAP_VERSION Comparing the wrong thing for < 1.	2015-12-18 17:24:19 -08:00
Eric Anholt	f1fb85e544	vc4: Do instruction scheduling on the QIR to hide texture fetch latency. This is a rewrite of vc4_opt_qpu_schedule.c to operate on QIR. Texture fetch can probably take as much as the rest of the cycles of the program, so it's important to hide our other cycles during it (which is hard to do after register allocation). Also, we can queue up multiple texture requests before collecting the resulting samples, so that we keep the texture unit busy more of the time. High-settings openarena performance +2.35849% +/- 0.221154% (n=7). Also about 2-3% on the multiarb demo. 8 piglit tests (ext_framebuffer_multisample accuracy depthstencil) go from failing in rendering to failing in register allocation, but hopefully I can fix that up with some better register pressure handling here. total instructions in shared programs: 87723 -> 88448 (0.83%) instructions in affected programs: 78411 -> 79136 (0.92%) total estimated cycles in shared programs: 276583 -> 246306 (-10.95%) estimated cycles in affected programs: 265691 -> 235414 (-11.40%)	2015-12-18 17:12:10 -08:00
Eric Anholt	5278c64de5	vc4: Fix latency handling for QPU texture scheduling. There's only high latency between a complete texture fetch setup and collecting its result, not between each step of setting up the texture fetch request.	2015-12-18 17:09:03 -08:00
Eric Anholt	960f48809f	vc4: Keep sample mask writes from being reordered after TLB writes Fixes a regression I noticed after introducing scheduling on the QIR. Cc: "11.1" <mesa-stable@lists.freedesktop.org>	2015-12-18 17:09:03 -08:00
Dave Airlie	5dc22cadb5	glsl: fix count_attribute_slots to allow for different 64-bit handling So vertex shader input attributes are handled different than internal varyings between shader stages, dvec3 and dvec4 only count as one slot for vertex attributes, but for internal varyings, they count as 2. This patch comments all the uses of this API to clarify what we pass in, except one which needs further investigation Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-12-19 12:00:00 +11:00
Dave Airlie	69ea66231e	glsl: use dual slot helper in the linker code. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-12-19 11:59:55 +11:00
Dave Airlie	d97b060e6f	glsl/fp64: add helper for dual slot double detection. The old function didn't work for matrices, and we need this in other places to fix some other problems, so move to a helper in glsl type and fix the one user so far. A dual slot double is one that has 3 or 4 components in it's base type. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-12-19 11:59:49 +11:00
Dave Airlie	9fbcd8e847	glsl: pass stage into mark function Don't use a bool here, as for some 64-bit fixes we need the stage. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-12-19 11:59:42 +11:00
Rob Herring	b201a6ed9f	freedreno/ir3: fix 32-bit builds with pointer-to-int-cast error enabled Android builds with -Werror=pointer-to-int-cast causing an error on 32-bit builds. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-12-18 14:01:07 -05:00
Matt Turner	bb9eb59933	i965/vec4: Optimize predicate handling for any/all. For a select whose condition is any(v), instead of emitting cmp.nz.f0(8) null<1>D g1<0,4,1>D 0D mov(8) g7<1>.xUD 0x00000000UD (+f0.any4h) mov(8) g7<1>.xUD 0xffffffffUD cmp.nz.f0(8) null<1>D g7<4,4,1>.xD 0D (+f0) sel(8) g8<1>UD g4<4,4,1>UD g3<4,4,1>UD we now emit cmp.nz.f0(8) null<1>D g1<0,4,1>D 0D (+f0.any4h) sel(8) g9<1>UD g4<4,4,1>UD g3<4,4,1>UD Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-18 13:20:13 -05:00
Matt Turner	c8a74e3a4e	nir: Delete bany, ball, fany, fall. As in the previous patches, these can be implemented as any(v) -> any_nequal(v, false) all(v) -> all_equal(v, true) and their removal simplifies the code in the next patch. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-18 13:20:13 -05:00
Matt Turner	21cd298aec	glsl: Implement all(v) as all_equal(v, true). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-18 13:20:13 -05:00
Matt Turner	2268a50ffd	glsl: Remove ir_unop_any. The GLSL IR to TGSI/Mesa IR paths for any_nequal have the same optimizations the ir_unop_any paths had. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-18 13:20:12 -05:00
Matt Turner	249bb89617	glsl: Implement any(v) as any_nequal(v, false). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-18 13:20:12 -05:00
Nicolai Hähnle	0a6a17b9d7	gallium/radeon: only dispose locally created target machine in radeon_llvm_compile Unify the cleanup paths of the function rather than duplicating code. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-18 12:17:40 -05:00
Jordan Justen	5e82a91324	anv/gen8: Add support for gl_NumWorkGroups Co-authored-by: Kristian Høgsberg <krh@bitplanet.net> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-18 01:45:11 -08:00
Jason Ekstrand	d7f66f9f6f	nir/spirv: Array lengths are constants not literals	2015-12-17 16:36:29 -08:00
Roland Scheidegger	61e5f8d073	gallium/util: (trivial) include p_shader_tokens.h in u_simple_shaders.h as it uses definition from it (enum tgsi_return_type).	2015-12-18 01:02:16 +01:00
Roland Scheidegger	6743c68a11	draw: fix clip test with NaNs NaNs mean it should be clipped, otherwise the NaNs might get passed to the next stages (if clipping didn't happen for another reason already), which might cause all kind of problems. The llvm path got this right already (possibly by luck), but this isn't used when there's a gs active. Found by code inspection, verified with some hacked piglit test and some more hacked debug output. (Note the clipper can still itself incorrectly generate NaN and INF position values in its output prims (at least after w divide / viewport transform) even if the inputs weren't NaNs, if the position data of the vertices is "sufficiently bad".) Reviewed-by: Brian Paul <brianp@vmware.com>	2015-12-18 00:57:07 +01:00
Roland Scheidegger	44e87b7b7b	draw: fix pstipple and aaline stages wrt sampler_views/samplers Those stages only really work for OGL-style texturing (so number of samplers and views mostly the same, certainly for the max values). These get often set up all at once, thus there might be max number of both even if all of them are just NULL. We must not set the max number of samplers and views to the same value since that will lead to terrible things if a driver supports more views than samplers (and the state tracker set up all the views). (This will not make these stages magically work if a shader uses dx10-style texturing, they might still replace an actually used sview in that case.) Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-12-18 00:55:35 +01:00
Jason Ekstrand	1473a8dc6f	anv/formats: Add more 64-bit formats	2015-12-17 13:51:09 -08:00
Jason Ekstrand	167809365b	anv/formats: Add more PACK32 formats	2015-12-17 13:44:50 -08:00
Miklós Máté	6723b61753	swrast: move two global defines to the only place where they are used Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-17 12:09:58 -08:00
Miklós Máté	555f67c3d7	mesa: improve debug log in atifragshader Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-17 12:09:58 -08:00
Miklós Máté	5150d56ec4	program: fix comment about the fog formula Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-17 12:09:58 -08:00
Miklós Máté	7279453da5	mesa: Don't leak ATIfs instructions in DeleteFragmentShader Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-17 12:09:58 -08:00
Jason Ekstrand	952bf05897	anv/image: Properly report buffer features	2015-12-17 11:52:31 -08:00
Jason Ekstrand	3395ca17d1	isl: Add a is_storage_image_format helper	2015-12-17 11:45:04 -08:00
Jason Ekstrand	b1325404c5	anv/device: Handle zero-sized memory allocations	2015-12-17 11:00:38 -08:00
Oded Gabbay	6e44bbe0f5	configura.ac: fix test for SSE4.1 assembler support This patch modifies the SSE4.1 test in configure.ac to use a global variable to initialize vector variables. In addition, we now return the value of the computation instead of 0. This is done so gcc 4.9 (and lower) won't optimize the SSE4.1 assembly instructions (when using -O1 and higher), because then the configure test might incorrectly pass even though the assembler doesn't support the SSE4.1 instructions (the test will pass because the compiler does support the intrinsics). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91806 Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2015-12-17 09:37:24 +00:00
Jonathan Gray	4ef44bb484	configure: check for python2.7 for PYTHON2 Check for a 'python2.7' binary, 'python' and 'python2' are not provided by the OpenBSD python 2.7.x packages. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2015-12-17 09:37:24 +00:00
Jonathan Gray	7f585a6a98	configure.ac: use pkg-config for libelf Use PKG_CHECK_MODULES to get the flags to link libelf v2: keep AC_CHECK_LIB as a fallback for elfutils provided libelf that doesn't install a pkg-config file. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2015-12-17 09:37:24 +00:00
Jordan Justen	e97b207654	i965/screen: Allow OpenGLES 3.1 for gen8+ OpenGLES 3.1 cannot be enabled for gen 7 (Ivy Bridge, Haswell) since they are still missing ARB_stencil_texturing. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>	2015-12-16 20:37:40 -08:00
Jordan Justen	3b5d442661	i965: Enable compute shaders in more cases for OpenGLES 3.1 Previously we were checking the desktop OpenGL ARB_compute_shader requirements, but for OpenGLES 3.1, the requirements are lower. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>	2015-12-16 20:37:23 -08:00
Jordan Justen	3e8a6e468b	main/version: Don't require ARB_compute_shader for OpenGLES 3.1 The OpenGL ARB_compute_shader extension specfication requires at least 1024 for GL_MAX_COMPUTE_WORK_GROUP_INVOCATIONS, whereas OpenGLES 3.1 only required 128. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-16 20:36:16 -08:00
Jordan Justen	a9d934726e	main: Allow compute shaders to be compiled with OpenGLES 3.1 Previous OpenGLES 3.1 testing had been done when ARB_compute_shader was overridden to enabled. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>	2015-12-16 20:35:55 -08:00
Jordan Justen	3507d0b7f9	main: Add MESA_VERBOSE=api for LinkProgram & UseProgram Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-12-16 20:35:51 -08:00
Matt Turner	257fb76403	ir_to_mesa: Skip useless comparison instructions. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-16 19:59:05 -08:00
Kenneth Graunke	4a5cff24d7	glsl: Remove inverse() from GLSL 1.20 and 1.30. I apparently regressed this when rewriting the built-ins using ir_builder, in `76d2f73643`. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93387 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-12-16 19:32:21 -08:00
Samuel Pitoiset	695ae816da	nv50: free memory allocated by the prog which reads MP perf counters This fixes a memory leak introduced in `6a9c151` ("nv50: add compute-related MP perf counters on G84+") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1" <mesa-stable@lists.freedesktop.org>	2015-12-16 21:52:43 -05:00
Brian Paul	f992d02ba2	st/osmesa: add OSMesaCreateContextAttribs() function As with the previous commit, except for gallium. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-12-16 19:39:05 -07:00
Brian Paul	a34e7612dc	osmesa: add new OSMesaCreateContextAttribs function This allows specifying a GL profile and version so one can get a core- profile context. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-12-16 19:38:51 -07:00
Brian Paul	c2c0983215	svga: don't use debug code in update_state() in release builds Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-12-16 19:38:15 -07:00
Jason Ekstrand	c643e9cea8	anv/state: Allow levelCount to be 0 This can happen if the client is creating an image view of a textureable surface and they only ever intend to render to that view.	2015-12-16 17:34:57 -08:00
Samuel Pitoiset	aeee7f2a4d	nv50,nvc0: free memory allocated by performance metrics The destroy_query() helper was actually never called. This fixes a memory leak while monitoring performance metrics. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1" <mesa-stable@lists.freedesktop.org>	2015-12-16 23:03:08 +01:00
Samuel Pitoiset	9aca60bfb0	nvc0: free memory allocated by the prog which reads MP perf counters This fixes a long time ago memory leak (even before all my query related changes). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-16 22:00:57 +01:00
Samuel Pitoiset	8022c7480e	nvc0: fix metric-achieved_occupancy calculation on Kepler The maximum number of resident warps per multiprocessor is 64 on Kepler instead of 48 on Fermi. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-16 22:00:57 +01:00
Christian König	a87a1420d6	st/va: remove fence handling v3 It's nonsense to drain the pipeline like this. v2: keep the drain for DMA-buf exports. v3: flush before the export and after compositing and add TODO comment. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Julien Isorce <j.isorce@samsung.com> Tested-by: Julien Isorce <j.isorce@samsung.com>	2015-12-16 21:13:42 +01:00
Neil Roberts	61cdb7665f	Revert "i965: Use MESA_FORMAT_B8G8R8X8_SRGB for RGB visuals" This reverts commit `839793680f`. The patch was breaking DRI3 because driGLFormatToImageFormat does not handle MESA_FORMAT_B8G8R8X8_SRGB which ended up making it fail to create the renderbuffer and it would later crash. It's not trivial to add this format because there is no __DRI_IMAGE_FORMAT nor __DRI_IMAGE_FOURCC define for the format either. I'm not sure how difficult adding this would be and whether adding a new format would require some sort of new version for DRI. Seeing as this might take a while to fix I think it makes sense to just revert the patch in the meantime in order to avoid regressing master. It is also not handled in intel_gles3_srgb_workaround and there may be other cases where it breaks. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93388 Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-12-16 17:35:33 +00:00
Neil Roberts	8c5310da9d	i965: Fix crash when calling glViewport with no surface bound If EGL_KHR_surfaceless_context is used then glViewport can be called with NULL for the draw and read surfaces. This was previously causing a crash because the i965 driver tries to use this point to invalidate the surfaces and it was derferencing the NULL pointer. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93257 Cc: Nanley Chery <nanley.g.chery@intel.com> Cc: "11.1" <mesa-stable@lists.freedesktop.org> Tested-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2015-12-16 16:39:29 +00:00
Neil Roberts	4c7c9e4602	mesa/blit: Don't require the same format for mulitisample blits Previously the GL spec required that whenever glBlitFramebuffer is used with either buffer being multisampled, the internal formats must match. However the GL 4.4 spec was later changed to remove this restriction. In the section entitled “Changes in the released Specification of July 22, 2013” it says: “Relax BlitFramebuffer in section 18.3.1 so that format conversion can take place during multisample blits, since drivers already allow this and some apps depend on it.” If most drivers already allowed this in earlier versions I think it's safe to assume that this is a spec bug and it should also be allowed in all versions. This patch just removes the restriction on desktop GL. For GLES there are conformance tests that assert the previous behaviour so it is probably safer to leave it in. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92706 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-16 16:20:36 +00:00
Julien Isorce	89eb342def	st/va: retrieve size from the temporary img variable "image" is not ready yet since it will be set at the end of the function by: image = img; Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian K<C3><B6>nig <christian.koenig@amd.com>	2015-12-16 14:12:31 +00:00
Roland Scheidegger	8e195a6251	draw: handle edge flags in llvm path We just ignored them altogether. While this feature is rather old-fashioned supporting it is actually rather trivial. This fixes the associated piglit tests (2 gl-1.0-edgeflag, 2 gl-2.0-edgeflag and all (7) of point-vertex-id). v2: comment fixes, and make the use of the edgeflag in clipmask consistent with when it's actually there (should be impossible to hit a case where the difference would actually matter but still...) Reviewed-by: Brian Paul <brianp@vmware.com>	2015-12-16 03:55:25 +01:00
Roland Scheidegger	13c0b1c780	draw: don't set start_instance and instance id for pt emit This just adds confusion, these parameters are used when fetching vertices by translate, but certainly not when emitting hw vertices for drivers, they make no sense there (setting them has no consequences otherwise since there won't be any elements with instance_divisor set). So just set them to 0 (the draw_pipe_vbuf code for emitting vertices when the draw pipeline is run already does exactly that). Also while here do some whitespace cleanup. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-12-16 03:55:14 +01:00
Jason Ekstrand	b2fe8b4673	nir/spirv: Add a missing break statement	2015-12-15 17:24:18 -08:00
Jason Ekstrand	1c51d91bfe	anv/pipeline: Allow the user to pass a null MultisampleCreateInfo According to section 5.2 of the Vulkan spec, this is allowed for color-only rendering pipelines.	2015-12-15 16:26:10 -08:00
Jason Ekstrand	d61ff1ed08	anv/descriptor_set: Initialize immutable_samplers to NULL Previously this wasn't a problem. However, with the new API update, descriptor sets can now be sparse so the client doesn't have to provide an entry for every binding. This means that it's possible for a binding to be uninitialized other than the memset. In that case, we want to have a null array of immutable samplers.	2015-12-15 16:24:22 -08:00
Jason Ekstrand	d7cb1634d2	nir/lower_system_values: Refactor and use the builder. Now that we have a helper in the builder for system values and a helper in core NIR to get the intrinsic opcode, there's really no point in having things split out into a helper function. This commit "modernizes" this pass to use helpers better and look more like newer passes. Reviewed-by: Eric Anholt <eric@anholt.net>	2015-12-15 14:12:31 -08:00
Jason Ekstrand	f6910f072a	nir/builder: Add a load_system_value helper While we're at it, go ahead and make nir_lower_clip use it. Reviewed-by: Eric Anholt <eric@anholt.net>	2015-12-15 14:12:31 -08:00
Jason Ekstrand	ca5be008bc	nir/lower_system_values: Stop supporting non-SSA The one user of this (i965) only ever calls it while in SSA form. Reviewed-by: Eric Anholt <eric@anholt.net>	2015-12-15 14:12:31 -08:00
Samuel Pitoiset	276837cbe4	nvc0: remove old comment related to metric calculations I forgot to remove it when I refactored all performance metrics. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-15 22:49:37 +01:00
Eric Anholt	3858722740	vc4: Add support for dumping executed commands to a file. The VC4_DEBUG=cl,qpu is nice and all, but I want to be able to get more detailed dumps, and to replay the same exact commands in simulation. For that I need a dump with all of the VBOs, shaders, shader recs, etc. This dump can be parsed by vc4-gpu-tools. For now this is only doable from simulator mode, because otherwise we don't have access to the RCL contents generated by the kernel.	2015-12-15 12:05:48 -08:00
Eric Anholt	07570edb98	vc4: Import updated vc4_drm.h with hang state.	2015-12-15 12:02:54 -08:00
Eric Anholt	c5b886b028	vc4: Only update vc4->msaa when the framebuffer changes. Any update here should have been the same as in vc4_set_framebuffer_state(), except for the point where vc4_blit.c temporarily sets different state for its different buffers.	2015-12-15 12:02:53 -08:00
Eric Anholt	f2cf2a63f1	vc4: Don't consider nr_samples==1 surfaces to be MSAA. This is apparently a weirdness of gallium -- nr_samples==1 is occasionally used and means the same thing as nr_samples==0. Fixes a bunch of ARB_framebuffer_srgb blit cases in piglit.	2015-12-15 12:02:53 -08:00
Eric Anholt	da92f16c50	vc4: Fix min() wrapper definition for the simulator's kernel code.	2015-12-15 12:02:53 -08:00
Eric Anholt	02bcb443ee	vc4: Warn instead of abort()ing on exec ioctl failures. It's really harsh to abort() the X Server because of a momentary failure (particularly -ENOMEM). I don't see a way to pass an -ENOMEM up the stack from here, but we can at least log to stderr before proceeding on. Cc: "11.1" <mesa-stable@lists.freedesktop.org>	2015-12-15 12:02:44 -08:00
Jason Ekstrand	28c4ef9d6c	anv/device: Bump the size of the instruction block pool Some CTS test shaders were failing to compile. At some point soon, we really need to make a real pipeline cache and stop using a block pool for this.	2015-12-15 11:49:28 -08:00
Jason Ekstrand	306abbead3	anv/pipeline: Properly set IncludeVertexHandles in 3DSTATE_GS	2015-12-15 11:37:18 -08:00
Jason Ekstrand	2d4b7eda23	nir/spirv: Add support for more CS intrinsics	2015-12-15 10:20:23 -08:00
Jason Ekstrand	1035108a7f	nir/lower_system_values: Add support for computed builtins. In particular, this commit adds support for computing gl_GlobalInvocationID and gl_LocalInvocationIndex from other intrinsics.	2015-12-15 10:20:23 -08:00
Jason Ekstrand	630b9528b3	shader_enums: Add enums for gl_GlobalInvocationID and gl_LocalInvocationIndex	2015-12-15 10:20:23 -08:00
Jason Ekstrand	7ebd84fa4b	nir/lower_system_values: Refactor and use the builder. Now that we have a helper in the builder for system values and a helper in core NIR to get the intrinsic opcode, there's really no point in having things split out into a helper function. This commit "modernizes" this pass to use helpers better and look more like newer passes.	2015-12-15 10:20:23 -08:00
Jason Ekstrand	c26e889a44	nir/builder: Add a load_system_value helper While we're at it, go ahead and make nir_lower_clip use it. Cc: Rob Clark <robclark@gmail.com>	2015-12-15 10:20:23 -08:00
Jason Ekstrand	de67456d6d	nir/lower_system_values: Stop supporting non-SSA The one user of this (i965) only ever calls it while in SSA form.	2015-12-15 10:20:23 -08:00
Andreas Boll	a2140b0571	docs: Replace sourceforge logo with a text link Fixes the following Lintian (Debian package checker) error: privacy-breach-logo usr/share/doc/mesa-common-dev/contents.html (http://sourceforge.net/sflogo.php?group_id=3&type=1) usr/share/doc/mesa-common-dev/thanks.html (http://sourceforge.net/sflogo.php?group_id=3&type=1) The extended description of this tag is: This package creates a potential privacy breach by fetching a logo at runtime. Before using a local copy you should check that the logo is suitable for main. You can get help with determining this by posting a link to the logo and a copy of, or a link to, the logo copyright and license information to the debian-legal mailing list. Please replace any scripts, images, or other remote resources with non-remote resources. It is preferable to replace them with text and links but local copies of the remote resources are also acceptable as long as they don't also make calls to remote services. Please ensure that the remote resources are suitable for Debian main before making local copies of them. Severity: serious, Certainty: possible Check: files, Type: binary, udeb Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-12-15 17:57:25 +01:00
Chad Versace	64f0ee73e0	isl: Add func isl_surf_get_image_offset_sa The function calculates the offset to a subimage within the surface, in units of surface samples. All unit tests pass with `make check`. (Admittedly, though, there are too few unit tests).	2015-12-15 08:46:09 -08:00
Chad Versace	53504b884e	isl: Fix calculation of array pitch for layout GEN4_2D The height of the miptree's right half was not large enough. Found by `make check` in test_isl_surf_get_offset, which is added in the next commit.	2015-12-15 08:46:09 -08:00
Chad Versace	f7e36f9f66	isl: Move it a standalone directory The plan all along was to eventualyl move isl out of the Vulkan directory, because I intended i965 and anvil to share it. A small problem I encountered when attempting to write unit tests for isl precipitated the move. I discovered that it's easier to get isl unit tests to build if I remove the extra, unneeded dependencies injected by src/vulkan/Makefile.am. And the easiest way to remove those unneeded dependencies is to move isl out of src/vulkan. (Unit tests come in subsequent commits).	2015-12-15 08:45:49 -08:00
Nicolai Hähnle	c8d9d289ff	radeonsi: fix perfcounter selection for SI_PC_MULTI_BLOCK layouts The incorrectly computed register count caused lockups. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2015-12-15 11:23:40 -05:00
Nicolai Hähnle	149d049676	gallium/radeon: remove unnecessary test in r600_pc_query_add_result This test is a left-over of the initial development. It is unneeded and misleading, so let's get rid of it. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2015-12-15 11:23:40 -05:00
Nicolai Hähnle	819543adb4	mesa/main: use BITSET_FOREACH_SET in perf_monitor_result_size This should make the code both faster and slightly clearer. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-15 11:23:40 -05:00
Emil Velikov	9c0773958e	docs: add news item and link release notes for 11.1.0 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2015-12-15 15:07:03 +00:00
Emil Velikov	b8394ef3df	docs: add sha256 checksums for 11.0.1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `525f3c2c28`)	2015-12-15 15:07:02 +00:00
Emil Velikov	5497e119a5	docs: Update 11.1.0 release notes Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `5a616125ac`)	2015-12-15 15:07:02 +00:00
Rob Clark	e677b3047b	freedreno/a4xx: fix fragcoord.z + fragdepth It seems like disabling earlyz on a4xx also, by defaults, disables fragcoord.z to the FS. For frag shaders that both read fragcoord(.z) and write fragdepth, we need to set some extra bits to prevent a lockup. This lets us get rid of the hack of disabling fragcoord.z (which prevented 0ad from lockups, but resulted in rendering corruption). Also fixes fbo-depth-sample-compare. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-12-15 09:40:54 -05:00
Rob Clark	cad0920d11	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-12-15 09:39:10 -05:00
Rob Clark	249b2be3bc	freedreno/ir3/cmdline: don't dump nir by default By default we only want the disasm dumped, which we get anyways. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-12-15 09:39:10 -05:00
Christian König	10b7a7c344	st/va: remove nonesense HEVC picture id handling The picture id in this case is a VA-API surface handle, checking for a certain value can't be correct. Signed-off-by: Christian König <christian.koenig@amd.com>	2015-12-15 11:25:02 +01:00
Chris Forbes	af5ca43f26	i965: Allocate URB space for HS and DS stages when required. v2: (by Ken, incorporating feedback from Matt Turner): - Rewrite the push constant allocation code to be clearer. - Only apply the minimum VS entries workaround on Gen 8. v3: (by Ken) - Fix a bug in v2 where we failed to allocate the full push constant space when the number of enabled stages didn't divide the available push constant space evenly. (Any left over space is now allocated to the PS, as it was in v1.) - Fix an off-by-one error in v2's number of enabled stages calculation. - Use DIV_ROUND_UP for nicer formatting. - Line wrapping fixes. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-15 02:16:14 -08:00
Jason Ekstrand	8224571ef8	vec4/generator: Actually pass the sampler into generate_tex This is an artifact of the way the separate samplers/textures series ended up getting sent out and rebased. This should fix a number of CTS tests involving geometry shaders.	2015-12-14 21:13:52 -08:00
Jordan Justen	7edcc59a7b	anv: Rename gs_vec4 to gs_kernel The code generated may be vec4 or simd8 depending on how we start the compiler. To run the GS in SIMD8, set the INTEL_SCALAR_GS environment variable. This was added in: commit `36fd653817` Author: Kenneth Graunke <kenneth@whitecape.org> Date: Wed Mar 11 23:14:31 2015 -0700 i965: Add scalar geometry shader support. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-14 18:23:14 -08:00
Jordan Justen	a3c5c339a8	nir/spirv_to_nir: Use a minimum of 1 for GS invocations glslang is giving us 0, which causes the SIMD8 GS compile to hit an assert. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-14 18:23:14 -08:00
Timothy Arceri	8c0963f9d3	docs: mark input/output block locations as DONE Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2015-12-15 13:10:51 +11:00
Timothy Arceri	0aeb9b3e5e	glsl: add support for explicit locations inside interface blocks This change also adds explicit location support for structs and interfaces which is currently missing in Mesa but is allowed with SSO and GLSL 1.50+. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2015-12-15 13:10:44 +11:00
Timothy Arceri	183c606066	glsl: simplify interface matching This makes the code easier to follow, should be more efficient and will makes it easier to add matching via explicit locations in the following patch. This patch also replaces the hash table with the newer resizable hash table this should be more suitable as the table is likely to only contain a small number of entries. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2015-12-15 13:10:39 +11:00
Roland Scheidegger	8e264765a4	draw: remove clip_vertex from vertex header vertex header had both clip_pos and clip_vertex. We only really need one (clip_pos) because the draw llvm shader would overwrite the position output from the vs with the viewport transformed. However, we don't really need the second one, which was only really used for gl_ClipVertex - if the shader didn't have that the values were just duplicated to both clip_pos and clip_vertex. So, just use this from the vs output instead when we actually need it. Also change clip debug to output both the data from clip_pos and the clipVertex output (if available). Makes some things more complex, some things less complex, but seems more easy to understand what clipping actually does (and what values it uses to do its magic). Reviewed-by: Brian Paul <brianp@vmware.com Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-12-15 02:03:40 +01:00
Roland Scheidegger	1775400a20	draw: use clip_pos, not clip_vertex for the fake guardband xy point clipping Seems obvious now this should use the data from position and not clip_vertex (albeit might not really make a difference). Reviewed-by: Brian Paul <brianp@vmware.com Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-12-15 02:03:40 +01:00
Roland Scheidegger	8575ddb644	draw: rename vertex header members clip -> clip_vertex and pre_clip_pos -> clip_pos. Looks more obvious to me what these values actually represent (so use something resembling the vs output names). Reviewed-by: Brian Paul <brianp@vmware.com Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-12-15 02:03:40 +01:00
Roland Scheidegger	1b22815af6	draw: don't pretend have_clipdist is per-vertex This is just for code cleanup, conceptually the have_clipdist really isn't per-vertex state, so don't put it there (just dependent on the shader). Even though there wasn't really any overhead associated with this, we shouldn't store random shader information in the vertex header. Reviewed-by: Brian Paul <brianp@vmware.com Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-12-15 02:03:40 +01:00
Roland Scheidegger	9e3f2af3c3	draw: use position not clipVertex output for xyz view volume clipping I'm pretty sure this should use position (i.e. pre_clip_pos) and not the output from clipVertex. Albeit piglit doesn't care. It is what we use in the clip test, and it is what every other driver does (as they don't even have clipVertex output and lower the additional planes to clip distances). Reviewed-by: Brian Paul <brianp@vmware.com Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-12-15 02:03:40 +01:00
Jason Ekstrand	f46544dea1	anv: Fix CUBE storage images	2015-12-14 16:59:59 -08:00
Jason Ekstrand	783a21192c	anv: Add support for storage texel buffers	2015-12-14 16:51:12 -08:00
Jason Ekstrand	1f98bf8da0	anv: Pass an isl_format into fill_buffer_surface_state	2015-12-14 16:14:20 -08:00
Jason Ekstrand	091b6156dd	i965/fs: Push small uniform arrays Unfortunately, this also means that we need to use a slightly different algorithm for assign_constant_locations. The old algorithm worked based on the assumption that each read of a uniform value read exactly one float. If it encountered a MOV_INDIRECT, it would immediately bail and push the whole thing. Since we can now read ranges using MOV_INDIRECT, we need to be able to push a series of floats without breaking them up. To do this, we use an algorithm similar to the on in split_virtual_grfs.	2015-12-14 15:58:10 -08:00
Jason Ekstrand	63c313de84	i965/fs: Rename demote_pull_constants to lower_constant_loads	2015-12-14 15:58:10 -08:00
Jason Ekstrand	75f33a6420	i965/vec4: Get rid of the uniform_size array	2015-12-14 15:58:09 -08:00
Jason Ekstrand	eb76f226cf	i965/fs: Use UD type for offsets in VARYING_PULL_CONSTANT_LOAD	2015-12-14 15:58:09 -08:00
Jason Ekstrand	a487f0284f	i965/vec4: Use MOV_INDIRECT instead of reladdr for indirect push constants This commit moves us to an instruction based model rather than a register-based model for indirects. This is more accurate anyway as we have to emit instructions to resolve the reladdr. It's also a lot simpler because it gets rid of the recursive reladdr problem by design. One side-effect of this is that we need a whole new algorithm in move_uniform_array_access_to_pull_constants. This new algorithm is much more straightforward than the old one and is fairly similar to what we're already doing in the FS backend.	2015-12-14 15:58:09 -08:00
Jason Ekstrand	46f5396846	i965/vec4: Inline get_pull_constant_offset It's not really doing enough anymore to justify a helper function.	2015-12-14 15:58:09 -08:00
Jason Ekstrand	9c36c40845	i965/fs: Get rid of the param_size array	2015-12-14 15:58:09 -08:00
Jason Ekstrand	9024353db3	i965/fs: Stop relying on param_size in assign_constant_locations Now that we have MOV_INDIRECT opcodes, we have all of the size information we need directly in the opcode. With a little restructuring of the algorithm used in assign_constant_locations we don't need param_size anymore. The big thing to watch out for now, however, is that you can have two ranges overlap where neither contains the other. In order to deal with this, we make the first pass just flag what needs pulling and handle assigning pull constant locations until later.	2015-12-14 15:58:09 -08:00
Jason Ekstrand	9f46af9e41	i965/fs: Get rid of reladdr We aren't using it anymore.	2015-12-14 15:58:09 -08:00
Jason Ekstrand	a3cd95a884	i965/fs: Use MOV_INDIRECT for all indirect uniform loads Instead of using reladdr, this commit changes the FS backend to emit a MOV_INDIRECT whenever we need an indirect uniform load. We also have to rework some of the other bits of the backend to handle this new form of uniform load. The obvious change is that demote_pull_constants now acts more like a lowering pass when it hits a MOV_INDIRECT.	2015-12-14 15:58:09 -08:00
Jordan Justen	c4219bc6ff	anv/cmd_buffer: Gen 8 requires 64 byte alignment for push constant data See MEDIA_CURBE_LOAD, CURBE Data Start Address & CURBE Total Data Length Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-14 15:39:07 -08:00
Jason Ekstrand	f0313a5569	anv: Add initial support for cube maps This fixes 486 cubemap CTS tests.	2015-12-14 15:36:30 -08:00
Kenneth Graunke	77cc2666b1	i965: Use DIV_ROUND_UP() in gen7_urb.c code. This is a newer convention, which we prefer over ALIGN(x, n) / n. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-14 14:56:14 -08:00
Kenneth Graunke	9f0944d15b	i965: Make TES inputs match TCS outputs. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-14 14:48:29 -08:00
Kenneth Graunke	4fac950010	i965: Force VS -> TCS varyings to use the SSO VUE map layout. The compact VUE map only works when varying packing is in use. Unfortunately, varying packing is disabled for TCS inputs. This is needed to fix Piglit's tcs-input-read-array-interface test. v2: Make lines fit in 80 columns (caught by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-14 14:48:18 -08:00
Kenneth Graunke	bee42cc1f7	i965: Handle TCS outputs and TES inputs. TCS outputs and TES inputs both refer to a common "patch URB entry" shared across all invocations. First, there are some number of per-patch entries. Then, there are per-vertex entries accessed via an offset for the variable and a stride times the vertex index. Because these calculations need to be done in both the vec4 and scalar backends, it's simpler to just compute the offset calculations in NIR. It doesn't necessarily make much sense to use per-vertex intrinsics afterwards, but that at least means we don't lose the per-patch vs. per-vertex information. v2: Use is_input/is_output helpers (suggested by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-14 14:48:13 -08:00
Kenneth Graunke	31140d097a	i965: Handle TCS inputs and TES outputs. TES outputs work exactly like VS outputs, so we can simply add a case statement for those. TCS inputs are very similar to geometry shaders - they're arrays of per-vertex data. We use the same method I used for the scalar GS backend. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-14 14:48:07 -08:00
Kenneth Graunke	1f46163acb	i965: Add tessellation shader VUE map code. Based on a patch by Chris Forbes, but largely rewritten by Ken. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-14 14:48:01 -08:00
Kenneth Graunke	9f3917bf37	i965: Fix partial variable access for geometry shaders in SSO mode. Without varying packing, if a VS writes a compound variable, and the GS only reads part of it, the base location of the variable may not actually be in the VUE map. To cope with this, we do lowering in terms of varying slots, add any constant offsets to the base, and then do the VUE map remapping. This ensures we only look up VUE map entries for slots which actually exist. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-12-14 14:39:38 -08:00
Kenneth Graunke	8c4deb10df	i965: Separate base offset/constant offset combining from remapping. My tessellation branch has two additional remap functions. I don't want to replicate this logic there. v2: Handle inputs/outputs separately (suggested by Jason Ekstrand). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-12-14 14:39:34 -08:00
Jason Ekstrand	7ba70b1b51	nir: Add another index to load_uniform to specify the range read	2015-12-14 14:28:31 -08:00
Jason Ekstrand	4115648a6b	i965/vec4: Add support for SHADER_OPCODE_MOV_INDIRECT	2015-12-14 14:28:31 -08:00
Jason Ekstrand	2f1455dbb0	i965/fs: Add support for MOV_INDIRECT on pre-Broadwell hardware While we're at it, we also add support for the possibility that the indirect is, in fact, a constant. This shouldn't happen in the common case (if it does, that means NIR failed to constant-fold something), but it's possible so we should handle it.	2015-12-14 14:28:31 -08:00
Jason Ekstrand	4be9a1c7bb	i965/fs: Fix regs_read() for MOV_INDIRECT with a non-zero subnr The subnr field is in bytes so we don't need to multiply by type_sz.	2015-12-14 14:28:31 -08:00
Jason Ekstrand	653d8044ab	i965/fs: Don't force MASK_DISABLE on INDIRECT_MOV instructions It should work fine without it and the visitor can set it if it wants.	2015-12-14 14:28:30 -08:00
Jason Ekstrand	2c90f08bf7	i965/fs: Add support for doing MOV_INDIRECT on uniforms	2015-12-14 14:28:30 -08:00
Kenneth Graunke	106c3a8a48	nir: Fix number of indices on shared variable store intrinsics. Shared variables and input reworks landed around the same time. Presumably, this was some sort of mistake in rebase conflict resolution. This really only affects the num_indices field in nir_intrinsic_infos, which is rarely used. However, it's used by the printer. Found by inspection. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-12-14 14:27:38 -08:00
Jason Ekstrand	dba28da075	anv/buffer_view: Store a bo + offset instead of buffer pointer This is what image_view does. Also, we really need to do this so that we can properly handle the combined offsets from the buffer and from pCreateInfo. This fixes some of the nonzero offset buffer view CTS tests.	2015-12-14 14:10:40 -08:00
Ian Romanick	96dc732ed8	meta/generate_mipmap: Work-around GLES 1.x problem with GL_DRAW_FRAMEBUFFER GL_DRAW_FRAMEBUFFER does not exist in OpenGL ES 1.x, and since _mesa_meta_begin hasn't been called yet, we have to work-around API difficulties. The whole reason that GL_DRAW_FRAMEBUFFER is used instead of GL_FRAMEBUFFER is that the read framebuffer may be different. This is moot in OpenGL ES 1.x. I have another patch series that would also fix this (by removing the calls to _mesa_BindFramebuffer and friends), but it's not quite ready yet... and I think it may be a bit heavy for some stable branches. Consider this a stop-gap fix. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93215 Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-12-14 13:09:15 -08:00
Chad Versace	ee57062e1e	anv: Remove anv_image::surface_type When building RENDER_SURFACE_STATE, the driver set SurfaceType = anv_image::surface_type, which was calculated during anv_image_init(). This was bad because the value of anv_image::surface_type was taken from a gen-specific header, gen8_pack.h, even though the anv_image structure is used for all gens. Replace anv_image::surface_type with a gen-specific lookup function, anv_surftype(), defined in gen${x}_state.c. The lookup function contains some useful asserts that caught some nasty bugs in anv meta, which were fixed in the previous commit.	2015-12-14 10:46:27 -08:00
Samuel Pitoiset	71135e275f	nvc0: check return value of nvc0_program_validate() Spotted by Coverity. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-14 19:08:42 +01:00
Samuel Pitoiset	54f58210c2	nv50: check return value of nouveau_object_new() When ret == 0, obj is not NULL. Spotted by Coverity. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-14 19:08:39 +01:00
Samuel Pitoiset	3f7462b792	nv50,nvc0: make use of unreachable() when invalid texture target happens Spotted by Coverity. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-14 19:08:25 +01:00
Chad Versace	f0d11d5a81	anv/meta: Fix VkImageViewType Meta unconditionally used VK_IMAGE_VIEW_TYPE_2D in the functions below. This caused some out-of-bound memory accesses. anv_CmdCopyImage anv_CmdBlitImage anv_CmdCopyBufferToImage anv_CmdClearColorImage Fix it by adding a new function, anv_meta_get_view_type().	2015-12-14 09:03:58 -08:00
Chad Versace	0bebaeacd7	isl: Rename s/lod_align/image_align/ for consistency Regarding the subimages within a surface, sometimes isl called them "images" and sometimes "LODs". This patch make isl consistently refer to them as "images". I choose the term "image" over "LOD" because LOD is an misnomer when applied to 3D surfaces. The alignment applies to each individual 2D subimage, not to the LOD as a whole. This patch changes no behavior. It's just a manually performed, case-insensitive, replacement s/lod/image/ that maintains correct indentation. any behavior.	2015-12-14 09:01:51 -08:00
Chad Versace	85a6384014	anv/tests: gitignore block_pool_no_free	2015-12-14 09:00:28 -08:00
Chad Versace	0da776b733	anv: Fix build for unit tests Clearly no one has been running `make check`, because the unittestbuild has been broken for a long time. After this buildfix, all tests now pass.	2015-12-14 09:00:28 -08:00
Christian König	8b52fa71ac	st/va: handle default post process regions Avoid referencing NULL pointers. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Julien Isorce <j.isorce@samsung.com> Tested-by: Julien Isorce <j.isorce@samsung.com>	2015-12-14 11:54:55 +01:00
Christian König	f6dd31c1cf	st/va: fix unused variable warning Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Julien Isorce <j.isorce@samsung.com>	2015-12-14 11:54:55 +01:00
Christian König	025d97381e	st/va: clean up post process includes Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Julien Isorce <j.isorce@samsung.com> Tested-by: Julien Isorce <j.isorce@samsung.com>	2015-12-14 11:54:54 +01:00
Christian König	27a276f625	st/va: cleanup filter color standard handling Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Julien Isorce <j.isorce@samsung.com> Tested-by: ulien Isorce <j.isorce@samsung.com>	2015-12-14 11:54:54 +01:00
Tapani Pälli	8b79258cfe	meta: clear_state structure cleanup Remove unused variables from clear_state and use a hardcoded location for color uniform to get rid of 2 more variables. Modify shaders to use explicit location for vertex attribute too as extension is enabled. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-14 08:01:49 +02:00
Ilia Mirkin	eca8f38dcf	glsl: assign varying locations to tess shaders when doing SSO GRID Autosport uses SSO shaders. When a tessellation evaluation shader is passed through this, it triggers assertion failures down the line with unassigned varying locations. Make sure to do this when the first shader in the pipeline is not a vertex shader. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-13 11:35:28 -05:00
Neil Roberts	839793680f	i965: Use MESA_FORMAT_B8G8R8X8_SRGB for RGB visuals Previously if the visual didn't have an alpha channel then it would pick a format that is not sRGB-capable. I don't think there's any reason not to always have an sRGB-capable visual. Since `28090b30` there are now visuals advertised without an alpha channel which means that games that don't request alpha bits in the config would end up without an sRGB-capable visual. This was breaking supertuxkart which assumes the winsys buffer is always sRGB-capable. The previous code always used an RGBA format if the visual config itself was marked as sRGB-capable regardless of whether the visual has alpha bits. I think we don't actually advertise any sRGB-capable visuals (but we just use sRGB formats anyway) so it shouldn't make any difference. However this patch also changes it to use RGBX if an sRGB-capable visual is requested without alpha bits for consistency. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92759 Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Suggested-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-13 14:29:42 +00:00
Neil Roberts	43f4be5f06	i965: Add B8G8R8X8_SRGB to the alpha format override brw_init_surface_formats overrides the render format for RGBX formats which aren't supported for rendering so that they internally use RGBA instead. However, B8G8R8X8_SRGB was missing so it wasn't marked as a renderable format. This patch just adds it. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-13 14:29:41 +00:00
Neil Roberts	c769efda93	i965: Add MESA_FORMAT_B8G8R8X8_SRGB to brw_format_for_mesa_format This will be used in a subsequent patch as the format for RGB visuals. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Suggested-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-13 14:29:38 +00:00
Jason Ekstrand	c56186026f	anv: Add initial support for texel buffers	2015-12-12 16:11:23 -08:00
Jason Ekstrand	fd944197f2	i965/nir: Provide a default LOD for buffer textures Our hardware requires an LOD for all texelFetch commands even if they are on buffer textures. GLSL IR gives us an LOD of 0 in that case, but the LOD is really rather meaningless. This commit allows other NIR producers to be more lazy and not provide one at all.	2015-12-12 16:09:54 -08:00
Ilia Mirkin	7752bbc44e	gk104/ir: simplify and fool-proof texbar algorithm With the current algorithm, we only look at tex uses. However there's a write-after-write hazard where we might decide to, on some path, not use a texture's output at all, but instead to write a different value to that register. However without the barrier, the texture might complete later and overwrite that value. This fixes Unreal Elemental demo on GK110/GK208, flightgear on GK10x, and likely other random-looking failures. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1" <mesa-stable@lists.freedesktop.org>	2015-12-12 18:10:16 -05:00
Ilia Mirkin	d35695096d	nv50/ir: combine sequences of conversions In some cases shaders want non-default rounding when converting float to integer. This can be done in one go, so merge the two ops. This comes up in the packUnorm4x8 & co functions, as well as a few random shaders. Overall shader-db impact is minimal, helping a handful of witcher2 and other misc shaders. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-12 18:10:16 -05:00
Ilia Mirkin	dbca0f3eba	nv50/ir: manually optimize multiplication expansion logic The conversion of 32-bit integer multiplies into 16-bit ones happens after the regular optimization loop. However it's fairly common to multiply by a small integer, rendering some of the expansion pointless. Firstly, propagate immediates when possible into mul ops, secondly just remove the ops when they are unnecessary. Including the change to generate imad immediates, the effect is: total instructions in shared programs : 6365463 -> 6351898 (-0.21%) total gprs used in shared programs : 728684 -> 728684 (0.00%) total local used in shared programs : 9904 -> 9904 (0.00%) total bytes used in shared programs : 44001576 -> 44036120 (0.08%) local gpr inst bytes helped 0 0 3288 4 hurt 0 0 0 842 It's easy for this to hurt bytes since we end up always generating the 8-byte form, while we can't always get rid of the immediate in question. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-12 18:10:16 -05:00
Ilia Mirkin	3af83c4bc7	nv50/ir: fix imul emission in the presence of an immediate Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-12 18:10:15 -05:00
Ilia Mirkin	a0b5d5beed	nv50/ir: teach post-ra immediate folding into mad about integers There will usually be a split before the mad op, peer through that and pick out the right word of the immediate. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-12 18:10:15 -05:00
Ilia Mirkin	ab70ea1353	nv50/ir: add short imad support Support emission of the short imad, but also include it in the various logic that tries to make it possible to emit. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-12 18:10:15 -05:00
Ilia Mirkin	6aca7fecb7	nv50/ir: can't have predication and immediates Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-12 18:10:15 -05:00
Ilia Mirkin	69e8b476d0	nv50/ir: fix texture grad for cubemaps We were ignoring the partial derivatives on the last dim. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-12 18:10:15 -05:00
Ilia Mirkin	a27548400e	nv50/ir: fix assumption that prog->maxGPR is in 32-bit reg units On NV50, we use 16-bit reg units (to make it all work with half-regs). A few places assumed that it was always in 32-bit units. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-12 18:10:15 -05:00
Nicolai Hähnle	d640f179d3	gallium/ddebug: regularly log the total number of draw calls This helps in the use of GALLIUM_DDEBUG_SKIP: first run a target application with skip set to a very large number and note how many draw calls happen before the bug. Then re-run, skipping the corresponding number of calls. Despite the additional run, this can still be much faster than not skipping anything. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-12-12 15:23:50 -05:00
Nicolai Hähnle	b86d5ccae2	gallium/ddebug: add GALLIUM_DDEBUG_SKIP option When we know that hangs occur only very late in a reproducible run (e.g. apitrace), we can save a lot of debugging time by skipping the flush and hang detection for earlier draw calls. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-12-12 15:23:34 -05:00
Roland Scheidegger	af7ba989fb	llvmpipe: fix layer/vp input into fs when not written by prior stages ARB_fragment_layer_viewport requires that if a fs reads layer or viewport index but it wasn't output by gs (or vs with other extensions), then it reads 0. This never worked for llvmpipe, and is surprisingly non-trivial to fix. The problem is the mechanism to handle non-existing outputs in draw is rather crude, it will simply redirect them to whatever is at output 0, thus later stages will just get garbage. So, rather than trying to fix this up (which looks non-trivial), fix this up in llvmpipe setup by detecting this case there and output a fixed zero directly. While here, also optimize the hw vertex layout a bit - previously if the gs outputted layer (or vp) and the fs read those inputs, we'd add them twice to the vertex layout, which is unnecessary. And do some minor cleanup, slots don't require that many bits, there was some bogus (but harmless) float/int mixup for psize slot too, make the slots all unsigned (we always put pos at pos zero thus everything else has to be positive if it exists), and make sure they are properly initialized (layer and vp index slot were not which looked fishy as they might not have got set back to zero when changing from a gs which outputs them to one which does not). This fixes the failures in piglit's arb_fragment_layer_viewport group (3 each for layer and vp). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-12-12 01:59:15 +01:00
Brian Paul	27d5be0b8f	svga: avoid emitting redundant SetSamplers() commands This greatly reduces the number of SetSamplers() commands for some applications. Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-12-11 16:54:58 -07:00
Brian Paul	1291e910d5	svga: avoid emitting redundant SetIndexBuffer commands Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-12-11 16:54:44 -07:00
Brian Paul	71f19dd201	st/mesa: trivial indentation fix	2015-12-11 16:53:20 -07:00
Brian Paul	c877f1aeef	util/blitter: minor formatting fixes	2015-12-11 16:53:20 -07:00
Jason Ekstrand	1c605c8dfa	Merge remote-tracking branch 'mesa-public/master' into vulkan This pulls in a shared local memory fix.	2015-12-11 14:29:13 -08:00
Jason Ekstrand	b8425bb1e8	i965/fs: Use the correct source for local memory load offsets The offset for loads is in src[0]. This was a copy+paste error in the nir_intrinsic_load/store refactoring. This commit fixes a segfault in ES31-CTS.compute_shader.work-group-size. I have no idea how piglit failed to catch this... Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93348 Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-11 13:56:34 -08:00
Jason Ekstrand	d12ea21dd5	gen8/pipeline: Support vec4 vertex shaders In order to actually get them, you need INTEL_DEBUG=vec4.	2015-12-11 13:25:17 -08:00
Kenneth Graunke	fadf378497	i965: Add Gen8+ tessellation control shader state (3DSTATE_HS). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-11 13:11:15 -08:00
Kenneth Graunke	b3c32f5f34	i965: Add Gen7+ tessellation engine state (3DSTATE_TE). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-11 13:11:15 -08:00
Kenneth Graunke	37b0b11cef	i965: Add Gen8+ tessellation evaluation shader state (3DSTATE_DS). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-11 13:11:15 -08:00
Kenneth Graunke	86a6eda9bc	i965: Add tessellation shader push constant support. Based on a patch by Chris Forbes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-11 13:11:15 -08:00
Kenneth Graunke	c59d1b1fd1	i965: Add tessellation shader sampler support. Based on code by Chris Forbes and Fabian Bieler. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-11 13:11:15 -08:00
Kenneth Graunke	f34c04fda6	i965: Add tessellation shader surface support. This is brw_gs_surface_state.c copy and pasted twice with search and replace. brw_binding_table.c code is similarly copy and pasted. v2: Drop dword_pitch related fields. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-12-11 13:11:15 -08:00
Kenneth Graunke	82455e5396	i965: Make fs_visitor::emit_urb_writes set EOT for TES as well. Tessellation evaluation shaders work almost identically to vertex shaders - we have a set of URB writes at the end of the program, and the last one should terminate it. Geometry shaders really are the special case, where multiple EmitVertex() calls trigger URB writes in the middle of the program. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-11 13:11:15 -08:00
Kenneth Graunke	7e0c22d461	i965: Don't hardcode g1 for URB handles in fs_visitor::emit_urb_writes(). Tessellation evaluation shaders will use g4 instead. For now, make an fs_reg called urb_handle and use that in place of hardcoding g1. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-11 13:11:15 -08:00
Kenneth Graunke	77b338d63b	i965: Make brw_set_message_descriptor() non-static. I want to use this directly from brw_vec4_generator.cpp. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-11 13:11:15 -08:00
Kristian Høgsberg Kristensen	e803276148	Revert "i965/HACK: Build brw_cs into libcompiler" This reverts commit `6df7963531`.	2015-12-11 13:09:42 -08:00
Kristian Høgsberg Kristensen	21d5e52da8	Merge ../mesa into vulkan	2015-12-11 13:09:06 -08:00
Kristian Høgsberg Kristensen	c51f133197	i965: Move brw_cs_fill_local_id_payload() to libi965_compiler This is a helper function for setting up the local invocation ID payload according to the cs_prog_data generated by the compiler. It's intended to be available to users of libi965_compiler so move it there.	2015-12-11 13:07:25 -08:00
Eric Anholt	076551116e	vc4: Add quick algebraic optimization for clamping of unpacked values. GL likes to saturate your incoming color, but if that color's coming from unpacking from unorms, there's no point. Ideally we'd have a range propagation pass that cleans these up in NIR, but that doesn't seem to be going to land soon. It seems like we could do a one-off optimization in nir_opt_algebraic, except that doesn't want to operate on expressions involving unpack_unorm_4x8, since it's sized. total instructions in shared programs: 87879 -> 87761 (-0.13%) instructions in affected programs: 6044 -> 5926 (-1.95%) total estimated cycles in shared programs: 349457 -> 349252 (-0.06%) estimated cycles in affected programs: 6172 -> 5967 (-3.32%) No SSPD on openarena (which had the biggest gains, in its VS/CSes), n=15.	2015-12-11 12:36:16 -08:00
Eric Anholt	e3efc4b023	vc4: When doing algebraic optimization into a MOV, use the right MOV. If there were src unpacks, changing to the integer MOV instead of float (for example) would change the unpack operation.	2015-12-11 12:21:22 -08:00
Eric Anholt	2591beef89	vc4: Fix handling of src packs on in qir_follow_movs(). The caller isn't going to expect it from a return, so it would probably get misinterpreted. If the caller had an unpack in its reg, that's fine, but don't lose track of it.	2015-12-11 12:21:22 -08:00
Eric Anholt	b70a2f4d81	vc4: Add missing progress note in opt_algebraic.	2015-12-11 12:21:22 -08:00
Eric Anholt	5989ef2b0f	vc4: Add debugging of the estimated time to run the shader to shader-db.	2015-12-11 12:21:22 -08:00
Eric Anholt	53b2523c6e	vc4: Fix handling of sample_mask output. I apparently broke this in a late refactor, in such a way that I decided its tests were some of those interminable ones that I should just blacklist from my testing. As a result, the refactors related to it were totally wrong.	2015-12-11 12:21:22 -08:00
Edward O'Callaghan	53609de762	softpipe: enable GL_ARB_viewport_array support, update GL3.txt doc Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-12-11 20:09:21 +01:00
Edward O'Callaghan	00f97ad5de	softpipe: implement some support for multiple viewports Mostly related to making sure the rasterizer can correctly pick out the correct scissor box for the current viewport. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-12-11 20:09:21 +01:00
Roland Scheidegger	6c2c1e0ffe	draw: don't assume fixed offset for data in struct vertex_info Otherwise, if struct vertex_info is changed, you're in for some surprises... Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-12-11 20:09:21 +01:00
Neil Roberts	583a5778f4	i965/gen9: Don't do fast clears when GL_FRAMEBUFFER_SRGB is enabled When GL_FRAMEBUFFER_SRGB is enabled any single-sampled renderbuffers are resolved in intel_update_state because the hardware can't cope with fast clears on SRGB buffers. In that case it's pointless to do a fast clear because it will just be immediately resolved. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-12-11 18:05:56 +00:00
Neil Roberts	0033c81344	i965/gen9: Allow fast clears for non-MSRT SRGB buffers SRGB buffers are not marked as losslessly compressible so previously they would not be used for fast clears. However in practice the hardware will never actually see that we are using SRGB buffers for fast clears if we use the linear equivalent format when clearing and make sure to resolve the buffer as a linear format before sampling from it. This is an important use case because by default the window system framebuffers are created as SRGB so without this fast clears won't be used there. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-12-11 18:05:56 +00:00
Neil Roberts	82d459a423	i965/gen9: Resolve SRGB color buffers when GL_FRAMEBUFFER_SRGB enabled SKL can't cope with the CCS buffer for SRGB buffers. Normally the hardware won't see the SRGB formats because when GL_FRAMEBUFFER_SRGB is disabled these get mapped to their linear equivalents. In order to avoid relying on the CCS buffer when it is enabled this patch now makes it flush the renderbuffers. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-12-11 18:05:56 +00:00
Neil Roberts	eb291d7013	i965/gen8+: Don't upload the MCS buffer for single-sampled textures For single-sampled textures the MCS buffer is only used to implement fast clears. However the surface always needs to be resolved before being used as a texture anyway so the the MCS buffer doesn't actually achieve anything. This is important for Gen9 because in that case SRGB surfaces are not supported for fast clears and we don't want the hardware to see the MCS buffer in that case. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-12-11 18:05:56 +00:00
Neil Roberts	44902ed1fa	i965/meta-fast-clear: Disable GL_FRAMEBUFFER_SRGB during clear Adds MESA_META_FRAMEBUFFER_SRGB to the meta save state so that GL_FRAMEBUFFER_SRGB will be disabled when performing the fast clear. That way the render surface state will be programmed with the linear equivalent format during the clear. This is important for Gen9 because the SRGB formats are not marked as losslessly compressible so in theory they aren't support for fast clears. It shouldn't make any difference whether GL_FRAMEBUFFER_SRGB is enabled for the fast clear operation because the color is not actually written to the framebuffer so there is no chance for the hardware to apply the SRGB conversion on it anyway. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-12-11 18:05:56 +00:00
Marek Olšák	369afdb7b6	winsys/amdgpu: clear the buffer cache on mmap failure and try again Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-11 15:25:13 +01:00
Marek Olšák	84a38bfc29	winsys/radeon: clear the buffer cache on mmap failure and try again Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-11 15:25:13 +01:00
Marek Olšák	eb1e1af676	winsys/amdgpu: clear the buffer cache on allocation failure and try again Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-11 15:25:13 +01:00
Marek Olšák	f9d6fe8001	winsys/radeon: clear the buffer cache on allocation failure and try again Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-11 15:25:13 +01:00
Marek Olšák	cf811faeff	gallium/radeon: remove radeon_winsys_cs_handle "radeon_winsys_cs_handle cs_buf" is now equivalent to "pb_buffer buf". Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-11 15:25:13 +01:00
Marek Olšák	cf422d20ff	winsys/radeon: use pb_cache instead of pb_cache_manager This is a prerequisite for the removal of radeon_winsys_cs_handle. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-11 15:25:13 +01:00
Marek Olšák	ebc9497fcb	winsys/radeon: use radeon_bomgr less Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-11 15:25:13 +01:00
Marek Olšák	a450f96ba9	winsys/radeon: rename radeon_bomgr_init_functions Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-11 15:25:13 +01:00
Marek Olšák	38ac20f7dd	winsys/radeon: move variables from radeon_bomgr to radeon_drm_winsys radeon_bomgr is going away. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-11 15:25:13 +01:00
Marek Olšák	3d090223ef	winsys/radeon: remove redundant radeon_bomgr::va Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-11 15:25:12 +01:00
Marek Olšák	1e05812fcd	winsys/amdgpu: don't use the "rws" abbreviation for amdgpu_winsys Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-11 15:25:12 +01:00
Marek Olšák	6f4e74d165	winsys/amdgpu: use pb_cache instead of pb_cache_manager This is a prerequisite for the removal of radeon_winsys_cs_handle. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-11 15:25:12 +01:00
Marek Olšák	3fbf250dfa	gallium/pb_bufmgr_cache: use the new pb_cache module Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Acked-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-11 15:25:12 +01:00
Marek Olšák	2b396eeed9	gallium/pb_cache: add a copy of cache bufmgr independent of pb_manager This simplified (basically duplicated) version of pb_cache_manager will allow removing some ugly hacks from radeon and amdgpu winsyses and flatten simplify their design. The difference is that winsyses must manually add buffers to the cache in "destroy" functions and the cache doesn't know about the buffers before that. The integration is therefore trivial and the impact on the winsys design is negligible. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Acked-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-11 15:25:12 +01:00
Marek Olšák	1a24f443b4	radeonsi: implement fast stencil clear Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-12-11 15:25:12 +01:00
Marek Olšák	8ee96ce834	radeonsi: re-enable Hyper-Z for stencil Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-12-11 15:25:12 +01:00
Marek Olšák	99e63338fb	r600g: remove a Hyper-Z workaround that's likely not needed anymore FORCE_OFF == 0, no need to set that Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-12-11 15:25:12 +01:00
Marek Olšák	96e8d38ac4	r600g: re-enable Hyper-Z for stencil on Evergreen & Cayman Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-12-11 15:25:12 +01:00
Marek Olšák	d3c08309ab	gallium/radeon: fix Hyper-Z hangs by programming PA_SC_MODE_CNTL_1 correctly This is the recommended setting according to hw people and it makes Hyper-Z stable. Just the two magic states. This fixes Evergreen, Cayman, SI, CI, VI (using the Cayman code). Cc: 11.0 11.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-12-11 15:25:12 +01:00
Marek Olšák	7c29bf26bb	radeonsi: don't use the CP DMA workaround on Fiji and newer Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-12-11 15:25:12 +01:00
Marek Olšák	787ada6bf6	radeonsi: apply the streamout workaround to Fiji as well Cc: 11.0 11.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-12-11 15:25:12 +01:00
Marek Olšák	62d82193b8	radeonsi: also print hexadecimal values for register fields in the IB parser Reviewed-by: Michel Dänzer <michel.daenzer@amd.com Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2015-12-11 15:25:12 +01:00
Marek Olšák	de887ba90c	radeonsi: implement RB+ for Stoney (v2) v2: fix dual source blending Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-12-11 15:25:12 +01:00
Marek Olšák	0f9519b938	radeonsi: don't call of u_prims_for_vertices for patches and rectangles Both caused a crash due to a division by zero in that function. This is an alternative fix. Cc: 11.0 11.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2015-12-11 15:25:12 +01:00
Marek Olšák	51603af390	radeonsi: use tgsi_shader_info::colors_written Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-12-11 15:25:11 +01:00
Marek Olšák	b5b87c4ed1	r600g: write all MRTs only if there is exactly one output (fixes a hang) This fixes a hang in piglit/arb_blend_func_extended-fbo-extended-blend-pattern_gles2 on REDWOOD. Cc: 11.0 11.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-12-11 15:25:11 +01:00
Marek Olšák	eb4813a952	tgsi/scan: add flag colors_written This is a prerequisite for the following r600g fix. Cc: 11.0 11.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-12-11 15:25:11 +01:00
Marek Olšák	37208c4fd7	Revert "radeonsi: disable DCC on Stoney" This reverts commit `32f05fadbb`. It turned out the problem with Stoney was caused by incorrect handling of a non-power-two VRAM size in the kernel driver. This is an optional BIOS setting and can be worked around by choosing a different VRAM size in the BIOS. Cc: 11.1 <mesa-stable@lists.freedesktop.org>	2015-12-11 15:25:11 +01:00
Timothy Arceri	4b9a79b7b8	nir: silence uninitialized warning Reviewed-by: Rob Clark <robdclark@gmail.com>	2015-12-11 19:26:20 +11:00
Jason Ekstrand	6ae4e59fac	anv/pipeline: Get rid of the no kernel input parameters hack Previously, meta would pass null shaders in for the VS when it intended to disable the VS. However, this meant that we didn't know what inputs we had and would dead-code things in the FS. In order to solve this, we hard-coded a number. Now meta passes in a VS even if it plans to disable the stage so this is no longer needed.	2015-12-10 22:37:30 -08:00
Jason Ekstrand	bd0e25d41e	anv/apply_pipeline_layout: Multiply uniform sizes by 4 This is because uniforms are now in terms of bytes everywhere.	2015-12-10 22:36:49 -08:00
Jason Ekstrand	6df7963531	i965/HACK: Build brw_cs into libcompiler We need it for CS push constants	2015-12-10 22:36:07 -08:00
Dave Airlie	18ad641c3b	mesa/shader: return correct attribute location for double matrix arrays If we have a dmat2[4], then dmat2[0] is at 17, dmat2[1] at 19, dmat2[2] at 21 etc. The old code was returning 17,18,19. I think this code is also wrong for float matricies as well. There is now a piglit for the float case. This partly fixes: GL41-CTS.vertex_attrib_64bit.limits_test [airlied: update with Tapani suggestion to clean it up]. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-11 16:28:29 +10:00
Jason Ekstrand	21cf55ab54	gen8/cmd_buffer: Don't push CS constants if there aren't any Issuing MEDIA_CURB_LOAD with a size of zero causes GPU hangs on BDW.	2015-12-10 18:56:27 -08:00
Jason Ekstrand	3893e11f4b	anv: Use 4 instead of sizeof(gl_constant_value) We no longer have access to gl_constant_value and, really, it's 4 because our uniform layout code works entirely in dwords.	2015-12-10 18:55:16 -08:00
Jason Ekstrand	13d1dd465c	nir/spirv: Put SSBO store writemasks in the right index It moved with the nir_intrinsic_load/store update.	2015-12-10 18:54:44 -08:00
Jason Ekstrand	d5c9955d3e	Merge remote-tracking branch 'mesa-public/master' into vulkan This pulls in nir_intrinsic_load/store changes and the switch of all uniforms in i965 to bytes. This accounts for the Vulkan changes.	2015-12-10 18:29:36 -08:00
Roland Scheidegger	64c59b0624	draw: fix clipping with linear interpolated values and gl_ClipVertex Discovered this when working on other clip code, apparently didn't work correctly - the combination of linear interpolated values and using gl_ClipVertex produced wrong values (failing all such combinations in piglits glsl-1.30 interpolation tests, named interpolation-noperspective-XXX-vertex). Use the pre-clip-pos values when determining the interpolation factor to fix this. Noone really understands this code well, but everybody agrees this looks sane... This fixes all those failing tests (10 in total) both with the llvm and non-llvm draw paths, with no piglit regressions. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-12-11 02:21:39 +01:00
Dave Airlie	5362e53a06	r600: add missing return value check. Pointed out by coverity scan. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-11 09:37:20 +10:00
Jason Ekstrand	8beea9d45b	anv/icd: Advertise the right ABI version	2015-12-10 12:27:13 -08:00
Jason Ekstrand	78b81be627	nir: Get rid of _indirect variants of input/output load/store intrinsics There is some special-casing needed in a competent back-end. However, they can do their special-casing easily enough based on whether or not the offset is a constant. In the mean time, having the _indirect variants adds special cases a number of places where they don't need to be and, in general, only complicates things. To complicate matters, NIR had no way to convdert an indirect load/store to a direct one in the case that the indirect was a constant so we would still not really get what the back-ends wanted. The best solution seems to be to get rid of the _indirect variants entirely. This commit is a bunch of different changes squashed together: - nir: Get rid of _indirect variants of input/output load/store intrinsics - nir/glsl: Stop handling UBO/SSBO load/stores differently depending on indirect - nir/lower_io: Get rid of load/store_foo_indirect - i965/fs: Get rid of load/store_foo_indirect - i965/vec4: Get rid of load/store_foo_indirect - tgsi_to_nir: Get rid of load/store_foo_indirect - ir3/nir: Use the new unified io intrinsics - vc4: Do all uniform loads with byte offsets - vc4/nir: Use the new unified io intrinsics - vc4: Fix load_user_clip_plane crash - vc4: add missing src for store outputs - vc4: Fix state uniforms - nir/lower_clip: Update to the new load/store intrinsics - nir/lower_two_sided_color: Update to the new load intrinsic NIR and i965 changes are Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> NIR indirect declarations and vc4 changes are Reviewed-by: Eric Anholt <eric@anholt.net> ir3 changes are Reviewed-by: Rob Clark <robdclark@gmail.com> NIR changes are Acked-by: Rob Clark <robdclark@gmail.com>	2015-12-10 12:25:16 -08:00
Jason Ekstrand	f3970fad9e	i965/fs_nir: Refactor store_output, load_input, and load_uniform There was way too much incrementing of things going on. Instead, let's just start everything off at the right base location, and then increment in the loop. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-10 12:25:16 -08:00
Patrick Rudolph	79bff488bc	gallium/util: return correct number of bound vertex buffers In case a state tracker unbinds every slot by a seperate pipe->set_vertex_buffers() call, starting from slot zero, the number of bound buffers would not reach zero at all. The current algorithm does not account for pre-existing holes in the buffer list. Unbinding all buffers at once or starting at the top-most slot results in correct behaviour. Calculating the correct number of bound buffers fixes a NULL pointer dereference in nvc0_validate_vertex_buffers_shared(). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93004 Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-10 13:55:53 -05:00
Neil Roberts	ba67739b66	blit: Don't take into account the Mesa format when checking MSRT blit According to the GLES3 spec, blitting between multisample FBOs with different internal formats should not be allowed. The compatible_resolve_formats function implements this check. Previously it had a shortcut where if the Mesa formats of the two renderbuffers were the same then it would assume the blit is ok. However some drivers map different internal formats to the same Mesa format, for example it might implement both GL_RGB and GL_RGBA textures with MESA_FORMAT_R8G8B8A_UNORM. The function is used to generate a GL error according to what the GL spec requires so the blit should not be allowed in that case. This patch just removes the shortcut so that it only ever looks at the internal format. Note that I posted a related patch to disable this check altogether for desktop GL. However this function is still used on GLES3 because there are conformance tests that require this behaviour so this patch is still useful. Cc: Marek Olšák <maraeo@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-12-10 11:03:58 +00:00
Neil Roberts	3f10774cba	i965: Check base format to determine whether to use tiled memcpy The tiled memcpy doesn't work for copying from RGBX to RGBA because it doesn't override the alpha component to 1.0. Commit `2cebaac479` added a check to disable it for RGBX formats by looking at the TexFormat. However a lot of the rest of the code base is written with the assumption that an RGBA texture can be used internally to implement a GL_RGB texture. If that is done then this check breaks. This patch makes it instead check the base format of the texture which I think more directly matches the intention. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-12-10 11:03:49 +00:00
Neil Roberts	9a31d9870b	i965/gen8: Allow rendering to B8G8R8X8 Since Gen8 this is allowed as a rendering target so we don't need to override it to B8G8R8A8. This is helpful on Gen9+ where using this override causes fast clears not to work. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2015-12-10 11:03:49 +00:00
Neil Roberts	d151338594	i965/gen9: Allow fast clear for MSRT formats matching render Previously fast clear was disallowed on Gen9 for MSRTs with the claim that some formats don't work but we didn't understand why. On further investigation it seems the formats that don't work are the ones where the render surface format is being overriden to a different format than the one used for texturing. The one used for texturing is not actually a renderable format. It arguably makes sense that the sampler hardware doesn't handle the fast color correctly in these cases because it shouldn't be possible to end up with a fast cleared surface that is non-renderable. This patch changes the limitation to prevent fast clear for surfaces where the format for rendering is overriden. Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2015-12-10 11:03:49 +00:00
Neil Roberts	e1a16b901b	i965/gen9/fast-clear: Handle linear→SRGB conversion If GL_FRAMEBUFFER_SRGB is enabled when writing to an SRGB-capable framebuffer then the color will be converted from linear to SRGB before being written. There is no chance for the hardware to do this itself because it can't modify the clear color that is programmed in the surface state so it seems pretty clear that the driver should be handling this itself. Note that this wasn't a problem before Gen9 because previously we were only able to do fast clears to 0 or 1 and those values are the same in linear and SRGB space. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-12-10 11:03:49 +00:00
Jordan Justen	83e8e07a2b	docs: Add ARB_compute_shader to 11.2.0 release notes Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	1c0d059c02	docs: Mark ARB_compute_shader as done for i965 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	d04612b60d	i965: Enable ARB_compute_shader extension on supported hardware Enable ARB_compute_shader on gen7+, on hardware that supports the OpenGL 4.3 requirements of a local group size of 1024. With SIMD16 support, this is limited to Ivy Bridge and Haswell. Broadwell will work with a local group size up to 896 on SIMD16 meaning programs that use this size or lower should run when setting MESA_EXTENSION_OVERRIDE=GL_ARB_compute_shader. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	e288b4a133	i965/nir: Implement shared variable atomic operations v3: * Update based on latest SSBO code (Iago) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	d584b2313e	nir: Add nir intrinsics for shared variable atomic operations v3: * Update min/max based on latest SSBO code (Iago) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	fc21a7c26e	glsl: Disable several optimizations on shared variables Shared variables can be accessed by other threads within the same local workgroup. This prevents us from performing certain optimizations with shared variables. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	f821a3ec4f	glsl: Buffer atomics are supported for compute shaders Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	7333593cf3	glsl: Translate atomic intrinsic functions on shared variables When an intrinsic atomic operation is used on a shared variable, we translate it to a new 'shared variable' specific intrinsic function call. For example, a call to __intrinsic_atomic_add when used on a shared variable will be translated to a call to __intrinsic_atomic_add_shared. v3: * Fix stale comments copied from SSBOs (Iago) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	614ad9b40b	glsl: Check for SSBO variable in check_for_ssbo_store The compiler probably already blocks this earlier on, but we should be checking for an SSBO here. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	c2e6cfbd78	glsl: Check for SSBO variable in SSBO atomic lowering When an atomic function is called, we need to check to see if it is for an SSBO variable before lowering it to the SSBO specific intrinsic function. v2: * is_in_buffer_block => is_in_shader_storage_block (Iago) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	a108e14d1c	glsl: Replace atomic_ssbo and ssbo_atomic with atomic The atomic functions can also be used with shared variables in compute shaders. When lowering the intrinsic in lower_ubo_reference, we still create an SSBO specific intrinsic since SSBO accesses can be indirectly addressed, whereas all compute shader shared variable live in a single shared variable area. v2: * Also remove the _internal suffix from ssbo atomic intrinsic names (Iago) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	23da6aeb17	glsl: Allow atomic functions to be used with shared variables Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	d3625d4071	i965: Lower shared variable references to intrinsic calls Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	b1fe3af0da	i965: Enable shared local memory for CS shared variables v3: * Check shared variable size at link time Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	faddb301ff	i965/fs: Handle nir shared variable store intrinsic v4: * Apply similar optimization for shared variable stores as `0cb7d7b4b7`. This was causing a OpenGLES 3.1 CTS failure, but `867c436ca8` fixes that. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	8613206bd3	i965/fs: Handle nir shared variable load intrinsic v3: * Remove extra #includes (Iago) * Use recently added GEN7_BTI_SLM instead of BRW_SLM_SURFACE_INDEX (curro) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	e128a62318	i965: Disable vector splitting on shared variables Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	aa12a92626	nir: Translate glsl shared var store intrinsic to nir intrinsic Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	03b0439938	nir: Translate glsl shared var load intrinsic to nir intrinsic Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	1078d712d7	glsl: Add lowering pass for shared variable references In this lowering pass, shared variables are decomposed into intrinsic calls. v2: * Send mem_ctx as a parameter (Iago) v3: * Shared variables don't have an associated interface block (Iago) * Always use 430 packing (Iago) * Comment / whitespace cleanup (Iago) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Iago Toral Quiroga	f22ab2e8b3	glsl: Don't assert on shared variable matrices with 'inherited' layout We use column-major for shared variable matrices. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	66eaef7737	glsl: Don't lower_variable_index_to_cond_assign for shared variables Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	c43a7e605e	glsl: Remove mem_ctx as member variable in lower_ubo_reference_visitor Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	ee005df2f9	glsl ubo/ssbo: Move common code into lower_buffer_access::setup_buffer_access This code will also be usable by the pass to lower shared variables. Note, that const_offset is adjusted by setup_buffer_access so it must be initialized before calling setup_buffer_access. v2: Add comment for lower_buffer_access::setup_buffer_access Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	99c8196458	glsl ubo/ssbo: Move is_dereferenced_thing_row_major into lower_buffer_access Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	afa4129cf6	glsl ubo/ssbo: Add lower_buffer_access class This class has code that will be shared by lower_ubo_reference and lower_shared_reference. (lower_shared_reference will be used to support compute shader shared variables.) v2: * Add lower_buffer_access.h to makefile (Emil) * Remove static is_dereferenced_thing_row_major from lower_buffer_access.cpp. This will become a lower_buffer_access method in the next commit. * Pass mem_ctx as parameter rather than using a member variable (Iago) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	ad3c65e792	glsl ubo/ssbo: Split buffer access to insert_buffer_access This allows the code in emit_access to be generic enough to also be for lowering shared variables. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	05667ecc52	glsl ubo/ssbo: Use enum to track current buffer access type v2: * Rename ssbo_get_array_length to ssbo_unsized_array_length_access (Iago) * Use always use this-> when referencing buffer_access_type (Iago) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Tapani Pälli	8cc372b6d9	glsl: do not loose always_active_io when packing varyings Otherwise packed and inactive varyings get optimized away. This needs to be prevented when using separate shader objects where interface needs to be preserved. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-12-10 07:51:55 +02:00
Tapani Pälli	2377db2c4e	mesa: invalidate pipeline status after glUseProgramStages This will cause validation to run during next draw, this is done because possible changes in used stages and programs can cause invalid pipeline state. This fixes a subtest in following CTS test: ES31-CTS.sepshaderobjs.StateInteraction Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-12-10 07:51:40 +02:00
Dave Airlie	21abaad8fe	mesa/varray: set double arrays to non-normalised. Doesn't have any effect in practice I don't think, but CTS reads back using GetVertexAttrib. This fixes: GL41-CTS.vertex_attrib_64bit.get_vertex_attrib Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-10 13:51:44 +10:00
Michel Dänzer	b4a03e7f8f	clover: Fix build against LLVM 3.8 SVN >= r255078 Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-12-10 10:45:29 +09:00
Chad Versace	5ba9121fe8	anv/image: Remove some vkCreateImage validation Don't validate the baseArrayLayer and layerCount of cube images. This allows us to remove a bloated lookup table and an unneeded struct definition (anv_image_view_info).	2015-12-09 16:33:23 -08:00
Chad Versace	9a9c551f3e	anv/image: Drop unused halign, valign lookup tables	2015-12-09 15:36:39 -08:00
Brian Paul	e1815bcc47	mesa: fix ID usage for buffer warnings We need a different ID pointer for each call site. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-09 16:06:35 -07:00
Brian Paul	de5bb7fe78	docs: remove stray <ul> tag from 11.0.5.html file to fix indentation	2015-12-09 15:55:11 -07:00
Serge Martin	2b930327e8	freedreno: little clean up in fd_create_surface in order to avoid returing invalid adress if CALLOC_STRUCT return NULL. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-12-09 17:32:41 -05:00
Serge Martin	0149e7a944	freedreno: change to goto fail in fd_resource_transfer_map, like the others error cases Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-12-09 17:31:16 -05:00
Serge Martin	e63fec29a1	freedreno: fix bind_sampler_states when hwcso is NULL src/gallium/tests/trivial/compute.c expects samplers to be cleaned when the samplers list is NULL. Like in radeon, the function behave like when the number of samplers parameter is set to 0. [small s/hwsco/hwcso/ typo fix] Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-12-09 17:30:58 -05:00
Edward O'Callaghan	f32f80e19d	gallium/util: Make u_prims_for_vertices() safe Let us avoid trapping in hardware from a SIGFPE and instead assert on a zero divisor. Hint: This can occur if a PIPE_PRIM_? is not handled in u_prim_vertex_count() that results in ' info ' not being initialized in the expected manner. Further, we also fix a possibly NULL pointer dereference from ' info ' being NULL from a u_prim_vertex_count() call. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-12-09 22:51:56 +01:00
Andreas Boll	63fe600c7a	docs: add news item for mesa-demos 8.3.0 release Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>	2015-12-09 22:44:52 +01:00
Jason Ekstrand	46bcf9d777	vulkan: Pull in the 0.210.1 vk_platform header Somehow this got missed in the API update.	2015-12-09 11:55:38 -08:00
Jordan Justen	47e5fb52f4	gen8/compute: Setup push constants and local ids Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-09 11:04:30 -08:00
Jordan Justen	f8d5fb4293	anv: Add anv_cmd_buffer_cs_push_constants Similar to anv_cmd_buffer_push_constants, but handles the compute pipeline, which requires different setup from the other stages. This also handles initializing the compute shader local IDs. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-09 11:02:20 -08:00
Patrick Rudolph	432a798cf5	nv50,nvc0: fix use-after-free when vertex buffers are unbound Always reset the vertex bufctx to make sure there's no pointer to an already freed pipe_resource left after unbinding buffers. Fixes use after free crash in nvc0_bufctx_fence(). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93004 Signed-off-by: Patrick Rudolph <siro@das-labor.org> [imirkin: simplify nvc0 fix, apply to nv50] Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-09 13:38:15 -05:00
Andreas Boll	f876346cdd	mesa: Fix a typo in a comment s/suports/supports/ Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-12-09 18:29:24 +01:00
Andreas Boll	0560e835f3	glx: Fix a typo in a comment s/suports/supports/ Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-12-09 18:29:21 +01:00
Andreas Boll	9246df2280	st/osmesa: Fix a typo in a comment s/suport/support/ Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-12-09 18:29:18 +01:00
Andreas Boll	7af9930ab4	meta: Fix a typo in a print message s/Unkown/Unknown/ Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-12-09 18:29:15 +01:00
Andreas Boll	c83e161c91	mesa: Fix typos in print messages s/inconsistant/inconsistent/ s/occurences/occurrences/ Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-12-09 18:29:11 +01:00
Andreas Boll	5c27cb3da3	glsl: Fix a typo in a comment s/suports/supports/ Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-12-09 18:26:47 +01:00
Brian Paul	aa9af32752	svga: initialize pipe_driver_query_info entries with a macro To be safe, set all the fields in case the enums ordering/values ever change. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-12-09 09:43:47 -07:00
Brian Paul	ab0651ccfd	mesa: detect inefficient buffer use and report through debug output When a buffer is created with GL_STATIC_DRAW, its contents should not be changed frequently. But that's exactly what one application I'm debugging does. This patch adds code to try to detect inefficient buffer use in a couple places. The GL_ARB_debug_output mechanism is used to report the issue. NVIDIA's driver detects these sort of things too. Other types of inefficient buffer use could also be detected in the future. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-12-09 09:43:47 -07:00
Emil Velikov	7d3df58125	docs: add news item and link release notes for 11.0.7 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2015-12-09 16:12:32 +00:00
Emil Velikov	61b91d0811	docs: add sha256 checksums for 11.0.7 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `f9715bc449`)	2015-12-09 16:11:12 +00:00
Emil Velikov	d432be32e2	docs: add release notes for 11.0.7 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `bec983b738`)	2015-12-09 16:11:11 +00:00
Francisco Jerez	595c818071	i965: Resolve color and flush for all active shader images in intel_update_state(). Fixes arb_shader_image_load_store/execution/load-from-cleared-image.shader_test. Couldn't reproduce any significant FPS regression in CPU-bound benchmarks from the Finnish benchmarking system on neither VLV nor BSW after 30 runs with 95% confidence level. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92849 Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jason Ekstrand <jason.ekstrand@intel.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Tested-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 15:12:59 +02:00
Francisco Jerez	3dc97a1586	i965: Document inconsistent units the URB size is represented in. Every other gen the representation of the URB size was changed and previous ones weren't updated. I'd be willing to write a series normalizing this to be KB on all generations if anybody else cares.	2015-12-09 14:00:30 +02:00
Francisco Jerez	228d5a3f75	i965: Hook up L3 partitioning state atom. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 13:59:03 +02:00
Francisco Jerez	1fc797e8e4	i965: Work around L3 state leaks during context switches. This is going to require some rather intrusive kernel changes to fix properly, in the meantime (and forever on at least pre-v4.1 kernels) we'll have to restore the hardware defaults at the end of every batch in which the L3 configuration was changed to avoid interfering with the DDX and GL clients that use an older non-L3-aware version of Mesa. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> v2: Optimize look-up of the default configuration by assuming it's the first entry of the L3 config array in order to avoid an FPS regression in GpuTest Triangle and SynMark OglBatch2-7 on most affected platforms. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-09 13:57:40 +02:00
Francisco Jerez	09d9638dd0	i965: Add debug flag to print out the new L3 state during transitions. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 13:46:05 +02:00
Francisco Jerez	acc77947ca	i965: Implement L3 state atom. The L3 state atom calculates the target L3 partition weights when the program bound to some shader stage is modified, and in case they are far enough from the current partitioning it makes sure that the L3 state is re-emitted. v2: Fix for inconsistent units the context URB size is expressed in. Clamp URB size to 1008 KB on SKL due to FF hardware limitation. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 13:46:05 +02:00
Francisco Jerez	95ad0bd33b	i965: Calculate appropriate L3 partition weights for the current pipeline state. This calculates a rather conservative partitioning of the L3 cache based on the shaders currently bound to the pipeline and whether they use SLM, atomics, images or scratch space. The result is intended to be fine-tuned later on based on other pipeline state. Note that the L3 partitioning calculated for VLV in the non-SLM non-DC case differs from the hardware defaults in that it doesn't include a DC partition and has twice as much RO cache space -- This is an intentional functional change that improves performance in several bandwidth-bound benchmarks on VLV (5% significance): SynMark OglTexFilterAniso by 14.18%, SynMark OglTexFilterTri by 7.15%, Unigine Heaven by 4.91%, SynMark OglShMapPcf by 2.15%, GpuTest Fur by 1.83%, SynMark OglDrvRes by 1.80%, SynMark OglVsTangent by 1.71%, and a few other benchmarks from the Finnish system by less than 1%. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 13:46:05 +02:00
Francisco Jerez	fa1300f75e	i965: Implement selection of the closest L3 configuration based on a vector of weights. The input of the L3 set-up code is a vector giving the approximate desired relative size of each partition. This implements logic to compare the input vector against the table of validated configurations for the device and pick the closest compatible one. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 13:46:05 +02:00
Francisco Jerez	353abb294b	i965: Define and use REG_MASK macro to make masked MMIO writes slightly more readable. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 13:46:05 +02:00
Francisco Jerez	fa043698d2	i965/hsw: Enable L3 atomics. Improves performance of the arb_shader_image_load_store-atomicity piglit test by over 25x (which isn't a real benchmark it's just heavy on atomics -- the improvement in a microbenchmark I wrote a while ago seemed to be even greater). The drawback is one needs to be extra-careful not to hang the GPU (in fact the whole system). A DC partition must have been allocated on L3, the "convert L3 cycle for DC to UC" bit may not be set, the MOCS L3 cacheability bit must be set for all surfaces accessed using DC atomics, and the SCRATCH1 and ROW_CHICKEN3 bits must be kept in sync. A fairly recent kernel is required for the command parser to allow writes to these registers. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 13:46:05 +02:00
Francisco Jerez	6907175a4f	i965: Implement programming of the L3 configuration. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 13:46:05 +02:00
Francisco Jerez	b22bebe966	i965: Import tables enumerating the set of validated L3 configurations. It should be possible to use additional L3 configurations other than the ones listed in the tables of validated allocations ("BSpec » 3D-Media-GPGPU Engine » L3 Cache and URB [IVB+] » L3 Cache and URB [*] » L3 Allocation and Programming"), but it seems sensible for now to hard-code the tables in order to stick to the hardware docs. Instead of setting up the arbitrary L3 partitioning given as input, the closest validated L3 configuration will be looked up in these tables and used to program the hardware. The included tables should work for Gen7-9. Note that the quantities are specified in ways rather than in KB, this is because the L3 control registers expect the value in ways, and because by doing that we can re-use a single table for all GT variants of the same generation (and in the case of IVB/HSW and CHV/SKL across different generations) which generally have different L3 way sizes but allow the same combinations of way allocations. v2: Use slice count from the devinfo structure instead of the gt number to implement get_l3_way_size(). Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 13:46:05 +02:00
Francisco Jerez	a403ad4f5a	i965: Add slice count to the brw_device_info structure. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 13:46:05 +02:00
Francisco Jerez	c8ff045fdb	i965/gen8: Don't add workaround bits to PIPE_CONTROL stalls if DC flush is set. According to the hardware docs a DC flush is sufficient to make CS_STALL happy, there's no need to add STALL_AT_SCOREBOARD whenever it's present. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 13:46:05 +02:00
Francisco Jerez	2405b75bc9	i965: Define state flag to signal that the URB size has been altered. This will make sure that we recalculate the URB layout anytime the URB size is modified by the L3 partitioning code. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 13:46:04 +02:00
Francisco Jerez	4841cab01a	i965: Keep track of whether LRI is allowed in the context struct. This stores the result of can_do_pipelined_register_writes() in the context struct so we can find out later whether LRI can be used to program the L3 configuration. v2: * Split change of gen check in can_do_pipelined_register_writes (jljusten) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 13:46:04 +02:00
Francisco Jerez	50c2713726	i965: Adjust gen check in can_do_pipelined_register_writes Allow for pipelined register writes for gen < 7. v2: * Split from another patch and adjust comment (jljusten) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 13:46:04 +02:00
Francisco Jerez	5912da45a6	i965: Define symbolic constants for some useful L3 cache control registers. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 13:46:04 +02:00
Dave Airlie	e307cfa7d9	radeonsi: handle loading doubles as geometry shader inputs. This adds the double code to the geometry shader input handling. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-09 17:04:04 +10:00
Dave Airlie	8c9e40ac22	radeonsi: handle doubles in lds load path. This handles loading doubles from LDS properly. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Cc: "11.0 11.1" <mesa-stable@lists.fedoraproject.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-09 17:03:38 +10:00
Dave Airlie	cce3864046	r600: handle geometry dynamic input array index This fixes: glsl-1.50/execution/geometry/dynamic_input_array_index.shader_test my profanity. We need to load the AR register with the value from the index reg Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-09 15:07:53 +10:00
Dave Airlie	38542921c7	r600g: fix geom shader input indirect indexing. This fixes: gs-input-array-vec4-index-rd The others run out of gprs unfortunately. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-09 15:07:47 +10:00
Dave Airlie	e97ac006d7	r600g: fix outputing to non-0 buffers for stream 0. This fixes: arb_transform_feedback3-ext_interleaved_two_bufs_gs arb_transform_feedback3-ext_interleaved_two_bufs_gs_max transform-feedback-builtins If we are only emitting one ring, then emit all output buffers on it. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-09 15:07:01 +10:00
Edward O'Callaghan	1f61447ce1	r600: Add ARB_copy_image support [airlied: update relnotes] Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-09 14:41:46 +10:00
Edward O'Callaghan	d13ac27200	r600g: allow copying between compatible un/compressed formats See: `commit e82c527f1fc2f8ddc64954ecd06b0de3cea92e93` which is where a block in src maps to a pixel in dst and vice versa. e.g. DXT1 <-> R32G32_UINT DXT5 <-> R32G32B32A32_UINT Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-09 14:40:32 +10:00
Ilia Mirkin	f920f8eb02	nv50/ir: fix cutoff for using r63 vs r127 when replacing zero The only effect here is a space savings - 822 programs in shader-db affected with the following overall change: total bytes used in shared programs : 44154976 -> 44139880 (-0.03%) Fixes: `641eda0c` (nv50/ir: r63 is only 0 if we are using less than 63 registers) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-08 23:15:29 -05:00
Ilia Mirkin	44260d9080	nv50/ir: prefer to color mad def and src2 with the same color This allows us to use the short encoding, and potentially fold immediates in later on. total instructions in shared programs : 6379731 -> 6367861 (-0.19%) total gprs used in shared programs : 728502 -> 728683 (0.02%) total local used in shared programs : 9904 -> 9904 (0.00%) total bytes used in shared programs : 44661008 -> 44154976 (-1.13%) local gpr inst bytes helped 0 51 7267 20306 hurt 0 232 125 274 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-08 23:15:29 -05:00
Ilia Mirkin	c1c1248b94	nv50/ir: reduce degree limit on ops that can't encode large reg dests Operations that take immediates can only encode registers up to 64. This fixes a shader in a "Powered by Unity" intro. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-08 23:15:29 -05:00
Ilia Mirkin	99581ca393	nv50/ir: only unspill once ahead of a group of instructions We already semi-did this but the list of uses as unsorted, so it was unreliable. Sort the uses by bb and serial, and don't unspill for each instruction in a sequence. (And also don't unspill multiple times for a single instruction that uses the value in question multiple times.) This causes a minor reduction in generated instructions for shader-db (as few programs spill) but more importantly it brings determinism to each run's output. On SM10: total instructions in shared programs : 6387945 -> 6379359 (-0.13%) total gprs used in shared programs : 728544 -> 728544 (0.00%) total local used in shared programs : 9904 -> 9904 (0.00%) local gpr inst bytes helped 0 0 322 322 hurt 0 0 0 0 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-08 23:15:29 -05:00
Ilia Mirkin	0f647bd65b	nv50/ir: check if the target supports the new offset before inlining Fixes: `abd326e81b` (nv50/ir: propagate indirect loads into instructions) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93300 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-08 23:15:29 -05:00
Dave Airlie	a13b14930d	llvmpipe: fix fp64 inputs to geom shader. This fixes the fetching of fp64 inputs to the geometry shader, this fixes the recently posted piglit's arb_gpu_shader_fp64/execution/gs-fs-vs-double-array.shader_test arb_vertex_attrib_64bit/execution/gs-fs-vs-attrib-double-array.shader_test Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-09 13:56:39 +10:00
Jordan Justen	974bdfa9ad	i965: Move brw_cs_fill_local_id_payload to brw_compiler.h Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-08 18:09:31 -08:00
Jordan Justen	d28df86c87	anv/compute: Fix thread width max off by 1 See cooresponding code in: commit `8d87070af2` Author: Jordan Justen <jordan.l.justen@intel.com> Date: Thu Aug 28 14:47:19 2014 -0700 i965/cs: Implement brw_emit_gpgpu_walker Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-08 18:09:31 -08:00
Matt Turner	3a7f95b3aa	nir: Optimize useless comparisons against true/false. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> [v1] Reviewed-by: Eric Anholt <eric@anholt.net> [v1] v2: Move new rule to Boolean simplification section Add a a@bool != true simplification Suggested-by: Neil Roberts <neil@linux.intel.com>	2015-12-08 15:41:08 -08:00
Matt Turner	9e9e6fc8f1	glsl: Switch opcode and avail parameters to binop(). To make it match unop(). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-08 15:39:47 -08:00
Matt Turner	dd3c16c94b	glsl_to_tgsi: Skip useless comparison instructions. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-08 15:38:03 -08:00
Matt Turner	eca846e7ae	glsl: Relax qualifier ordering restriction in ES 3.1. ... and allow the "binding" qualifier in ES 3.1 as well. GLSL ES 3.1 incorporates only a few features from the extension ARB_shading_language_420pack: the relaxed qualifier ordering requirements and the binding qualifier. Cc: "11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-08 15:36:57 -08:00
Matt Turner	79da7220db	glsl: Use has_420pack(). These features would not have been enabled with #version 420 otherwise. Cc: "11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-08 15:36:57 -08:00
Matt Turner	c200e606f7	glsl: Allow binding of image variables with 420pack. This interaction was missed in the addition of ARB_image_load_store. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93266 Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-08 15:36:57 -08:00
Jose Fonseca	a9a0c693e5	appveyor: Cache winflexbison archive. Unforunately the Appveyor -> SourceForge connection seems a bit unreliable, causing frequent build failures while downloading winflexbison (approx once every 2 days). Fetching winflexbison archive into Appveyor's cache should eliminate these. Fetching Python modules from PyPI doesn't seem to be a problem, so they are left alone for now, though they could eventually get the same treatment.	2015-12-08 22:49:38 +00:00
Chad Versace	db66424218	anv: Remove unused anv_image_view_info_for_vk_image_view_type()	2015-12-08 14:25:28 -08:00
Eric Anholt	f61ceeb3fd	vc4: Enable MSAA. We still have several failures in the newly enabled tests in simulation: sRGB downsampling is done as if it was just linear, stencil blits are not supported on MSAA either, and derivatives are still not supported (breaking some MSAA simulation shaders). So, other than sRGB downsampling quality, things seem to be in good shape.	2015-12-08 10:09:52 -08:00
Eric Anholt	fc4a1bfb88	vc4: Add support for mapping of MSAA resources. The pipe_transfer_map API requires that we do an implicit downsample/upsample and return a mapping of that.	2015-12-08 09:49:56 -08:00
Eric Anholt	6b4dfd53ae	vc4: Add support for texel fetches from MSAA resources. This is the core of ARB_texture_multisample. Most of the piglit tests for GL_ARB_texture_multisample require GL 3.0, but exposing support for this lets us use the gallium blitter for multisample resolves. We can sometimes multisample resolve using just the RCL, but that requires that the blit is 1:1, unflipped, and aligned to tile boundaries.	2015-12-08 09:49:55 -08:00
Eric Anholt	a97b40dca4	vc4: Add support for multisample framebuffer operations. This includes GL_SAMPLE_COVERAGE, GL_SAMPLE_ALPHA_TO_ONE, and GL_SAMPLE_ALPHA_TO_COVAGE. I haven't implemented a dithering function yet, and gallium doesn't give me a good chance to do so for GL_SAMPLE_COVERAGE.	2015-12-08 09:49:54 -08:00
Eric Anholt	edc3305de7	vc4: Add a workaround for HW-2905, and additional failure I saw with MSAA. I only stumbled on this while experimenting due to reading about HW-2905. I don't know if the EZ disable in the Z-clear is actually necessary, but go with it for now.	2015-12-08 09:49:54 -08:00
Eric Anholt	edfd4d853a	vc4: Add support for drawing in MSAA.	2015-12-08 09:49:53 -08:00
Eric Anholt	e7c8ad0a6c	vc4: Add kernel RCL support for MSAA rendering.	2015-12-08 09:49:53 -08:00
Eric Anholt	568d3a8e32	vc4: Rename color_ms_write to color_write. I was thinking this was the only MSAA resolve thing, so it should be noted separately, but actually load/store general also do MSAA resolve.	2015-12-08 09:49:52 -08:00
Eric Anholt	bf92017ace	vc4: Allow RCL blits to the edge of the surface. The recent unaligned fix successfully prevented RCL blits that weren't aligned inside of the surface, but we also want to be able to do RCL blits for the whole surface when the width or height of the surface aren't aligned (we don't care what renders inside of the padding).	2015-12-08 09:49:52 -08:00
Eric Anholt	fb4877dbab	vc4: Add disabled debug printf for describing blits. I keep typing variants of this while debugging RCL blits for MSAA.	2015-12-08 09:49:51 -08:00
Eric Anholt	2792d118f1	vc4: Fix check for tile RCL blits with mismatched y. This was a typo in `3a508a0d94` that didn't show up in testcases at that moment.	2015-12-08 09:49:51 -08:00
Eric Anholt	1529f138ff	vc4: Fix compiler warning from size_t change. I missed this when bringing over the kernel changes.	2015-12-08 09:49:50 -08:00
Olivier Pena	a5256012ef	scons: support for LLVM 3.7. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-12-08 13:53:31 +00:00
Dave Airlie	bd47fcd57b	docs/GL3.txt: consolidate r600 GL4.1. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-08 20:13:14 +10:00
Jason Ekstrand	18069dce4a	i965: Make uniform offsets be in terms of bytes This commit pushes makes uniform offsets be terms of bytes starting with nir_lower_io. They get converted to be in terms of vec4s or floats when we cram them in the UNIFORM register file but reladdr remains in terms of bytes all the way down to the point where we lower it to a pull constant load. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-07 21:51:23 -08:00
Jason Ekstrand	813f0eda8e	i965/nir_uniforms: Replace comps_per_unit with an is_scalar boolean Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-07 21:51:23 -08:00
Jason Ekstrand	22c273de2b	i965/nir: Remove unused indirect handling The one and only place where the FS backend allows reladdr is on uniforms. For locals, inputs, and outputs, we lower it away before the backend ever sees it. This commit gets rid of the dead indirect handling code. Cc: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-07 21:51:23 -08:00
Jason Ekstrand	abb569ca18	i965/state: Get rid of dword_pitch arguments to buffer functions Cc: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-07 21:51:23 -08:00
Jason Ekstrand	05bdc21f84	i965/vec4: Use a stride of 1 and byte offsets for UBOs Cc: "11.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92909 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-07 21:51:23 -08:00
Jason Ekstrand	13ad8d03f2	i965/fs: Use a stride of 1 and byte offsets for UBOs Cc: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-07 21:51:23 -08:00
Jason Ekstrand	e3e70698c3	i965/vec4: Use byte offsets for UBO pulls on Sandy Bridge Previously, the VS_OPCODE_PULL_CONSTANT_LOAD opcode operated on vec4-aligned byte offsets on Iron Lake and below and worked in terms of vec4 offsets on Sandy Bridge. On Ivy Bridge, we add a new *LOAD_GEN7 variant which works in terms of vec4s. We're about to change the GEN7 version to work in terms of bytes, so this is a nice unification. Cc: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-07 21:51:23 -08:00
Jason Ekstrand	f4aee5d82f	gen8/cmd_buffer: Flush push constants after descriptor sets This is because, if storage images are used, flushing descriptor sets can cause push constants to become dirty.	2015-12-07 21:45:43 -08:00
Jason Ekstrand	43ac954e25	anv: Add initial support for pushing image params The helper to fill out the image params data-structure is stilly a dummy, but this puts the infastructure in place.	2015-12-07 21:08:26 -08:00
Jason Ekstrand	1eb731d9fe	anv/descriptor_set: Add support for storage images in layouts	2015-12-07 21:08:26 -08:00
Jason Ekstrand	ff05f634f6	anv/image: Add a separate storage image surface state Thanks to hardware limitations, storage images may need a different surface format and/or other bits in the surface state.	2015-12-07 21:08:22 -08:00
Jason Ekstrand	8f83222d37	isl: Add initial support for storage images	2015-12-07 21:08:08 -08:00
Ben Widawsky	6ef8149bcd	i965: Fix texture views of 2d array surfaces It is legal to have a texture view of a single layer from a 2D array texture; you can sample from it, or render to it. Intel hardware needs to be made aware when it is using a 2d array surface in the surface state. The texture view is just a 2d surface with the backing miptree actually being a 2d array surface. This caused the previous code would not set the right bit in the surface state since it wasn't considered an array texture. I spotted this early on in debug but brushed it off because it is clearly not needed on other platforms (since they all pass). I have no idea how this works properly on other platforms (I think gen7 introduced the bit in the state, but I am too lazy to check). As such, I have opted not to modify gen7, though I believe the current code is wrong there as well. Thanks to Chris for helping me debug this. v2: Just use the underlying mt's target type to make the array determination. This replaces a bug in the first patch which was incorrectly relying only on non-zero depth (not sure how that had no failures). (Ilia) Cc: Chris Forbes <chrisf@ijw.co.nz> Reported-by: Mark Janes <mark.a.janes@intel.com> (Jenkins) References: https://www.opengl.org/registry/specs/ARB/texture_view.txt Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92609 Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-12-07 18:47:04 -08:00
Nicolai Hähnle	d5a5dbd71f	radeonsi: last_gfx_fence is a winsys fence Cc: "11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-12-07 21:15:59 -05:00
Jason Ekstrand	42b4417031	HACK/i965: Disable assign_var_locations on uniforms This conflicts with the way we're doing uniforms in Vulkan.	2015-12-07 17:19:55 -08:00
Jason Ekstrand	cd75ff5d17	anv/pipeline: Only apply a pipeline layout if we have one	2015-12-07 16:56:02 -08:00
Ilia Mirkin	f97f755192	nvc0/ir: fix up mul+add -> mad algebraic opt, enable for integers For some reason this has been disabled for integers ever since codegen was merged, despite there being emission code for IMAD. Seems to work. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-07 18:49:28 -05:00
Ilia Mirkin	1d708aacb7	gk110/ir: fix imad sat/hi flag emission for immediate args According to nvdisasm both the immediate and non-imm cases use the same bits. Both of these flags are quite rarely set though. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-07 18:49:28 -05:00
Kenneth Graunke	87a1166310	i965: Add brw_device_info::min_ds_entries field. From the 3DSTATE_URB_DS documentation: "Project: IVB, HSW If Domain Shader Thread Dispatch is Enabled then the minimum number of handles that must be allocated is 10 URB entries." "Project: BDW+ If Domain Shader Thread Dispatch is Enabled then the minimum number of handles that must be allocated is 34 URB entries." When the HS is run in SINGLE_PATCH mode (the only mode we support today), there is no minimum for HS - it's just zero. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-12-07 14:48:55 -08:00
Chris Forbes	42ca675cc9	i965: Add state bits for tess stages Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-12-07 14:48:55 -08:00
Chris Forbes	80ea18d1a1	i965: Add backend structures for tess stages Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-12-07 14:48:55 -08:00
Chris Forbes	5340f37902	i965: Set core tessellation-related limits Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-12-07 14:48:54 -08:00
Kenneth Graunke	a9e6a56a02	i965: Request lowering of gl_TessLevel* from float[] to vec4s. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-12-07 14:48:54 -08:00
Kenneth Graunke	7a17356800	i965: Create new files for HS/DS/TE state upload code. For now, this just splits the existing code to disable these stages into separate atoms/files. We can then replace it with real code. v2: Bump the render atoms in this patch so it compiles (in my branch, I'd bumped it in an earlier patch). 61 seems to be the minimum that works, which doesn't match the old value + the number of atoms I added in this patch, so apparently we had some slop before. v3: Actually disable the DS unit on Gen8+. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> [v1] Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-12-07 14:48:54 -08:00
Ilia Mirkin	63b850403c	gk104/ir: sampler doesn't matter for txf We actually leave the sampler unset for OP_TXF, which caused the GK104+ logic to treat some texel fetches as indirect. While this works, it's incredibly wasteful. This only happened when the texture was > 0 (since sampler remained == 0). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-07 16:22:54 -05:00
Marek Olšák	32f05fadbb	radeonsi: disable DCC on Stoney Cc: 11.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-12-07 22:01:08 +01:00
Sonny Jiang	2618886600	winsys/amdgpu: addrlib - port a Fiji bug fix Fiji: Fixed tiled resource failures Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> v2: fix a compile failure (typo) - Marek	2015-12-07 21:58:42 +01:00
Sonny Jiang	338d7bf053	winsys/amdgpu: addrlib - port Checks mip 0 for czDispCompatible Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-07 21:58:42 +01:00
Sonny Jiang	676bc25140	winsys/amdgpu: addrlib - port fix error for workaround for 1D tiling Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-07 21:58:42 +01:00
Christian König	a2c5200a4b	st/va: disable MPEG4 by default v2 The workarounds are too hacky to enable them by default and otherwise MPEG4 doesn't work reliably. v2: add docs/envvars.html, CC stable and fix typos Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> (v1) Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (v1) Cc: "11.1.0" <mesa-stable@lists.freedesktop.org>	2015-12-07 20:34:17 +01:00
Christian König	ca3e2b76c0	st/va: move HEVC functions into separate file v2 v2: actually copy all of it Signed-off-by: Christian König <christian.koenig@amd.com>	2015-12-07 20:34:17 +01:00
Alejandro Piñeiro	3d260cc653	mesa: remove _mesa_tex_target_is_array _mesa_is_array_texture provides the same functionality and: 1. it returns bool instead of GLboolean 2. it's not related to the texture format (texformat.c) 3. the name's a little shorter v2: remove _mesa_tex_target_is_array instead (Brian Paul) Reviewed-by: Brian Paul <brianp@vmware.com>	2015-12-07 20:31:20 +01:00
Alejandro Piñeiro	b16e0ff34e	i965: use _mesa_is_array_texture instead of _mesa_tex_target_is_array Both methods provide the same functionality, so one would be removed. v2: use _mesa_is_array_texture and not the other way (Brian Paul) Reviewed-by: Brian Paul <brianp@vmware.com>	2015-12-07 20:30:24 +01:00
Ilia Mirkin	db072d2086	gk110/ir: fix imul hi emission with limm arg The elemental demo hits this case. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-07 13:30:17 -05:00
Chad Versace	9098e0f074	anv/image: Refactor anv_image_make_surface() Reduce the number of function parameters. Deduce the anv_image::*_surface from the parameters instead of requiring the caller to do that.	2015-12-07 09:28:14 -08:00
Chad Versace	3d85a28e90	anv: Assert the succes of isl_surf_init()	2015-12-07 08:54:59 -08:00
Chad Versace	64e8af69b1	anv: Use isl_tiling_flags in anv_image_create_info Replace anv_image_create_info::force_tiling anv_image_create_info::tiling with the bitmask anv_image_create_info::isl_tiling_flags This allows us to drop the function anv_image.c:choose_isl_tiling_flags().	2015-12-07 08:50:28 -08:00
Chad Versace	c97d8af9aa	anv: Fix anv_gem_set_tiling to respect tiling param Function anv_gem_set_tiling() ignored its 'tiling' parameter. It unconditionally set the bo's tiling to I915_TILING_X.	2015-12-07 08:42:11 -08:00
Chad Versace	01e2932d6a	anv: Remove unused anv_format_s8_uint This is no longer needed after migrating to isl.	2015-12-07 08:40:14 -08:00
Brian Paul	32a6e081c3	svga: use the debug callback to report issues to the state tracker Use the new debug callback hook to report conformance, performance and fallbacks to the state tracker. The state tracker, in turn can report this issues to the user via the GL_ARB_debug_output extension. More issues can be reported in the future; this is just a start. v2: remove conditionals around pipe_debug_message() calls since the check is now done in the macro itself. v3: remove unneeded dummy %s substitutions Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>, Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-12-07 08:57:49 -07:00
Brian Paul	5effc3ae74	gallium/util: check callback pointers for non-null in pipe_debug_message() So the callers don't have to do it. v2: also check cb!=NULL in the macro Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-12-07 08:56:51 -07:00
Abdiel Janulgue	b19546abf3	i965: Add defines for gather push constants v2 (Francisco Jerez): - Rename HSW_GATHER_CONSTANTS_RESERVED to HSW_GATHER_POOL_ALLOC_MUST_BE_ONE. - Rename BRW_GATHER_* prefix to HSW_GATHER_CONSTANT_*. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2015-12-07 14:58:12 +02:00
Timothy Arceri	9214664aed	mesa: move GLES checks for SSO input/output validation This function is unfinished there is a bunch more validation rules that need to be applied here. We will still want to call it for desktop GL we just don't want to validate precision so move the ES check to reflect this. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-12-07 21:41:14 +11:00
Timothy Arceri	ad02621854	mesa: move GL_INVALID_OPERATION error to rendering call The validation api doesn't trigger this error so just move it to the code called during rendering. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org>	2015-12-07 21:41:09 +11:00
Timothy Arceri	4dd096d741	mesa: move pipeline input/output validation inside _mesa_validate_program_pipeline() This allows validation to be done on rendering calls also. Fixes 3 dEQP-GLES31.functional.separate tests. Cc: "11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org>	2015-12-07 21:41:05 +11:00
Timothy Arceri	da1a01361b	glsl: re-validate program pipeline after sampler change Cc: "11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> https://bugs.freedesktop.org/show_bug.cgi?id=93180	2015-12-07 21:41:00 +11:00
Dave Airlie	41e82f4f96	r600: apply SIMD workaround to cayman also. At last on ARUBA this is required to stop tessellation hanging in heaven. This removes one of the SIMDs from use by the HS/LS. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Tested-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 18:57:34 +10:00
Dave Airlie	6bf6bdbc2b	r600: fix regression introduced with ring emit changes. This was adding one after a CUT which broke end primitive	2015-12-07 05:44:55 +00:00
Dave Airlie	fc276bda22	r600: remove stale tessellation comment pointed out by Marek. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 11:04:48 +10:00
Dave Airlie	5ca9825758	docs: consolidate r600 entry in GL3.txt Though fp64 emulation still needs to be done for a lot of the evergreen hw.	2015-12-07 10:06:44 +10:00
Dave Airlie	7fa2914b06	docs: update with r600 tessellation status. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:02 +10:00
Dave Airlie	33404f1415	r600: enable tessellation for evergreen/cayman (v2) This enables tessellation for evergreen/cayman, This will need changes before committing depending on what hw works etc. working are CAYMAN/REDWOOD/BARTS/TURKS/SUMO/CAICOS v2: only enable on evergreen and above.	2015-12-07 09:59:02 +10:00
Dave Airlie	a2885d9cf9	r600g: reduce number of ps thread on caicos this allows tess apps to start Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:02 +10:00
Dave Airlie	fe64a0c8bf	r600g: adjust ls/hs thread counts for sumo these stop tess hangs here. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:02 +10:00
Dave Airlie	e7ce9e3bb8	r600/asm: enable nstack check for tess ctrl/eval shaders. This just makes sure they register at least one stack usage frame like vertex shaders. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:02 +10:00
Dave Airlie	bb44c1f036	r600/asm: handle lds read operations. Reads from the queue shouldn't be merged for now read operations. Reads from the queue shouldn't be merged for now, or put in T slots. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:02 +10:00
Dave Airlie	8ec2cb13e5	r600/asm: add LDS ops and barrier to the once per group restriction. LDS ops must be scheduled in X slot, and barrier should be on its own in a group. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:02 +10:00
Dave Airlie	18871ac576	r600: move VGT_VTX_CNT_EN into shader stages atom. This should be enabled for tessellation shaders as well. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:02 +10:00
Dave Airlie	958d617d98	r600: enable tcs/tes dumping for R600_DUMP_SHADERS. Trivial patch just to enable dumping more. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:01 +10:00
Dave Airlie	b8df7d03c8	r600: handle SIMD allocation issue with HS/LS At least one SIMD must be kept away from the HS/LS stages in order to avoid a hw issue on evergreen/cayman. This patch implements this workaround. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:01 +10:00
Dave Airlie	7b5878ee04	r600/shader: increase number of inputs/outputs to 64. Tessellation exceeds these sometimes, so increase them for now. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:01 +10:00
Edward O'Callaghan	22058f69fb	r600: handle barrier opcode. This handles the barrier opcode for EG/CM. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:01 +10:00
Dave Airlie	9662a43d23	r600/shader: handle tess related system-values. This adds handling for TESSINNER/TESSOUTER in the TES where they need to be fetched from LDS, and TESSCOORD which comes in via r0. It also handle primitive ID and invocation ID. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:01 +10:00
Dave Airlie	92fbf856f4	r600/shader: allow multi-dimension arrays for tcs/tes inputs/outputs. This just allows multi-dim arrays to be processed. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:01 +10:00
Dave Airlie	30d56d1c00	r600/shader: handle TES exports and streamout when tessellation is enabled the TES shader is responsible for handling streamout and exports. This adds the streamout and export workarounds to TES, and also makes sure TES sets up spi_sid. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:01 +10:00
Dave Airlie	2239f3eaff	r600/shader: emit tessellation factors to GDS at end of TCS. When we are finished the shader, we read back all the tess factors from LDS and write them to special global memory storage using GDS instructions. This also handles adding NOP when GDS or ENDLOOP end the TCS. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:01 +10:00
Dave Airlie	cfc2818e23	r600/shader: handle TCS output writing. TCS outputs whenever they are written in the shader, need to be written to LDS not temporaries, this handles this case. It also fixes up the case where the output is a relative addressed output, so we don't try to apply the relative address at the wrong time. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:01 +10:00
Dave Airlie	892cc65fa3	r600/shader: handle VS shader writing to the LDS outputs. (v1.1) This writes the VS shaders outputs to the LDS memory in the correct places. v1.1: use 24-bit Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:01 +10:00
Dave Airlie	8b2024196f	r600/shader: handle fetching tcs/tes inputs and tcs outputs This handles the logic for doing fetches from LDS for TCS and TES. For TCS we need to fetch both inputs and outputs, for TES only inputs need to be fetched. v2: use 24-bit ops. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:01 +10:00
Dave Airlie	4477be2404	r600/shader: add get_lds_offset0 helper This retrievs the offset into the LDS for a patch or non-patch variable, it takes the RelPatch channel and a temporary register. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:01 +10:00
Dave Airlie	2a9639e41f	r600/shader: add function to get tess constants info This function retrieves the tess input/output info from the tess constant buffer that is bound to the shader. This uses a vfetch to get the values into the shader. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:01 +10:00
Dave Airlie	0696ebc899	r600/shader: add utility functions to do single slot arithmatic These utilities are to be used to do things like integer adds and multiplies to be used in calculating the LDS offsets etc. It handles CAYMAN MULLO differences as well. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:01 +10:00
Dave Airlie	09d25a9b37	r600/eg: workaround bug with tess shader and dynamic GPRs. When using tessellation on eg/ni chipsets, we must disable dynamic GPRs to workaround a hw bug where the GPU hangs when too many things get queued. This implements something like the r600 code to emit the transition between static and dynamic GPRs, and to statically allocate GPRs when tessellation is enabled. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:01 +10:00
Dave Airlie	d87f54f225	r600/shader: move get_temp and last_instruction helpers up These are required for tess to be used earlier. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:01 +10:00
Dave Airlie	7933ba4d9c	r600: bind geometry shader ring to the correct place When tess/gs are enabled, the geom shader ring needs to bind to the tess eval not the vertex shader. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:00 +10:00
Dave Airlie	e3ecc28e99	r600: create fixed function tess control shader fallback. If we have no tess control shader, then we have to use a fallback one that just writes the tessellation factors. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:00 +10:00
Dave Airlie	731ff3766f	r600: create LDS info constants buffer and write LDS registers. (v2) This creates a constant buffer with the information about the layout of the LDS memory that is given to the vertex, tess control and tess evaluation shaders. This also programs the LDS size and the LS_HS_CONFIG registers, on evergreen only. v2: calculate lds hs num waves properly (Marek) Emit the state only when something has changed (airlied). Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:00 +10:00
Dave Airlie	38b5ee4796	r600/eg: update shader stage emission/tf param for tess. This update the setting of the shader stages register when tess is enabled and add the setting of the VGT_TF_PARAM register from the tess shader properties. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:00 +10:00
Dave Airlie	8874725c84	r600: hook TES/TCS shaders to the selection logic. This hooks the TES/TCS bindings to the HW stages up. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:00 +10:00
Dave Airlie	79d88afd5c	r600: workout bitmask for the used tcs inputs/outputs. This is used later to setup the constants to be given to the tessellation shaders. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:00 +10:00
Dave Airlie	839dae0dc0	r600: port over the get_lds_unique_index from radeonsi On r600 this needs to subtract 9 due to texcoord interactions. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:00 +10:00
Dave Airlie	420afe06d1	r600: add set_tess_state callback. This just stores the values in the context to be used later when emitting the constant buffers. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:00 +10:00
Dave Airlie	7db24b740c	r600/eg: init tess registers to defaults (v1.1) This initialises the tess min/max using fglrx values, and also initialises a number of other registers related to tessellation. v1.1: caicos doesn't have some registers. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:00 +10:00
Dave Airlie	25f96c1120	r600: hook up constants/samplers/sampler view for tessellation This hooks the resources to the correct hw shaders when tess is enabled. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:00 +10:00
Dave Airlie	9f86741863	r600: add create/bind/delete shader hooks for tessellation This hooks up the gallium API for the tessellation shaders. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:00 +10:00
Dave Airlie	797012bb67	r600/sb: add LS/HS hw shader types. This just adds printing for the hw shader types, and hooks it up. Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:00 +10:00
Dave Airlie	382e2a2901	r600/blit: add tcs/tes shader saves. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:00 +10:00
Dave Airlie	bdf7dadda8	r600: disable SB for now on tess related shaders. Note we have to disable on vertex shaders when we are operating in tes mode. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:00 +10:00
Dave Airlie	8849867b8a	r600: update correct hw shaders depending on configuration. This updates the tess hw shaders from the sw ones routing things correctly. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:00 +10:00
Dave Airlie	b1da110b71	r600: add shader key entries for tcs and tes. with tessellation vs can now run on ls, and tes can run on vs or es, tcs runs on hs. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:58:59 +10:00
Dave Airlie	a131ac73e6	r600: add PATCHES to the pipe conversion. This just converts the value to the hw value. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:58:59 +10:00
Dave Airlie	0b08a8ade6	r600: add functions to update ls/hs state. This just adds the two functions, these will get hooked up later in the shader code. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:58:59 +10:00
Glenn Kennard	b2fa64b161	r600g/sb: Support LDS ops in SB bytecode I/O This just adds the LDS ops to the SB bytecode reader/writers. Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:58:59 +10:00
Dave Airlie	816bb30245	r600: add support for LDS instruction encoding. These are used in tessellation shaders to read/write values between VS/TCS/TES. This splits the eg alu assembler out to handle these instructions. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:58:59 +10:00
Dave Airlie	fe4eb49df9	r600/sb: add support for GDS to the sb decoder/dump. (v1.1) This just adds support to the decoder, not actual SB support. v1.1: fixup GDS relative mode. (Glenn). Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:58:59 +10:00
Dave Airlie	2b25d9ac7f	r600: add support for GDS clause to the assembler. This just adds enough for the tessellation shaders, which require TF_WRITE to work. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:58:59 +10:00
Dave Airlie	4f83184eff	r600: use macros for updating the various stages. These macros will make things easier to see when tess is added to the mix. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:58:59 +10:00
Dave Airlie	85131a5490	r600: add SET_NULL_SHADER macro. This is used to set a hw shader to NULL. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:58:59 +10:00
Dave Airlie	f395ed8d4c	r600: move clip misc and streamout stream updates to a single place This will be updated in a macro later. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:58:59 +10:00
Dave Airlie	8a0e21fc5a	r600: move selecting shaders into earlier code. select the ps/gs/vs in that order then process the results. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:58:59 +10:00
Dave Airlie	3a7232a9a9	r600: use a macro to remove common shader selection code. This function is going to get a lot messier with tessellation so I'm going to use some macros to try and clean some bits of common code up. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:58:59 +10:00
Dave Airlie	19799a5928	r600: move to using hw stages array for hw stage atoms This moves to using an array of hw stages for the atoms. Note this drops the 23 from the vertex shader, this value is calculated internally when shaders are bound, so not required here. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:58:59 +10:00
Dave Airlie	bb2b8778cb	r600: make adjust_gprs use hw stages. This changes the r600 specific GPR adjustment code to use the stage defines, and arrays. This is prep work for the tess changes later. Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:58:59 +10:00
Dave Airlie	d1b90839c0	r600: introduce HW shader stage defines Add a list of defines for the HW stages. We will use this for GPR calculations amongst other things. Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:58:58 +10:00
Dave Airlie	bd71f3e4fe	r600: fix masks for two of the unused evergreen regs. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:58:58 +10:00
Edward O'Callaghan	d108b69d2c	gallium: Remove redundant NULL ptr checks Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-12-06 17:10:23 +01:00
Edward O'Callaghan	13eb5f596b	gallium/drivers: Sanitize NULL checks into canonical form Use NULL tests of the form `if (ptr)' or `if (!ptr)'. They do not depend on the definition of the symbol NULL. Further, they provide the opportunity for the accidental assignment, are clear and succinct. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-12-06 17:10:23 +01:00
Edward O'Callaghan	150c289f60	gallium/auxiliary: Sanitize NULL checks into canonical form Use NULL tests of the form `if (ptr)' or `if (!ptr)'. They do not depend on the definition of the symbol NULL. Further, they provide the opportunity for the accidental assignment, are clear and succinct. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-12-06 17:10:23 +01:00
Edward O'Callaghan	147fd00bb3	gallium/auxiliary: Trivial code style cleanup Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-12-06 17:10:22 +01:00
Edward O'Callaghan	25b3d554c4	gallium/drivers: Trivial code-style cleanup Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-12-06 17:10:22 +01:00
Edward O'Callaghan	34782eec31	gallium/auxiliary: Fix zero integer literal to pointer comparison Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-12-06 17:10:02 +01:00
Edward O'Callaghan	3edae10601	winsys/amdgpu: Make use of ARRAY_SIZE macro Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-12-06 17:09:54 +01:00
Edward O'Callaghan	82871081fc	svga: Make use of ARRAY_SIZE macro Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-12-06 17:09:52 +01:00
Edward O'Callaghan	70d2d3ef7f	llvmpipe: Make use of ARRAY_SIZE macro Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-12-06 17:09:47 +01:00
Edward O'Callaghan	be51020f2a	gallium/drivers/nouveau: Make use of ARRAY_SIZE macro Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-12-06 17:03:17 +01:00
Edward O'Callaghan	7e43a28079	gallium/radeon*: Remove useless casts These are unnecessary and are likely just left overs from prior work. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-12-06 11:52:16 +01:00
Ilia Mirkin	0ef5c8ab74	nv50/ir: fold shl + mul with immediates On SM20 this gives: total instructions in shared programs : 6299222 -> 6294240 (-0.08%) total gprs used in shared programs : 944139 -> 944068 (-0.01%) total local used in shared programs : 54116 -> 54116 (0.00%) local gpr inst bytes helped 0 126 2781 2781 hurt 0 55 11 11 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-05 18:56:43 -05:00
Ilia Mirkin	abd326e81b	nv50/ir: propagate indirect loads into instructions This way $r1 = $r0 + 4; c1[$r1] becomes c1[$r0+4]. On SM35: total instructions in shared programs : 6206257 -> 6185058 (-0.34%) total gprs used in shared programs : 911045 -> 910722 (-0.04%) total local used in shared programs : 39072 -> 39072 (0.00%) local gpr inst bytes helped 0 417 4195 4195 hurt 0 280 0 0 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-05 17:50:23 -05:00
Ilia Mirkin	31fde8faba	nv50/ir: flip shl(add, imm) into add(shl, imm) This works when the add also has an immediate. This often happens in address calculations. These addresses can then be inlined as well. On code targeted to SM35: total instructions in shared programs : 6223346 -> 6206257 (-0.27%) total gprs used in shared programs : 911075 -> 911045 (-0.00%) total local used in shared programs : 39072 -> 39072 (0.00%) local gpr inst bytes helped 0 119 3664 3664 hurt 0 74 15 15 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-05 17:50:23 -05:00
Eric Anholt	a4eff86f4a	vc4: Fix accidental scissoring when scissor is disabled. Even if the rasterizer has scissor disabled, we'll have whatever vc4->scissor bounds were last set when someone set up a scissor, so we shouldn't clip to them in that case. Fixes piglit fbo-blit-rect, and a lot of MSAA tests once they're enabled.	2015-12-05 13:12:27 -08:00
Eric Anholt	d16d666776	vc4: Disable RCL blitting when scissors are enabled. We could potentially handle scissored blits when they're tile aligned, but it doesn't seem worth it. If you're doing a scissored blit, you're probably a testcase. Fixes piglit's fbo-scissor-blit fbo	2015-12-05 13:12:27 -08:00
Eric Anholt	0afe83078d	vc4: Bring over cleanups from submitting to the kernel.	2015-12-05 13:12:27 -08:00
Samuel Pitoiset	9f6ff76fdc	nvc0: expose a group of performance metrics for SM30 (Kepler) This allows to monitor these performance metrics through GL_AMD_performance_monitor. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-05 19:23:34 +01:00
Samuel Pitoiset	0afd8f7bd7	nvc0: re-introduce performance metrics for SM30 (Kepler) This implements more performance metrics than the previous support, but some other metrics still need to be figured out. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-05 19:23:34 +01:00
Samuel Pitoiset	af275b8839	nvc0: remove useless counting operations for MP counters Those bits were related to old performance metrics support. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-05 19:23:34 +01:00
Samuel Pitoiset	6667355d4b	nvc0: remove old performance metrics support on Kepler These performance metrics will be re-introduced in an upcoming patch that will follow the same design as Fermi. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-05 19:23:34 +01:00
Samuel Pitoiset	662eb434ee	nvc0: remove wrong inst_issued HW SM perf counter on Kepler inst_issued is performance metric not a hardware event on Kepler (SM30). It will be re-introduced in an upcoming patch. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-05 19:23:34 +01:00
Samuel Pitoiset	342ea31193	nvc0: add missing HW SM perf counters for SM30 (Kepler) SM30 is the compute capability version for GK104/GK106/GK107. This also introduces a new signal group selection called UNK0F. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-05 19:23:34 +01:00
Samuel Pitoiset	7f42688017	nvc0: fix the comment that describe MP counters storage on Kepler Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-05 19:23:34 +01:00
Rob Clark	58efff89a2	freedreno/ir3: nir shader prints with 'disasm' debug option Move these to 'disasm' instead of the more verbose 'optmsgs' since, like the tgsi dumps, it is useful without the more verbose compiler logging enabled. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-12-05 08:48:19 -05:00
Kristian Høgsberg Kristensen	0a5bee1fe6	vk: Don't override and hardcode autoconf CFLAGS To disable optimizations pass CFLAGS="-O0 -g" on the configure command line.	2015-12-04 21:24:15 -08:00
Kristian Høgsberg Kristensen	7337870036	vk: Move isl files to libisl.la helper library These will be in their own library eventually - let's just do that now.	2015-12-04 21:24:15 -08:00
Chad Versace	2f270f0d15	anv/image: Fix choice of isl_surf_usage for depthstencil images Fixes assertion in vkCreateImage when VkFormat is combined depthstencil. Fixed many vulkancts tests that use combined depthstencil. For example, fixes dEQP-VK.pipeline.depth.format.d16_unorm_s8_uint.compare_ops.\ not_equal_less_or_equal_not_equal_greater.	2015-12-04 16:37:05 -08:00
Chad Versace	a09b4c298c	anv: Add func anv_get_isl_format()	2015-12-04 16:37:05 -08:00
Chad Versace	8b9ceda9f1	anv/image: Delete old ifdef'd out code	2015-12-04 16:37:05 -08:00
Jason Ekstrand	4dd5ef9e09	vk: Add needed builddir subdirectories to the include path This fixes out-of-tree builds and closes #1	2015-12-04 15:48:27 -08:00
Kristian Høgsberg Kristensen	f1f78a371e	vk: gem handles are uint32_t No functional difference, but lets be consistent with the kernel API.	2015-12-04 12:53:27 -08:00
Ilia Mirkin	a3f90ef0a6	gallium/util: fix pipe_debug_message macro to allow 0 args Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2015-12-04 15:24:17 -05:00
Jason Ekstrand	8f722c2fa3	vk: Update the README for 0.210.1	2015-12-04 11:08:45 -08:00
Kristian Høgsberg	dac57750db	vk: Turn on Bay Trail, Cherryview and Broxton support	2015-12-04 09:51:47 -08:00
Kristian Høgsberg Kristensen	bbb6875f35	vk: Map uncached, coherent memory as write-combine This gives us the required characteristics for the memory type.	2015-12-04 09:51:47 -08:00
Kristian Høgsberg Kristensen	c3c61d210f	vk: Expose two memory types for non-LLC GPUs We're required to expose a host-visible, coherent memory type. On big core GPUs that share, LLC, we can expose one such memory type that's also cached. However, on non-LLC GPUs we can't both be cached and coherent. Thus, we expose both the required coherent type and the cached but non-coherent combination.	2015-12-04 09:51:47 -08:00
Kristian Høgsberg	773592051b	vk: clflush all state for non-LLC GPUs	2015-12-04 09:51:47 -08:00
Kristian Høgsberg	b431cf59a3	vk: Set I915_CACHING_NONE for userptr BOs when !llc Regular objects are created I915_CACHING_CACHED on LLC platforms and I915_CACHING_NONE on non-LLC platforms. However, userptr objects are always created as I915_CACHING_CACHED, which on non-LLC means snooped. That can be useful but comes with a bit of overheard. Since we're eplicitly clflushing and don't want the overhead we need to turn it off.	2015-12-04 09:51:47 -08:00
Kristian Høgsberg	e0b5f0308c	vk: Implement vkFlushMappedMemoryRanges() We'll do a runtime switch on device->info.has_llc for now.	2015-12-04 09:51:47 -08:00
Eric Anholt	a69ac4e89c	vc4: Add debug dumping of MSAA surfaces.	2015-12-04 09:24:36 -08:00
Eric Anholt	3c3b1184eb	vc4: Add support for laying out MSAA resources. For MSAA, we store full resolution tile buffer contents, which have their own tiling format. Since they're full resolution buffers, we have to align their size to full tiles.	2015-12-04 09:24:36 -08:00
Eric Anholt	74c4b3b80c	vc4: Add support for storing sample mask. From the API perspective, writing 1 bits can't turn on pixels that were off, so we AND it with the sample mask from the payload.	2015-12-04 09:23:55 -08:00
Eric Anholt	3a508a0d94	vc4: Fix up tile alignment checks for blitting using just an RCL. We were checking that the blit started at 0 and was 1:1, but not that it went to the full width of the surface, or that the width was aligned to a tile. We then told it to blit to the full width/height of the surface, causing contents to be stomped in a bunch of MSAA tests that happen to include half-screen-width blits to 0,0.	2015-12-04 09:10:53 -08:00
Eric Anholt	a664233042	vc4: Add support for loading sample mask.	2015-12-04 09:10:53 -08:00
Rob Clark	4b18d51756	freedreno/ir3: convert scheduler back to recursive algo I've played with a few different approaches to tweak instruction priority according to how much they increase/decrease register pressure, etc. But nothing seems to change the fact that compared to original (pre-multiple-block-support) scheduler, in some edge cases we are generating shaders w/ 5-6x higher register usage. The problem is that the priority queue approach completely looses the dependency between instructions, and ends up scheduling all paths at the same time. Original reason for switching was that recursive approach relied on starting from the shader outputs array. But we can achieve more or less the same thing by starting from the depth-sorted list. shader-db results: total instructions in shared programs: 113350 -> 105183 (-7.21%) total dwords in shared programs: 219328 -> 211168 (-3.72%) total full registers used in shared programs: 7911 -> 7383 (-6.67%) total half registers used in shader programs: 109 -> 109 (0.00%) total const registers used in shared programs: 21294 -> 21294 (0.00%) half full const instr dwords helped 0 322 0 711 215 hurt 0 163 0 38 4 The shaders hurt tend to gain a register or two. While there are also a lot of helped shaders that only loose a register or two, the more complex ones tend to loose significanly more registers used. In some more extreme cases, like glsl-fs-convolution-1.shader_test it is more like 7 vs 34 registers! Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-12-04 10:27:09 -05:00
Rob Clark	ad2cc7bddc	freedreno/ir3: don't reuse a0.x across blocks It causes confusion in sched if we need to split_addr() since otherwise we wouldn't easily know which block the new addr instr will be scheduled in. So just side-step the whole situation. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-12-04 10:27:09 -05:00
Rob Clark	8e52344dc1	freedreno/ir3: rename ir3_block::bd We'll need to add similar for ir3_instruction, but following the pattern to use 'id' seems confusing. Let's just go w/ generic 'data' as the name. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-12-04 10:27:09 -05:00
Giuseppe Bilotta	d566382a98	util: fix comment typo Undefining the NDEBUG is relevant for release build, as they are the ones that set it. [Emil Velikov: split from previous patch] Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-12-04 14:06:41 +00:00
Giuseppe Bilotta	efaac624af	xvmc: force assertion in XvMC tests This follows the src/util/u_atomic_test.c model of undefining NDEBUG unconditionally throughouth the XvMC tests, to force asserts regardless of debug mode. The comment on u_atomic_test.c is also fixed (read 'debug' where it should have been 'release'). v2: s/debug/release/ in relevant comments Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> [Emil Velikov: keep the src/util/ hunk as separate patch] Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-12-04 14:06:41 +00:00
Giuseppe Bilotta	4839353634	radeon: const correctness Add missing `const` specifier for pointer pointing to a const struct. Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-12-04 14:06:41 +00:00
Giuseppe Bilotta	d61802b5e0	radeon: whitespace cleanup Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-12-04 14:06:38 +00:00
Emil Velikov	1074e38fbb	mesa/tests: add KHR_debug GLES glGetPointervKHR entry points Should have been part of commit `f53f9eb8d4` "glapi: add GetPointervKHR to the ES dispatch". v2: comment out the ES1.1 symbol and use the same description (pattern) as elsewhere (Matt) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93235 Fixes: `f53f9eb8d4` "glapi: add GetPointervKHR to the ES dispatch". Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Vinson Lee <vlee@freedesktop.org> (v1) Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-04 13:56:43 +00:00
Jason Ekstrand	b715e6d528	i965/vec4: Stop pretending to support indirect output stores Since we're using nir_lower_outputs_to_temporaries to shadow all our outputs, it's impossible to actually get an indirect store. The code we had to "handle" this was pretty bogus as it created a register with a reladdr and then stuffed it in a fixed varying slot without so much as a MOV. Not only does this not do the MOV, it also puts the indirect on the wrong side of the transaction. Let's just delete the broken dead code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-03 20:58:12 -08:00
Jason Ekstrand	aa35b0c2c7	i965/vec4: Get rid of the nir_inputs array It's not really buying us anything at this point. It's just a way of remapping one offset namespace onto another. We can just use the location namespace the whole way through. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-03 20:58:12 -08:00
Jason Ekstrand	c6bcc23369	nir/lower_io: Pass the builder and type_size into get_io_offset Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-03 20:58:12 -08:00
Ilia Mirkin	204f803ce0	nv50/ir: replace zeros in movs as well The original change to put zeroes directly into instructions created conditional mov's with the zero immediate. However that can't be emitted, so make sure to replace the zero with r63. Fixes: `52a800a68` (nv50/ir: allow immediate 0 to be loaded anywhere) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-03 23:46:02 -05:00
Ilia Mirkin	a3722b81f5	nv50/ir: fold fma/mad when all 3 args are immediates This happens pretty rarely, but might as well do it when it does. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-03 23:02:57 -05:00
Ilia Mirkin	2b98914fe0	nv50/ir: avoid looking at uninitialized srcMods entries Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-03 23:02:57 -05:00
Ilia Mirkin	49692f86a1	nv50/ir: fix DCE to not generate 96-bit loads A situation where there's a 128-bit load where the last component gets DCE'd causes a 96-bit load to be generated, which no GPU can actually emit. Avoid generating such instructions by scaling back to 64-bit on the first load when splitting. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-03 23:02:57 -05:00
Roland Scheidegger	51140f452a	draw: fix clipping of layer/vp index outputs This was just plain broken. It used always the value from v0 (for vp_index) but would pass the value from the provoking vertex to later stages - but only if there was a corresponding fs input, otherwise the layer/vp index would get lost completely (as it would try to interpolate the (unsigned) values as floats). So, make it obey provoking vertex rules (drivers relying on draw will need to do the same). And make sure that the default interpolation mode (when no corresponding fs input is found) for them is constant. Also, change the code a bit so constant inputs aren't interpolated then copied over later. Fixes the new piglit test gl-layer-render-clipped. v2: more consistent whitespaces fixes for function defs, and more tab killing (overall still not quite right however). Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-12-04 03:42:19 +01:00
Roland Scheidegger	5ea5b169e9	softpipe: use provoking vertex for layer Same as for llvmpipe, albeit softpipe only really handles multiple layers, not multiple viewports/scissors. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-12-04 03:42:19 +01:00
Roland Scheidegger	ddaf8d7b10	llvmpipe: use provoking vertex for layer/viewport d3d10 actually requires using provoking (first) vertex. GL is happy with any vertex (as long as we say it's undefined in the corresponding queries). Up to now we actually used vertex 0 for viewport index, and vertex 1 for layer (for tris), which really didn't make sense (probably a typo). Also,$ since we reorder vertices of clockwise triangle, that actually meant we used a different vertex depending if the traingle was cw or ccw (still ok by gl). However, it should be consistent with what draw (clip) does, and using provoking vertex seems like the sensible choice (draw clip will be fixed next as it is totally broken there). While here, also use the correct viewport always even when not needed in setup (we pass it down to jit fragment shader it might be needed there for getting correct near/far depth values). No piglit changes. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-12-04 03:42:19 +01:00
Jason Ekstrand	cb2382882e	nir/spirv: Update to SPIR-V version 1.0	2015-12-03 18:28:10 -08:00
Eric Anholt	83e65ca831	vc4: Add the RCL to CL debug dumping when in simulator mode. We can't dump it in the real driver, since the kernel doesn't give us a handle to it (except after a GPU hang, using a root ioctl). In the simulator we can.	2015-12-03 18:20:39 -08:00
Chad Versace	371fc2bc20	anv/gen9: Fix SURFACE_STATE halign and valign Pre-Skylake, RENDER_SUFFACE_STATE.SurfaceVerticalAlignment is in units of surface samples. A surface sample is equivalent to a pixel in all surfaces except interleaved multisample surfaces. In Skylake, it is in units of surface elements. A surface element is equivalent to a surface sample except for compressed formats, in which case the element is a compression block.	2015-12-03 15:33:08 -08:00
Chad Versace	981ef2f02d	anv: Embed isl_surf into anv_surface This reduces struct anv_surface to just two members: an offset and the embedded isl_surf.	2015-12-03 15:31:00 -08:00
Chad Versace	594e673fcc	anv/image: Drop assertions on SURFTYPE extent limits In anv_image_create(), stop asserting that VkImageCreateInfo::extent does not exceed the hardware limits for the given SURFTYPE. The assertions were incorrect because they did not take into account the hardware gen. Anyways, these types of assertions belong in isl, not anvil.	2015-12-03 15:29:52 -08:00
Chad Versace	b369389640	anv/image: Use isl to calculate surface layout Remove the surface layout calculations in anv_image_make_surface(). Let isl_surf_init() do the heavy lifting. Fixes 8 Crucible tests and regresses none. (hw=Broadwell and crucible@33d91ec).	2015-12-03 15:29:08 -08:00
Chad Versace	afdadec77f	isl: Implement isl_surf_init() for gen4-gen9 This is a big code push. The patch is about 3000 lines. Function isl_surf_init() calculates the physical layout of a surface. The implementation is "complete" (but untested) for all 1D, 2D, 3D, and cube surfaces for gen4 through gen9, except: * gen9 1D surfaces * gen9 Ys multisampled surfaces * auxiliary surfaces (such as hiz, mcs, ccs)	2015-12-03 15:26:11 -08:00
Chad Versace	bda43a0f59	isl: Rename legacy Y tiling to ISL_TILING_Y0 Rename legacy Y tiling from ISL_TILING_Y to ISL_TILING_Y0 in order to clearly distinguish it from Yf and Ys. Using ISL_TILING_Y to denote legacy Y tiling would lead to confusion with i965, because i965 uses I195_TILE_Y to denote any Y tiling.	2015-12-03 15:26:11 -08:00
Chad Versace	57941b61ab	anv/image: Vulkan's depthPitch is in bytes, not rows Fix for VkGetImageSubresourceLayout.	2015-12-03 15:26:11 -08:00
Jason Ekstrand	bfeaf67391	anv/device: Give a version of 0.210.1 in apiVersion	2015-12-03 15:23:33 -08:00
Jason Ekstrand	d666487dc6	vk: Add new WSI support and bump the API to 0.210.1	2015-12-03 15:15:29 -08:00
Marek Olšák	dd27825c8c	radeonsi: fix Fiji for LLVM <= 3.7 Cc: 11.0 11.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-12-03 23:55:23 +01:00
Marek Olšák	bfc14796b0	radeonsi: fix occlusion queries on Fiji Tested.	2015-12-03 23:46:37 +01:00
Marek Olšák	0b03f2def0	radeonsi: dump init_config IBs Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-03 23:41:23 +01:00
Marek Olšák	3a6de8c86e	radeonsi: print framebuffer info into ddebug logs Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-03 23:41:23 +01:00
Marek Olšák	a0bfb2798d	gallium/radeon: print more info about HTILE Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-03 23:41:23 +01:00
Marek Olšák	1cca259d99	gallium/radeon: print more info about CMASK Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-03 23:41:23 +01:00
Marek Olšák	84fbb0aff9	gallium/radeon: rename fmask::pitch -> pitch_in_pixels Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-03 23:41:23 +01:00
Marek Olšák	19eaceb6ed	gallium/radeon: print more information about textures Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-03 23:41:23 +01:00
Marek Olšák	2d712d35c5	gallium/radeon: move printing texture info into a separate function Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-03 23:41:23 +01:00
Marek Olšák	c60d49161e	gallium/radeon: remove unused r600_texture::pitch_override Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-03 23:41:23 +01:00
Marek Olšák	75d64698f0	gallium/radeon: remove DBG_TEXMIP we don't need 2 flags for dumping texture info Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-03 23:41:23 +01:00
Edward O'Callaghan	a5055e2f86	gallium/aux/util: Trivial, we already have format use it No need to dereference again, fixup for clarity. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-12-03 23:41:23 +01:00
Jason Ekstrand	fde60c1684	anv/entrypoints: Run the headers through the preprocessor first This allows us to filter based on preprocessor directives. We could build a partial preprocessor into the generator, but we would likely get it wrong. This allows us to filter out, for instance, windows-specific WSI stuff.	2015-12-03 14:13:55 -08:00
Jose Fonseca	5294debfa4	automake: Fix typo in MSVC2008 compat flags. It should be MSVC2008_COMPAT_CFLAGS and not MSVC2008_COMPAT_CXXFLAGS. This is why the recent util_blitter breakage went unnoticed on autotools builds. Trivial.	2015-12-03 22:00:49 +00:00
Jose Fonseca	071af9a511	ttn: Whitelist from -Werror=declaration-after-statement. nir is the exception among gallium/auxiliary -- we don't need to compile it with MSVC2008 yet. And this enables us to use -Werror=declaration-after-statement in the next commit as we should, without complicated fixes to tgsi_to_nir module. Trvial. Tested with GCC and Clang.	2015-12-03 22:00:49 +00:00
Jason Ekstrand	4c19243562	vk/0.210.0: Advertise version 0.210.0	2015-12-03 13:44:02 -08:00
Jason Ekstrand	888744cabf	vk/0.210.0: Update queries to the new API	2015-12-03 13:44:02 -08:00
Jason Ekstrand	924fbfc9a1	vk/0.210.0: Fix how we handle access flags in barriers The initial implementation in the 0.210.0 API update was misguieded as to what the access flags meant. This should be more correct.	2015-12-03 13:44:02 -08:00
Jason Ekstrand	fa2435de3c	vk/0.210.0: Update the VkFormat enum	2015-12-03 13:44:02 -08:00
Jason Ekstrand	4e904a0310	vk/0.210.0: Rework vkQueueSubmit	2015-12-03 13:44:02 -08:00
Jason Ekstrand	5757ad2959	vk/0.210.0: Remove depth clip and add depth clamp	2015-12-03 13:43:59 -08:00
Jason Ekstrand	d689745303	vk/0.210.0: Rework device features and limits	2015-12-03 13:43:54 -08:00
Jason Ekstrand	74c4c4acb6	vk/0.210.0: Rework QueueFamilyProperties	2015-12-03 13:43:54 -08:00
Jason Ekstrand	fed3586f34	vk/0.210.0: Rework result and structure type enums By and large, this is just moving enum values around. However, it also removed VK_UNSUPPORTED which we were returning a number of places. Those places now return VK_ERROR_INCOMPATABLE_DRIVER.	2015-12-03 13:43:54 -08:00
Jason Ekstrand	a5f19f64c3	vk/0.210.0: Remove the VkShaderStage enum This made for an unfortunately large amount of work since we were using it fairly heavily internally. However, gl_shader_stage does basically the same things, so it's not too bad.	2015-12-03 13:43:54 -08:00
Jason Ekstrand	e10dc002e9	vk/0.210.0: Remove VkShader	2015-12-03 13:43:54 -08:00
Jason Ekstrand	e6ab06ae7f	vk/0.210.0: Rework memory property flags	2015-12-03 13:43:54 -08:00
Jason Ekstrand	93071482f9	vk/0.210.0: Remove some unused enum values	2015-12-03 13:43:54 -08:00
Jason Ekstrand	b264012fcf	vk/0.210.0: Update VkPipelineStageFlagBits	2015-12-03 13:43:54 -08:00
Jason Ekstrand	1aaf15bf19	vk/0.210.0: Trivial function argument name change	2015-12-03 13:43:53 -08:00
Jason Ekstrand	938a2939c8	vk/0.210.0: We now allocate command buffers; not create them	2015-12-03 13:43:53 -08:00
Jason Ekstrand	5a02441789	vk/0.210.0: Rename a parameter to GetImageSparseMemoryRequirements	2015-12-03 13:43:53 -08:00
Jason Ekstrand	a9fc0ce0e3	vk/0.210.0: Delete three no longer existant entrypoints	2015-12-03 13:43:53 -08:00
Jason Ekstrand	fcfb404a58	vk/0.210.0: Rework allocation to use the new pAllocator's	2015-12-03 13:43:53 -08:00
Jason Ekstrand	d3547e7334	vk/0.210.0: Use VkSampleCountFlagBits for sample counts	2015-12-03 13:43:53 -08:00
Jason Ekstrand	9349625d60	vk/0.210.0: Rework VkInstanceCreateInfo	2015-12-03 13:43:53 -08:00
Jason Ekstrand	c30a021820	vk/0.210.0: More function argument renaming	2015-12-03 13:43:53 -08:00
Jason Ekstrand	b1cd025b88	vk/0.210.0: Replace MemoryInput/OutputFlags with AccessFlags	2015-12-03 13:43:53 -08:00
Jason Ekstrand	43f3e92348	vk/0.210.0: Rework render pass description structures	2015-12-03 13:43:53 -08:00
Jason Ekstrand	299f8f1511	vk/0.210.0: More structure field renaming	2015-12-03 13:43:53 -08:00
Jason Ekstrand	407b8cc5e0	vk/0.210.0: Get rid of VkImageAspect	2015-12-03 13:43:53 -08:00
Jason Ekstrand	3f6abd0161	vk/0.210.0: Rework descriptor sets	2015-12-03 13:43:52 -08:00
Jason Ekstrand	6a6da54ccb	vk/0.210.0: Rename parameters to memory binding/mapping functions	2015-12-03 13:43:52 -08:00
Jason Ekstrand	aadb7dce9b	vk/0.210.0: Update to the new instance/device create structs	2015-12-03 13:43:52 -08:00
Jason Ekstrand	607fe31598	vk/0.210.0: More trivial struct/enum changes	2015-12-03 13:43:52 -08:00
Jason Ekstrand	dde7172a8a	vk/0.210.0: Trivial flag enum updates	2015-12-03 13:43:52 -08:00
Jason Ekstrand	4cf0b57bbf	vk/0.210.0: Rename ChannelFlags to ColorComponentFlags	2015-12-03 13:43:52 -08:00
Jason Ekstrand	7f2284063d	vk/0.210.0: s/raster/rasterization/	2015-12-03 13:43:52 -08:00
Jason Ekstrand	1ab9f843bc	vk/0.210.0: Don't allow chaining of description structs	2015-12-03 13:43:52 -08:00
Jason Ekstrand	17486b8664	vk/0.210.0: More fun with flags fields	2015-12-03 13:43:52 -08:00
Jason Ekstrand	f5ba1f994a	vk/0.210.0: Make pCode a uint32_t pointer	2015-12-03 13:43:52 -08:00
Jason Ekstrand	5f348bd0e5	vk/0.210.0: Rename origin fields of VkViewport	2015-12-03 13:43:52 -08:00
Jason Ekstrand	9fa6e328eb	vk/0.210.0: Move alphaToOne and alphaToCoverate to multisample state	2015-12-03 13:43:52 -08:00
Jason Ekstrand	f97c3b6d58	vk/0.210.0: Add flags fields to various pipeline create structs	2015-12-03 13:43:51 -08:00
Jason Ekstrand	e673d64209	vk/0.210.0: Change field names in vertex input structs	2015-12-03 13:43:51 -08:00
Jason Ekstrand	fd53603e42	vk/0.210.0: Misc. no-op structure changes The only non-trivial change is to sparse resources that we don't handle anyway.	2015-12-03 13:43:51 -08:00
Jason Ekstrand	fe644721aa	vk/0.210.0: Rename property pCount parameters	2015-12-03 13:43:51 -08:00
Jason Ekstrand	e8f2294cd2	vk/0.210.0: Rework sampler filtering and mode enums	2015-12-03 13:43:51 -08:00
Jason Ekstrand	2e10ca5748	vk/0.210.0: Misc. function argument renames	2015-12-03 13:43:51 -08:00
Jason Ekstrand	569f70be56	vk/0.210.0: Rework copy/clear/blit API	2015-12-03 13:43:47 -08:00
Emil Velikov	5a23f6bd8d	mesa: rework the meaning of gl_debug_message::length Currently it stores strlen(buf) whenever the user originally provided a negative value for length. Although I've not seen any explicit text in the spec, CTS requires that the very same length (be that negative value or not) is returned back on Pop. So let's push down the length < 0 checks, tweak the meaning of gl_debug_message::length and fix GetDebugMessageLog to add and count the null terminators, as required by the spec. v2: return correct total length in GetDebugMessageLog v3: rebase (drop _mesa_shader_debug hunk). Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-12-03 19:21:19 +00:00
Emil Velikov	622186fbdf	mesa: errors: validate the length of null terminated string We're about to rework the meaning of gl_debug_message::length to only store the user provided data. Thus we should add an explicit validation for null terminated strings. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-12-03 19:21:08 +00:00
Emil Velikov	66fea8bd96	mesa: accept TYPE_PUSH/POP_GROUP with glDebugMessageInsert These new (relative to ARB_debug_output) tokens, have been explicitly separated from the existing ones in the spec text. With the reference to glDebugMessageInsert was dropped. At the same time, further down the spec says: "The value of <type> must be one of the values from Table 5.4" ... and these two are listed in Table 5.4. The GL 4.3 and GLES 3.2 do not give any hints on the former 'definition', plus CTS requires that the tokens are valid values for glDebugMessageInsert. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-12-03 19:21:08 +00:00
Emil Velikov	53be28107b	mesa: add SEVERITY_NOTIFICATION to default state As per the spec quote: "All messages are initially enabled unless their assigned severity is DEBUG_SEVERITY_LOW" We already had MEDIUM and HIGH set, let's toggle NOTIFICATION as well. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-12-03 19:21:07 +00:00
Emil Velikov	078dd6a0b4	mesa: return the correct value for GroupStackDepth We already have one group (the default) as specified in the spec. So lets return its size, rather than the index of the current group. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-12-03 19:20:58 +00:00
Emil Velikov	f39954bf7c	mesa: rename GroupStackDepth to CurrentGroup The variable is used as the actual index, rather than the size of the group stack - rename it to reflect that. Suggested-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-12-03 19:17:48 +00:00
Emil Velikov	1ca735701b	mesa: do not enable KHR_debug for ES 1.0 The extension requires (cough implements) GetPointervKHR (alias of GetPointerv) which in itself is available for ES 1.1 enabled mesa. Anyone willing to fish around and implement it for ES 1.0 is more than welcome to revert this commit. Until then lets restrict things. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93048 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-12-03 19:17:48 +00:00
Emil Velikov	f53f9eb8d4	glapi: add GetPointervKHR to the ES dispatch The KHR_debug extension implements this. Strictly speaking it could be used with ES 1.0, although as the original function is available on ES 1.1, I'm inclined to lift the KHR_debug requirement to ES 1.1. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93048 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-12-03 19:17:48 +00:00
Nanley Chery	808e752796	mesa/version: Update gl_extensions::Version during version override Commit `a16ffb743c`, which introduced gl_extensions::Version, updates the field when the context version is computed and when entering/exiting meta. Update this field when the version is overridden as well. Cc: "11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>	2015-12-03 10:20:34 -08:00
Brian Paul	a0f1bc18e5	mesa: print enum names rather than hexadecimal values in error messages Trivial.	2015-12-03 09:40:43 -07:00
Brian Paul	72a913ceb8	st/wgl: add new stw_ext_rendertexture.c file This should have been included in the previous commit. Signed-off-by: Brian Paul <brianp@vmware.com>	2015-12-03 09:33:55 -07:00
Brian Paul	e832b5b7fa	st/wgl: add support for WGL_ARB_render_texture There are a few legacy OpenGL apps on Windows which need this extension. We basically use glCopyTex[Sub]Image to implement wglBindTexImageARB (see the implementation notes for details). v2: refactor code to use st_copy_framebuffer_to_texture() helper function. Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-12-03 09:12:20 -07:00
Brian Paul	47b9ef872b	st/mesa: add new st_copy_framebuffer_to_texture() function This helper is used by the WGL state tracker to implement the wglBindTexImageARB() function. This is basically a new "meta" function. However, we're not putting it in the src/mesa/drivers/common/ directory because that code is not linked with gallium-based drivers. Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-12-03 08:34:24 -07:00
Juha-Pekka Heikkila	d6d90750f1	glsl: remove useless null checks and make match_explicit_outputs_to_inputs() static match_explicit_outputs_to_inputs() cannot get null inputs and if it ever did triggering first null check would later in the function cause segfault. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> CC: timothy.arceri@collabora.com Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-12-03 10:56:35 +02:00
Tapani Pälli	231db5869c	i965: use _Shader to get fragment program when updating surface state Atomic counters and Images were using ctx::Shader that does not take in to account program pipeline changes, ctx::_Shader must be used for SSO to work. Commit `c0347705` already changed ubo's to use this. Fixes failures seen with following Piglit test: arb_separate_shader_object-atomic-counter Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-03 08:08:07 +02:00
Ilia Mirkin	6c6f28c35e	nv50/ir: fix moves to/from flags Noticed this when looking at a trace that caused flags to spill to/from registers. The flags source/destination wasn't encoded correctly according to both envydis and nvdisasm. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-02 20:41:38 -05:00
Ilia Mirkin	101e315cc1	nv50/ir: don't forget to mark flagsDef on cvt in txb lowering Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-02 20:41:38 -05:00
Ilia Mirkin	06055121e6	nv50/ir: fix instruction permutation logic Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-02 20:41:38 -05:00
Ilia Mirkin	11fcf46590	nv50/ir: the mad source might not have a defining instruction For example if it's $r63 (aka 0), there won't be a definition. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-02 20:41:37 -05:00
Ilia Mirkin	52b68375ae	nv50/ir: make sure entire graph is reachable The algorithm expects the entire CFG to be reachable, so make sure that we hit every node. Otherwise we will end up with uninitialized data, memory corruption, etc. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-02 18:51:15 -05:00
Ilia Mirkin	adcc547bfb	nv50/ir: deal with loops with no breaks For example if there are only returns, the break bb will not end up part of the CFG. However there will have been a prebreak already emitted for it, and when hitting the RET that comes after, we will try to insert the current (i.e. break) BB into the graph even though it will be unreachable. This makes the SSA code sad. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-02 18:51:15 -05:00
Ilia Mirkin	ff61ac4838	nvc0/ir: fold postfactor into immediate SM20-SM50 can't emit a post-factor in the presence of a long immediate. Make sure to fold it in. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-02 18:51:15 -05:00
Ilia Mirkin	52a800a687	nv50/ir: allow immediate 0 to be loaded anywhere There's a post-RA fixup to replace 0's with $r63 (or $r127 if too many regs are used), so just as nvc0, let an immediate 0 be loaded anywhere. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-02 18:51:15 -05:00
Kenneth Graunke	043d427538	i965: Add INTEL_DEBUG=perf information for GS recompiles. Surprisingly, this didn't exist at all. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-12-02 15:23:01 -08:00
Kenneth Graunke	b6d4f051a5	i965: De-duplicate key_debug() function. This appeared in brw_vs.c and brw_wm.c, should have appeared in brw_gs.c, and was soon going to have to be in brw_tcs.c and brw_tes.c as well. So, instead, move it to a central location (which has to know about both struct brw_context and perf_debug()). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-12-02 15:22:58 -08:00
Samuel Pitoiset	8482763d35	nv50/ir/gk110: add memory barriers support for GK110 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-02 22:44:53 +01:00
Samuel Pitoiset	c672bf3b04	nv50/ir: do not call textureMask() for surface ops That texture mask thing doesn't seem to be needed for surface ops, so just as nve4+, let do that only for texture ops. This fixes a segfault with 'test_surface_st' from gallium/tests/trivial/compute.c on Fermi because this test uses sustp. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-02 22:10:44 +01:00
Jose Fonseca	9e6af56666	appveyor: Initial integration. AppVeyor doesn't require an appveyor.yml in the repos (in fact it has some limitations as noted in comments below), but doing so has two great advantages over the web UI: - appveyor.yml can be revisioned together with the code, so instructions should always be in synch with the code - appveyor.yml can be reused for people's private repositories (be on fdo or GitHub, etc.) Acked-by: Roland Scheidegger <sroland@vmware.com>	2015-12-02 19:40:53 +00:00
Jose Fonseca	4a3d388834	util/blitter: Fix "SO C90 forbids mixed declarations and code". Trivial.	2015-12-02 17:49:20 +00:00
Brian Paul	d31065cbf6	mesa: print enum string in compressed_subtexture_error_check() error msg Trivial.	2015-12-02 10:28:15 -07:00
Edward O'Callaghan	772f429f0a	gallium/util: Fix util_blitter_clear_depth_stencil() for num_layers>1 Previously util_blitter_clear_depth_stencil() could not clear more than the first layer. We need to generalise this as we did for util_blitter_clear_render_target(). Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-12-02 18:23:43 +01:00
Edward O'Callaghan	8f2c5e281d	gallium/util: Fix util_blitter_clear_render_target() for num_layers>1 Previously util_blitter_clear_render_target() could not clear more than the first layer. We need to generalise this so that ARB_clear_texture can pass the 3d piglit test. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-12-02 18:23:43 +01:00
Roland Scheidegger	09f74e6ef4	mesa: fix VIEWPORT_INDEX_PROVOKING_VERTEX and LAYER_PROVOKING_VERTEX queries These are implementation-dependent queries, but so far we just returned the value of whatever the current provoking vertex convention was set to, which was clearly wrong. Just make this a variable in the context constants like for other things which are implementation dependent (I assume all drivers will want to set this to the same value for both queries), and set it to GL_UNDEFINED_VERTEX which is correct for everybody (and drivers can override it). Reviewed-by: Brian Paul <brianp@vmware.com> CC: <mesa-stable@lists.freedesktop.org>	2015-12-02 18:20:57 +01:00
Jose Fonseca	56aff6bb4e	Remove Sun CC specific code. Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Alan Coopersmith <alan.coopersmith@oracle.com>	2015-12-02 07:51:04 +00:00
Jose Fonseca	51564f04b7	configure.ac: Refuse to build with Sun C compiler. https://bugs.freedesktop.org/show_bug.cgi?id=93189 Reviewed-by: Matt Turner <mattst88@gmail.com> Tested-by: Vinson Lee <vlee@freedesktop.org> Acked-by: Alan Coopersmith <alan.coopersmith@oracle.com>	2015-12-02 07:51:04 +00:00
Eric Anholt	18f8da7865	travis: Add a test build with scons. Since I just broke the scons build, I figured I'd make Travis test that I don't break it again in the future. The script runs the builds in parallel across VMs, so it still takes just 5 minutes to turn around results. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-12-01 15:09:56 -08:00
Kenneth Graunke	975b1299dd	i965: Increase BRW_MAX_UBO to 14. The NVIDIA binary driver and Intel's closed source driver both expose 14 here, rather than the GL minimum of 12. Let's follow suit. Without this, Shadow of Mordor fails to render correctly and triggers OpenGL errors: Mesa: User error: GL_INVALID_VALUE in glBindBufferBase(index=68) Mesa: User error: GL_INVALID_VALUE in glUniformBlockBinding(block binding 68 >= 60) There are 5 stages (VS, TCS, TES, GS, FS), and 12 * 5 = 60 is too small. 14 * 5 = 70 will work just fine. Tapani believes this will also help Alien Isolation. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Matt Turner <mattst88@gmail.com>	2015-12-01 14:55:33 -08:00
Matt Turner	7e6a6f3e61	i965: Do dead-code elimination in a single pass. The first pass marked dead instructions as opcode = NOP, and a second pass deleted those instructions so that the live ranges used in the first pass wouldn't change. But since we're walking the instructions in reverse order, we can just do everything in one pass. The only thing we have to do is walk the blocks in reverse as well. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-12-01 14:48:55 -08:00
Matt Turner	5a6f0bf5b8	glsl: Rename safe_reverse -> reverse_safe. To match existing foreach_in_list_reverse_safe. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-12-01 14:48:55 -08:00
Matt Turner	48b4e88d3d	i965: Don't mark dead instructions' sources live. Removes dead code from glsl-mat-from-int-ctor-03.shader_test. Reported-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-01 14:48:55 -08:00
Dave Airlie	0e49151dcf	r600: set mega fetch count to 16 for gs copy shader Seems like MFC should be set for this shader. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-02 08:25:13 +10:00
Dave Airlie	4ebcf5194d	r600: increment ring index after emit vertex not before. The docs say we should send the emit after the ring writes, so lets do that and not have an ALU in between. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-02 08:25:13 +10:00
Dave Airlie	13b134a443	r600: add alu + cf nop to copy shader on r600 SB suggests we do this for r600, so lets do it, for the copy shader. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-02 08:25:13 +10:00
Dave Airlie	af4013d26b	r600: SMX returns CONTEXT_DONE early workaround streamout, gs rings bug on certain r600s, requires a wait idle before each surface sync. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "10.6 11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-02 08:25:00 +10:00
Dave Airlie	b63944e8b9	r600: do SQ flush ES ring rolling workaround Need to insert a SQ_NON_EVENT when ever geometry shaders are enabled. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "10.6 11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-02 08:24:32 +10:00
Samuel Pitoiset	ea33920f7e	nv50,nvc0: allow to create resources other than buffers For the compute support, we might stick buffers as surfaces. This fixes an assertion when executing src/gallium/tests/trivial/compute. To avoid using these "restricted" surfaces as render targets, these assertions have been moved. Note that it's already handled for the framebuffer thing on nvc0. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-01 22:55:14 +01:00
Brian Paul	f391b95105	glapi: work-around MSVC 65K string length limitation for enums.c String literals cannot exceed 65535 characters for MSVC. Instead of emiting a string, emit an array of characters. v2: fix indentation and add comment in the gl_enums.py file about this ugliness. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-01 14:28:45 -07:00
Eric Anholt	148c2f5b17	mapi: Fix enums.c build with other build systems. Tested with scons (by both myself and Mark Janes), Android is just copy and paste.	2015-12-01 12:19:02 -08:00
Eric Anholt	1c0ac1976a	travis: Initial import of travis instructions. This just builds/installs our dependencies, and runs "make check". I'm interested in integrating more tests into it, but this seems like a pretty easy first start. If your personal branches of Mesa are on github, you can enable it on your account and the repository (see https://docs.travis-ci.com/user/for-beginners), then any pushes you do will get their HEAD commit tested, and any pull requests to your tree will get their merge commits tested.	2015-12-01 11:08:57 -08:00
Eric Anholt	4922a3ae52	mesa: Drop the blacklisting of new GL enums. Now when people need new extensions, they can skip the entire enum-definition process, and we can stop reviewing new extension XML for its enum content. This also brings in a new enum that I wanted to use in enum_strings.cpp for testing the code generator. v2: Drop comment about disabled GL_1PASS_EXT test. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-01 10:24:42 -08:00
Eric Anholt	b65e44f55d	mesa: Use a 32-bit offset for the enums.c string offset table. With GLES 3.1, GL 4.5, and many new vendor extensions about to get their enums added, we jump up to 85k of table. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-01 10:24:41 -08:00
Eric Anholt	c75cfe1c8a	mesa: Prefer newer names to older ones among names present in core. Sometimes GL likes to rename an old enum when it grows a more general purpose, and we should prefer the new name. Changes from this: GL_POINT/LINE_SIZE_* (1.1) -> GL_SMOOTH_POINT/LINE_SIZE_* (1.2) GL_FOG_COORDINATE_* (1.4) -> GL_FOG_COORD_* (1.5) GL_SOURCE[012]_RGB/ALPHA (1.3) -> GL_SRC0_RGB (1.5) GL_COPY_READ/WRITE_BUFFER (3.1) -> GL_COPY_READ_BUFFER_BINDING (4.2) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-01 10:24:38 -08:00
Eric Anholt	710762b64a	mesa: Drop bitfield "enums" from the enum-to-string table. Asking the table for bitfield names doesn't make any sense. For 0x10, do you want GL_GLYPH_HORIZONTAL_BEARING_ADVANCE_BIT_NV or GL_COLOR_BUFFER_BIT4_QCOM or GL_POLYGON_STIPPLE_BIT or GL_SHADER_GLOBAL_ACCESS_BARRIER_BIT_NV? Giving a useful answer would depend on a whole lot of context. This also fixes a bad enum table entry, where we chose GL_HINT_BIT instead of GL_ABGR_EXT for 0x8000, so we can now fix its entry in the enum_strings test. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-01 10:24:36 -08:00
Eric Anholt	cbabf5f9dc	mesa: Switch to using the Khronos registry for generating enums. I've used a bunch of python code to cut out new enums so that the two generated files can be diffed. I'll remove all that hardcoding in the following commits. All remaining differences between the generated code: - GL_TEXTURE_BUFFER_FORMAT didn't appear in GL3 when TBOs got merged to core, so it now gets an _ARB suffix instead. - Blacklisting can't keep EXT_sso's GL_ACTIVE_PROGRAM_EXT from becoming GL_ACTIVE_PROGRAM -- in our hash table, GL_ACTIVE_PROGRAM_EXT points at the GLES2 enum's value (aka GL_CURRENT_PROGRAM). By not blacklisting the core name, we get both enums translated. - GL_DRAW_FRAMEBUFFER_BINDING and GL_FRAMEBUFFER_BINDING both appeared in GL3 as synonyms, and the new code happens to choose GL_FRAMEBUFFER_BINDING instead. - GL_TEXTURE_COMPONENTS and GL_TEXTURE_INTERNAL_FORMAT both appear in 1.1, and the new code chooses GL_TEXTURE_INTERNAL_FORMAT instead (which seems better, to me) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-01 10:24:34 -08:00
Eric Anholt	f72923aaea	mesa: Remove the python mode bits from gl_enums.py. emacs whines at me every time I open the file about these unsafe variables, and the file was reformatted from 8 space to 4 space long ago. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-01 10:24:31 -08:00
Eric Anholt	741f642a6f	mesa: Drop apparently typoed GL_ALL_CLIENT_ATTRIB_BITS. GL_ALL_ATTRIB_BITS is a thing, and GL_CLIENT_ALL_ATTRIB_BITS, but I don't see GL_ALL_CLIENT_ATTRIB_BITS in my grepping of khronos XML, GL extension specs, GL 1.1, GL 2.2, and GL 4.4. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-12-01 10:24:22 -08:00
Eric Anholt	5cb9dc45c7	mesa: Drop enums that had been removed in later revs of specs. Mesa hasn't been using these enums and the finalized specs don't reference them, so losing them from our generated enum-to-string code should be fine. Reduces diffs to generating from Khronos XML, which has these enums noted defined but commented out from any consumers. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-01 10:24:18 -08:00
Eric Anholt	5a7e5d8bb6	mesa: Fix a typo in AMD_performance_monitor enum. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-01 10:24:16 -08:00
Eric Anholt	bfc64b9688	mesa: Fix enum definition of CULL_VERTEX_EYE/OBJECT_POSITION In converting to using the Khronos XML, I found that our XML had these two swapped, and the text spec agreed. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-01 10:24:15 -08:00
Eric Anholt	76ec0b9038	mesa: Add a copy of the Khronos gl.xml (SVN #31705 ). The intention here is to keep a pristine copy of the upstream gl.xml that can be updated at any time with a new version, and use that to generate Mesa code from instead of our private XML. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-01 10:24:13 -08:00
Eric Anholt	edc8850436	mesa: Cut enum_strings.cpp test down to a few hand-chosen enums. The previous contents appeared to be the output of some form of code generation for all enums, with a few entries hand-edited to deal with oddness. The downside to this was that when an enum gets promoted from vendor to _EXT or _EXT to _ARB or _ARB to core, make check starts failing even when the commiter has done nothing wrong. Instead of black-box testing the code generation, pick a few enums that intentionally poke the interesting cases of code generation. People editing the code generator should be diffing the generated code anyway. This should catch when they fail to do so, without throwing false negatives when people update the GL XML. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-01 10:24:02 -08:00
Tom Stellard	9adbb9e713	clover: Handle NULL devices returned by pipe_loader_probe() v2 When probing for devices, clover will call pipe_loader_probe() twice. The first time to retrieve the number of devices, and then second time to retrieve the device structures. We currently assume that the return value of both calls will be the same, but this will not be the case if a device happens to disappear between the two calls. When a device disappears, the pipe_loader_probe() will add a NULL device to the device list, so we need to handle this. v2: - Keep range for loop Reviewed-by: Francisco Jerez <currojerez@riseup.net> Acked-by: Emil Velikov <emil.l.velikov@gmail.com> CC: <mesa-stable@lists.freedesktop.org>	2015-12-01 16:00:54 +00:00
Jonathan Gray	99cd600835	automake: fix some occurrences of hardcoded -ldl and -lpthread Correct some occurrences of -ldl and -lpthread to use $(DLOPEN_LIBS) and $(PTHREAD_LIBS) respectively. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-12-01 16:53:40 +00:00
Iago Toral Quiroga	241f15ac80	glsl/lower_ubo_reference: split struct copies into element copies Improves register pressure, since otherwise we end up emitting loads for all the elements in the RHS and them emitting stores for all elements in the LHS. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-01 13:30:42 +01:00
Iago Toral Quiroga	867c436ca8	glsl/lower_ubo_reference: split array copies into element copies Improves register pressure, since otherwise we end up emitting loads for all the elements in the RHS and them emitting stores for all elements in the LHS. v2: - Mark progress properly. This also fixes some instances where the added nodes with individual element copies where not being lowered, which is expected behavior as explained in the documentation for visit_list_elements. - Only need to do this if the RHS is a buffer-backed variable. - We can also have arrays inside structs. A later patch will make it so we also split struct copies and end up with multiple ir_dereference_record assignments, so make sure that if any of these is an array copy, we also split it. Fixes the following piglit tests: tests/spec/arb_shader_storage_buffer_object/execution/large-field-copy.shader_test tests/spec/arb_shader_storage_buffer_object/linker/copy-large-array.shader_test Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-01 13:29:57 +01:00
Julien Isorce	e483cba9f5	st/va: also retrieve reference frames info for h264 Other hardwares than AMD require to parse: VAPictureParameterBufferH264.ReferenceFrames[16] Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-12-01 08:21:37 +00:00
Julien Isorce	b4fb6d7616	st/va: delay decoder creation until max_references is known In general max_references cannot be based on num_render_targets. This patch allows to allocate buffers with an accurate size. I.e. no more than necessary. For other codecs it is a fixed value 2. This is similar behaviour as vaapi/vdpau-driver. For now HEVC case defaults to num_render_targets as before. But it could also benefits this change by setting a more accurate max_references number in handlePictureParameterBuffer. Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-12-01 08:21:20 +00:00
Iago Toral Quiroga	750393ff7d	glsl/dead_builin_varyings: Fix gl_FragData array lowering The current implementation looks for array dereferences on gl_FragData and immediately proceeds to lower them, however this is not enough because we can have array access on vector variables too, like in this code: out vec4 color; void main() { int i; for (i = 0; i < 4; i++) color[i] = 1.0; } Fix it by making sure that the actual variable being dereferenced is an array. Fixes a crash in: spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-ldexp-dvec4.shader_test Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-12-01 08:30:52 +01:00
Dave Airlie	4f34722575	r600: workaround empty geom shader. We need to emit at least one cut/emit in every geometry shader, the easiest workaround it to stick a single CUT at the top of each geom shader. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "10.6 11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-01 12:58:43 +10:00
Dave Airlie	04efcc6c7a	r600: rv670 use at least 16es/gs threads This is specified in the docs for rv670 to work properly. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "10.6 11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-01 12:58:34 +10:00
Dave Airlie	8168dfdd4e	r600: geometry shader gsvs itemsize workaround On some chips the GSVS itemsize needs to be aligned to a cacheline size. This only applies to some of the r600 family chips. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "10.6 11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-01 12:57:55 +10:00
Gregory Hainaut	2ab9cd0c4d	glsl: don't sort varying in separate shader mode This fixes an issue where the addition of the FLAT qualifier in varying_matches::record() can break the expected varying order. It also avoids a future issue with the relaxing of interpolation qualifier matching constraints in GLSL 4.50. V2: (by Timothy Arceri) * reworked comment slightly Signed-off-by: Gregory Hainaut <gregory.hainaut@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-12-01 12:46:37 +11:00
Gregory Hainaut	8117f46f49	glsl: don't dead code remove SSO varyings marked as active GL_ARB_separate_shader_objects allow matching by name variable or block interface. Input varyings can't be removed because it is will impact the location assignment. This fixes the bug 79783 and likely any application that uses GL_ARB_separate_shader_objects extension. V2 (by Timothy Arceri): * simplify now that builtins are not set as always active Signed-off-by: Gregory Hainaut <gregory.hainaut@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> https://bugs.freedesktop.org/show_bug.cgi?id=79783	2015-12-01 12:46:32 +11:00
Gregory Hainaut	618612f867	glsl: add always_active_io attribute to ir_variable The value will be set in separate-shader program when an input/output must remains active. e.g. when deadcode removal isn't allowed because it will create interface location/name-matching mismatch. v3: * Rename the attribute * Use ir_variable directly instead of ir_variable_refcount_visitor * Move the foreach IR code in the linker file v4: * Fix variable name in assert v5 (by Timothy Arceri): * Rename functions and reword comments * Don't set always active on builtins Signed-off-by: Gregory Hainaut <gregory.hainaut@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-12-01 12:46:26 +11:00
Timothy Arceri	76c09c1792	glsl: copy how_declared when lowering interface blocks Cc: Gregory Hainaut <gregory.hainaut@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-12-01 12:45:07 +11:00
Timothy Arceri	12ba6cfba7	glsl: optimise inputs/outputs with explicit locations This change allows used defined inputs/outputs with explicit locations to be removed if they are detected to not be used between shaders at link time. To enable this we change the is_unmatched_generic_inout field to be flagged when we have a user defined varying. Previously explicit_location was assumed to be set only in builtins however SSO allows the user to set an explicit location. We then add a function to match explicit locations between shaders. V2: call match_explicit_outputs_to_inputs() after is_unmatched_generic_inout has been initialised. Cc: Gregory Hainaut <gregory.hainaut@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-12-01 12:45:03 +11:00
Jason Ekstrand	4ab9391fbb	vk/0.210.0: Rework dynamic states	2015-11-30 14:19:41 -08:00
Dave Airlie	4d64459a92	r600/shader: split address get out to a function. This will be used in the tess shaders. Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-01 08:10:21 +10:00
Jason Ekstrand	73ef7d47d2	vk/0.210.0: Rework color blending enums	2015-11-30 13:49:28 -08:00
Jason Ekstrand	2c77b0cd01	gen7/8/cmd_buffer: Inline vk_to_gen_swizzle It's currently unused on IVB so we get compiler warnings.	2015-11-30 13:29:51 -08:00
Jason Ekstrand	9b1cb8fdbc	vk/0.210.0: Rework a few raster/input enums	2015-11-30 13:28:17 -08:00
Jason Ekstrand	a53f23d93f	vk/0.210.0: Rework texture view component mapping	2015-11-30 13:06:12 -08:00
Jason Ekstrand	f1a7c7841f	vk/0.210.0: Switch to the new VKAPI function decorations While we're at it, we do a bunch of the VkResult -> void updates	2015-11-30 12:46:30 -08:00
Jason Ekstrand	a89a485e79	vk/0.210.0: Rename CmdBuffer to CommandBuffer	2015-11-30 11:48:08 -08:00
Jason Ekstrand	6a8a542610	vk/0.210.0: A pile of minor enum updates	2015-11-30 11:12:44 -08:00
Jason Ekstrand	3db43e8f3e	vk/0.210.0: Switch to the new-style handle declarations	2015-11-30 10:58:02 -08:00
Jason Ekstrand	5cb57806b2	vk: Add connonical 0.170.2 and 0.210.0 headers This is in preparation for the API update	2015-11-30 10:24:35 -08:00
Marta Lofstedt	44944a66ce	doc: Set GL_OES_geometry_shader as started Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2015-11-30 10:47:21 +01:00
Marta Lofstedt	1d5b88e33b	gles2: Update gl2ext.h to revision: 32120 This is needed to be able to implement the accepted OES extensions. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-30 10:46:15 +01:00
Julien Isorce	10c14919c8	vl/buffers: fixes vl_video_buffer_formats for RGBX Fixes: `42a5e143a8` "vl/buffers: add RGBX and BGRX to the supported formats" Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-30 09:02:29 +00:00
Samuel Iglesias Gonsálvez	a348fe89af	i965/fs: remove unused fs_reg offset Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2015-11-30 10:00:40 +01:00
Kenneth Graunke	83dedb6354	i965: Add src/dst interference for certain instructions with hazards. When working on tessellation shaders, I created some vec4 virtual opcodes for creating message headers through a sequence like: mov(8) g7<1>UD 0x00000000UD { align1 WE_all 1Q compacted }; mov(1) g7.5<1>UD 0x00000100UD { align1 WE_all }; mov(1) g7<1>UD g0<0,1,0>UD { align1 WE_all compacted }; mov(1) g7.3<1>UD g8<0,1,0>UD { align1 WE_all }; This is done in the generator since the vec4 backend can't handle align1 regioning. From the visitor's point of view, this is a single opcode: hs_set_output_urb_offsets vgrf7.0:UD, 1U, vgrf8.xxxx:UD Normally, there's no hazard between sources and destinations - an instruction (naturally) reads its sources, then writes the result to the destination. However, when the virtual instruction generates multiple hardware instructions, we can get into trouble. In the above example, if the register allocator assigned vgrf7 and vgrf8 to the same hardware register, then we'd clobber the source with 0 in the first instruction, and read back the wrong value in the last one. It occured to me that this is exactly the same problem we have with SIMD16 instructions that use W/UW or B/UB types with 0 stride. The hardware implicitly decodes them as two SIMD8 instructions, and with the overlapping regions, the first would clobber the second. Previously, we handled that by incrementing the live range end IP by 1, which works, but is excessive: the next instruction doesn't actually care about that. It might also be the end of control flow. This might keep values alive too long. What we really want is to say "my source and destinations interfere". This patch creates new infrastructure for doing just that, and teaches the register allocator to add interference when there's a hazard. For my vec4 case, we can determine this by switching on opcodes. For the SIMD16 case, we just move the existing code there. I audited our existing virtual opcodes that generate multiple instructions; I believe FS_OPCODE_PACK_HALF_2x16_SPLIT needs this treatment as well, but no others. v2: Rebased by mattst88. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-30 00:34:07 -08:00
Kenneth Graunke	1ac1581f38	i965: Fix JIP to properly skip over unrelated control flow. We've apparently always been botching JIP for sequences such as: do cmp.f0.0 ... (+f0.0) break ... if ... else ... endif ... while Normally, UIP is supposed to point to the final destination of the jump, while in nested control flow, JIP is supposed to point to the end of the current nesting level. It essentially bounces out of the current nested control flow, to an instruction that has a JIP which bounces out another level, and so on. In the above example, when setting JIP for the BREAK, we call brw_find_next_block_end(), which begins a search after the BREAK for the next ENDIF, ELSE, WHILE, or HALT. It ignores the IF and finds the ELSE, setting JIP there. This makes no sense at all. The break is supposed to skip over the whole if/else/endif block entirely. They have a sibling relationship, not a nesting relationship. This patch fixes brw_find_next_block_end() to track depth as it does its search, and ignore anything not at depth 0. So when it sees the IF, it ignores everything until after the ENDIF. That way, it finds the end of the right block. I noticed this while reading some assembly code. We believe jumping earlier is harmless, but makes the EU walk through a bunch of disabled instructions for no reason. I noticed that GLBenchmark Manhattan had a shader that contained a BREAK with a bogus JIP, but didn't measure any performance improvement (it's likely miniscule, if there is any). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-11-30 00:27:16 -08:00
Dave Airlie	d72299c531	r600: move per-type settings into a switch statement This will allow adding tess stuff much cleaner later. Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-11-30 11:08:00 +10:00
Dave Airlie	58e0122d86	r600: split out common alu_writes pattern. This just splits out a common pattern into an inline function to make things cleaner to read. Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-11-30 11:07:18 +10:00
Dave Airlie	26332ef797	r600/llvm: fix r600/llvm build Reported on irc by gryffus Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-11-30 11:05:42 +10:00
Dave Airlie	9eff9f6134	r600: fixes for register definitions. Forgot to add these. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-11-30 09:35:37 +10:00
Dave Airlie	c2e701c7ca	r600: add missing register to initial state We really should initialise HS/LS_2 and SQ_LDS_ALLOC exists on all evergreen not just cayman, so we should initialise it as well. Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-11-30 09:14:16 +10:00
Dave Airlie	bcdc748fe2	r600: define registers required for tessellation This adds the defines for a bunch of registers and shader values that are required to implement tessellation. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-11-30 09:14:16 +10:00
Dave Airlie	b502bae610	r600: consolidate clip state updates Move some common code into one place, tess will also need to use this function. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-11-30 09:14:16 +10:00
Samuel Pitoiset	b8c524ff88	nv50/ir: always display the opcode number for unknown instructions This helps in debugging unknown instructions. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-29 16:40:12 +01:00
Emil Velikov	d37ebed470	mesa: remove len argument from _mesa_shader_debug() There was only a single user which was using strlen(buf). As this function is not user facing (i.e. we don't need to feed back original length via a callback), we can simplify things. Suggested-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-11-29 14:41:40 +00:00
Emil Velikov	e714c971ae	drivers/x11: scons: partially revert `b9b40ef9b7` As glsl_types.{cpp,h} were moved out of the sconscript (commit `b23a4859f4` "scons: Build nir/glsl_types.cpp once.") remove the dangling includes. Cc: Jose Fonseca <jfonseca@vmware.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-11-29 14:41:39 +00:00
Emil Velikov	31ed3fc57d	nir: remove recursive inclusion in builtin_type_macros.h The header is already included by glsl_types.{cpp,h}. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-29 14:41:39 +00:00
Emil Velikov	fc16942cf7	nir: remove unneeded include Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-29 14:41:39 +00:00
Emil Velikov	b92ecdcc79	mesa/program: remove dead function declarations Dead since `5e9aa9926b` (2011) - _mesa_ir_compile_shader `69e07bdeb4` (2009) - _mesa_get_program_register Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Brian Paul <brianp@vmware.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-29 14:41:39 +00:00
Emil Velikov	5d294d9fa3	auxiliary/vl/dri: fd management cleanups Analogous to previous commit, minus the extra dup. We are the one opening the device thus we can directly use the fd. Spotted by Coverity (CID `1339867`, 1339877) Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-29 14:41:00 +00:00
Emil Velikov	151290c154	auxiliary/vl/drm: fd management cleanups Analogous to previous commit. Spotted by Coverity (CID 1339868) Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-29 14:40:26 +00:00
Emil Velikov	fe71059388	st/xa: fd management cleanups Analogous to previous commit. Spotted by Coverity (CID 1339866) Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-29 14:39:51 +00:00
Emil Velikov	d90ba57c08	st/dri: fd management cleanups Add some checks if the original/dup'd fd is valid and ensure that we don't leak it on error. The former is implicitly handled within the pipe_loader, although let's make things explicit and check beforehand. Spotted by Coverity (CID 1339865) Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-29 14:39:03 +00:00
Emil Velikov	5f92906b87	pipe-loader: check if winsys.name is non-null prior to strcmp In theory this wouldn't be an issue, as we'll find the correct name and break out of the loop before we hit the sentinel. Let's fix this and avoid issues in the future. Spotted by Coverity (CID 1339869, 1339870, 1339871) Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-29 14:38:22 +00:00
Emil Velikov	866a1f7fdd	st/va: add missing break statement Earlier commit factored out the mpeg4 IQ matrix handling into separate function, although it forgot to add a break in its case statement. Thus the data ended up partially overwritten as the mpeg4 and h265 structs are members of the desc union. Spotted by Coverity (CID 1341052) Fixes: `64761a841d` "st/va: move MPEG4 functions into separate file" Cc: Julien Isorce <j.isorce@samsung.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-11-29 14:31:14 +00:00
Ilia Mirkin	0396eaaf80	mesa: support GL_RED/GL_RG in ES2 contexts when driver support exists Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93126 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-11-28 17:24:34 -05:00
Nicolai Hähnle	9e5e702cfb	radeon: only suspend queries on flush if they haven't been suspended yet Non-timer queries are suspended during blits. When the blits end, the queries are resumed, but this resume operation itself might run out of CS space and trigger a flush. When this happens, we must prevent a duplicate suspend during preflush suspend, and we must also prevent a duplicate resume when the CS flush returns back to the original resume operation. This fixes a regression that was introduced by: commit `8a125afa6e` Author: Nicolai Hähnle <nhaehnle@gmail.com> Date: Wed Nov 18 18:40:22 2015 +0100 radeon: ensure that timing/profiling queries are suspended on flush The queries_suspended_for_flush flag is redundant because suspended queries are not removed from their respective linked list. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reported-by: Axel Davy <axel.davy@ens.fr> Cc: "11.1" <mesa-stable@lists.freedesktop.org> Tested-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-28 11:08:49 +01:00
Jose Fonseca	ea3f394e4a	scons: Use LD version script for libgl-xlib. Trivial.	2015-11-27 14:14:25 +00:00
Jose Fonseca	a11955b9f9	svga: Don't return value from void function. Addresses MSVC warning C4098: 'svga_destroy_query' : 'void' function returning a value. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-11-27 14:14:25 +00:00
Jose Fonseca	c127e6a3ea	gallium: Make pipe_query_result::batch array length non-zero. Zero length arrays are non standard: warning C4200: nonstandard extension used : zero-sized array in struct/union Cannot generate copy-ctor or copy-assignment operator when UDT contains a zero-sized array And all code does `N * sizeof query_result->batch[0]`, so it should work exactly the same. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-11-27 14:14:25 +00:00
Neil Roberts	bc2470d5d3	util: Tiny optimisation for the linear→srgb conversion When converting 0.0 it would be nice if it didn't do any arithmetic. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2015-11-27 10:55:22 +01:00
Eduardo Lima Mitev	27a88a947c	docs: Update GL3.txt to add ARB_internalformat_query2 Added to OpenGL 4.3 section, tagged as 'in progress (elima)'. See https://bugs.freedesktop.org/show_bug.cgi?id=92687. Thanks to Thomas H.P. Andersen for remainding me about this. v1: - Update the already existing entry in section 4.3 instead (Ilia Mirkin). - Added my BZ nickname as contact person (Felix Schwarz). Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-26 23:53:16 +01:00
Timothy Arceri	c3ec12ec3c	glsl: don't generate extra errors in ValidateProgramPipeline From Section 11.1.3.11 (Validation) of the GLES 3.1 spec: "An INVALID_OPERATION error is generated by any command that trans- fers vertices to the GL or launches compute work if the current set of active program objects cannot be executed, for reasons including:" It then goes on to list the rules we validate in the _mesa_validate_program_pipeline() function. For ValidateProgramPipeline the only mention of generating an error is: "An INVALID_OPERATION error is generated if pipeline is not a name re- turned from a previous call to GenProgramPipelines or if such a name has since been deleted by DeleteProgramPipelines," Which we handle separately. This fixes: ES31-CTS.sepshaderobjs.PipelineApi No regressions on the eEQP 3.1 tests. Cc: Gregory Hainaut <gregory.hainaut@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-11-27 08:44:37 +11:00
Kristian Høgsberg Kristensen	d6d82f1ab3	vk: Fix 3DSTATE_WM_DEPTH_STENCIL for gen8 This packet is a different size on gen8 and we hit an assertion when we try to merge a gen9 size dword array from the pipeline with the gen8 sized array we create from dynamic state. Use a static assert in the merge macro and fix this issue by using different wm_depth_stencil arrays on gen8 and gen9.	2015-11-26 10:11:52 -08:00
Rob Clark	57fc0dd8d5	freedreno/ir3: assign varying locations later Rather than assigning inloc up front, when we don't yet know if it will be unused, assign it last thing before the legalize pass. Also, realize when inputs are unused (since for frag shader's we can't rely on them being removed from ir->inputs[]). This doesn't make sense if we don't also dynamically assign the inloc's, since we could end up telling the hw the wrong # of varyings (since we currently assume that the # of varyings and max-inloc are related..) Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-11-26 12:35:10 -05:00
Rob Clark	2181f2cd58	freedreno/ir3: use instr flag to mark unused instructions Rather than magic depth value, which won't be available in later stages. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-11-26 12:35:10 -05:00
Rob Clark	2fbe4e7d2f	freedreno/a4xx: rework vinterp/vpsrepl Same as previous commit, for a4xx. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-11-26 12:35:10 -05:00
Rob Clark	5adf4a5cda	freedreno/a3xx: rework vinterp/vpsrepl Make the interpolation / point-sprite replacement mode setup deal with varying packing. In a later commit, we switch to packing just the varying components that are actually used by the frag shader, so we won't be able to assume everything is vec4's aligned to vec4. Which would highly confuse the previous vinterp/vpsrepl logic. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-11-26 12:35:10 -05:00
Serge Martin	b7c958b7b7	clover: fix tgsi compiler crash with invalid src Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-11-26 15:30:25 +02:00
Francisco Jerez	55ffa64daf	i965/gen9+: Switch thread scratch space to non-coherent stateless access. The thread scratch space is thread-local so using the full IA-coherent stateless surface index (255 since Gen8) is unnecessary and potentially expensive. On Gen8 and early steppings of Gen9 this is not a functional change because the kernel already sets bit 4 of HDC_CHICKEN0 which overrides all HDC memory access to be non-coherent in order to workaround a hardware bug. This happens to fix a full system hang when running any spilling code on a pre-production SKL GT4e machine I have on my desk (forcing all HDC access to non-coherent from the kernel up to stepping F0 might be a good idea though regardless of this patch), and improves performance of the OglPSBump2 SynMark benchmark run with INTEL_DEBUG=spill_fs by 33% (11 runs, 5% significance) on a production SKL GT2 (on which HDC IA-coherency is apparently functional so it wouldn't make sense to disable globally). Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-11-26 14:07:58 +02:00
Francisco Jerez	bc8182808a	i965/fs: Don't use Gen7-style scratch block reads on Gen9+. Unfortunately Gen7 scratch block reads and writes seem to be hardwired to BTI 255 even on Gen9+ where that index causes the dataport to do an IA-coherent read or write. This change is required for the next patch to be correct, since otherwise we would be writing to the scratch space using non-coherent access and then reading it back using IA-coherent reads, which wouldn't be guaranteed to return the value previously written to the same location without introducing an additional HDC flush in between. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-11-26 14:07:58 +02:00
Francisco Jerez	3e6d0d2ca4	i965: Add symbolic defines for some magic dataport surface indices. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-11-26 14:07:58 +02:00
Nicolai Hähnle	6b5268d202	radeon: use PIPE_DRIVER_QUERY_FLAG_DONT_LIST for perfcounters Since the query names are not very enlightening, and there are thousands of them, GALLIUM_HUD=help should only show the first and last query name for each hardware block. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-26 10:57:44 +01:00
Nicolai Hähnle	f36d9857cd	gallium: add PIPE_DRIVER_QUERY_FLAG_DONT_LIST This allows the driver to give a hint to the HUD so that GALLIUM_HUD=help is less spammy. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-26 10:57:43 +01:00
Nicolai Hähnle	80a16dece6	radeon: delay the generation of driver query names until first use This shaves a bit more time off the startup of programs that don't actually use performance counters. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-26 10:57:43 +01:00
Julien Isorce	ca976e6900	st/va: add missing profiles in PipeToProfile's switch. Otherwise assert is raised from vlVaQueryConfigProfiles's for loop. Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-11-26 08:21:45 +00:00
Marta Lofstedt	63b49e1711	mesa: remove ARB_geometry_shader4 No drivers currently implement ARB_geometry_shader4, nor are there any plans to implement it. We only support the version of geometry shaders that was incorporated into OpenGL 3.2 / GLSL 1.50. Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-26 08:40:46 +01:00
Kristian Høgsberg Kristensen	cd4721c062	vk: Add SKL support Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-11-25 22:34:10 -08:00
Tapani Pälli	c2e146f487	mesa: error out in indirect draw when vertex bindings mismatch Patch adds additional mask for tracking which vertex arrays have associated vertex buffer binding set. This array can be directly compared to which vertex arrays are enabled and should match when drawing. Fixes following CTS tests: ES31-CTS.draw_indirect.negative-noVBO-arrays ES31-CTS.draw_indirect.negative-noVBO-elements v2: update mask in vertex_array_attrib_binding v3: rename mask and make it track _BoundArrays which matches what was actually originally wanted (Fredrik Höglund) v4: code cleanup, check for GLES 3.1 (Fredrik Höglund) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2015-11-26 08:01:31 +02:00
Kristian Høgsberg Kristensen	c445fa2f77	vk: Make entrypoint generator output gen9 entry points Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-11-25 20:58:25 -08:00
Kristian Høgsberg Kristensen	0e02a88ad4	vk: Add GEN9 pack header	2015-11-25 20:56:41 -08:00
Michel Dänzer	22d2dda03b	targets/xvmc: use the non-inline sw helpers This was missed in commit `59cfb21d` ("targets: use the non-inline sw helpers"). Fixes build failure: CXXLD libXvMCgallium.la ../../../../src/gallium/auxiliary/pipe-loader/.libs/libpipe_loader_static.a(libpipe_loader_static_la-pipe_loader_sw.o):(.data.rel.ro+0x0): undefined reference to `sw_screen_create' collect2: error: ld returned 1 exit status Makefile:756: recipe for target 'libXvMCgallium.la' failed make[3]: *** [libXvMCgallium.la] Error 1 Trivial.	2015-11-26 12:14:28 +09:00
Kristian Høgsberg Kristensen	0c59cb42b5	vk: Move all gen8 files to gen8 lib	2015-11-25 14:13:53 -08:00
Emil Velikov	72c33f0dd5	targets/nine: remove freedreno target Analogous to previous commit. As we no longer have anyone who uses NIR we can drop the link. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robdclark@gmail.com>	2015-11-25 20:29:44 +00:00
Emil Velikov	aa335bb01b	targets/nine: remove vc4 target There are no users for it. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-11-25 20:28:38 +00:00
Emil Velikov	b78259c4b5	gallium: remove unused function declarations Unused as of commit `23fb11455b` "{st,targets}/dri: use static/dynamic pipe-loader" Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-25 20:26:52 +00:00
Emil Velikov	59cfb21d46	targets: use the non-inline sw helpers Previously (with the inline ones) things were embedded into the pipe-loader, which means that we cannot control/select what we want in each target. That also meant that at runtime we ended up with the empty sw_screen_create() as the GALLIUM_SOFTPIPE/LLVMPIPE were not set. v2: Cover all the targets, not just dri. Cc: "11.1" <mesa-stable@lists.freedesktop.org> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Cc: Edward O'Callaghan <edward.ocallaghan@koparo.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Oded Gabbay <oded.gabbay@gmail.com> Tested-by: Nick Sarnie <commendsarnex@gmail.com>	2015-11-25 20:25:29 +00:00
Emil Velikov	fbc6447c3d	target-hepers: add non inline sw helpers Feeling rather dirty copying the inline ones, yet we need the inline ones for swrast only targets like libgl-xlib, osmesa. Cc: "11.1" <mesa-stable@lists.freedesktop.org> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Cc: Edward O'Callaghan <edward.ocallaghan@koparo.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Oded Gabbay <oded.gabbay@gmail.com> Tested-by: Nick Sarnie <commendsarnex@gmail.com>	2015-11-25 20:25:14 +00:00
Emil Velikov	f623517188	pipe-loader: fix off-by one error With earlier commit we've dropped the manual iteration over the fixed size array and prepemtively set the variable storing the size, that is to be returned. Yet we forgot to adjust the comparison, as before we were comparing the index, now we're comparing the size. Fixes: `ff9cd8a67c` "pipe-loader: directly use pipe_loader_sw_probe_null() at probe time" Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93091 Reported-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Tom Stellard <thomas.stellard@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2015-11-25 20:22:35 +00:00
Emil Velikov	0572e5fea5	nir: include what we want/need Swap core.h with macros.h, as the latter provides the required MAX2 macro. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-25 20:19:47 +00:00
Kenneth Graunke	3810c15614	i965: Fix scalar vertex shader struct outputs. While we correctly set output[] for composite varyings, we set completely bogus values for output_components[], making emit_urb_writes() output zeros instead of the actual values. Unfortunately, our simple approach goes out the window, and we need to recurse into structs to get the proper value of vector_elements for each field. Together with the previous patch, this fixes rendering in an upcoming game from Feral Interactive. v2: Use pointers instead of pass-by-mutable-reference (Jason, Matt). Cc: "11.1 11.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-25 11:47:47 -08:00
Kenneth Graunke	3e9003e9cf	i965: Fix fragment shader struct inputs. Apparently we have literally no support for FS varying struct inputs. This is somewhat surprising, given that we've had tests for that very feature that have been passing for a long time. Normally, varying packing splits up structures for us, so we don't see them in the backend. However, with SSO, varying packing isn't around to save us, and we get actual structs that we have to handle. This patch changes fs_visitor::emit_general_interpolation() to work recursively, properly handling nested structs/arrays/and so on. (It's easier to read with diff -b, as indentation changes.) When using the vec4 VS backend, this fixes rendering in an upcoming game from Feral Interactive. (The scalar VS backend requires additional bug fixes in the next patch.) v2: Use pointers instead of pass-by-mutable-reference (Jason, Matt). Cc: "11.1 11.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-25 11:47:47 -08:00
Tom Stellard	89851a2965	radeonsi/compute: Use the compiler's COMPUTE_PGM_RSRC* register values The compiler has more information and is able to optimize the bits it sets in these registers. Reviewed-by: Marek Olšák <marek.olsak@amd.com> CC: <mesa-stable@lists.freedesktop.org>	2015-11-25 11:03:05 -05:00
Tom Stellard	95e0510916	radeonsi: Rename si_shader::ls_rsrc{1,2} to si_shader::rsrc{1,2} In the future, these will be used by other shaders types. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-25 11:03:05 -05:00
Samuel Iglesias Gonsálvez	98ceb60177	docs: minimum required python mako version is 0.3.4 Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-25 16:50:53 +01:00
Nicolai Hähnle	07bddff460	docs: update relnotes with AMD_performance_monitor for radeonsi Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-25 15:52:09 +01:00
Nicolai Hähnle	ad22006892	radeonsi: implement AMD_performance_monitor for CIK+ Expose most of the performance counter groups that are exposed by Catalyst. Ideally, the driver will work with GPUPerfStudio at some point, but we are not quite there yet. In any case, this is the reason for grouping multiple instances of hardware blocks in the way it is implemented. The counters can also be shown using the Gallium HUD. If one is interested to see how work is distributed across multiple shader engines, one can set the environment variable RADEON_PC_SEPARATE_SE=1 to obtain finer-grained performance counter groups. Part of the implementation is in radeon because an implementation for older hardware would largely follow along the same lines, but exposing a different set of blocks which are programmed slightly differently. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-25 15:52:09 +01:00
Nicolai Hähnle	b9fc01aee7	radeon: scale query buffer size to result size Performance monitor queries can become very big, especially considering that instances of a block in different shader engines are queried separately. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-25 15:28:09 +01:00
Nicolai Hähnle	592928065c	radeonsi/sid: add performance counter registers Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-25 15:28:06 +01:00
Nicolai Hähnle	9823048e0b	radeonsi/sid: add hardware constants for COPY_DATA packet Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-25 15:28:03 +01:00
Nicolai Hähnle	1aa3b48c12	radeon: extend CIK_UCONFIG_REG_END for performance counters Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-25 15:28:00 +01:00
Nicolai Hähnle	b589e18a98	radeon: add perfcounter-related EVENT_TYPEs Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-25 15:27:56 +01:00
Nicolai Hähnle	30462b1826	radeon: additional constants for WAIT_REG_MEM and EVENT_WRITE_EOP Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-25 15:27:34 +01:00
Nicolai Hähnle	bfddd005ea	st/mesa: remove outdated comment The enable of AMD_performance_monitor is no longer related to whether queries are run by the GPU since the commit mentioned below. Suggested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> commit `ddf27a3dd0` Author: Nicolai Hähnle <nhaehnle@gmail.com> Date: Tue Nov 10 13:35:01 2015 +0100 gallium: remove pipe_driver_query_group_info field type	2015-11-25 15:27:34 +01:00
Nicolai Hähnle	babf655ab2	st/mesa: delay initialization of performance counters Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-11-25 15:27:33 +01:00
Nicolai Hähnle	27a06e0bbe	mesa/main: allow delayed initialization of performance monitors Most applications never use performance counters, so allow drivers to skip potentially expensive initialization steps. A driver that wants to use this must enable the appropriate extension(s) at context initialization and set the InitPerfMonitorGroups driver function which will be called the first time information about the performance monitor groups is actually used. The init_groups helper is called for API functions that can be called before a monitor object exists. Functions that require an existing monitor object can rely on init_groups having been called before. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-11-25 15:27:33 +01:00
Tapani Pälli	315c4c315e	glsl: handle case where index is array deref in optimize_split_arrays Previously pass did not traverse to those array dereferences which were used as indices to arrays. This fixes Synmark2 Gl42CSCloth application issues. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2015-11-25 11:25:57 +02:00
Julien Isorce	63c344d179	nouveau: move interlaced assert down in nouveau_vp3_video_buffer_create templat->interlaced is 0 if not NV12 which is the case currently when using VPP. Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-25 08:17:39 +00:00
Iago Toral Quiroga	2bba2152e4	i965: remove trailing spaces in various files Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-25 08:12:08 +01:00
Iago Toral Quiroga	1af0d9d939	glsl: remove trailing spaces in various files Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-25 08:09:17 +01:00
Matt Turner	f1b7fefd4e	i965: Pass brw_context pointer, not gl_context pointer. Fixes a warning introduced by commit `dcadd855`.	2015-11-24 21:27:57 -08:00
Timothy Arceri	7436d7c33b	glsl: only call dead code pass when new inputs/outputs demoted This will help avoid eliminating inputs/outputs needed by SSOs. Cc: Gregory Hainaut <gregory.hainaut@gmail.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2015-11-25 09:50:13 +11:00
Timothy Arceri	404ac4bf9e	glsl: move and reused code to find first and last shaders Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2015-11-25 09:49:48 +11:00
Matt Turner	0ce370a84b	mesa: Use unreachable() instead of a default case. (And add an unreachable() in one place that didn't have a default case)	2015-11-24 13:27:20 -08:00
Ian Romanick	47b3a0d235	meta: Don't save or restore the active client texture This setting is only used by glTexCoordPointer and related glEnable calls. Since the preceeding commits removed all of those, it is not necessary to save, reset to default, or restore this state. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:30 -08:00
Ian Romanick	c63f9c735d	meta: Don't save or restore the VBO binding Nothing left in meta does anything with the VBO binding, so we don't need to save or restore it. The VAO binding is still modified. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:30 -08:00
Ian Romanick	58aa56d40b	meta/TexSubImage: Don't pollute the buffer object namespace tl;dr: For many types of GL object, we can NEVER use the Gen function. In OpenGL ES (all versions!) and OpenGL compatibility profile, applications don't have to call Gen functions. The GL spec is very clear about how you can mix-and-match generated names and non-generated names: you can use any name you want for a particular object type until you call the Gen function for that object type. Here's the problem scenario: - Application calls a meta function that generates a name. The first Gen will probably return 1. - Application decides to use the same name for an object of the same type without calling Gen. Many demo programs use names 1, 2, 3, etc. without calling Gen. - Application calls the meta function again, and the meta function replaces the data. The application's data is lost, and the app fails. Have fun debugging that. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363 Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:30 -08:00
Ian Romanick	76cfe2bc44	meta: Don't pollute the buffer object namespace in _mesa_meta_DrawTex tl;dr: For many types of GL object, we can NEVER use the Gen function. In OpenGL ES (all versions!) and OpenGL compatibility profile, applications don't have to call Gen functions. The GL spec is very clear about how you can mix-and-match generated names and non-generated names: you can use any name you want for a particular object type until you call the Gen function for that object type. Here's the problem scenario: - Application calls a meta function that generates a name. The first Gen will probably return 1. - Application decides to use the same name for an object of the same type without calling Gen. Many demo programs use names 1, 2, 3, etc. without calling Gen. - Application calls the meta function again, and the meta function replaces the data. The application's data is lost, and the app fails. Have fun debugging that. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363 Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:30 -08:00
Ian Romanick	a222d4cbc3	meta: Use internal functions for buffer object and VAO access in _mesa_meta_DrawTex Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:30 -08:00
Ian Romanick	b8a7369fb7	meta: Track VBO using gl_buffer_object instead of GL API object handle in _mesa_meta_DrawTex Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:30 -08:00
Ian Romanick	d5225ee5d9	meta: Partially convert _mesa_meta_DrawTex to DSA Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:29 -08:00
Ian Romanick	37d11b13ce	meta: Don't pollute the buffer object namespace in _mesa_meta_setup_vertex_objects tl;dr: For many types of GL object, we can NEVER use the Gen function. In OpenGL ES (all versions!) and OpenGL compatibility profile, applications don't have to call Gen functions. The GL spec is very clear about how you can mix-and-match generated names and non-generated names: you can use any name you want for a particular object type until you call the Gen function for that object type. Here's the problem scenario: - Application calls a meta function that generates a name. The first Gen will probably return 1. - Application decides to use the same name for an object of the same type without calling Gen. Many demo programs use names 1, 2, 3, etc. without calling Gen. - Application calls the meta function again, and the meta function replaces the data. The application's data is lost, and the app fails. Have fun debugging that. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363 Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:29 -08:00
Ian Romanick	b1b73a42c8	meta: Use internal functions for buffer object and VAO access Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:29 -08:00
Ian Romanick	52921f8e08	meta: Use DSA functions for VBOs in _mesa_meta_setup_vertex_objects The fixed-function attribute paths don't get the DSA treatment because there are no DSA entry-points for fixed-function attributes. These could have been added, but this is a temporary patch intended to make later patches easier to review. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:29 -08:00
Ian Romanick	1035e00a81	meta: Track VBO using gl_buffer_object instead of GL API object handle Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:29 -08:00
Ian Romanick	3b5a7d450d	meta: Don't leave the VBO bound after _mesa_meta_setup_vertex_objects Meta currently does this, but future changes will make this impossible. Explicitly do it as a step in the patch series now to catch any possible kinks. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:29 -08:00
Ian Romanick	ed0bd6573b	i965: Use _mesa_NamedBufferSubData for users of _mesa_meta_setup_vertex_objects Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:29 -08:00
Ian Romanick	7f2f300071	meta: Use _mesa_NamedBufferData and _mesa_NamedBufferSubData for users of _mesa_meta_setup_vertex_objects Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:29 -08:00
Ian Romanick	89a61afdd7	meta: Use DSA functions for PBO in create_texture_for_pbo Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:29 -08:00
Ian Romanick	4e6b9c11fc	i965: Don't pollute the buffer object namespace in brw_meta_fast_clear tl;dr: For many types of GL object, we can NEVER use the Gen function. In OpenGL ES (all versions!) and OpenGL compatibility profile, applications don't have to call Gen functions. The GL spec is very clear about how you can mix-and-match generated names and non-generated names: you can use any name you want for a particular object type until you call the Gen function for that object type. Here's the problem scenario: - Application calls a meta function that generates a name. The first Gen will probably return 1. - Application decides to use the same name for an object of the same type without calling Gen. Many demo programs use names 1, 2, 3, etc. without calling Gen. - Application calls the meta function again, and the meta function replaces the data. The application's data is lost, and the app fails. Have fun debugging that. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363 Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:29 -08:00
Ian Romanick	e62799bd4e	i965: Use internal functions for buffer object access Instead of going through the GL API implementation functions, use the lower-level functions. This means that we have to keep track of a pointer to the gl_buffer_object and the gl_vertex_array_object. This has two advantages. First, it avoids a bunch of CPU overhead in looking up objects and validing API parameters. Second, and much more importantly, it will allow us to stop calling _mesa_GenBuffers / _mesa_CreateBuffers and pollute the buffer namespace (next patch). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:29 -08:00
Ian Romanick	1c5423d3a0	i965: Use DSA functions for VBOs in brw_meta_fast_clear Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:29 -08:00
Ian Romanick	dcadd855f1	i965: Pass brw_context instead of gl_context to brw_draw_rectlist Future patches will use the brw_context instead. Keeping this non-functional change separate should make the function changes easier to review. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:29 -08:00
Ian Romanick	4a644f1caa	mesa: Refactor enable_vertex_array_attrib to make _mesa_enable_vertex_array_attrib Pulls the parts of enable_vertex_array_attrib that aren't just parameter validation out into a function that can be called from other parts of Mesa (e.g., meta). _mesa_enable_vertex_array_attrib can also be used to enable fixed-function arrays. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:29 -08:00
Ian Romanick	a336fcd36a	mesa: Refactor update_array_format to make _mesa_update_array_format_public Pulls the parts of update_array_format that aren't just parameter validation out into a function that can be called from other parts of Mesa (e.g., meta). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:28 -08:00
Ian Romanick	8fae494df2	mesa: Make bind_vertex_buffer avilable outside varray.c Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:28 -08:00
Kenneth Graunke	03d6949630	Revert "i965: Combine assembly annotations if possible." This reverts commit `a280e83d71`. It breaks INTEL_DEBUG=fs output. For example, glsl-fs-discard-01.shader_test has 11 instructions but only prints 5. Acked-by: Matt Turner <mattst88@gmail.com>	2015-11-24 10:21:37 -08:00
Matt Turner	5369efe311	glsl: Pass ast_type_qualifier by const reference. Coverity noticed that we were passing this by value, and it's 152 bytes. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-11-24 10:05:33 -08:00
Matt Turner	f36993b469	i965: Clean up #includes in the compiler. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-11-24 10:05:33 -08:00
Matt Turner	1eb11e64b3	i965: Move brw_new_shader and brw_link_shader prototypes from brw_wm.h. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-11-24 10:05:33 -08:00
Matt Turner	6ba700c3c3	i965: Compile brw_cs_fill_local_id_payload() as C. It's only called from C, it compiles as C, so just compile it as C. Notice the missing extern "C" on the definition of the function, which would screw things up if the prototype wasn't parsed before the definition. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-11-24 10:05:33 -08:00
Matt Turner	6b525d9f2b	i965: Move MRF macros from brw_inst.h to brw_eu.h. brw_inst.h is only for the brw_inst/brw_compact_inst functions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-11-24 10:05:33 -08:00
Matt Turner	76732932ec	i965: Drop #include of main/glheader.h. It's never used. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-11-24 10:05:33 -08:00
Matt Turner	ecac1aab53	i965: Push down inclusion of brw_program.h. We were including it in headers, which then caused it to be included in tons of places it wasn't needed. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-11-24 10:05:33 -08:00
Matt Turner	64cc7572c1	i965: Mark functions called from C as extern "C". These functions' prototypes are marked with extern "C", which apparently overrides a lack of extern "C" at the definition site if the prototype has been seen first. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-11-24 10:05:33 -08:00
Matt Turner	fb86f0e75a	i965: Push down inclusion of vbo/vbo.h. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-11-24 10:05:33 -08:00
Matt Turner	6fe9ea78fa	i965: Remove duplicate #includes. Added in commits `36fd65381` and `337dad8ce` even though the existing include was in view. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-11-24 10:05:33 -08:00
Matt Turner	c06f3d5d54	i965: Remove unneeded forward declarations. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-11-24 10:05:32 -08:00
Matt Turner	e768c498bf	i965: Mark count_trailing_one_bits() static. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-11-24 10:05:32 -08:00
Matt Turner	836aaa4394	i965: Remove useless gen6_blorp.h/gen7_blorp.h headers. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-11-24 10:05:32 -08:00
Matt Turner	d956335a0b	util: Include assert.h in macros.h.	2015-11-24 10:05:32 -08:00
Matt Turner	fafbf994cf	util: Include <stdbool.h> in debug.h.	2015-11-24 10:05:32 -08:00
Matt Turner	2d8c529903	i965: Prevent implicit upcasts to brw_reg. Now that backend_reg inherits from brw_reg, we have to be careful to avoid the object slicing problem. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-11-24 09:58:33 -08:00
Matt Turner	799f924073	i965: Use scope operator to ensure brw_reg is interpreted as a type. In the next patch, I make backend_reg's inheritance from brw_reg private, which confuses clang when it sees the type "struct brw_reg" in the derived class constructors, thinking it is referring to the privately inherited brw_reg: brw_fs.cpp:366:23: error: 'brw_reg' is a private member of 'brw_reg' fs_reg::fs_reg(struct brw_reg reg) : ^ brw_shader.h:39:22: note: constrained by private inheritance here struct backend_reg : private brw_reg ^~~~~~~~~~~~~~~ brw_reg.h:232:8: note: member is declared here struct brw_reg { ^ Avoid this by marking brw_reg with the scope resolution operator.	2015-11-24 09:58:33 -08:00
Matt Turner	f093c842e6	i965: Use implicit backend_reg copy-constructor. In order to do this, we have to change the signature of the backend_reg(brw_reg) constructor to take a reference to a brw_reg in order to avoid unresolvable ambiguity about which constructor is actually being called in the other modifications in this patch. As far as I understand it, the rule in C++ is that if multiple constructors are available for parent classes, the one closest to you in the class heirarchy is closen, but if one of them didn't take a reference, that screws things up. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-11-24 09:58:33 -08:00
Matt Turner	309a44d63c	i965: Add and use backend_reg::equals(). Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-11-24 09:58:33 -08:00
Roland Scheidegger	6c6a439e98	softpipe/llvmpipe: don't advertize support for ASTC `3333977556` added support for ASTC textures to gallium. They don't have any helpers hooked up for software decoding, however, so cannot support them in drivers relying on util code for decoding. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-24 18:26:11 +01:00
Roland Scheidegger	97eed8dcb9	llvmpipe: don't test for unsupported formats in lp_test_format Removing the fake format helpers (`1c7d0a6aa4`) caused this to fail. These formats were never supported, but previously they would have asserted in the generated jit functions (which, due to lack of test cases for these formats, were never called) whereas we now assert when trying to build the jit function. So, skip them completely. This fixes https://bugs.freedesktop.org/show_bug.cgi?id=93092 Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-24 18:26:11 +01:00
Ian Romanick	9b41489cb5	docs: add missed i965 feature to relnotes Trivial. GL_ARB_fragment_layer_viewport support was added in `8c902a58` by Ken. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "11.1" <mesa-stable@lists.freedesktop.org>	2015-11-24 09:03:39 -08:00
Rob Clark	d278e31459	util: move brw_env_var_as_boolean() to util Kind of a handy function. And I'll want it available outside of i965 for common nir-pass helpers. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Nicolai Hähnle <nhaehnle@gmail.com>	2015-11-24 10:02:55 -05:00
Christian König	d3e2c48dfa	st/va: fix indentation Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2015-11-24 15:31:48 +01:00
Christian König	64761a841d	st/va: move MPEG4 functions into separate file Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2015-11-24 15:31:45 +01:00
Christian König	9fe7924328	st/va: move VC-1 functions into separate file Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2015-11-24 15:31:41 +01:00
Christian König	da173344a6	st/va: move H264 functions into separate file Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2015-11-24 15:31:38 +01:00
Christian König	c9cb22392b	st/va: move MPEG12 functions into separate file Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2015-11-24 15:31:35 +01:00
Christian König	ec6ef1cbfe	st/va: move post processing function into own file Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2015-11-24 15:31:31 +01:00
Christian König	3d6386fdc5	st/va: fix post process dirty area handling The dirty area in this call isn't related to the screen at all. v2: set clear dirty area to false as well Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2015-11-24 15:31:11 +01:00
Timothy Arceri	2571a768d6	glsl: implement recent spec update to SSO validation Enables 200+ dEQP SSO tests to proceed past validation, and fixes a ES31-CTS.sepshaderobjs.PipelineApi subtest. V2: split out change that reverts a previous patch into its own commit, move variable declaration to top of function, and fix some formatting all suggested by Ian. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Cc: "11.1" <mesa-stable@lists.freedesktop.org>	2015-11-24 20:59:48 +11:00
Timothy Arceri	3c4aa7aff2	Revert "mesa: return initial value for VALIDATE_STATUS if pipe not bound" This reverts commit `ba02f7a3b6`. The commit checked whether the pipeline was currently bound instead of checking whether it had ever been bound. The previous setting of Validated during object creation makes this unnecessary. The real problem was that Validated was not properly set to false elsewhere in the code. This is fixed by a later patch. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Cc: "11.1" <mesa-stable@lists.freedesktop.org>	2015-11-24 20:59:44 +11:00
Michel Dänzer	d094631936	radeon/llvm: Use llvm.AMDIL.exp intrinsic again for now llvm.exp2.f32 doesn't work in some cases yet. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92709 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-24 18:07:48 +09:00
Boyuan Zhang	f55f134a03	radeon/uvd: uv pitch separation for stoney v2: set the behaviour default for future ASICs. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Cc: mesa-stable@lists.freedesktop.org	2015-11-23 17:34:43 -05:00
Jason Ekstrand	179fc4aae8	Merge remote-tracking branch 'mesa-public/master' into vulkan This pulls in nir cloning and some much-needed upstream refactors.	2015-11-23 14:03:47 -08:00
Dave Airlie	237bcdbab5	texgetimage: consolidate 1D array handling code. This should fix the getteximage-depth test that currently asserts. I was hitting problem with virgl as well in this area. This moves the 1D array handling code to a single place. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Cc: "10.6 11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-11-24 06:43:21 +10:00
Jason Ekstrand	d9b8fde963	i965: Use NIR for lowering texture swizzle Now that nir_lower_tex can do texture swizzle lowering, we can use that instead of repeating more-or-less the same code in both backends. This both allows us to share code and means that things like the tg4 work-arounds are somewhat simpler because they don't have to take the swizzle into account. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-11-23 11:07:32 -08:00
Jason Ekstrand	8537b4ab76	nir/lower_tex: Add support for lowering texture swizzle Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-11-23 11:04:49 -08:00
Jason Ekstrand	6921b17107	nir: Add a tex_instr_is_query helper Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-11-23 11:04:49 -08:00
Jason Ekstrand	7e83fd85aa	nir: Add a ssa_def_rewrite_uses_after helper Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-11-23 11:04:49 -08:00
Jason Ekstrand	384396a69b	nir: Use instr/if_rewrite in nir_ssa_def_rewrite_uses nir_ssa_def_rewrite_uses is one of the older helpers in NIR and predated both of those. Now it can be substantially simplified. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-11-23 11:04:49 -08:00
Jason Ekstrand	03c9ad900e	nir/validate: Validated dests after sources Previously, if someone accidentally made an instruction that refers to its own SSA destination, the validator wouldn't catch it. The reason for this is that it validated the destination too early and, by the time it got to the source, the destination SSA value was already added to the set of seen SSA values so it would assume that it came from some previous instruction. By moving destination validation to be after source validation, the SSA value is not in the list of seen values and the validator will catch self-referential instructions. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-11-23 11:04:49 -08:00
Jason Ekstrand	6c8ba59cff	i965: Use nir_lower_tex for texture coordinate lowering Previously, we had a rescale_texcoords helper in the FS backend for handling rescaling of texture coordinates. Now that we can do variants in NIR, we can use nir_lower_tex to do the rescaling for us. This allows us to delete the i965-specific code and gives us proper TEXTURE_RECTANGLE and GL_CLAMP handling in vertex and geometry shaders. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-23 11:04:49 -08:00
Jason Ekstrand	d065a93a3f	i965/fs: Stomp the texture return type to UINT32 for resinfo messages Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-23 11:02:15 -08:00
Jason Ekstrand	042fa75e48	nir/lower_tex: Set the dest_type for txs instructions Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-11-23 11:02:15 -08:00
Jason Ekstrand	1417f6a216	nir/lower_tex: Report progress Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-11-23 11:02:15 -08:00
Jason Ekstrand	ce767bbdff	i965: Move postprocess_nir to codegen time This allows us to insert NIR passes between initial NIR compilation and optimization (link time) and actual backend code-gen. In particular, it will allow us to do shader variants in NIR and share some of that shader variant code between backends. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-11-23 11:02:15 -08:00
Jason Ekstrand	9cf108193b	i965/nir: Split shader optimization and lowering into three stages At the moment, brw_create_nir just calls the three stages in sequence so there's not much difference. Soon, however, we will want to start doing variants in NIR at which point the postprocessing step will have to move from shader create time to codegen time. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-11-23 11:02:15 -08:00
Jason Ekstrand	9d703de85a	i965: Use ull immediates in brw_inst_bits This fixes a regression introduced in `b1a83b5d1` that caused basically all shaders to fail to compile on 32-bit platforms. Reported-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-23 10:55:38 -08:00
Ilia Mirkin	e4c1221d36	docs: add missed freedreno features to relnotes Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1" <mesa-stable@lists.freedesktop.org>	2015-11-23 12:32:54 -05:00
Ilia Mirkin	33dc9aac07	docs: update relnotes with new freedreno/a4xx support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-23 12:32:54 -05:00
Jose Fonseca	c9651f0264	svga: Add ASTC formats to format table. Fixes build. Otherwise untested. Trivial.	2015-11-23 16:45:28 +00:00
Ilia Mirkin	754b26e76d	freedreno/ir3: add support for a few gs5 ops Tested on a4xx. This is part of the builtins added by ARB_gpu_shader5 and GLSL ES 3.10. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-23 11:17:16 -05:00
Ilia Mirkin	cca8dd4e93	ttn: fix UMSB conversion Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-23 11:17:16 -05:00
Ilia Mirkin	190acb34ca	freedreno/a4xx: add ARB_texture_query_lod support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-23 11:17:15 -05:00
Ilia Mirkin	f0e670bdd7	ttn: add LODQ support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-23 11:17:15 -05:00
Ilia Mirkin	9761d5146f	freedreno/a4xx: re-emit program on dirty framebuffer The program emit depends on certain fb details. Make sure those get updated when the fb changes. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-23 11:17:15 -05:00
Ilia Mirkin	81b16350fa	freedreno/a4xx: use a factor of 32767 for snorm8 blending It appears that the hardware wants the integer to be scaled the same way that the hardware representation is. snorm16 uses one of the float factors, so this is only relevant for snorm8. This fixes a number of subcases of bin/fbo-blending-formats GL_EXT_texture_snorm Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-11-23 11:17:15 -05:00
Ilia Mirkin	6f17f19b17	freedreno/a4xx: only compute texture offset once for the view Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-23 11:17:15 -05:00
Ilia Mirkin	f10bb0ac9e	freedreno/a4xx: add ARB_texture_view support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-23 11:17:15 -05:00
Ilia Mirkin	1b9992b803	freedreno/a4xx: add formats for ARB_texture_buffer_object_rgb32 support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-23 11:17:15 -05:00
Ilia Mirkin	f9549d0a0f	freedreno/a4xx: add ARB_texture_rgb10_a2ui support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-23 11:17:15 -05:00
Ilia Mirkin	93905a8df1	freedreno/a4xx: add astc formats Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-23 11:17:15 -05:00
Ilia Mirkin	6b21d3c92e	st/mesa: add astc support This doesn't account for the ldr/hdr distinction... that will probably have to be exposed via a separate cap. When relevant hardware appears, this can be worked out. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-11-23 11:17:15 -05:00
Ilia Mirkin	3333977556	gallium: add ASTC formats Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-11-23 11:17:15 -05:00
Ilia Mirkin	1c7d0a6aa4	gallium/util: remove the fake format helpers for bptc and etc2 This was a silly hack that kept growing and growing. Instead, just write NULLs for those functions. No need to have helpers that just assert(0) when you call them. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-11-23 11:17:14 -05:00
Ilia Mirkin	c65bc2e805	freedreno/a4xx: support 16384 texels in buffer texture Looks like the width field's bitmask was off-by-one. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-23 11:17:14 -05:00
Ilia Mirkin	99f12a3f1a	freedreno/a4xx: add ARB_texture_buffer_range support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-23 11:17:14 -05:00
Ilia Mirkin	d4c40f99ab	freedreno/a4xx: add polygon mode support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-23 11:17:14 -05:00
Emil Velikov	b89d1b2ccf	configure.ac: default to disabled dri3 when --disable-dri is set Not too long ago, the dri3 code was living in src/glx, which in itself was guarded by HAVE_DRI_GLX. As the name suggests we didn't dive into the folder when dri was disabled, thus we missed that dri3 does not consider/honour --enable-dri. Cc: mesa-stable@lists.freedesktop.org Fixes: `6bd9ba7d07` "loader: Add dri3 helper" Cc: Pali Rohár <pali.rohar@gmail.com> Reported-by: Pali Rohár <pali.rohar@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-23 12:08:04 +00:00
Emil Velikov	b9b0a1f58e	loader: unconditionally add AM_CPPFLAGS to libloader_la_CPPFLAGS It seems that due to the conditional autotools is getting confused and forgetting to add AM_CPPFLAGS when building libloader (when HAVE_DRICOMMON is not set). Cc: mesa-stable@lists.freedesktop.org Fixes: `5a79e0a8e3` "automake: loader: rework the CPPFLAGS" Reported-by: Pali Rohár <pali.rohar@gmail.com> Tested-by: Pali Rohár <pali.rohar@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-23 12:07:50 +00:00
Emil Velikov	8a6d476588	pipe-loader: link against libloader regardless of libdrm presence Whether or not the loader has libdrm support is up-to it. Anyone using the loader should just include it whenever they depend on it. Cc: mesa-stable@lists.freedesktop.org Fixes: `0f39f9cb7a` "pipe-loader: add a dummy 'static' pipe-loader" Reported-by: Jon TURNEY <jon.turney@dronecode.org.uk> Tested-by: Jon TURNEY <jon.turney@dronecode.org.uk> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-23 12:07:09 +00:00
Neil Roberts	2010de4015	i965: Handle lum, intensity and missing components in the fast clear It looks like the sampler hardware doesn't take into account the surface format when sampling a cleared color after a fast clear has been done. So for example if you clear a GL_RED surface to 1,1,1,1 then the sampling instructions will return 1,1,1,1 instead of 1,0,0,1. This patch makes it override the color that is programmed in the surface state in order to swizzle for luminance and intensity as well as overriding the missing components. Fixes the ext_framebuffer_multisample-fast-clear Piglit test. v2: Handle luminance and intensity formats Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2015-11-23 10:44:01 +01:00
Jason Ekstrand	f58813842b	nir: s/nir_type_unsigned/nir_type_uint v2: do the same in tgsi_to_nir (Samuel) v3: added missing cases after rebase (Iago) v4: Add a blank space after '#' in one of the comments (Matt) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-23 08:36:12 +01:00
Connor Abbott	fb93dd7aa8	nir/builder: only read meaningful channels in nir_swizzle() This way the caller doesn't have to initialize all 4 channels when they aren't using them. v2: Fix signed/unsigned comparison warning (Iago) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-23 08:36:12 +01:00
Connor Abbott	d982922b18	i965/fs: add stride restrictions for copy propagation There are various restrictions on what the hstride can be that depend on the Gen, and now that we're using hstride == 2 for packing/unpacking doubles, we're going to run into these restrictions a lot more often. Pull them out into a separate function, and move the one restriction we checked previously into it. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-23 08:30:30 +01:00
Connor Abbott	95ac3b1dae	i965/fs: don't propagate cmod when the exec sizes differ This can happen when the source of the compare was split by the SIMD lowering pass. Potentially, we could allow the case where the exec size of scan_inst is larger, and scan_inst has the right quarter selected, but doing that seems a little more risky. v2: Merge the bail condition into the the previous if/break block (Matt) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-23 08:30:30 +01:00
Connor Abbott	70171a9c89	i965/fs: respect force_sechalf/force_writemask_all in CSE Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-23 08:30:30 +01:00
Connor Abbott	b1a83b5d1b	i965: fix 64-bit immediates in brw_inst(_set)_bits If we tried to get/set something that was exactly 64 bits, we would try to do (1 << 64) - 1 to calculate the mask which doesn't give us all 1's like we want. v2 (Iago) - Replace ~0 by ~0ull - Removed unnecessary parenthesis v3 (Kristian) - Avoid the conditional Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-11-23 08:30:30 +01:00
Connor Abbott	718b9f52dd	i965/fs: print non-1 strides when dumping instructions v2: - Simplify code (Iago) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-23 08:30:30 +01:00
Ilia Mirkin	4deb118d06	nv50/ir: fix (un)spilling of 3-wide results There is no 96-bit load/store operations, so we have to split it up into a 32-bit parts, with a split/merge around it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90348 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-11-22 23:27:22 -05:00
Timothy Arceri	6463d36394	glsl: fix max binding validation for uniform blocks Regression as of `64710db664` We can't use the type returned by get_interface_type() as the interface type has arrays removed. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2015-11-23 13:47:19 +11:00
Ilia Mirkin	ad5f6b03e7	nv50,nvc0: properly handle buffer storage invalidation on dsa buffer In case that the buffer has no bind at all, assume it can be a regular buffer. This can happen on buffers created through the ARB_dsa interfaces. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-11-22 21:08:16 -05:00
Ilia Mirkin	079f713754	nouveau: use the buffer usage to determine placement when no binding With ARB_direct_state_access, buffers can be created without any binding hints at all. We still need to allocate these buffers to VRAM or GART, as we don't have logic down the line to place them into GPU-mappable space. Ideally we'd be able to shift these things around based on usage. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92438 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-11-22 20:58:56 -05:00
Eric Anholt	1b62a4e885	vc4: Take precedence over ilo when in simulator mode. They're exclusive at build time, but the ilo entry is always present, so we'd try to use it and fail out. v2: Add comment in the code, from Emil. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-22 13:15:43 -08:00
Eric Anholt	a39eac80fd	vc4: Just put USE_VC4_SIMULATOR in DEFINES. In the pipe-loader reworks, it was missed in one of the new directories it was used. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-22 13:15:40 -08:00
Nanley Chery	d1212abf50	mesa/teximage: Fix S3TC regression due to ASTC interaction A prior, literal reading of the ASTC spec led to the prohibition of some compressed formats being used against the targets: TEXTURE_CUBE_MAP_ARRAY and TEXTURE_3D. Since the spec does not specify interactions with other extensions for specific compressed textures, remove such interactions. Fixes the following Piglit tests on Gen9: piglit.spec.arb_direct_state_access.getcompressedtextureimage piglit.spec.arb_get_texture_sub_image.arb_get_texture_sub_image-getcompressed piglit.spec.arb_texture_cube_map_array.fbo-generatemipmap-cubemap array s3tc_dxt1 piglit.spec.ext_texture_compression_s3tc.getteximage-targets cube_array s3tc v2. Don't interact with other specific compressed formats (Ian). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91927 Suggested-by: Neil Roberts <neil@linux.intel.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-22 12:29:09 -08:00
Nanley Chery	21d43fe51a	mesa/extensions: Enable overriding permanently enabled extensions Provide the ability to prevent any permanently enabled extension from appearing in the string returned by glGetString[i](). Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2015-11-22 12:19:45 -08:00
Igor Gnatenko	05eed0eca7	virgl: pipe_virgl_create_screen is not static Cc: mesa-stable@lists.freedesktop.org Fixes: `17d3a5f857` "target-helpers: add a non-inline drm_helper.h" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93063 Signed-off-by: Igor Gnatenko <i.gnatenko.brain@gmail.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-22 11:17:17 +00:00
Kenneth Graunke	86fc97da06	i965: Fix num_uniforms count for scalar GS. I noticed that brw_vs.c does this. I believe the point is that nir->num_uniforms is either counted in scalar components (in scalar mode), or vec4 slots (in vector mode). But we want param_count to be in scalar components regardless, so we have to scale up in vector mode. We don't have to scale up in scalar mode, though. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-22 00:03:21 -08:00
Eric Anholt	4cff16bc3a	vc4: Use nir_channel() to simplify all of our nir_swizzle() cases.	2015-11-21 18:55:31 -08:00
Eric Anholt	81544f231a	vc4: Fix point size lookup. I think I may have regressed this in the NIR conversion. TGSI-to-NIR is putting the PSIZ in the .x channel, not .w, so we were grabbing some garbage for point size, which ended up meaning just not drawing points. Fixes glean pointAtten and pointsprite.	2015-11-21 18:55:31 -08:00
Jose Fonseca	4befd82a64	pipe-loader: Fix PATH_MAX define on MSVC.	2015-11-21 23:03:20 +00:00
Jose Fonseca	02afbd2476	scons: Conditionally use DRM module on pipe-loader. Fixes non Linux builds. Trivial.	2015-11-21 21:20:12 +00:00
Jason Ekstrand	e14b2c76b4	anv/meta_clear: Don't trash state if no clears are needed	2015-11-21 11:39:12 -08:00
Ilia Mirkin	22aeb0c568	freedreno/a4xx: disable blending and alphatest for integer rt0 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-11-21 09:08:16 -05:00
Ilia Mirkin	4c170d9e1d	freedreno/a4xx: fix independent blend This fixes the ext_draw_buffers2 and arb_draw_buffers_blend tests. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-11-21 09:08:16 -05:00
Ilia Mirkin	801b55c2ee	freedreno/a4xx: enable ARB_base_instance support We already pass in start_instance in fd4_draw. Expose the extension. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-21 09:08:16 -05:00
Ilia Mirkin	f54c89f13e	freedreno/a4xx: set fetchsize in mem2gmem texture restore Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-21 09:08:16 -05:00
Ilia Mirkin	7426d9581a	freedreno/a4xx: add 11_11_10_float vertex type support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-21 09:08:16 -05:00
Ilia Mirkin	740eb63aa7	freedreno/a4xx: fix 3d texture setup Same fix as on a3xx - set the second (tiny) layer size bitfield to the smallest level's size so that the hw knows not to minify beyond that. This fixes texelFetch sampler3D piglits. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-11-21 09:08:16 -05:00
Ilia Mirkin	ecb0dcd34c	freedreno/a4xx: only align slices in non-layer_first textures When layer is the container, slices are tightly packed inside of each layer. We don't need any additional alignment. On a3xx, each slice contains all the layers, so having alignment makes sense. This fixes a whole slew of array-related piglits, including texelFetch and tex-miplevel-selection varieties. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-11-21 09:08:16 -05:00
Emil Velikov	428146522b	docs: add 11.2.0-devel release notes template, bump version Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2015-11-21 14:10:08 +00:00
Jason Ekstrand	83c305f8ef	anv/meta_clear: Don't try to clear depth-stencil without LOAD_OP_CLEAR	2015-11-21 00:05:18 -08:00
Jason Ekstrand	438eaa3ae7	anv/meta: Add initial support for multi-slice array and 3-D copies We still need to fix up a few bits once we have real CPP values, but this should get us a long ways.	2015-11-20 18:25:06 -08:00
Jason Ekstrand	d6a7c659c7	anv/meta: Use array textures for 2D This a total of 1 extra instruction in the shader and gives us a lot more flexibility in how we do blits.	2015-11-20 16:00:34 -08:00
Jason Ekstrand	e3ec964e44	anv/meta: Keep z coordinate flat while blitting	2015-11-20 15:48:03 -08:00
Jason Ekstrand	1157b0360d	nir/spirv: Rework decoration iteration The old code didn't work correctly if you had member decorations after non-member decorations. Since glslang never gave us any of those, it wasn't properly tested.	2015-11-20 15:15:40 -08:00
Jason Ekstrand	cff74d6fb8	nir/spirv: Handle OpNop	2015-11-20 15:02:45 -08:00
Jason Ekstrand	1d42f773d3	gen8_state: Clamp sampler values to HW limitations	2015-11-20 14:45:44 -08:00
Jason Ekstrand	48228c114e	nir/spirv: Add support for runtime arrays	2015-11-20 12:49:20 -08:00
Jason Ekstrand	55d16c090e	gen8/pipeline: Properly handle MIN/MAX blend ops	2015-11-20 11:53:10 -08:00
Jason Ekstrand	b43ce6768d	gen8/pipeline: Set IndependentAlphaBlendEnable properly	2015-11-20 11:52:54 -08:00
Jason Ekstrand	e69db9159b	gen8/pipeline: Minor blending fixes This makes various fields match upstream mesa	2015-11-20 11:52:30 -08:00
Jason Ekstrand	fa8db0dfcc	anv: Put all of the descriptor set stuff together in one file The stuff to take descriptor sets and turn them into binding tables and sampler tables is still in anv_cmd_buffer.c. We may want to consider putting it in anv_descriptor_set.c eventually.	2015-11-18 14:58:43 -08:00
Jason Ekstrand	828b1a6eb6	anv/device: Update the right sampler in UpdateDescriptorSets	2015-11-18 14:48:28 -08:00
Jason Ekstrand	6f613abc2b	anv/cmd_buffer: Add a new genX_cmd_buffer file for shared code This file contains code that can be shared across gens modulo recompiling. In particular, we can share STATE_BASE_ADDRESS setup and handling of the vkPipelineBarrier call. Not sharing STATE_BASE_ADDRESS setup has already been a source of bugs and the gen7 and gen8 implementations of PipelineBarrier were line-for-line identical. Incidentally, this should fix MOCS settings for dynamic and surface state on Haswell.	2015-11-18 12:26:57 -08:00
Jason Ekstrand	fb8b2f5f9e	anv/gen7: A bunch of depth-stencil fixes There are various bits which move around between Haswell and Ivy Bridge that we weren't taking into account. This also makes us actually set the StencilWriteEnable in a sane way.	2015-11-18 11:43:52 -08:00
Jason Ekstrand	e9d634f4ad	gen7/pipeline: Re-arrange stencil parameters to match gen8	2015-11-17 19:10:31 -08:00
Jason Ekstrand	9e39bdabad	anv/gen7: Implement CmdPipelineBarrier	2015-11-17 17:09:27 -08:00
Jason Ekstrand	b707e90b6e	anv/gen7: Don't use the upper bound on dynamic state base address It doesn't do much for us and, if we have to resize the dynamic state block pool for any reason, it becomes out-of-date.	2015-11-17 17:08:44 -08:00
Jason Ekstrand	f0390bcad6	anv: Add initial Haswell support	2015-11-17 12:14:24 -08:00
Jason Ekstrand	45320f677b	anv: Add macros for doing per-gen compilation	2015-11-17 08:27:51 -08:00
Jason Ekstrand	92d164b1c3	anv/entrypoints: Add dispatch support for haswell	2015-11-17 08:27:51 -08:00
Jason Ekstrand	aa3002bd42	anv/entrypoints: Use devinfo instead of a gen number	2015-11-17 08:27:51 -08:00
Jason Ekstrand	0508046dc8	anv/cmd_buffer: Pack the 3DSTATE_VF packet on-demand	2015-11-17 08:27:51 -08:00
Jason Ekstrand	34d55d69cf	anv/formats: Don't advertise stencil texture/blit prior to Broadwell	2015-11-17 08:23:29 -08:00
Jason Ekstrand	de54b4b18f	anv: Only include the pack headers where needed Previously, we were including gen7_pack.h, gen75_pack.h, and gen8_pack.h in anv_private.h. As we add more gens, this is going to become untenable. This commit moves things around so that we only use the pack headers when and if we need them.	2015-11-16 12:29:09 -08:00
Jason Ekstrand	cb9e2305f8	anv/cmd_buffer: Move gen-specific stuff into the appropreate files	2015-11-16 12:10:11 -08:00
Jason Ekstrand	22d024e031	nir/spirv: Add support for separate samplers and textures This gets tricky in a few places because we have to pass vtn_sampled_image values through OpAccessChain, but it works ok. At some point, it probably needs to be cleaned up but it doesn't occur to me exactly how to do that at the moment. We'll see how this approach goes.	2015-11-14 22:32:54 -08:00
Jason Ekstrand	002db3ee15	anv/cmd_buffer: Add a default descriptor type case This silences a bunch of compiler warnings.	2015-11-14 09:16:55 -08:00
Jason Ekstrand	e9dba80430	anv/apply_pipeline_layout: Handle separate samplers and textures	2015-11-14 09:00:35 -08:00
Jason Ekstrand	b5d4027c35	Merge branch 'wip/i965-separate-sampler-tex' into vulkan	2015-11-14 08:23:27 -08:00
Jason Ekstrand	c7d504ad93	i965/vec4: Plumb separate surfaces and samplers through from NIR	2015-11-14 08:05:31 -08:00
Jason Ekstrand	3dd84822df	i965/vec4: Separate the sampler from the surface in generate_tex	2015-11-14 08:05:31 -08:00
Jason Ekstrand	c09e140b65	i965/fs: Plumb separate surfaces and samplers through from NIR	2015-11-14 08:04:47 -08:00
Jason Ekstrand	c2a373ec85	i965/fs: Separate the sampler from the surface in generate_tex	2015-11-14 08:01:50 -08:00
Jason Ekstrand	b169bb902a	nir: Separate texture from sampler in nir_tex_instr This commit adds the capability to NIR to support separate textures and samplers. As it currently stands, glsl_to_nir only sets the sampler and leaves the texture alone as it did before and nir_lower_samplers assumes this. However, backends can, if they wish, assume that they are separate because nir_lower_samplers sets both texture and sampler index (they are the same in this case).	2015-11-14 07:57:31 -08:00
Jason Ekstrand	1469ccb746	Merge remote-tracking branch 'mesa-public/master' into vulkan This pulls in Matt's big compiler refactor.	2015-11-14 07:56:10 -08:00
Jason Ekstrand	e8f51fe4de	anv/gen8: Subtract 1 from num_elements when setting up buffer surface state	2015-11-13 22:50:54 -08:00
Jason Ekstrand	91bc4e7cec	anv/pipeline: Don't free blend states that don't exist Compute pipelines don't need a blend state so we shouldn't be unconditionally freeing it.	2015-11-13 21:49:41 -08:00
Jason Ekstrand	c1733886a6	nir/spirv: Add support for SSBO stores This only handles vector stores, not component-of-a-vector stores.	2015-11-13 21:41:52 -08:00
Jason Ekstrand	c68e28d766	nir/spirv: Refactor vtn_block_load We pull the offset calculations out into their own function so we can re-use it for stores.	2015-11-13 21:32:00 -08:00
Jason Ekstrand	99494b96f0	nir/spirv: Add support for image_load_store	2015-11-13 17:54:43 -08:00
Jason Ekstrand	164b3ca164	nir/builder: Add a nir_ssa_undef helper	2015-11-13 17:54:43 -08:00
Jason Ekstrand	ffbc31d13b	nir/spirv: Add support for creating image variables	2015-11-13 17:54:43 -08:00
Jason Ekstrand	453239f6a5	nir/spirv: Add support for image types	2015-11-13 17:54:43 -08:00
Jason Ekstrand	0572444a0e	nir/types: Add image type helpers	2015-11-13 17:54:43 -08:00
Jason Ekstrand	d5ba7a26d9	glsl/types: Add a get_image_instance helper	2015-11-13 17:54:43 -08:00
Chad Versace	738eaa8acf	isl: Embed brw_device_info in isl_device Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-13 11:14:03 -08:00
Chad Versace	ba467467f4	anv: Use enum isl_tiling everywhere In anv_surface and anv_image_create_info, replace member 'uint8_t tile_mode' with 'enum isl_tiling'. As a nice side-effect, this patch also reduces bug potential because the hardware enum values for tile modes are unstable across hardware generations.	2015-11-13 10:44:09 -08:00
Chad Versace	af392916ff	anv/device: Embed isl_device Embed struct isl_device into anv_physical_device and anv_device. It will later be used for surface layout calculations.	2015-11-13 10:44:09 -08:00
Chad Versace	a4a2ea3f79	isl: Add enum isl_tiling and a query func The query func is isl_tiling_get_extent.	2015-11-13 10:44:07 -08:00
Chad Versace	652727b029	isl: Add structs isl_extent2d, isl_extent3d They are nowhere used yet.	2015-11-13 10:31:49 -08:00
Chad Versace	b1bb270590	isl: Add struct isl_device The struct is incomplete (it contains only the gen). And it's nowhere used yet. It will be used later for surface layout calculations.	2015-11-13 10:31:37 -08:00
Chad Versace	477383e9ac	anv: Strip trailing whitespace from anv_device.c	2015-11-13 10:27:40 -08:00
Chad Versace	c6493dff79	anv: Strip trailing space in anv_private.h	2015-11-12 12:24:01 -08:00
Chad Versace	addc2a9d02	anv: Remove redundant fields anv_format::bs,bw,bh,bd Instead, use the equivalent fields in anv_format::isl_layout.	2015-11-12 12:23:49 -08:00
Chad Versace	cbc31f453d	anv/formats: Re-indent the fmt() macro Use one line per struct member.	2015-11-12 12:21:46 -08:00
Chad Versace	1bea1669c5	anv: Use enum isl_format in anv_format This patch begins using isl.h in Anvil. More refactors will follow. Change type of anv_format::surface_format from uint16_t -> enum isl_format.	2015-11-12 12:21:46 -08:00
Chad Versace	bfb022a235	isl: Generate isl_format_layout.c Generate an array of struct isl_format_layout, using isl_format_layout.csv as input. Each entry follows the patten: [ISL_FORMAT_R32G32B32A32_FLOAT] = { ISL_FORMAT_R32G32B32A32_FLOAT, .bs = 16, .bpb = 128, .bw = 1, .bh = 1, .bd = 1, .channels = { .r = { ISL_SFLOAT, 32 }, .g = { ISL_SFLOAT, 32 }, .b = { ISL_SFLOAT, 32 }, .a = { ISL_SFLOAT, 32 }, .l = {}, .i = {}, .p = {}, }, .colorspace = ISL_COLORSPACE_LINEAR, .txc = ISL_TXC_NONE, },	2015-11-12 12:21:46 -08:00
Chad Versace	7986efc644	isl: Add CSV of format layouts Add file isl_format_layout.csv, which describes the block layout, channel layout, and colorspace of all hardware surface formats.	2015-11-12 11:56:16 -08:00
Chad Versace	67362698a9	isl: Add enum isl_format	2015-11-12 11:34:45 -08:00
Jason Ekstrand	3a3d79b38e	anv/gen7: Implement the VS state depth-stall workaround	2015-11-10 16:42:34 -08:00
Jason Ekstrand	750b8f9e98	anv/gen7: Properly handle a GS with zero invocations	2015-11-10 16:41:23 -08:00
Jason Ekstrand	9d18555c8d	anv/gen7: Add push constant support	2015-11-10 15:14:11 -08:00
Jason Ekstrand	427978d933	anv/device: Use an actual int64_t in WaitForFences	2015-11-10 15:02:52 -08:00
Jason Ekstrand	d9079648d0	anv/meta: Create a sampler in meta_emit_blit	2015-11-10 14:43:18 -08:00
Jason Ekstrand	b461744c52	anv/gen7: Properly handle VS with VertexID but no vertices	2015-11-10 11:31:31 -08:00
Jason Ekstrand	aafc87402d	anv/device: Work around the i915 kernel driver timeout bug There is a bug in some versions of the i915 kernel driver where it will return immediately if the timeout is negative (it's supposed to wait indefinitely). We've worked around this in mesa for a few months but never implemented the work-around in the Vulkan driver. I rediscovered this bug again while working on Ivy Bridge becasuse the drive in my Ivy Bridge currently has Fedora 21 installed which has one of the offending kernels.	2015-11-10 11:24:11 -08:00
Jason Ekstrand	06f466a770	anv/nir: Fix codegen in lower_push_constants	2015-11-09 16:29:05 -08:00
Jason Ekstrand	abede04314	anv/gen7: Fix the length of 3DSTATE_SF	2015-11-09 16:04:07 -08:00
Jason Ekstrand	e8c2a52a70	anv/gen7: Properly handle missing color-blend state	2015-11-09 16:04:06 -08:00
Jason Ekstrand	862da6a891	anv/device: Add a newline to the end of a comment	2015-11-09 16:04:06 -08:00
Nanley Chery	9c2b37a9c3	anv/formats: Define ETC2 formats Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-11-09 15:41:41 -08:00
Nanley Chery	41cf35d1d8	anv/image: Determine the alignment units for compressed formats Alignment units, i and j, match the compressed format block width and height respectively. v2: Don't assert against HALIGN* and VALIGN* enums (Chad) Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-11-09 15:41:41 -08:00
Nanley Chery	381f602c6b	anv/image: Handle compressed format qpitch and padding Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-11-09 15:41:41 -08:00
Nanley Chery	300f7c2be3	anv/image: Handle compressed format stride and size These formulas did not take compressed formats into account. Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-11-09 15:41:41 -08:00
Nanley Chery	7b4244dea0	anv/formats: Add fields for block dimensions A non-compressed texture is a 1x1x1 block. Compressed textures could have values which vary in different dimensions WxHxD. Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-11-09 15:41:41 -08:00
Nanley Chery	a6c7d1e016	anv/formats: Add surface_format initializer v2: Rename __brw_fmt to __hw_fmt (Chad) Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Chad Versace chad.versace@intel.com	2015-11-09 15:41:41 -08:00
Nanley Chery	3ee923f1c2	anv: Rename cpp variable to "bs" cpp (chars-per-pixel) is an integer that fails to give useful data about most compressed formats. Instead, rename it to "bs" which stands for block size (in bytes). v2: Rename vk_format_for_bs to vk_format_for_size (Chad) Use "block size" instead of "bs" in error message (Chad) Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-11-09 15:41:41 -08:00
Jason Ekstrand	17fa3d3572	nir/spirv: Give both block and buffer_block types an interface type	2015-11-07 08:03:25 -08:00
Jason Ekstrand	a10d59c09a	nir/spirv: Increment num_ubos/ssbos when creating variables	2015-11-06 16:53:27 -08:00
Jason Ekstrand	046563167c	anv/apply_dynamic_offsets: Use the right sized immediate zero	2015-11-06 16:49:24 -08:00
Jason Ekstrand	104525c33b	anv/pipeline: Set the right SSBO binding table start index for FS	2015-11-06 15:57:51 -08:00
Jason Ekstrand	399d5314f6	anv/cmd_buffer: Rework the way we emit UBO surface state The new mechanism should be able to handle SSBOs as well as properly handle emitting surface state on gen7 where we need different strides depending on shader stage.	2015-11-06 15:14:12 -08:00
Jason Ekstrand	1b5c7e7ecd	anv/pipeline: Expose is_scalar_shader_stage	2015-11-06 15:12:33 -08:00
Jason Ekstrand	5ba281e794	nir/spirv: Add a helper for determining if a block is externally visable	2015-11-06 15:09:57 -08:00
Jason Ekstrand	220261a0c9	anv: Use VkDescriptorType instead of anv_descriptor_type	2015-11-06 14:09:52 -08:00
Jason Ekstrand	612e35b2c6	anv: Do range-checking in the shader for dynamic buffers	2015-11-06 13:32:52 -08:00
Jason Ekstrand	f8052351ac	anv/device: Increase the block size for instructions	2015-11-06 13:29:47 -08:00
Jason Ekstrand	d7cc9929bb	anv: Remove all support for BufferViews We never actually supported them, we just used them for binding UBOs. Now that we have BufferInfo and we aren't supporting texture buffers yet, we should get rid of them until we can do them properly.	2015-11-06 13:16:18 -08:00
Jason Ekstrand	0360c3608b	anv/device: Only support binding UBOs through BufferInfo	2015-11-06 12:52:12 -08:00
Jason Ekstrand	3aa2fc82dd	anv: Rework UpdateDescriptorSets Previously, UpdateDescriptorSets was wrong because it assumed that the binding was the offset into the descriptor set.	2015-11-06 12:28:03 -08:00
Jason Ekstrand	45b1bbe801	anv: Add a descriptor_index to anv_descriptor_set_binding_layout	2015-11-06 12:16:54 -08:00
Jason Ekstrand	f029e0ce13	anv: Add a layout to anv_descriptor_set	2015-11-06 12:16:54 -08:00
Chad Versace	16119ad884	anv/meta: Finish load clears for stencil attachments Tested by Crucible "func.depthstencil.stencil_triangles.*" in commit c194292d5eadb84e9d7489fc01ce0b653cdd4ca5 (HEAD -> master) Author: Chad Versace <chad.versace@intel.com> Date: Wed Nov 4 16:19:24 2015 -0800 Subject: func.depthstencil: Remove stencil clear workaround for Mesa	2015-11-05 15:45:43 -08:00
Jason Ekstrand	a40f682c71	anv/cmd_buffer: Fix SURFACE_STATE for non-view buffer bindings We were treating it as if it's a BufferView and weren't taking the offset into account properly.	2015-11-04 19:56:18 -08:00
Jason Ekstrand	1b68120760	anv/cmd_buffer: Don't use an anv_state pointer in emit_binding_table The anv_state is supposed to be a flyweight so we're not really saving anything by using a pointer. Also, we were creating one, setting a pointer to it, and then having it go out-of-scope which is bad.	2015-11-04 19:56:16 -08:00
Chad Versace	d259af3fbb	anv: Remove unused anv_render_pass members Remove members num_color_clear_attachments has_depth_clear_attachment has_stencil_clear_attachment The new clear code in anv_meta_clear.c does not use them.	2015-11-04 15:54:38 -08:00
Chad Versace	a9a3071fc4	anv/meta: Rewrite clear code Fixes Crucible test "func.clear.load-clear.attachments-8". The old clear code, when clearing attachments for VK_ATTACHMENT_LOAD_OP_CLEAR, suffered from some fundamental bugs. The bugs were not fixable with the old code's approach. - It assumed that a VkRenderPass contained at most one depthstencil attachment. - It tried to clear all attachments (color and the sole depthstencil) with a single instanced draw call, using the VUE header's RenderTargetArrayIndex to specify the instance's target color attachment. But the RenderTargetArrayIndex does not select entries in the binding table; it only selects an array index of a singled layered surface. - If at least one attachment of VkRenderPass had VK_ATTACHMENT_LOAD_OP_CLEAR, then the old code cleared all attachments. This was a consequence of using a single draw call and single pipeline for the clear. The new clear code fixes those bugs by making a separate draw call for each attachment, and using one pipeline when clearing color attachments and a different pipeline for depth attachments. The new code, like the old code, does not clear stencil attachments. It is left as a FINISHME.	2015-11-04 15:20:52 -08:00
Chad Versace	49c96a14c5	anv/meta: Clear color attribute is always flat No behavioral change. This patch just removes an unneeded function parameter.	2015-11-04 15:15:19 -08:00
Chad Versace	7f82cc718f	anv/meta: Use consistent naming for dynamic state mask Consistently rename bitmasks of Vulkan dynamic state to 'dynamic_mask'. anv_meta_saved_state::dynamic_flags -> dynamic_mask anv_meta_save(dynamic_state) -> dynamic_mask	2015-11-04 15:15:19 -08:00
Chad Versace	2bdb9e2ed9	anv/meta: Rename anv_cmd_buffer_save/restore As the functions are now exposed in anv_meta.h, let's rename them to clarify that they are meta functions. anv_cmd_buffer_save -> anv_meta_save anv_cmd_buffer_restore -> anv_meta_restore	2015-11-04 15:15:19 -08:00
Chad Versace	16b2a489db	anv: Move meta clear code to new file anv_meta_clear.c anv_meta.c currently handles blits, copies, clears, and resolves. The clear code is about to grow, and anv_meta.c is already busting at the seams.	2015-11-04 15:15:19 -08:00
Chad Versace	c56727037a	anv: Move struct anv_vue_header to anv_private.h Move it from anv_meta.c to the common header anv_private.h. This allows us to split the meta blit and meta clear code into separate files.	2015-11-04 15:15:19 -08:00
Jason Ekstrand	b00e3f221b	Merge remote-tracking branch 'mesa-public/master' into vulkan	2015-11-03 15:45:04 -08:00
Jason Ekstrand	a1e7b8701a	nir: remove sampler_set from nir_tex_instr Now that descriptor sets are handled in a lowering pass, this is no longer needed.	2015-11-03 14:58:20 -08:00
Chad Versace	4d1c76485b	anv: Drop stale comment in anv_cmd_buffer_emit_binding_table() When emitting the binding table for the fragment shader stage, we no longer "walk all of the attachments, [inserting only] the color attachments into the binding table". Instead, we iterate only over the subpass's color attachments, which is the minimal possible iteration. While killing the comment, also rename the variable 'attachments' to 'color_count', as it's no longer a count of all framebuffer attachments but only the subpass's color attachment count.	2015-11-03 13:46:40 -08:00
Jason Ekstrand	584f9d4442	anv: Report 0 physical devices when not on Broadwell or Ivy Bridge Right now, Broadweel and Ivy Bridge are the only supported platforms. Hopefully, this reduces the chances that someone will try the driver on unsupported hardware and be confused that it doesn't work.	2015-11-02 12:14:37 -08:00
Jason Ekstrand	3883728730	anv: Add better push constant support What we had before was kind of a hack where we made certain untrue assumptions about the incoming data. This new support, while it still doesn't support indirects properly (that will come), at least pulls the offsets and strides from SPIR-V like it's supposed to.	2015-10-29 22:26:36 -07:00
Jason Ekstrand	1f2624e6dd	nir/spirv: Add support for push constants	2015-10-29 22:26:00 -07:00
Jason Ekstrand	a2283508b0	nir/intrinsics: Add a load_push_constant intrinsic	2015-10-29 22:26:00 -07:00
Jason Ekstrand	f2a8c9db24	nir/spirv: Rework the way we handle interface types	2015-10-29 22:26:00 -07:00
Chad Versace	4073219cf1	anv/pass: Remove redundant assert Trivial fix.	2015-10-29 11:47:39 -07:00
Chad Versace	1e98177439	anv/pass: Move VkRenderPass code to new file Move it from anv_device.c to new file anv_pass.c. Because it will soon grow bigger.	2015-10-29 11:10:03 -07:00
Chad Versace	c284c39b13	anv: Fix parsing of load ops in VkAttachmentDescription My original understanding of VkAttachmentDescription::loadOp, stencilLoadOp was incorrect. Below are all possible combinations: VkFormat \| loadOp=clear stencilLoadOp=clear ---------------+--------------------------- color \| clear-color ignored depth-only \| clear-depth ignored stencil-only \| ignored clear-stencil depth-stencil \| clear-depth clear-stencil	2015-10-29 10:59:55 -07:00
Jason Ekstrand	8bcba083db	anv: Update the README Adds a note that we support SPIR-V revision 32. Also, we now support geometry shaders.	2015-10-28 12:30:34 -07:00
Jason Ekstrand	12feda0c09	Revert "nir/intrinsic: Allow up to four indices" This reverts commit `5eccd0b4b9`. This was only needed for the store_ssbo_vk_indirect intrinsic	2015-10-27 13:44:14 -07:00
Jason Ekstrand	423e7a55cc	Revert "nir/intrinsics: Add new Vulkan load/store intrinsics" This reverts commit `24bcc89c8f`. Now that we have the new vulkan_resource_index intrinsic, these variants of the classic UBO/SSBO instrinsics aren't needed.	2015-10-27 13:43:25 -07:00
Jason Ekstrand	a6be53223e	anv/nir: Work with the new vulkan_resource_index intrinsic	2015-10-27 13:42:51 -07:00
Jason Ekstrand	3d44b3aaa6	nir/spirv: Use the new vulkan_resource_index intrinsic This is instead of using the _vk versions of UBO/SSBO load/store intrinsics	2015-10-27 13:41:59 -07:00
Jason Ekstrand	800a9706f0	nir: Add a vulkan_resource_index intrinsic	2015-10-27 13:41:08 -07:00
Jason Ekstrand	37b6afb3d9	Add a todo comment about intput_slots_valid in the FS shader key	2015-10-26 16:25:02 -07:00
Jason Ekstrand	ab6ed2e1ac	anv/gen8_pipeline: Emit a real 3DSTATE_SBE_SWIZ packet	2015-10-26 16:25:02 -07:00
Jason Ekstrand	9006e555ce	anv/pipeline: Bump the size of the pipeline batch to accomodate GS The 1k batch size wasn't big enough for a full pipeline setup including geometry shaders. Some day we should make it dynamic.	2015-10-23 16:50:31 -07:00
Jason Ekstrand	4c59ee808f	anv/gen8_pipeline: Various 3DSTATE_GS fixes	2015-10-23 16:49:26 -07:00
Jason Ekstrand	8aba8cf513	anv/pipeline: Use separate-shader	2015-10-23 10:53:00 -07:00
Jason Ekstrand	760c4b894d	anv/pipeline: Pull separate_shader from NIR for vue map setup	2015-10-23 10:48:52 -07:00
Jason Ekstrand	ee8c67abe8	nir/spirv: Add support for builtins in arrays	2015-10-22 17:58:20 -07:00
Jason Ekstrand	9fe907ec79	nir/spirv: Make the builtins array distinguish between in and out	2015-10-22 17:54:24 -07:00
Jason Ekstrand	d11ea76168	nir/spirv: Make vtn_get_builtin_location smarter Instead of just stomping on the mode, it now validates asserts that the previously set mode is correct and only changes it if needed. We need to do this because, in geometry shaders, there are some builtins that can be either an input or an output depending on context. We can get that information from the SPIR-V source but we can't throw it away.	2015-10-22 17:45:41 -07:00
Jason Ekstrand	9abef3e817	nir/spirv: Make get_builtin_variable take a nir_variable_mode We'll want this in a moment for validation but, for now, it just gets stompped by get_builtin_variable.	2015-10-22 17:28:25 -07:00
Jason Ekstrand	2ce6636c75	nir/spirv: Remove the vtn_type argument from _vtn_variable_load/store Now that builtins are handled in deref chains, we don't really need this anymore.	2015-10-22 16:56:42 -07:00
Jason Ekstrand	f23d951083	nir/validate: Add better validation of load/store types	2015-10-22 16:53:01 -07:00
Jason Ekstrand	82c579e314	anv/gen8: Set the correct maximum number of GS threads This equation was pulled from mesa gen8_gs_state.c	2015-10-21 21:51:18 -07:00
Jason Ekstrand	d0e8c78407	anv/pipeline: set the gs_vertex_count in compile_gs This was missed in the initial enabling commit.	2015-10-21 21:50:47 -07:00
Jason Ekstrand	8af2a09956	anv/pipeline: Make the has_push_constants computation more accurate The computation used to only look for uniforms that weren't samplers. Now it also filters out arrays of samplers.	2015-10-21 21:50:16 -07:00
Jason Ekstrand	0329a252bd	nir/spirv: Add defaults for GS input/output primitive types These are supposed to be specified in the SPIR-V source as SpvExecutionMode enums but glslang isn't giving them to us. A bug has been filed: https://github.com/KhronosGroup/glslang/issues/84	2015-10-21 21:46:22 -07:00
Jason Ekstrand	4032549885	i965/vec4: Handle returns at the end of functions	2015-10-21 20:42:23 -07:00
Jason Ekstrand	5f29dacda2	i965: Move get_hw_prim_for_gl_prim to brw_util.c	2015-10-21 20:40:28 -07:00
Jason Ekstrand	ea23cb3543	nir/spirv: Add capabilities and decorations for basic geometry shaders	2015-10-21 20:36:25 -07:00
Jason Ekstrand	d538fe849d	anv/pipeline: Add back basic geometry shader support Now that we've done the refactoring upstream, it's much easier to to get hooked up. We haven't tested things well enough to know that we're setting up the GPU state correctly for them yet but at least we can compile them now.	2015-10-21 18:45:48 -07:00
Jason Ekstrand	164abff0c0	nir/spirv: Add support for more CS system values	2015-10-21 18:39:06 -07:00
Jason Ekstrand	5790ee2bbb	nir/spirv: Add support for various barrier type instructions	2015-10-21 18:17:11 -07:00
Jason Ekstrand	3d35e4361f	Fix a couple of dereferences	2015-10-21 18:16:50 -07:00
Jason Ekstrand	55a7ee730c	spirv/nir: Add more stage asserts	2015-10-21 18:00:05 -07:00
Jason Ekstrand	27393c8630	nir/spirv: Add support for GS metadata	2015-10-21 17:58:34 -07:00
Jason Ekstrand	a8ffd6e72c	nir/gather_info: Add more info for geometry shaders	2015-10-21 17:42:47 -07:00
Jason Ekstrand	fed60e3c73	Merge remote-tracking branch 'mesa-public/master' into vulkan	2015-10-21 17:40:13 -07:00
Chad Versace	0ab926dfbf	anv: Don't teardown uninitialized anv_physical_device If the user called vkDestroyDevice but never called vkEnumeratePhysicalDevices, then the driver tried to ralloc_free() an unitialized anv_physical_device. Fixes test 'dEQP-VK.api.device_init.create_instance_name_version'.	2015-10-21 11:55:37 -07:00
Jason Ekstrand	c8572d0f9c	anv/pipeline: Remove a redundant line We set compute_sample_id based on multisample state two lines below.	2015-10-20 16:02:03 -07:00
Jason Ekstrand	72d99f8a40	anv/pipeline: Update a comment	2015-10-20 16:00:55 -07:00
Jason Ekstrand	27d868500a	anv/pipeline: Set key->render_to_fbo to false for fragment shaaders Vulkan uses the upper-left convention. This is the same as DX one and what our hardware does. We had it flipped around.	2015-10-20 15:37:16 -07:00
Jason Ekstrand	59bae36ffb	nir/spirv: Fix a typo	2015-10-20 15:35:13 -07:00
Jason Ekstrand	44b22ca441	nir/spirv: Handle SpvExecutionMode	2015-10-20 15:23:56 -07:00
Jason Ekstrand	a71e614d33	anv: Completely rework shader compilation Now that we have a decent interface in upstream mesa, we can get rid of all our hacks. As of this commit, we no longer use any fake GL state objects and all of shader compilation is moved into anv_pipeline.c. This should make way for actually implementing a shader cache one of these days. As a nice side-benifit, this commit also gains us an extra 300 passing CTS tests because we're actually filling out the texture swizzle information for vertex shaders.	2015-10-20 13:02:03 -07:00
Jason Ekstrand	2d9e899e35	nir: Add a pass to gather info from the shader This pass fills out a bunch of the fields in nir_shader_info by inspecting the shader.	2015-10-20 13:02:03 -07:00
Jason Ekstrand	6fb4469588	anv: Move the brw_compiler from anv_compiler to physical_device	2015-10-20 13:02:03 -07:00
Jason Ekstrand	9e3615cc7d	i965: Move brw_compiler_create to brw_compiler.h	2015-10-20 13:02:03 -07:00
Jason Ekstrand	bf6407079b	i965: Split process_nir into two haves; pre- and post-	2015-10-20 13:02:03 -07:00
Jason Ekstrand	611ace6861	anv/compiler: Remove more pre-SNB shader key setup	2015-10-20 13:02:03 -07:00
Jason Ekstrand	b3a344db30	anv/compiler: Get rid of GS support. The geometry shader support is currently completely untested. As I go through and re-factor the compiler, I'd rather not refactor dead code that I don't have a way to know if I broke. Let's just remove it for now. We can put it back in easily enough later and then we'll do it properly.	2015-10-20 13:02:03 -07:00
Jason Ekstrand	5f5224f256	anv/meta: Use the actual render pass for creating blit pipelines	2015-10-20 13:02:02 -07:00
Chad Versace	4d4e559b6a	vk: Use consistent names for anv_cmd_state dirty bits Prefix all anv_cmd_state dirty bit tokens with ANV_CMD_DIRTY. For example: old -> new ANV_DYNAMIC_VIEWPORT_DIRTY -> ANV_CMD_DIRTY_DYNAMIC_VIEWPORT ANV_CMD_BUFFER_PIPELINE_DIRTY -> ANV_CMD_DIRTY_PIPELINE Change type of anv_cmd_state::dirty and ::compute_dirty from uint32_t to the self-documenting type anv_cmd_dirty_mask_t.	2015-10-20 11:40:24 -07:00
Chad Versace	2484d1a01f	anv/pipeline: Fix requirement for depthstencil state The Vulkan spec allows VkGraphicsPipelineCreateInfo::pDepthStencilState to be NULL when the pipeline's subpass contains no depthstencil attachment (see spec quote below). anv_pipeline_init_dynamic_state() required it unconditionally. This path fixes anv_pipeline_init_dynamic_state() to access pDepthStencilState only when there is a depthstencil attachment. From the Vulkan spec (20 Oct 2015, git-aa308cb) pDepthStencilState [...] may only be NULL if renderPass and subpass specify a subpass that has no depth/stencil attachment.	2015-10-20 11:29:16 -07:00
Chad Versace	b51468b519	anv/pipeline: Validate VkGraphicsPipelineCreateInfo The Vulkan spec (20 Oct 2015, git-aa308cb) states that some fields of VkGraphicsPipelineCreateInfo are required under certain conditions. Add a new function, anv_pipeline_validate_create_info() that asserts the requirements hold. The assertions helped me discover bugs in Crucible and anv_meta.c.	2015-10-20 10:55:54 -07:00
Chad Versace	855180b3d9	anv: Define anv_validate macro If a block of code is annotated with anv_validate, then the block runs only in debug builds.	2015-10-20 10:55:54 -07:00
Chad Versace	81f8b82fc8	vk/meta: Add required renderpass to pipeline The Vulkan spec (20 Oct 2015, git-aa308cb) requires that VkGraphicsPipelineCreateInfo::renderPass be a valid handle. To satisfy that, define a static dummy render pass used for all meta operations.	2015-10-20 10:48:26 -07:00
Chad Versace	0d84a0d58b	vk/meta: Add required multisample state to pipeline The Vulkan spec (20 Oct 2015, git-aa308cb) requires that VkGraphicsPipelineCreateInfo::pMultisampleState not be NULL.	2015-10-20 10:48:09 -07:00
Jason Ekstrand	60e8439237	anv/compiler: Remove irrelevant wm key setup Most of this applies to Iron Lake and prior only. While we're at it, we get rid of the legacy GL shading model code.	2015-10-19 17:00:26 -07:00
Jason Ekstrand	27ca9ca4e1	anv/compiler: Get rid of legacy shader key setup Most of the shader key setup we did was for pre-Sandybridge and the stuff for SNB+ wasn't in the key setup. That stuff still isn't there but at least we've left ourselves notes for now.	2015-10-19 16:45:11 -07:00
Jason Ekstrand	661d0db077	anv/compiler: Delete legacy clipping code This is a Vulkan driver. We don't need legacy clipping stuff and, even if we did, we don't plan on supporting pre-Sandybridge anyway.	2015-10-19 16:26:16 -07:00
Jason Ekstrand	fba55b711e	anv/compiler: Remove unneeded wm prog data setup As of upstream mesa changes, brw_compile_fs does this for us so there's no need to have the code in the Vulkan driver anymore.	2015-10-19 16:17:41 -07:00
Jason Ekstrand	12c30c9498	nir/spirv: Use the new nir_variable helpers	2015-10-19 16:08:23 -07:00
Jason Ekstrand	7e6959402d	nir/spirv: Handle builtins in OpAccessChain Previously, we were trying to handle them later when loading. However, at that point, you've already lost information and it's harder to handle certain corner-cases. In particular, if you have a shader that does gl_PerVertex.gl_Position.x = foo we have trouble because we see the .x and we don't know that we're in gl_Position. If we, instead, handle it in OpAccessChain, we have all the information we need and we can silently re-direct it to the appropreate variable. This also lets us delete some code which is a nice side-effect.	2015-10-19 15:50:45 -07:00
Jason Ekstrand	958fc04dc5	Merge remote-tracking branch 'mesa-public/master' into vulkan	2015-10-19 14:14:21 -07:00
Jason Ekstrand	995d9c4ac7	anv/pipeline: Remove the ViewportState finishme We should be doing everything we need to with the viewport state	2015-10-17 10:35:29 -07:00
Jason Ekstrand	3e47e34036	anv: Add support for immutable descriptors	2015-10-17 08:17:00 -07:00
Jason Ekstrand	7010fe61c8	anv: Add facilities for dumping an image to a file The ability to dump an arbitrary miplevel or array slice of an anv_image to a file is very useful for debugging. Nothing inside of the driver calls this right now, but it's very useful to call from GDB.	2015-10-16 20:03:06 -07:00
Jason Ekstrand	368e703a01	anv/pipeline: Rework dynamic state handling Aparently, we had the dynamic state array in the pipeline backwards. Instead of enabling the bits in the pipeline, it disables them and marks them as "dynamic".	2015-10-16 16:30:02 -07:00
Jason Ekstrand	8ed23654c9	nir/spirv: Fix handling of vector component selects via OpAccessChain When we get to the end of the _vtn_load/store_varaible recursion, we may have one link left in the deref chain if there is a vector component select on the end. In this case, we need to truncate the deref chain early so that, when we make the copy for the load, we don't get the extra deref. The final deref will be handled by the vector extract/insert that comes later.	2015-10-15 21:18:57 -07:00
Jason Ekstrand	2552df41a1	anv/cmd_buffer: Reset the command buffer in BeginCommandBuffer	2015-10-15 18:28:00 -07:00
Jason Ekstrand	298d031642	anv/batch_chain: Add some sanity-check asserts for relocations	2015-10-15 17:24:32 -07:00
Jason Ekstrand	3130851add	anv/x11: Only advertise VK_FORMAT_B8R8G8A8_UNORM The others don't work at the moment so we shouldn't be advertising them.	2015-10-15 16:16:17 -07:00
Jason Ekstrand	f5eec407ea	anv/x11: Treat the pPlatformWindow as a xcb_window_t* instead of xcb_window_t	2015-10-15 15:38:20 -07:00
Jason Ekstrand	03952b1513	anv/device: Add support for combined image and sampler descriptors	2015-10-15 15:17:27 -07:00
Jason Ekstrand	b459b3d82c	anv/device: Remove some unneeded anv_finishmes	2015-10-15 15:17:07 -07:00
Jason Ekstrand	ba20569626	anv/device: Make the CreateSemaphore stub return success	2015-10-15 14:34:07 -07:00
Jason Ekstrand	bed7d1e03c	anv: Add support for BufferInfo in descriptor sets	2015-10-15 13:45:53 -07:00
Jason Ekstrand	6dc4cad994	anv/cmd_buffer: Add an alloc_surface_state helper	2015-10-15 13:45:07 -07:00
Jason Ekstrand	896c1c65d6	anv: Get rid of the descriptor_set_binding struct We no longer need it as we have a better way to deal with dynamic offsets.	2015-10-14 19:02:29 -07:00
Jason Ekstrand	42683e3757	anv: Get rid of backend compiler hacks for descriptor sets Now that we have anv_nir_apply_pipeline_layout, we can hand the backend compiler intrinsics and texture instructions that use a flat buffer index just like it wants. There's no longer any reason for any of these hacks.	2015-10-14 18:38:33 -07:00
Jason Ekstrand	da994f4b7e	anv/nir: Rewrite apply_dynamic_offsets to handle the new vk intrinsics	2015-10-14 18:38:33 -07:00
Jason Ekstrand	9c9b7d79c8	anv/nir: Add a pass for applying a applying a pipeline layout to a shader This new pass lowers the _vk intrinsics which take a (set, binding, index) tripple to the single-index non-vk intrinsics based on the pipeline layout.	2015-10-14 18:38:33 -07:00
Jason Ekstrand	de608153fb	nir/spirv: Use the Vulkan ubo intrinsics	2015-10-14 18:38:33 -07:00
Jason Ekstrand	24bcc89c8f	nir/intrinsics: Add new Vulkan load/store intrinsics	2015-10-14 18:38:33 -07:00
Jason Ekstrand	5eccd0b4b9	nir/intrinsic: Allow up to four indices	2015-10-14 18:38:33 -07:00
Jason Ekstrand	b37c38c1ca	anv: Completely rework descriptor set layouts This patch reworks a bunch of stuff in the way we do descriptor set layouts. Our previous approach had a couple of problems. First, it was based on a misunderstanding of arrays in descriptor sets. Second, it didn't properly handle descriptor sets where some bindings were missing stages. The new apporach should be correct and also makes some operations, particularly those on the hot-path, a bit easier. We use the descriptor set layout for four things: 1) To determine the map from bindings to the actual flattened descriptor set in vkUpdateDescriptorSets(). 2) To determine the descriptor <-> binding table entry mapping to use in anv_cmd_buffer_flush_descriptor_sets(). 3) To determine the mappings of dynamic indices. 4) To determine the (set, binding, array index) -> binding table entry mapping inside of shaders. The new approach is directly taylored towards these operations.	2015-10-14 18:38:33 -07:00
Chad Versace	7965fe7da6	vk: Add README Requested by developers outside Intel. During the driver's pre-release development, let's make the README easy to find for external experimenters. Keep it at the top of the source tree.	2015-10-14 13:58:29 -07:00
Jason Ekstrand	d2d8945eb8	nir/spirv: Fix a bug in indirect OpAccessChain handling	2015-10-13 20:00:18 -07:00
Jason Ekstrand	db5a5fcd18	anv/image: Add a basic implementation of GetImageSubresourceLayout	2015-10-13 20:00:17 -07:00
Jason Ekstrand	28ed02588a	anv/formats: Use the surface_format_info struct from brw_surface_formats.h The surface_format_info struct changed in mesa but the copied-and-pasted version didn't get updated on the last mesa master merge. This both fixes the bug and should prevent this in the future.	2015-10-13 15:23:24 -07:00
Jason Ekstrand	accbf178eb	i965/surface_formats: Pull the surface_format_info struct into a header	2015-10-13 15:23:24 -07:00
Jason Ekstrand	fd2ec1c8ad	anv/x11: Do something sensible if get_geometry fails in GetSurfaceProperties	2015-10-13 15:10:40 -07:00
Jason Ekstrand	c31f926726	anv/wsi: Add the GetSurfacePresentModesKHR stub Support has existed in the X11 and Wayland backends for a while but, somehow, the entrypoint got missed in the API shuffle.	2015-10-13 11:47:03 -07:00
Jason Ekstrand	e21ecb841c	anv: Declare/validate the correct API version	2015-10-12 18:25:19 -07:00
Jason Ekstrand	0689a0f0f3	anv/device: Return VK_SUCCESS after setting pCount in QueueFamilyProperties	2015-10-10 15:25:08 -07:00
Kristian Høgsberg Kristensen	fc2a66cfcd	Merge ../mesa into vulkan	2015-10-08 17:20:24 -07:00
Jason Ekstrand	48a87f4ba0	anv/queue: Get rid of the serial This was a remnant of the object tagging implementation we had at one point. We haven't used it for a long time so there's no good reason to keep it around.	2015-10-08 12:16:00 -07:00
Jason Ekstrand	8984559892	vk/0.170.2: Update to the new VK_EXT_KHR_swapchain extensions	2015-10-08 12:11:18 -07:00
Chad Versace	2228ec0112	Merge branch 'vulkan-0.170.2' into vulkan This updates the API from 0.138.2 to 0.170.2, and updates SPIR-V to v32.	2015-10-07 11:49:07 -07:00
Chad Versace	7fa98ab182	vk: Remove temporary vulkan headers Remove vulkan-0.138.2.h and vulkan-0.170.2.h. Their purpose was to aid the header update to 0.170.2.	2015-10-07 11:45:48 -07:00
Chad Versace	2f1ca71360	vk/0.170.2: Bump header version The header is now fully updated.	2015-10-07 11:44:44 -07:00
Chad Versace	c2f94e3a0d	vk/0.170.2: Update C++ errata and typedefs	2015-10-07 11:44:33 -07:00
Chad Versace	0ca3c8480d	vk/0.170.2: Update remaining enums	2015-10-07 11:39:49 -07:00
Chad Versace	f9c948ed00	vk/0.170.2: Update VkResult Version 0.170.2 removes most of the error enums. In many cases, I had to replace an error with a less accurate (or even incorrect) one. In other cases, the error path is replaced with an assertion.	2015-10-07 11:36:51 -07:00
Chad Versace	8dee32e71f	vk/0.170: Update VkDescriptorInfo Ignore the new bufferInfo field with a anv_finishme.	2015-10-07 10:58:55 -07:00
Chad Versace	92e7bd3610	vk/0.170.2: Update vkCreateDescriptorPool Nothing to do. In Mesa the pool is a stub.	2015-10-07 10:47:55 -07:00
Chad Versace	a3bc07c23b	vk/0.170.2: Update VkAttachmentDescription	2015-10-07 10:44:40 -07:00
Chad Versace	82259f88dd	vk/0.170.2: Update VkImageViewCreateInfo	2015-10-07 10:43:44 -07:00
Chad Versace	f4295b3cca	vk/0.170.2: Update VkImageCreateInfo	2015-10-07 10:43:17 -07:00
Chad Versace	d48e71ce55	vk/0.170.2: Update VkPhysicalDeviceProperties	2015-10-07 10:36:46 -07:00
Chad Versace	81e1dcc42c	vk/0.170.2: Update VkImageFormatProperties	2015-10-07 10:28:30 -07:00
Chad Versace	98c2bb6917	vk/0.170.2: Update VkFormatProperties	2015-10-07 10:15:59 -07:00
Chad Versace	545f5cc6e1	vk/0.170.2: Update VkPhysicalDeviceFeatures	2015-10-07 10:09:39 -07:00
Chad Versace	033a37f591	vk/0.170.2: Update VkPhysicalDeviceLimits	2015-10-07 10:09:31 -07:00
Jason Ekstrand	982466aeff	anv/device: Remove some #ifdef'd out code This was a left-over from the dynamic state update.	2015-10-07 09:45:49 -07:00
Jason Ekstrand	010c6efd65	vk/0.170.2: Make vkUpdateDescriptorSets return void	2015-10-07 09:44:53 -07:00
Jason Ekstrand	1a52bc3039	anv/pipeline: Add support for dynamic state in pipelines	2015-10-07 09:40:49 -07:00
Jason Ekstrand	daf68a9465	vk/0.170.2: Switch to the new dynamic state model	2015-10-07 09:40:49 -07:00
Jason Ekstrand	55fcca306b	anv: Add a dynamic state data structure and basic helpers	2015-10-07 09:36:27 -07:00
Jason Ekstrand	941a105954	anv/private: Add a typed_memcpy macro This is amazingly helpful when copying arrays of things around.	2015-10-07 09:36:27 -07:00
Chad Versace	b1c024a932	vk/meta: Fix -Wstrict-prototypes In C, functions with no arguments require a void argument. build_nir_clear_fragment_shader() lacked that. Fixes: anv_meta.c:70:1: warning: function declaration isn't a prototype [-Wstrict-prototypes]	2015-10-07 09:10:25 -07:00
Chad Versace	6dea1a9ba1	vk/0.170.2: Merge VkAttachmentView into VkImageView	2015-10-07 09:10:25 -07:00
Chad Versace	03dd72279f	vk/image: Fix retrieval of anv_surface for depthstencil aspect If anv_image_get_surface_for_aspect_mask() is given a combined depthstencil aspect mask, and the image has a stencil surface but no depth surface, then return the stencil surface. Hacks on hacks.	2015-10-07 09:10:25 -07:00
Chad Versace	85ff3cfde3	vk: Drop -Wextra Eliminates lots of warnings due to anv_meta.c's inclusion of nir.h. I like the extra warnings, and they should probably get fixed. However, git-grep reveals that no other Mesa directory uses -Wextra. Building Vulkan produces a lot of compiler warnings from core Mesa headers that no other Mesa developer sees, and hence no other Mesa developer will fix.	2015-10-07 07:28:46 -07:00
Chad Versace	24de3d49ea	vk: Embed two surface states in anv_image_view This prepares for merging VkAttachmentView into VkImageView. The two surface states are: anv_image_view::color_rt_surface_state: RENDER_SURFACE_STATE when using image as a color render target. anv_image_view::nonrt_surface_state; RENDER_SURFACE_STATE when using image as a non render target. No Crucible regressions.	2015-10-06 21:22:18 -07:00
Chad Versace	37bf120930	vk/pipeline: Emit MSAA finishme only if samples > 1 If samples == 1, then there's nothing for Mesa to do, and the finishme message is only noise.	2015-10-06 21:22:18 -07:00
Chad Versace	3fc2b1f325	vk: Remove stale finishme for stencil image views They don't work completely. But they work well enough to satisfy Crucible.	2015-10-06 21:22:18 -07:00
Chad Versace	44143a1f46	vk: Add anv_image::usage It's a copy of VkImageCreateInfo::usage. Will be used for the VkAttachmentView/VkImageView merge.	2015-10-06 21:22:18 -07:00
Chad Versace	cf603714cb	vk/meta: Fix usage flags for image-wrapped-buffers In make_image_for_buffer(), use VK_IMAGE_USAGE_SAMPLED_BIT when transferring from the buffer and use VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT when transferring to the buffer.	2015-10-06 21:22:18 -07:00
Chad Versace	d00718104f	vk/image: Remove stale anv_asserts for depthstencil attachments We don't fully handle mipmapped, array depthstencil attachments. But we handle the well enough for Crucible's miptree tests.	2015-10-06 21:22:18 -07:00
Kristian Høgsberg Kristensen	1d7ef82f4b	i965: Delete brw_cs.cpp which was deleted in master	2015-10-06 15:20:19 -07:00
Jason Ekstrand	c272bb58f5	nir/spirv: Better texture handling	2015-10-06 15:10:45 -07:00
Jason Ekstrand	ccea9cc332	nir/spirv: Update to SPIR-V Rev. 32	2015-10-06 14:52:35 -07:00
Jason Ekstrand	89eebd889c	vk/0.170.2: Fairly trivial enum shuffling	2015-10-06 14:08:08 -07:00
Jason Ekstrand	1e4263b7d2	vk/0.170.2: s/baseArraySlice/baseArrayLayer/	2015-10-06 14:08:08 -07:00
Chad Versace	d4446a7e58	vk: Merge anv_attachment_view into anv_image_view This prepares for merging VkAttachmentView into VkImageView.	2015-10-06 12:13:03 -07:00
Chad Versace	6b5ce5daf5	vk: Update comments for anv_image_view - Document the extent member. It's the extent of the view's base level. - s/VkAttachmentView/VkImageView/	2015-10-06 12:12:52 -07:00
Jason Ekstrand	19018c9f13	vk/0.170.2: Add a stage field to ShaderCreateInfo	2015-10-06 10:20:10 -07:00
Jason Ekstrand	cc389b1482	vk/0.170.2: Rename cs to stage in ComputePipelineCreateInfo	2015-10-06 10:11:50 -07:00
Jason Ekstrand	588d40e97a	vk/0.170.2: Use ImageSubresourceCopy in ImageResolve	2015-10-06 10:09:47 -07:00
Jason Ekstrand	bd4cde708a	vk/0.170.2: Rename fields in VkClearColorValue	2015-10-06 10:07:47 -07:00
Jason Ekstrand	81c7fa8772	vk/0.170.2: Rework blits to use ImageSubresourceCopy	2015-10-06 10:04:04 -07:00
Jason Ekstrand	ba2254aa79	vulkan.h: Move stuff around This has no functional change but substantially decreases the diff with the 0.170.2 header.	2015-10-06 09:50:04 -07:00
Jason Ekstrand	d1908d2c33	vk/0.170.2: Rework parameters to CmdClearDepthStencil functions	2015-10-06 09:40:39 -07:00
Jason Ekstrand	02a9be31d6	vk/0.170.2: Add the flags parameter to GetPhysicalDeviceImageFormatProperties	2015-10-06 09:37:21 -07:00
Jason Ekstrand	a145acd812	vk/0.170.2: Remove the pCount parameter from AllocDescriptorSets	2015-10-06 09:32:01 -07:00
Jason Ekstrand	8ba684cbad	vk/0.170.2: Rename extension and layer query functions	2015-10-06 09:25:03 -07:00
Jason Ekstrand	a6eba403e2	vk/0.170.2: Update to the new queue family properties query	2015-10-05 21:17:12 -07:00
Jason Ekstrand	65964cd49b	vk/0.170.2: Re-arrange parameters of vkCmdDraw[Indexed]	2015-10-05 21:10:20 -07:00
Jason Ekstrand	05a26a60c8	vk/0.170.2: Make destructors return void	2015-10-05 20:50:51 -07:00
Jason Ekstrand	460676122f	vk/0.170.2: Rename VkClearValue.ds to depthStencil	2015-10-05 20:35:08 -07:00
Jason Ekstrand	8e1ef639b6	vk/0.170.2: Add the subpass field to VkCmdBufferBeginInfo	2015-10-05 20:30:53 -07:00
Jason Ekstrand	757166592e	vk/0.170.2: Rename pointer parameters of VkSubpassDescription	2015-10-05 20:26:21 -07:00
Jason Ekstrand	57f500324b	vk/0.170.2: Add unnormalizedCoordinates to VkSamplerCreateInfo	2015-10-05 20:17:24 -07:00
Jason Ekstrand	f7c3519aaf	vk/0.170.2: Rename VkTexAddress to VkTexAddressMode	2015-10-05 20:15:06 -07:00
Jason Ekstrand	39a19e88a3	vulkan.h: Various cosmetic changes These don't affect the driver in any way.	2015-10-05 20:06:30 -07:00
Chad Versace	9357062348	vk: Merge anv_*_attachment_view into anv_attachment_view Remove anv_color_attachment_view and anv_depth_stencil_view, merging them into anv_attachment_view. This prepares for merging VkAttachmentView into VkImageView.	2015-10-05 17:46:04 -07:00
Chad Versace	ae30535602	vk: Drop anv_attachment_view::extent It's duplicated by anv_attachment_view::image_view::extent.	2015-10-05 17:46:04 -07:00
Chad Versace	f0f4dfa9cc	vk: Drop anv_surface_view Push the members of struct anv_surface_view into anv_image_view and anv_buffer_view, then remove struct anv_surface_view. Observe that anv_surface_view::range is not needed for anv_image_view, and so was dropped there. This prepares for the merge of VkAttachmentView into VkImageView. Remove the common parent of anv_buffer_view and anv_image_view (that is, anv_surface_view) will make the merge easier.	2015-10-05 17:46:04 -07:00
Chad Versace	74193a880f	vk: Use consistent names for anv__view variables Rename all anv__view variables to follow this convention: - sview -> anv_surface_view - bview -> anv_buffer_view - iview -> anv_image_view - aview -> anv_attachment_view - cview -> anv_color_attachment_view - ds_view -> anv_depth_stencil_attachment_view This clarifies existing code. And it will reduce noise in the upcoming commits that merge VkAttachmentView into VkImageView.	2015-10-05 17:46:04 -07:00
Chad Versace	ffd051830d	vk: Unionize anv_desciptor For a given struct anv_descriptor, all members are NULL (in which case the descriptor is empty) or exactly one member is non-NULL. To make struct anv_descriptor better reflect its set of valid states, convert the struct into a tagged union.	2015-10-05 17:46:04 -07:00
Chad Versace	63439953d7	vk: Drop dependency on no longer extant header anv_meta no longer uses GLSL shaders, and the build system no longer converts them to SPIR-V. So remove anv_meta_spirv_autogen.h from Makefile.am. (cherry picked from commit `2fc8122f66`)	2015-10-05 17:06:19 -07:00
Chad Versace	2fc8122f66	vk: Drop dependency on no longer extant header anv_meta no longer uses GLSL shaders, and the build system no longer converts them to SPIR-V. So remove anv_meta_spirv_autogen.h from Makefile.am.	2015-10-05 17:04:18 -07:00
Chad Versace	8bf021cf3d	vk: Return anv_image_view_info by value The struct is only 2 bytes. Returning it on the stack is better than returning a reference into the ELF .data segment.	2015-10-05 13:22:44 -07:00
Chad Versace	4ffb4549e0	vk/image: Document a Vulkan spec requirement for depthstencil The Vulkan spec (git a511ba2) requires support for some combined depth stencil formats.	2015-10-05 13:18:44 -07:00
Chad Versace	3530224063	vk: Annotate anv_cmd_state::gen7::index_type It's the value of 3DSTATE_INDEX_BUFFER.IndexFormat.	2015-10-05 08:58:35 -07:00
Chad Versace	9c93aa9141	vk: Better types for VkShaderStage, VkShaderStageFlags vars In most places, the variable type was the uninformative uint32_t.	2015-10-05 08:55:09 -07:00
Chad Versace	6317c3144d	vk/0.170.2: Drop VK_BUFFER_USAGE_GENERAL	2015-10-05 08:12:59 -07:00
Chad Versace	4744f60e79	vk/0.170.2: Drop enum VkBufferViewType	2015-10-05 08:12:58 -07:00
Chad Versace	7a089bd1a6	vk/0.170.2: Update VkImageSubresourceRange Replace 'aspect' with 'aspectMask'.	2015-10-05 08:10:57 -07:00
Chad Versace	568654d606	vk/0.170.2: Drop VK_IMAGE_USAGE_GENERAL	2015-10-05 08:09:33 -07:00
Chad Versace	6a40af1b08	vk/0.170.2: Update VkPipelineMultisampleStateCreateInfo	2015-10-04 10:00:25 -07:00
Chad Versace	dd04be491d	vk/0.170.2: Update Vk VkPipelineDepthStencilStateCreateInfo Rename member depthBoundsEnable -> depthBoundsTestEnable.	2015-10-04 09:41:46 -07:00
Chad Versace	8cb2e27c62	vk/0.170.2: Update VkRenderPassBeginInfo Rename members: attachmentCount -> clearValueCount pAttachmentClearValues -> pClearValues	2015-10-04 09:26:25 -07:00
Chad Versace	3694518be5	vk/0.170.2: Drop VkBufferViewCreateInfo::viewType	2015-10-04 09:14:57 -07:00
Chad Versace	216d9f248d	vk: Copy current header to vulkan-0.138.2.h While upgrading Mesa to the new 0.170.2 API, it's convenient to have all three headers available in the tree: - vulkan-0.138.2.h, the old one - vulkan-0.170.2.h, the new one - vulkan.h, the one in transition	2015-10-04 09:09:35 -07:00
Chad Versace	7f18ed4b9f	vk: Import header 0.170.2 header LunarG SDK From the LunarG SDK at tag sdk-0.9.1, import vulkan.h as vulkan-0.170.2.h. This header is the first provisional header with the addition of minor fixes.	2015-10-04 09:09:31 -07:00
Jason Ekstrand	09ba0a7c05	Merge remote-tracking branch 'mesa-public/master' into vulkan	2015-10-03 11:32:29 -07:00
Jason Ekstrand	ef56cf7738	Merge remote-tracking branch 'mesa-public/master' into vulkan	2015-10-02 16:52:47 -07:00
Jason Ekstrand	10f97718c3	anv/allocator: Add a sanity assertion in state stream finish. We assert that the block offset we got while walking the list of blocks is actually a multiple of the block size. If something goes wrong and the GPU decides to stomp on the surface state buffer we can end up getting corruptions in our list of blocks. This assertion makes such corruptions a crash with a meaningful message rather than an infinite loop.	2015-10-02 16:24:42 -07:00
Jason Ekstrand	002e7b0cc3	anv: Remove the GLSL -> SPIR-V scraper/converter This was very useful to get us up-and-going. However, now that we can use NIR directly for meta shaders, we don't need this anymore and we might as well drop the glslc dependency.	2015-10-02 16:20:04 -07:00
Jason Ekstrand	f5ffb0e0cb	anv/meta: Use NIR directly for blit shaders	2015-10-02 16:18:44 -07:00
Jason Ekstrand	7851a4392a	anv/meta: Use NIR directly for clear shaders	2015-10-02 16:18:32 -07:00
Jason Ekstrand	add99c4beb	anv: Add a back-door for passing NIR shaders directly into the pipeline This will allow us to use NIR directly for meta operations rather than having to go through SPIR-V.	2015-10-02 16:16:57 -07:00
Jason Ekstrand	b68805f83c	anv: Add some NIR builder helpers These should all eventually be up-streamed. However, since they currently have no upstream users, they would just bitrot there. We'll keep them local for the time being.	2015-10-02 16:15:53 -07:00
Jason Ekstrand	c1553653a2	vk/wsi/x11: Send OUT_OF_DATE if the X drawable goes away	2015-10-02 13:44:53 -07:00
Kristian Høgsberg Kristensen	005c8e0106	Merge branch 'master' of ../mesa into vulkan	2015-10-01 14:24:29 -07:00
Jason Ekstrand	337caee910	anv/wsi_x11: Properly report BadDrawable errors to the client	2015-09-28 20:18:41 -07:00
Jason Ekstrand	f06bc45b0c	anv/batch_chain: Use the surface state pool for binding tables	2015-09-28 16:01:14 -07:00
Jason Ekstrand	d93f6385a7	anv/batch_chain: Add helpers for fixing up block_pool relocations	2015-09-28 16:01:14 -07:00
Jason Ekstrand	8c00f9ab56	anv/gen8: Do a render cache flush prior to changing state base address	2015-09-28 16:01:14 -07:00
Jason Ekstrand	0e94446b25	anv/device: Use a 4K block size for surface state blocks We want to start using the surface state block pool for binding tables and binding tables. In order to do this, we need to be able to set surface state base address to the address of a block and surface state base address has a 4K alignment requriement.	2015-09-28 16:01:01 -07:00
Jason Ekstrand	737e89bc8d	anv/meta: Use the dynamic state stream for temporary buffers	2015-09-28 16:01:01 -07:00
Jason Ekstrand	219a1929f7	anv/util: Add helpers for getting the first and last elements of a vector	2015-09-28 16:01:01 -07:00
Jason Ekstrand	95487668df	anv/batch_chain: Add a _alloc_binding_table function	2015-09-28 16:01:01 -07:00
Jason Ekstrand	d517de6126	anv: Make anv_state.offset an int32_t Binding tables will have a negative offset and we need a way to express that. Besides, the chances of a state offset being larger than 2 GB is so remote it's not worth thinking about.	2015-09-28 16:01:01 -07:00
Jason Ekstrand	9ac3dde3a0	anv/wsi_wayland: Fix FIFO mode Previously, there were a number of things we were doing wrong: 1) We weren't flushing the wl_display so dead-looping clients weren't guaranteed to work. 2) We were sending the frame event after calling wl_surface.commit() so it wasn't getting assigned to the correct frame 3) We weren't actually setting fifo_ready to false. Unfortunately, we never noticed because (3) was hiding the other two. This commit fixes all three and clients that use FIFO mode are now properly refresh-rate limited.	2015-09-28 15:58:34 -07:00
Chad Versace	ddcedb979a	vk: Implement vkGetPhysicalDeviceImageFormatProperties() The implementation is incomplete because we lie about VkImageFormatProperties::maxResourceSize, hardcoding it to UINT32_MAX for all supported cases.	2015-09-28 11:53:39 -07:00
Chad Versace	9f3122db0e	vk: Refactor anv_GetPhysicalDeviceFormatProperties() Move the bulk of the function body to a new function anv_physical_device_get_format_properties(). This allows us to reuse the function when implementing anv_GetPhysicalDeviceImageFormatProperties() without calling into the public entry point.	2015-09-28 11:53:39 -07:00
Chad Versace	c15ce5c834	vk: Advertise that depthstencil formats support sampling Let vkGetPhysicalDeviceFormatProperties() set VK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT for tiled depthstencil images.	2015-09-28 11:53:39 -07:00
Jason Ekstrand	4e48f94469	anv/device: Wrap a couple valgrind calls in the VG macro This fixes the build for systems that don't have valgrind devel packages installed.	2015-09-28 11:18:52 -07:00
Chad Versace	97636345da	vk: Fix vkGetPhysicalDeviceSparseImageFormatProperties() The driver does not yet support sparse images, so return zero properties for all formats.	2015-09-28 10:17:48 -07:00
Kristian Høgsberg Kristensen	164f08c255	vk: Add anv_icd.json to .gitignore	2015-09-25 15:16:56 -07:00
Kristian Høgsberg Kristensen	850cfcad3e	vk: Also define vk_errorf in non-debug builds	2015-09-25 15:15:37 -07:00
Kristian Høgsberg Kristensen	cf24211d55	vk: Roll back GLSL parser support for vulkan In the interest of reducing our delta to mesa master, let's undo these changes now that we only support SPIR-V.	2015-09-25 10:42:07 -07:00
Jason Ekstrand	e9dff5bb99	vk: Add an ICD declaration file	2015-09-24 14:45:58 -07:00
Jason Ekstrand	39cd3783a4	anv: Add support for the ICD loader	2015-09-24 14:45:58 -07:00
Jason Ekstrand	a95f51c1d7	anv: Add a global dispatch table for use in meta operations	2015-09-24 14:45:58 -07:00
Jason Ekstrand	00d18a661f	anv/entrypoints: Expose the anv_resolve_entrypoint function	2015-09-24 14:45:58 -07:00
Jason Ekstrand	f5e72695e0	anv/entrypoints: Rename anv_layer to anv_dispatch_table	2015-09-24 14:45:58 -07:00
Jason Ekstrand	913a9b76f7	anv/batch_chain: Remove the current_surface_bo helper It's no longer used outside anv_batch_chain so we certainly don't need to be exporting. Inside anv_batch_chain, it's only used twice and it can be replaced by a single line so there's really no point.	2015-09-24 08:46:41 -07:00
Jason Ekstrand	bc17f9c9d7	anv/cmd_buffer: Add a helper for getting the surface state base address	2015-09-24 08:42:38 -07:00
Jason Ekstrand	e1a7c721d3	anv/allocator: Don't ever call mremap This has always been a bit sketchy and neither Kristian nor I have ever really liked it.	2015-09-24 08:42:14 -07:00
Jason Ekstrand	99e62f5ce8	anv/allocator: Delete the unused center_fd_offset from anv_block_pool	2015-09-24 08:41:56 -07:00
Jason Ekstrand	429665823d	anv/allocator: Do a better job of centering bi-directional block pools	2015-09-24 08:41:47 -07:00
Jason Ekstrand	76be58efce	anv/batch_chain: Clean up the reloc list swapping code	2015-09-24 08:41:38 -07:00
Jason Ekstrand	041f5ea089	anv/meta: Add location specifiers to meta shaders	2015-09-21 16:21:56 -07:00
Jason Ekstrand	f406b708a5	Merge branch 'nir-spirv' into vulkan	2015-09-17 20:03:40 -07:00
Jason Ekstrand	616db92b01	nir/spirv: Add better location handling Previously, our location handling was focussed on either no location (usually implicit 0) or a builting. Unfortunately, if you gave it a location, it would blow it away and just not care. This worked fine with crucible and our meta shaders but didn't work with the CTS. The new code uses the "data.explicit_location" field to denote that it has a "final" location (usually from a builtin) and, otherwise, the location is considered to be relative to the base for that shader stage.	2015-09-17 20:02:46 -07:00
Jason Ekstrand	a788e7c659	anv/device: Move mutex initialization to befor block pools	2015-09-17 18:23:21 -07:00
Jason Ekstrand	595e6cacf1	meta: Initial support for packing parameters Probably incomplete but it should do for now	2015-09-17 18:21:05 -07:00
Jason Ekstrand	d616493953	anv/meta: Pass the depth through the clear vertex shader It shouldn't matter since we shut off the VS but it's at least clearer.	2015-09-17 18:09:21 -07:00
Jason Ekstrand	3b8aa26b8e	anv/formats: Properly report depth-stencil formats	2015-09-17 17:44:20 -07:00
Jason Ekstrand	b5f6889648	vk/device: Don't allow device or instance creation with invalid extensions	2015-09-17 17:44:20 -07:00
Jason Ekstrand	dcf424c98c	anv/tests: Add some asserts for data integrity in block_pool_no_free	2015-09-17 17:44:20 -07:00
Jason Ekstrand	5f57ff7e18	anv/allocator: Make the block pool double-ended This allows us to allocate from either side of the block pool in a consistent way. If you use the previous block_pool_alloc function, you will get offsets from the start of the pool as normal. If you use the new block_pool_alloc_back function, you will get a negative index that corresponds to something in the "back" of the pool.	2015-09-17 17:44:20 -07:00
Jason Ekstrand	15624fcf55	anv/tests: Refactor the block_pool_no_free test This simply breaks the monotonicity check out into its own function	2015-09-17 17:44:20 -07:00
Jason Ekstrand	55daed947d	vk/allocator: Split block_pool_alloc into two functions	2015-09-17 17:44:20 -07:00
Jason Ekstrand	c55fa89251	anv/allocator: Use a signed 32-bit offset for the free list This has the unfortunate side-effect of making it so that we can't have a block pool bigger than 1GB. However, that's unlikely to happen and, for the sake of bi-directional block pools, we need to negative offsets.	2015-09-17 17:44:20 -07:00
Jason Ekstrand	8c6bc1e85d	anv/allocator: Create 2GB memfd up-front for the block pool	2015-09-17 17:44:20 -07:00
Jason Ekstrand	74bf7aa07c	anv/allocator: Take the device mutex when growing a block pool We don't have any locking issues yet because we use the pool size itself as a mutex in block_pool_alloc to guarantee that only one thread is resizing at a time. However, we are about to add support for growing the block pool at both ends. This introduces two potential races: 1) You could have two block_pool_alloc() calls that both try to grow the block pool, one from each end. 2) The relocation handling code will now have to think about not only the bo that we use for the block pool but also the offset from the start of that bo to the center of the block pool. It's possible that the block pool growing code could race with the relocation handling code and get a bo and offset out of sync. Grabbing the device mutex solves both of these problems. Thanks to (2), we can't really do anything more granular.	2015-09-17 17:44:20 -07:00
Jason Ekstrand	222ddac810	anv: Document the index and offset parameters of anv_bo	2015-09-17 17:44:20 -07:00
Chad Versace	85520aa070	vk/image: Remove stale FINISHME for non-2D image views gen8_image_view_init() now supports 1D, 2D, and 3D image views.	2015-09-14 15:16:57 -07:00
Chad Versace	622a317e4c	vk/image: Teach vkCreateImage about layout of 1D surfaces Calling vkCreateImage() with VK_IMAGE_TYPE_1D now succeeds and computes the surface layout correctly.	2015-09-14 15:15:12 -07:00
Chad Versace	6221593ff8	vk/meta: Partially implement vkCmdCopy, vkCmdBlit for 3D images Partially implement the below functions for 3D images: vkCmdCopyBufferToImage vkCmdCopyImageToBuffer vkCmdCopyImage vkCmdBlitImage Not all features work, and there is much for performance improvement. Beware that vkCmdCopyImage and vkCmdBlitImage are untested. Crucible proves that vkCmdCopyBufferToImage and vkCmdCopyImageToBuffer works, though. Supported: - copy regions with z offset Unsupported: - copy regions with extent.depth > 1 Crucible test results on master@d452d2b are: pass: func.miptree.r8g8b8a8-unorm..view-3d. pass: func.miptree.d32-sfloat..view-3d. fail: func.miptree.s8-uint..view-3d.	2015-09-14 14:27:34 -07:00
Chad Versace	0ecafe0285	vk/meta: Rename meta_emit_blit() params Rename src -> src_view and dest -> dest_view. This reduces noise in the next patch's diff, which adds new params to the function.	2015-09-14 12:29:51 -07:00
Chad Versace	b659a066e9	vk/gen8: Set RENDER_SURFACE_STATE::RenderTargetViewExtent	2015-09-14 12:29:49 -07:00
Chad Versace	ffa61e1572	vk/gen8: Refactor setting of SURFACE_STATE::Depth The field's meaning depends on SURFACE_STATE::SurfaceType. Make that correlation explicit by switching on VkImageType. For good measure, add some PRM quotes too.	2015-09-14 12:27:05 -07:00
Chad Versace	eed74e3a02	vk: Teach vkCreateImage about layout of 3D surfaces Calling vkCreateImage() with VK_IMAGE_TYPE_3D now succeeds and computes the surface layout correctly. However, 3D images do not yet work for many other Vulkan entrypoints.	2015-09-14 11:04:08 -07:00
Chad Versace	e01d5a0471	vk: Refactor anv_image_make_surface() Move the code that calculates the layout of 2D surfaces into a switch case.	2015-09-14 11:00:18 -07:00
Jason Ekstrand	8c8ad6dddf	vk: Use push constants for dynamic buffers	2015-09-11 15:56:19 -07:00
Jason Ekstrand	2b4a2eb592	vk/compiler: Rework create_params_array	2015-09-11 15:55:54 -07:00
Jason Ekstrand	c3086c54a8	vk/compiler: Add a NIR pass for pushing dynamic buffer offset This commit just adds the NIR pass but does none of the uniform setup	2015-09-11 15:53:56 -07:00
Jason Ekstrand	7487371056	vk/pipeline_layout: Add dynamic_offset_start and has_dynamic_offsets fields	2015-09-11 15:52:43 -07:00
Jason Ekstrand	de5220c7ce	vk/pipeline_layout: Move surface/sampler start from SoA to AoS This makes more sense to me and it's more consistent with anv_descriptor_set_layout.	2015-09-11 10:43:55 -07:00
Jason Ekstrand	b908c67816	vk: Rework the push constants data structure Previously, we simply had a big blob of stuff for "driver constants". Now, we have a very specific data structure that contains the driver constants that we care about.	2015-09-11 10:25:23 -07:00
Jason Ekstrand	fd21f0681a	Add the wayland protocol files to .gitignire	2015-09-11 09:29:40 -07:00
Jason Ekstrand	8040dc4ca5	vk/error: Handle ERROR_OUT_OF_DATE_WSI	2015-09-08 12:13:07 -07:00
Jason Ekstrand	060720f0c9	vk/wsi/x11: Actually block on X so we don't re-use busy buffers	2015-09-08 11:51:47 -07:00
Jason Ekstrand	1bee19e023	vk: Add the WSI header files	2015-09-08 10:33:46 -07:00
Jason Ekstrand	2f3de6260d	Merge branch 'nir-spirv' into vulkan	2015-09-05 14:12:59 -07:00
Jason Ekstrand	4d73ca3c58	nir/spirv.h: Remove some cruft missed while merging There were merge conflicts in spirv.h that got missed because they were in a comment and so it still compiled. This gets rid of them and we should be on-par with upstream spirv->nir.	2015-09-05 14:11:40 -07:00
Jason Ekstrand	612b13aeae	nir/spirv: Add support for most of the rest of texturing Assuming this all works, about the only thing left should be some corner-cases for tg4	2015-09-05 14:10:05 -07:00
Jason Ekstrand	fe786ff67d	Merge branch 'nir-spirv' into vulkan	2015-09-05 13:17:53 -07:00
Jason Ekstrand	35fcd37fcf	nir/spirv: Handle decorations after assigning variable locations	2015-09-05 13:17:21 -07:00
Jason Ekstrand	87d02f515b	Merge branch 'nir-spirv' into vulkan	2015-09-05 09:48:33 -07:00
Jason Ekstrand	9be43ef99c	nir/spirv: Handle the MatrixStride member decoration	2015-09-05 09:47:45 -07:00
Jason Ekstrand	01924a03d4	vk: Actually link in wayland libraries Turns out this was why I had accidentally broken the universe. Oops...	2015-09-04 20:02:38 -07:00
Jason Ekstrand	2c4ae00db6	vk: Conditionally compile Wayland support Pulling in libwayland causes undefined symbols in applications that are linked against vulkan alone. Ideally, we would like to dlopen a platform support library or something like that. For now, this works and should get crucible running again.	2015-09-04 19:18:52 -07:00
Jason Ekstrand	b3c037f329	vk: Fix size return value handling in a couple plces	2015-09-04 19:05:51 -07:00
Jason Ekstrand	9a95d08ed6	Merge branch 'nir-spirv' into vulkan	2015-09-04 18:54:15 -07:00
Jason Ekstrand	6d5dafd779	nir/spirv/glsl450: Use the correct write mask	2015-09-04 18:50:14 -07:00
Jason Ekstrand	7174d155e9	nir: Add a lower_fdiv option and use it in i965	2015-09-04 18:50:14 -07:00
Jason Ekstrand	f32d16a9f0	nir/spirv: Use the actual GLSL 450 extension header from Khronos	2015-09-04 18:50:14 -07:00
Jason Ekstrand	9e2c13350e	nir/spirv: Add support for SpvDecorationColMajor	2015-09-04 18:50:14 -07:00
Jason Ekstrand	f3bdb93a8e	nir/types: Allow single-column matrices This can sometimes be a convenient way to build vectors.	2015-09-04 18:50:14 -07:00
Jason Ekstrand	48e87c0163	vk/wsi: Add Wayland WSI support	2015-09-04 17:55:42 -07:00
Jason Ekstrand	348cb29a20	vk/wsi: Move to a clallback system for the entire WSI implementation We do this for two reasons: First, because it allows us to simplify WSI and compiling in/out support for a particular platform is as simple as calling or not calling the platform-specific init function. Second, the implementation gives us a place for a given chunk of the WSI to stash stuff in the instance.	2015-09-04 17:55:42 -07:00
Jason Ekstrand	06d8fd5881	vk/instance: Expose anv_instance_alloc/free	2015-09-04 17:55:42 -07:00
Jason Ekstrand	c0b97577e8	vk/WSI: Use a callback mechanism instead of explicit switching	2015-09-04 17:55:42 -07:00
Jason Ekstrand	ca3cfbf6f1	vk: Add an initial implementation of the actual Khronos WSI extension Unfortunately, this is a very large commit and removes the old LunarG WSI extension. This is because there are a couple of entrypoints that have the same name between the two extensions so implementing them both is impractiacl. Support is still incomplete, but this is enough to get vkcube up and going again.	2015-09-04 17:55:42 -07:00
Jason Ekstrand	3d9fbb6575	vk: Add initial support for VK_WSI_swapchain	2015-09-04 17:55:42 -07:00
Jason Ekstrand	beb466ff5b	vk: Move anv_x11.c to anv_wsi_x11.c	2015-09-04 17:55:42 -07:00
Jason Ekstrand	9a7600c9b5	vk/device: Use an array for device extensions	2015-09-04 17:55:42 -07:00
Kristian Høgsberg Kristensen	8af3624651	vk: Further reduce diff to master Now that we don't compile GLSL, we can roll back a few more hacks and unexport some things from the backend compiler. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-09-04 16:17:01 -07:00
Kristian Høgsberg Kristensen	7c1d20dc48	vk: Drop GLSL code from anv_compiler.cpp Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-09-03 14:02:11 -07:00
Kristian Høgsberg Kristensen	316c8ac53b	vk: Assert that the SPIR-V module has the magic number Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-09-03 12:27:28 -07:00
Kristian Høgsberg Kristensen	6e35a1f166	vk: Remove various hacks/scaffolding code Since we switched away from calling brwCreateContext() there's a bit of hacky support we can now delete. This reduces our diff to upstream master. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-09-03 12:17:13 -07:00
Kristian Høgsberg Kristensen	1d787781ff	vk: Fall back to previous gens in entry point resolver We used to always just do a one-level fallback from genX_* to anv_* entry points. That worked for gen7 and gen8 where all entry points were either different or could be made anv_* entry points (eg anv_CreateDynamicViewportState). We're about to add gen9 and now need to be able to fall back to gen8 entry points for most things. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-09-03 11:53:09 -07:00
Kristian Høgsberg Kristensen	c4dbff58d8	vk: Drop redundant gen7_CreateGraphicsPipelines This is handled by anv_CreateGraphicsPipelines(). Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-09-03 11:53:09 -07:00
Kristian Høgsberg Kristensen	b5e90f3f48	vk: Use vk* entrypoints in meta, not driver_layer pointers We'll change the dispatch mechanism again in a later commit. Stop using the driver_layer function pointers and just use the public entry points. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-09-03 11:53:09 -07:00
Kristian Høgsberg Kristensen	82396a5514	vk: Drop check for I915_PARAM_HAS_EXEC_CONSTANTS We don't use this kernel feature. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-09-03 11:53:08 -07:00
Kristian Høgsberg Kristensen	c4b30e7885	vk: Add new vk_errorf that takes a format string This allows us to annotate error cases in debug builds. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-09-03 11:53:08 -07:00
Kristian Høgsberg Kristensen	2e346c882d	vk: Make vk_error a little more helpful Print out file and line number and translate the error code to the symbolic name. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-09-03 11:53:08 -07:00
Chad Versace	0cb26523d3	vk/image: Add PRM reference for QPitch equation Suggested-by: Nanley Chery <nanley.g.chery@intel.com>	2015-09-03 11:04:38 -07:00
Chad Versace	28503191f1	vk/meta: Partially fix vkCmdCopyBufferToImage for S8_UINT Create R8_UINT VkAttachmentView and VkImageView for the stencil data. This fixes a crash, but the pixels in the destination image are still incorrect. They are not properly tiled. Fixes crashes in Crucible tests func.miptree.s8-uint.aspect-stencil.* as of crucible-7471449. Test results improve 'lost' -> 'fail'.	2015-09-02 11:08:36 -07:00
Jason Ekstrand	be0a4da6a5	vk/meta: Use SPIR-V for shaders We are also now using glslc for compiling the Vulkan driver like we do in curcible.	2015-09-01 15:16:06 -07:00
Jason Ekstrand	362ab2d788	vk/compiler: Handle interpolation qualifiers for SPIR-V shaders	2015-09-01 15:15:04 -07:00
Jason Ekstrand	126ade0023	vk/extensions: count needs to be <= number of extensions	2015-09-01 12:28:50 -07:00
Jason Ekstrand	0c2d476935	vk/compiler: Properly reference/delete programs when using SPIR-V	2015-09-01 12:28:50 -07:00
Jason Ekstrand	16ebe883a4	vk/meta: Add a helper for making an image from a buffer	2015-08-31 21:54:38 -07:00
Jason Ekstrand	86c3476668	nir/spirv: Use VERTEX_ID_ZERO_BASE for VertexId In Vulkan, VertexId and InstanceId will be zero-based and new intrinsics, VertexIndex and InstanceIndex, will be added for non-zer-based. See also, Khronos bug #14255	2015-08-31 17:16:49 -07:00
Jason Ekstrand	6350c97412	Merge remote-tracking branch 'fdo-personal/nir-spirv' into vulkan From now on, the majority of SPIR-V improvements should happen on the spirv branch which will also be public. It will be frequently merged into the vulkan driver.	2015-08-31 17:14:47 -07:00
Jason Ekstrand	22fdb2f855	nir/spirv: Update to the latest revision	2015-08-31 17:05:23 -07:00
Jason Ekstrand	ce70cae756	nir/builder: Use nir_after_instr to advance the cursor This should ensure that the cursor gets properly advanced in all cases. We had a problem before where, if the cursor was created using nir_after_cf_node on a non-block cf_node, that would call nir_before_block on the block following the cf node. Instructions would then get inserted in backwards order at the top of the block which is not at all what you would expect from nir_after_cf_node. By just resetting to after_instr, we avoid all these problems.	2015-08-31 17:05:23 -07:00
Jason Ekstrand	24b0c53231	nir/intrinsics: Move to a two-dimensional binding model for UBO's	2015-08-31 17:05:23 -07:00
Jason Ekstrand	f4608bc530	nir/nir_variable: Add a descriptor set field We need this for SPIR-V	2015-08-31 17:05:23 -07:00
Jason Ekstrand	85cf2385c5	mesa: Move gl_vert_attrib from mtypes.h to shader_enums.h It is a shader enum after all...	2015-08-31 17:05:23 -07:00
Jason Ekstrand	de4f379a70	nir/cursor: Add a helper for getting the current block	2015-08-31 17:05:23 -07:00
Connor Abbott	024c49e95e	nir/builder: add a nir_fdot() convenience function	2015-08-31 17:05:23 -07:00
Jason Ekstrand	f6a0eff1ba	nir: Add a pass to lower outputs to temporary variables This pass can be used as a helper for NIR producers so they don't have to worry about creating the temporaries themselves.	2015-08-31 17:05:23 -07:00
Jason Ekstrand	4956bbaa33	nir/cursor: Add a constructor for the end of a block but before the jump	2015-08-31 16:58:20 -07:00
Connor Abbott	c62be38286	nir/types: add more nir_type_is_xxx() wrappers	2015-08-31 16:58:20 -07:00
Connor Abbott	a1e136711b	nir/types: add a helper to transpose a matrix type	2015-08-31 16:58:20 -07:00
Jason Ekstrand	756b00389c	nir/spirv: Don't assert that the current block is empty It's possible that someone will give us SPIR-V code in which someone needlessly branches to new blocks. We should handle that ok now.	2015-08-31 16:58:20 -07:00
Jason Ekstrand	fe220ebd37	nir/spirv: Add initial support for samplers	2015-08-31 16:58:20 -07:00
Jason Ekstrand	a992909aae	nir/spirv: Move Exp and Log to the list of currently unhandled ALU ops NIR doesn't have the native opcodes for them anymore	2015-08-31 16:58:20 -07:00
Jason Ekstrand	45963c9c64	nir/types: Add support for sampler types	2015-08-31 16:58:20 -07:00
Jason Ekstrand	2887e68f36	nir/spirv: Make the global constants in spirv.h static I've been promissed in a bug that this will be fixed in a future version of the header. However, in the interest of my branch building, I'm adding these changes in myself for the moment.	2015-08-31 16:58:20 -07:00
Jason Ekstrand	62b094a81c	nir/spirv: Handle jump-to-loop in a more general way	2015-08-31 16:58:20 -07:00
Jason Ekstrand	ca51d926fd	nir/spirv: Handle boolean uniforms correctly	2015-08-31 16:58:20 -07:00
Jason Ekstrand	b6562bbc30	nir/spirv: Handle control-flow with loops	2015-08-31 16:58:20 -07:00
Jason Ekstrand	4a63761e1d	nir/spirv: Set a name on temporary variables	2015-08-31 16:58:20 -07:00
Jason Ekstrand	6fc7911d15	nir/spirv: Use the correct length for copying string literals	2015-08-31 16:58:20 -07:00
Jason Ekstrand	9da6d808be	nir/spirv: Make vtn_ssa_value handle constants as well as ssa values	2015-08-31 16:58:20 -07:00
Jason Ekstrand	1feeee9cf4	nir/spirv: Add initial support for GLSL 4.50 builtins	2015-08-31 16:58:20 -07:00
Jason Ekstrand	577c09fdad	nir/spirv: Split the core datastructures into a header file	2015-08-31 16:58:20 -07:00
Jason Ekstrand	66fc7f252f	nir/spirv: Use the builder for all instructions We don't actually use it to create all the instructions but we do use it for insertion always. This should make things far more consistent for implementing extended instructions.	2015-08-31 16:58:20 -07:00
Jason Ekstrand	9e03b6724c	nir/spirv: Add support for a bunch of ALU operations	2015-08-31 16:58:20 -07:00
Jason Ekstrand	91b3b46d8b	nir/spirv: Add support for indirect array accesses	2015-08-31 16:58:20 -07:00
Jason Ekstrand	9197e3b9fc	nir/spirv: Explicitly type constants and SSA values	2015-08-31 16:58:20 -07:00
Jason Ekstrand	b7904b8281	nir/spirv: Handle OpBranchConditional We do control-flow handling as a two-step process. The first step is to walk the instructions list and record various information about blocks and functions. This is where the acutal nir_function_overload objects get created. We also record the start/stop instruction for each block. Then a second pass walks over each of the functions and over the blocks in each function in a way that's NIR-friendly and actually parses the instructions.	2015-08-31 16:58:20 -07:00
Jason Ekstrand	d216dcee94	nir/spirv: Add a helper for getting a value as an SSA value	2015-08-31 16:58:20 -07:00
Jason Ekstrand	f36fabb736	nir/spirv: Split instruction handling into preamble and body sections	2015-08-31 16:58:20 -07:00
Jason Ekstrand	7bf4b53f1c	nir/spirv: Implement load/store instructiosn	2015-08-31 16:58:20 -07:00
Jason Ekstrand	7d64741a5e	nir: Add a helper for getting the tail of a deref chain	2015-08-31 16:58:20 -07:00
Jason Ekstrand	112c607216	nir/spirv: Actaully add variables to the funciton or shader	2015-08-31 16:58:20 -07:00
Jason Ekstrand	4fa1366392	nir/spirv: Add a vtn_untyped_value helper	2015-08-31 16:58:20 -07:00
Jason Ekstrand	e709a4ebb8	nir/spirv: Use vtn_value in the types code and fix a off-by-one error	2015-08-31 16:58:20 -07:00
Jason Ekstrand	67af6c59f2	nir/types: Add an is_vector_or_scalar helper	2015-08-31 16:58:20 -07:00
Jason Ekstrand	5e6c5e3c8e	nir/spirv: Add support for deref chains	2015-08-31 16:58:20 -07:00
Jason Ekstrand	366366c7f7	nir/types: Add a scalar type constructor	2015-08-31 16:58:20 -07:00
Jason Ekstrand	befecb3c55	nir/spirv: Add support for OpLabel	2015-08-31 16:58:20 -07:00
Jason Ekstrand	399e962d25	nir/spirv: Add support for declaring functions	2015-08-31 16:58:20 -07:00
Jason Ekstrand	ac4d459aa2	nir/types: Add accessors for function parameter/return types	2015-08-31 16:58:20 -07:00
Jason Ekstrand	3a266a18ae	nir/spirv: Add support for declaring variables Deref chains and variable load/store operations are still missing.	2015-08-31 16:58:20 -07:00
Jason Ekstrand	2494055631	nir/spirv: Add support for constants	2015-08-31 16:58:20 -07:00
Jason Ekstrand	2a023f30a6	nir/spirv: Add basic support for types	2015-08-31 16:58:20 -07:00
Jason Ekstrand	5bb94c9b12	nir/types: Add more helpers for creating types	2015-08-31 16:58:20 -07:00
Jason Ekstrand	53bff3e445	glsl/types: Expose the function_param and struct_field structs to C Previously, they were hidden behind a #ifdef __cplusplus so C wouldn't find them. This commit simpliy moves the #ifdef and adds #ifdef's around constructors.	2015-08-31 16:58:20 -07:00
Jason Ekstrand	0db3e4dd72	glsl/types: Add support for function types	2015-08-31 16:58:20 -07:00
Jason Ekstrand	1169fcdb05	glsl: Add GLSL_TYPE_FUNCTION to the base types enums	2015-08-31 16:58:20 -07:00
Jason Ekstrand	b79916dacc	nir/spirv: Rework the way values are added Instead of having functions to add values and set various things, we just have a function that does a few asserts and then returns the value. The caller is then responsible for setting the various fields.	2015-08-31 16:58:20 -07:00
Jason Ekstrand	ac60aba351	nir/spirv: Add stub support for extension instructions	2015-08-31 16:58:20 -07:00
Jason Ekstrand	78eabc6153	REVERT: Add a simple helper program for testing SPIR-V -> NIR translation	2015-08-31 16:58:20 -07:00
Jason Ekstrand	2c585a722d	glsl/compiler: Move the error_no_memory stub to standalone_scaffolding.cpp	2015-08-31 16:58:20 -07:00
Jason Ekstrand	b20d9f5643	nir: Add the start of a SPIR-V to NIR translator At the moment, it can handle the very basics of strings and can ignore debug instructions. It also has basic support for decorations.	2015-08-31 16:58:20 -07:00
Jason Ekstrand	9d92b4fd0e	nir: Import the revision 30 SPIR-V header from Khronos	2015-08-31 16:58:20 -07:00
Jason Ekstrand	0af4bf4d4b	Merge remote-tracking branch 'mesa-public/master' into vulkan	2015-08-31 16:30:07 -07:00
Jason Ekstrand	9f9628e9dd	vk/SPIR-V: Pull num_uniform_components out of the NIR shader	2015-08-28 22:31:03 -07:00
Jason Ekstrand	44e6ea74b0	spirv: lower outputs to temporaries	2015-08-28 17:38:41 -07:00
Jason Ekstrand	9cebdd78d8	nir: Add a pass to lower outputs to temporary variables This pass can be used as a helper for NIR producers so they don't have to worry about creating the temporaries themselves.	2015-08-28 17:38:41 -07:00
Jason Ekstrand	5e7c7b2a4e	spirv: Only do a block load if you're actually loading a uniform	2015-08-28 16:17:45 -07:00
Jason Ekstrand	98abed2441	spirv: Use VERTEX_ID_ZERO_BASE for vertex id	2015-08-28 16:08:29 -07:00
Jason Ekstrand	dbc3eb5bb4	vk/compiler: Pass the correct is_scalar value to brw_process_nir	2015-08-28 12:13:17 -07:00
Jason Ekstrand	ea56d0cb1d	glsl/types: Fix up function type hash table insertion	2015-08-28 12:00:25 -07:00
Chad Versace	a2d15ee698	vk/meta: Support stencil in vkCmdCopyImageToBuffer At Crucible commit 12e64a4, fixes the func.depthstencil.stencil-triangles.* tests on Broadwell.	2015-08-28 08:41:21 -07:00
Chad Versace	84cfc08c10	vk/pipeline: Fix crash when the pipeline has no attributes If there are no attributes, don't emit 3DSTATE_VERTEX_ELEMENTS. That packet does not allow 0 attributes.	2015-08-28 08:07:15 -07:00
Chad Versace	053d32d2a5	vk/image: Linear stencil buffers are illegal The hardware requires that stencil buffer memory be W-tiled. From the Sandybridge PRM: This buffer is supported only in Tile W memory.	2015-08-28 08:04:59 -07:00
Chad Versace	14e1d58fb7	vk: Fix stride of stencil buffers Stencil buffers have strange pitch. The PRM says: The pitch must be set to 2x the value computed based on width, as the stencil buffer is stored with two rows interleaved.	2015-08-28 08:03:46 -07:00
Chad Versace	31af126229	vk: Program stencil ops in 3DSTATE_WM_DEPTH_STENCIL The driver ignored the Vulkan stencil, always programming the hardware stencil op to 0 (STENCILOP_KEEP).	2015-08-28 08:00:56 -07:00
Chad Versace	bff2879abe	vk/image: Don't abort when creating stencil image views When creating a stencil image view, log a FINISHME but don't abort. We're sooooo close to having this working.	2015-08-28 07:59:59 -07:00
Chad Versace	4f852c76dc	vk/meta: Save/restore VkDynamicDepthStencilState	2015-08-28 07:59:29 -07:00
Chad Versace	104c4e5ddf	vk/meta: Don't skip clearing when clearing only depth attachment anv_cmd_buffer_clear_attachments() skipped the clear renderpass if no color attachments needed to be cleared, even if a depth attachment needed to be cleared.	2015-08-28 07:58:51 -07:00
Chad Versace	aacb7bb9b6	vk: Add func anv_cmd_buffer_get_depth_stencil_view() This function removes some duplicated code from genN_cmd_buffer_emit_depth_stencil().	2015-08-28 07:57:34 -07:00
Chad Versace	641c25dd55	vk: Declare some local variables as const In anv_cmd_buffer_emit_depth_stencil(), declare 'subpass' and 'fb' as const.	2015-08-28 07:53:24 -07:00
Chad Versace	c6f19b4248	vk: Don't duplicate anv_depth_stencil_view's surface data In anv_depth_stencil_view, replace the members bo depth_offset depth_stride depth_format depth_qpitch stencil_offset stencil_stride stencil_qpitch with the single member const struct anv_image *image The removed members duplicated data in anv_image::depth_surface and anv_image::stencil_surface.	2015-08-28 07:52:19 -07:00
Chad Versace	35b0262a2d	vk/gen7: Add func gen7_cmd_buffer_emit_depth_stencil() This patch moves all the GEN7_3DSTATE_DEPTH_BUFFER code from gen7_cmd_buffer_begin_subpass() into a new function gen7_cmd_buffer_emit_depth_stencil().	2015-08-28 07:46:16 -07:00
Chad Versace	b2ee317e24	vk: Fix format of anv_depth_stencil_view The format of the view itself and of the view's image may differ. Moreover, if the view's format has no depth aspect but the image's format does, we must not program the depth buffer. Ditto for stencil.	2015-08-28 07:44:32 -07:00
Chad Versace	798acb2464	vk/gen7: Fix gen of emitted packet in gen7_batch_lri() Emit GEN7_MI_LOAD_REGISTER_IMM, not the GEN8 version.	2015-08-28 07:36:35 -07:00
Chad Versace	4461392343	vk: Remove dummy anv_depth_stencil_view	2015-08-28 07:35:39 -07:00
Chad Versace	941b48e992	vk/image: Let anv_image have one anv_surface per aspect Split anv_image::primary_surface into two: anv_image::color_surface and depth_surface.	2015-08-28 07:17:54 -07:00
Jason Ekstrand	c313a989b4	spirv: Bump to the public revision 31	2015-08-27 15:24:04 -07:00
Jason Ekstrand	2a8d1ac958	vk: Update to API version 0.138.2	2015-08-27 11:41:04 -07:00
Jason Ekstrand	4e3ee043c0	vk/gen8: Add support for push constants	2015-08-27 10:25:58 -07:00
Jason Ekstrand	375a65d5de	vk/private.h: Handle a NULL bo but valid offset in __gen_combine_address	2015-08-27 10:25:58 -07:00
Jason Ekstrand	c8365c55f5	vk/cmd_buffer: Set the CONSTANTS_REL_GENERAL flag on execbuf This tells the kernel that the push constant buffers are relative to the dynamic state base address.	2015-08-27 10:25:58 -07:00
Jason Ekstrand	efc2cce01f	HACK: Don't call nir_setup_uniforms We're doing our own uniform setup and we don't need to call into the entire GL stack to mess with things.	2015-08-27 10:25:58 -07:00
Jason Ekstrand	33cabeab01	vk/compiler: Add a helper for setting up prog_data->param This new helper sets it up the way we'll want for handling push constants.	2015-08-27 10:25:16 -07:00
Jason Ekstrand	5446bf352e	vk: Add initial API support for setting push constants This doesn't add support for actually uploading them, it just ensures that we have and update the shadow copy.	2015-08-26 17:59:15 -07:00
Jason Ekstrand	36134e1050	Merge remote-tracking branch 'mesa-public/master' into vulkan	2015-08-26 11:04:30 -07:00
Jason Ekstrand	74e076bba8	vk/meta: Destroy vertex shaders when setting up clearing	2015-08-25 18:51:26 -07:00
Jason Ekstrand	4bb9915755	vk/gen8: Don't duplicate generic pipeline setup gen8_graphics_pipeline_create had a bunch of stuff in it that's already set up by anv_pipeline_init. The duplication was causing double-initialization of a state stream and made valgrind very angry.	2015-08-25 18:41:25 -07:00
Jason Ekstrand	9b387b5d3f	Merge remote-tracking branch 'mesa-public/master' into vulkan	2015-08-25 18:41:21 -07:00
Kristian Høgsberg Kristensen	5360edcb30	vk/vec4: Use the right constant for offset into a UBO We were using constant 0, which is the set. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-25 16:14:59 -07:00
Kristian Høgsberg Kristensen	647a60226d	vk: Use true/false for RenderCacheReadWriteMode This field in surface state is a bool, WriteOnlyCache is an enum from GEN8. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-25 15:58:21 -07:00
Kristian Høgsberg Kristensen	7e5afa75b5	vk: Support descriptor sets and bindings in vec4 ubo loads Still incomplete, but at least we get the simplest case working. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-25 15:57:12 -07:00
Kristian Høgsberg Kristensen	00e7799c69	vk/gen7: Enable L3 caching for GEN7 MOCS Do what GL does here. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-25 15:55:56 -07:00
Kristian Høgsberg Kristensen	6a1098b2c2	vk/gen7: Use TILEWALK_XMAJOR for linear surfaces You wouldn't think the TileWalk mode matters when TiledSurface is false. However, it has to be TILEWALK_XMAJOR. Make it so. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-25 10:54:13 -07:00
Kristian Høgsberg Kristensen	f1455ffac7	vk: Add gen7 support With all the previous commits in place, we can now drop in support for multiple platforms. First up is gen7 (Ivybridge). Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:41 -07:00
Kristian Høgsberg Kristensen	891995e55b	vk: Move 3DSTATE_SBE setup to just before 3DSTATE_PS This is a more logical place for it, between geometry front end state and pixel backend state. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:41 -07:00
Kristian Høgsberg Kristensen	9c752b5b38	vk: Move generic pipeline init to anv_pipeline.c This logic will be shared between multiple gens. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:41 -07:00
Kristian Høgsberg Kristensen	3800573fb5	vk: Move gen8 specific state into gen8 sub-structs This commit moves all occurances of gen8 specific state into a gen8 substruct. This clearly identifies the state as gen8 specific and prepares for adding gen7 state structs. In the process we also rename the field names to exactly match the command or state packet name, without the 3DSTATE prefix, eg: 3DSTATE_VF -> gen8.vf 3DSTATE_WM_DEPTH_STENCIL -> gen8.wm_depth_stencil Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:41 -07:00
Kristian Høgsberg Kristensen	615da3795a	vk: Always use a placeholder vertex shader in meta The clear pipeline didn't have a vertex shader and relied on the clear shader being hardcoded by the compiler to accept one attribute. This necessitated a few special cases in the 3DSTATE_VS setup. Instead, always provide a vertex shader, even if we disable VS dispatch. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:41 -07:00
Kristian Høgsberg Kristensen	ac738ada7a	vk: Trim out irrelevant 0-initialized surface state fields Many of of these fields aren't used for buffer surfaces, so leave them out for brevity. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:41 -07:00
Kristian Høgsberg Kristensen	963a1e35e7	vk: Update generated headers This adds VALIGN_2 and VALIGN_4 defines for IVB and HSW RENDER_SURFACE_STATE. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:41 -07:00
Kristian Høgsberg Kristensen	f5275f7eb3	vk: Move anv_color_attachment_view_init() to gen8_state.c I'd prefer to move anv_CreateAttachmentView() as well, but it's a little too much generic code to just duplicate for each gen. For now, we'll add a anv_color_attachment_view_init() to dispatch to the gen specific implementation, which we then call from anv_CreateAttachmentView(). Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	988341a73c	vk: Move anv_CreateImageView to gen8_state.c We'll probably want to move some code back into a shared init function, but this gets one GEN8 surface state initialization out of anv_image.c. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	bc568ee992	vk: Make anv_cmd_buffer_begin_subpass() switch on gen Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	8fe74ec45c	vk: Add generic wrapper for filling out buffer surface state We need this for generating surface state on the fly for dynamic buffer views. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	a2b822185e	vk: Add helper for adding surface state reloc We're going to have to do this differently for earlier gens, so lets do it in place only. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	e43fc871be	vk: Make batch chain code gen-agnostic Since the extra dword in MI_BATCH_BUFFER_START added in gen8 is at the end of the struct, we can emit the gen8 packet on all gens as long as we set the instruction length correctly. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	25ab43ee8c	vk: Move vkCmdPipelineBarrier to gen8_cmd_buffer.c Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	b4ef2302a9	vk: Use helper function for emitting MI_BATCH_BUFFER_START Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	97360ffc6c	vk: Use anv_batch_emit() for chaining back to primary batch We used to use a manual GEN8_MI_BATCH_BUFFER_START_pack() call, but this refactors the code to use anv_batch_emit(); Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	cff717c649	vk: Downgrade state packet to gen7 where they're common Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	64045eebfb	vk: Reorder gen8 specific code into three new files We'll organize gen specific code in three files per gen: pipeline, cmd_buffer and state, eg: gen8_cmd_buffer.c gen8_pipeline.c gen8_state.c where gen8_cmd_buffer.c holds all vkCmd* entry points, gne8_pipeline.c all gen specific code related to pipeline building and remaining state code (sampler, surface state, dynamic state) in gen8_state.c. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	9f0bb5977b	vk: Move gen8_CmdBindIndexBuffer() to anv_gen8.c Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	a7649b2869	vk: Move gen8_cmd_buffer_emit_state_base_address() to anv_gen8.c Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	130db30771	vk: Move gen8 specific parts of queries to anv_gen8.c Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	98126c021f	vk: Move dynamic depth stenctil to anv_gen8.c	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	0bcf85d79f	vk: Move pipeline creation to anv_gen8.c	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	ef0ab62486	vk: Move anv_CreateSampler to anv_gen8.c Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	fb428727e0	vk: Move anv_CreateBufferView to anv_gen8.c Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	74556b076a	vk: Add new anv_gen8.c and move CreateDynamicRasterState there Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-24 13:45:40 -07:00
Kristian Høgsberg Kristensen	ee9788973f	vk: Implement multi-gen dispatch mechanism	2015-08-24 13:45:39 -07:00
Chad Versace	c4e7ed9163	vk/meta: Implement depth clears Fixes Crucible test func.depthstencil.basic-depth.clear-1.0.op-greater.	2015-08-20 10:25:05 -07:00
Chad Versace	0db3d67a14	vk: Cache each render pass's number of clear ops During vkCreateRenderPass, count the number of clear ops and store them in new members of anv_render_pass: uint32_t num_color_clear_attachments bool has_depth_clear_attachment bool has_stencil_clear_attachment Cacheing these 8 bytes (including padding) reduces the number of times that anv_cmd_buffer_clear_attachments needs to loop over the pass's attachments.	2015-08-20 10:25:04 -07:00
Chad Versace	2387219101	vk: Use temp var in vkCreateRenderPass's attachment loop Store the attachment in a temporary variable and s/pass->attachments[i]/att/ .	2015-08-20 10:25:04 -07:00
Chad Versace	1c24a191cd	vk: Improve memory locality of anv_render_pass Allocate the pass's array of attachments, anv_render_pass::attachments, in the same allocation as the pass itself.	2015-08-20 09:31:58 -07:00
Chad Versace	4eaf90effb	vk: Unharcode an argument to sizeof s/struct anv_subpass/pass->subpasses[0])/	2015-08-20 09:31:58 -07:00
Chad Versace	44ef4484c8	vk/meta: Add Z coord to clear vertices For now, the Z coordinate is always 0.0. Will later be used for depth clears.	2015-08-20 09:31:12 -07:00
Chad Versace	4aef5c62cd	vk/meta: Restore all saved state in anv_cmd_buffer_restore() anv_cmd_buffer_restore() did not restore the old VkDynamicColorBlendState.	2015-08-20 09:30:34 -07:00
Chad Versace	9f908fcbde	vk/meta: Use consistent names and types in anv_saved_state In struct anv_saved_state, each member's type was a pointer to an Anvil struct and each member's name was prefixed with "old" except cb_state, which was a Vulkan handle whose name lacked "old".	2015-08-20 09:29:41 -07:00
Neil Roberts	49d9e89d00	Add mesa.icd to the .gitignore Since `4d7e0fa8c7` this file is generated by the configure script. Reviewed-by: Tapani Palli <tapani.palli@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> (cherry picked from commit `885762e182`)	2015-08-19 14:12:31 -07:00
Chad Versace	bd0aab9a58	vk/meta: Fix dest format of vkCmdCopyImage The source image's format was incorrectly used for both the source view and destination view. For vkCmdCopyImage to correctly translate formats, the destination view's format must be that of the destination image's.	2015-08-18 12:44:06 -07:00
Chad Versace	b0875aa911	vk: Assert that swap chain format is a color format	2015-08-18 12:43:57 -07:00
Chad Versace	d52822541e	vk/image: Don't set anv_surface_view::offset twice It was set twice a few lines apart, and the second setting always overrode the first.	2015-08-18 11:48:50 -07:00
Chad Versace	e7d3a5df5a	vk/meta: Use anv_format_is_color() That is, replace !anv_format_is_depth_or_stencil() with anv_format_is_color(). That conveys the meaning better.	2015-08-18 11:48:48 -07:00
Chad Versace	50f7bf70da	vk: Add anv_format_is_color()	2015-08-18 11:48:46 -07:00
Chad Versace	6ff95bba8a	vk: Add anv_format reference to anv_render_pass_attachment Change type of anv_render_pass_attachment::format from VkFormat to const struct anv_format*. This elimiates the repetitive lookups into the VkFormat -> anv_format table when looping over attachments during anv_cmd_buffer_clear_attachments().	2015-08-17 14:08:55 -07:00
Chad Versace	5a6b2e6df0	vk/image: Simplify stencil case for anv_image_create() Stop creating a temporary VkImageCreateInfo with overriden format=VK_FORMAT_S8_UINT. Instead, just pass the format override directly to anv_image_make_surface().	2015-08-17 14:08:55 -07:00
Chad Versace	a9c36daa83	vk/formats: Add global pointer to anv_format for S8_UINT Stencil formats are often a special case. To reduce the number of lookups into the VkFormat-to-anv_format translation table when working with stencil, expose the table's entry for VK_FORMAT_S8_UINT as global variable anv_format_s8_uint.	2015-08-17 14:08:55 -07:00
Chad Versace	60c4ac57f2	vk: Add anv_format reference t anv_surface_view Change type of anv_surface_view::format from VkFormat to const struct anv_format*. This reduces the number of lookups in the VkFormat -> anv_format table.	2015-08-17 14:08:55 -07:00
Chad Versace	c11094ec9a	vk: Pass anv_format to anv_fill_buffer_surface_state() This moves the translation of VkFormat to anv_format from anv_fill_buffer_surface_state() to its caller. A prep commit to reduce more VkFormat -> anv_format translations.	2015-08-17 14:08:55 -07:00
Chad Versace	ded736f16a	vk: Add anv_format reference to anv_image Change type of anv_image::format from VkFormat to const struct anv_format*. This reduces the number of lookups in the VkFormat -> anv_format table.	2015-08-17 14:08:55 -07:00
Chad Versace	4ae42c83ec	vk: Store the original VkFormat in anv_format Store the original VkFormat as anv_format::vk_format. This will be used to reduce format indirection, such as lookups into the VkFormat -> anv_format translation table.	2015-08-17 14:07:44 -07:00
Jason Ekstrand	e39e1f4d24	vk: Update .gitignore for the autogenerated spirv changes	2015-08-17 11:47:25 -07:00
Kristian Høgsberg Kristensen	aac6f7c3bb	vk: Drop aub dumper and PCI ID override feature These are now available in intel_aubdump from intel-gpu-tools. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-17 11:41:19 -07:00
Kristian Høgsberg Kristensen	6d09d0644b	vk: Use anv_image_create() for creating dmabuf VkImage We need to make sure we use the VkImage infrastructure for creating dmabuf images. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-17 11:41:19 -07:00
Jason Ekstrand	0deae66eb1	vk: Add an _autogen suffix autogenerated spirv file names This prevents make from stomping on nir_spirv.h	2015-08-17 11:40:16 -07:00
Jason Ekstrand	6a7ca4ef2c	Merge remote-tracking branch 'mesa-public/master' into vulkan	2015-08-17 11:25:03 -07:00
Jason Ekstrand	b4c02253c4	vk: Add four unit tests for our lock-free data-structures	2015-08-14 17:04:39 -07:00
Jason Ekstrand	16c5b9f4ed	vk: Build a version of the driver for linking into unit tests	2015-08-14 17:04:39 -07:00
Kristian Høgsberg Kristensen	30d82136bb	vk: Update generated headers This update brings usable IVB/HSW RENDER_SURFACE_STATE structs and adds more float fields that we previously failed to recognize.	2015-08-12 21:05:32 -07:00
Kristian Høgsberg Kristensen	9564dd37a0	vk: Query aperture size up front in anv_physical_device_init() We already query the device in various ways here and we can just also get the aperture size. This avoids keeping an extra drm fd open during the life time of the driver. Also, we need to use explicit 64 bit types for the aperture size, not size_t.	2015-08-10 17:18:55 -07:00
Kristian Høgsberg Kristensen	8605ee60e0	vk: Share upload logic and add size assert This lets us hit an assert if we exceed the block pool size instead of GPU hanging. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>	2015-08-10 17:17:45 -07:00
Jason Ekstrand	6757e2f75c	vk/cmd_buffer: Allow for null VkCmdPool's	2015-08-04 14:01:08 -07:00
Kristian Høgsberg Kristensen	4b097d73e6	vk: Call anv_batch_emit_dwords() up front in anv_batch_emit() This avoids putting a memory barrier between the template struct and the pack function, which generates much better code.	2015-08-03 15:38:14 -07:00
Kristian Høgsberg Kristensen	fbb119061e	vk: Update generated headers This adds zeroing of reserved blocks of dwords and removes an instruction definition.	2015-08-03 15:21:27 -07:00
Jason Ekstrand	facf587dea	vk/allocator: Solve a data race in anv_block_pool The anv_block_pool data structure suffered from the exact same race as the state pool. Namely, that the uniqueness of the blocks handed out depends on the next_block value increasing monotonically. However, this invariant did not hold thanks to our block "return" concept.	2015-08-03 01:19:34 -07:00
Jason Ekstrand	5e5a783530	vk: Add and use an anv_block_pool_size() helper	2015-08-03 01:18:09 -07:00
Jason Ekstrand	56ce219493	vk/allocator: Make block_pool_grow take and return a size It takes the old size as an argument and returns the new size as the return value. On error, it returns a size of 0.	2015-08-03 01:06:45 -07:00
Jason Ekstrand	fd64598462	vk/allocator: Fix a data race in the state pool The previous algorithm had a race because of the way we were using __sync_fetch_and_add for everything. In particular, the concept of "returning" over-allocated states in the "next > end" case was completely bogus. If too many threads were hitting the state pool at the same time, it was possible to have the following sequence: A: Get an offset (next == end) B: Get an offset (next > end) A: Resize the pool (now next < end by a lot) C: Get an offset (next < end) B: Return the over-allocated offset D: Get an offset in which case D will get the same offset as C. The solution to this race is to get rid of the concept of "returning" over-allocated states. Instead, the thread that gets a new block simply sets the next and end offsets directly and threads that over-allocate don't return anything and just futex-wait. Since you can only ever hit the over-allocate case if someone else hit the "next == end" case and hasn't resized yet, you're guaranteed that the end value will get updated and the futex won't block forever.	2015-08-03 00:38:48 -07:00
Jason Ekstrand	481122f4ac	vk/allocator: Make a few things more consistant	2015-08-03 00:35:19 -07:00
Jason Ekstrand	e65953146c	vk/allocator: Use memory pools rather than (MALLOC\|FREE)LIKE We have pools, so we should be using them. Also, I think this will help keep valgrind from getting confused when we have to end up fighting with system allocations such as those from malloc/free and mmap/munmap.	2015-07-31 10:38:28 -07:00
Jason Ekstrand	1920ef9675	vk/allocator: Add an anv_state_pool_finish function Currently this is a no-op but it gives us a place to put finalization things in the future.	2015-07-31 10:38:28 -07:00
Jason Ekstrand	930598ad56	vk/instance: valgrind-guard client-provided allocations	2015-07-31 10:38:23 -07:00
Jason Ekstrand	e40bdcef1f	vk/device: Add anv_instance_alloc/free helpers This way we can more consistently alloc/free the device and it will provide us a better place to put valgrind hooks in the next patch	2015-07-31 10:14:17 -07:00
Jason Ekstrand	0f050aaa15	vk/device: Mark newly allocated memory as undefined for valgrind This way valgrind still works even if the client gives us memory that has been initialized or re-uses memory for some reason.	2015-07-31 09:44:42 -07:00
Jason Ekstrand	1f49a7d9fc	vk/batch_chain: Decrement num_relocs instead of incrementing it	2015-07-31 09:11:47 -07:00
Jason Ekstrand	220a01d525	vk/batch_chain: Compute secondary exec mode after finishing the bo Figuring out whether or not to do a copy requires knowing the length of the final batch_bo. This gets set by anv_batch_bo_finish so we have to do it afterwards. Not sure how this was even working before.	2015-07-31 08:52:30 -07:00
Jason Ekstrand	26ba0ad54d	vk: Re-name command buffer implementation files Previously, the command buffer implementation was split between anv_cmd_buffer.c and anv_cmd_emit.c. However, this naming convention was confusing because none of the Vulkan entrypoints for anv_cmd_buffer were actually in anv_cmd_buffer.c. This changes it so that anv_cmd_buffer.c is what you think it is and the internals are in anv_batch_chain.c.	2015-07-30 15:00:42 -07:00
Jason Ekstrand	e379cd9a0e	vk/cmd_buffer: Add a simple command pool implementation	2015-07-30 14:55:49 -07:00
Jason Ekstrand	4c2a182a36	vk/cmd_buffer: Add support for zero-copy batch chaining	2015-07-30 14:22:17 -07:00
Jason Ekstrand	21004f23bf	vk: Add initial support for secondary command buffers	2015-07-30 11:36:48 -07:00
Jason Ekstrand	5aee803b97	vk/cmd_buffer: Split batch chaining into a helper function	2015-07-30 11:34:58 -07:00
Jason Ekstrand	0c4a2dab7e	vk/device: Make BATCH_SIZE a global #define	2015-07-30 11:34:09 -07:00
Jason Ekstrand	ace093031d	vk/cmd_buffer: Add functions for cloning a list of anv_batch_bo's We'll need this to implement secondary command buffers.	2015-07-30 11:32:27 -07:00
Jason Ekstrand	7af67e085f	vk/reloc_list: Actually set the new length in reloc_list_grow	2015-07-30 11:29:55 -07:00
Jason Ekstrand	f15be18c92	util/list: Add list splicing functions This adds functions for splicing one list into another. These have more-or-less the same API as the kernel list splicing functions.	2015-07-30 11:28:22 -07:00
Jason Ekstrand	e39d0b635c	CLONE	2015-07-30 08:24:02 -07:00
Jason Ekstrand	82548a3aca	vk/cmd_buffer: Invalidate texture cache in emit_state_base_address Previously, the caller of emit_state_base_address was doing this. However, putting it directly in emit_state_base_address means that we'll never forget the flush at the cost of one PIPE_CONTROL at the top every batch (that should do nothing since the kernel just flushed for us).	2015-07-30 08:24:02 -07:00
Jason Ekstrand	56ce896d73	vk/cmd_buffer: Rename emit_batch_buffer_end to end_batch_buffer This is more generic and doesn't imply that it emits MI_BATCH_BUFFER_END. While we're at it, we'll move NOOP adding from bo_finish to end_batch_buffer.	2015-07-30 08:24:02 -07:00
Jason Ekstrand	3ed9cea84d	vk/cmd_buffer: Use an array to track all know anv_batch_bo objects Instead of walking the list of batch and surface buffers, we simply keep track of all known batch and surface buffers as we build the command buffer. Then we use this new list to construct the validate list.	2015-07-29 15:30:15 -07:00
Jason Ekstrand	0f31c580bf	vk/cmd_buffer: Rework validate list creation The algorighm we used previously required us to call add_bo in a particular order in order to guarantee that we get the initial batch buffer as the last element in the validate list. The new algorighm does a recursive walk over the buffers and then re-orders the list. This should be much more robust as we start to add circular dependancies in the relocations.	2015-07-29 15:16:54 -07:00
Jason Ekstrand	4fc7510a7c	vk/cmd_buffer: Move emit_batch_buffer_end higher in the file	2015-07-29 12:01:08 -07:00
Jason Ekstrand	8208f01a35	vk/cmd_buffer: Store the relocation list in the anv_batch_bo struct Before, we were doing this thing where we had one big relocation list for the whole command buffer and each subbuffer took a chunk out of it. Now, we store the actual relocation list in the anv_batch_bo. This comes at the cost of more small allocations but makes a lot of things simpler.	2015-07-29 12:01:08 -07:00
Jason Ekstrand	7d50734240	vk/batch: Make relocs a pointer to a relocation list Previously anv_batch.relocs was an actual relocation list. However, this is limiting if the implementation of the batch wants to change the relocation list as the batch progresses.	2015-07-29 12:01:08 -07:00
Kristian Høgsberg Kristensen	fcea3e2d23	vk/headers: Update to new generated gen headers This update fixes cases where a 48-bit address field was split into two parts: __gen_address_type MemoryAddress; uint32_t MemoryAddressHigh; which cases this pack code to be generated: dw[1] = __gen_combine_address(data, &dw[1], values->MemoryAddress, dw1); dw[2] = __gen_field(values->MemoryAddressHigh, 0, 15) \| 0; which breaks for addresses above 4G. This update also fixes arrays of structs in commands and structs, for example, we now have: struct GEN8_BLEND_STATE_ENTRY Entry[8]; and the pack functions now write all dwords in the packet, making valgrind happy. Finally, we would try to pack 64 bits of blend state into a uint32_t - that's also fixed now.	2015-07-29 11:02:33 -07:00
Jason Ekstrand	65f3d00cd6	vk/cmd_buffer: Update a comment	2015-07-29 08:33:56 -07:00
Jason Ekstrand	86a53d2880	vk/cmd_buffer: Use a doubly-linked list for batch and surface buffers This is probably better than hand-rolling the list of buffers.	2015-07-28 17:47:59 -07:00
Jason Ekstrand	6aba52381a	vk/aub: Use the data directly from the execbuf2 Previously, we were crawling through the anv_cmd_buffer datastructure to pull out batch buffers and things. This meant that every time something in anv_cmd_buffer changed, we broke aub dumping. However, aub dumping should just dump the stuff the kernel knows about so we really don't need to be crawling driver internals.	2015-07-28 16:53:45 -07:00
Jason Ekstrand	3c2743dcd1	vk/cmd_buffer: Pull the execbuf stuff into a substruct	2015-07-27 16:37:09 -07:00
Jason Ekstrand	4ced8650d4	vk/cmd_buffer: Move the remaining entrypoints into cmd_emit.c	2015-07-27 15:14:31 -07:00
Jason Ekstrand	d4c249364d	vk/cmd_buffer: Move the re-emission of STATE_BASE_ADDRESS to the flushing code This used to happen magically in cmd_buffer_new_surface_state_bo. However, according to Ken, STATE_BASE_ADDRESS is very gen-specific so we really shouldn't have it in the generic data-structure code.	2015-07-27 15:05:06 -07:00
Jason Ekstrand	117d74b4e2	vk/cmd_buffer: Factor the guts of CmdBufferEnd into two helpers	2015-07-27 14:52:16 -07:00
Jason Ekstrand	8fb6405718	vk/cmd_buffer: Factor the guts of (Create\|Reset\|Destroy)CmdBuffer into helpers	2015-07-27 14:23:56 -07:00
Jason Ekstrand	80ad578c4e	vk/private.h: Re-arrange and better comment anv_cmd_buffer	2015-07-27 12:40:43 -07:00
Jason Ekstrand	50e86b5777	vk: Actually advertise 0.138.1 at runtime	2015-07-23 10:44:27 -07:00
Jason Ekstrand	f884b500d0	vk/vulkan.h: Bump to the version 0.138.1 header This doesn't actually require any implementation changes but it does change an enum so it is ABI-incompatable with 0.138.0.	2015-07-23 10:38:22 -07:00
Jason Ekstrand	e99773badd	vk: Add two more valgrind checks	2015-07-23 08:57:54 -07:00
Jason Ekstrand	b1fcc30ff0	vk/meta: Destroy shader modules	2015-07-22 17:51:26 -07:00
Jason Ekstrand	3460e6cb2f	vk/device: Finish the scratch block pool on device destruction	2015-07-22 17:51:14 -07:00
Jason Ekstrand	867f6cb90c	vk: Add a FreeDescriptorSets function	2015-07-22 17:33:09 -07:00
Jason Ekstrand	c9dc1f4098	vk/pipeline: Be more sloppy about shader entrypoint names The CTS passes in NULL names right now. It's not too hard to support that as just "main". With this, and a patch to vulkancts, we now pass all 6 tests.	2015-07-22 15:26:56 -07:00
Chad Versace	2c2233e328	vk: Prefix most filenames with anv Jason started the task by creating anv_cmd_buffer.c and anv_cmd_emit.c. This patch finishes the task by renaming all other files except gen*_pack.h and glsl_scraper.py.	2015-07-17 20:25:38 -07:00
Chad Versace	f70d079854	vk/image: Remove unneeded data from anv_buffer_view This completes the FINISHME to trim unneeded data from anv_buffer_view. A VkExtent3D doesn't make sense for a VkBufferView. So remove the member anv_surface_view::extent, and push it up to the two objects that actually need it, anv_image_view and anv_attachment_view.	2015-07-17 14:48:23 -07:00
Chad Versace	194b77d426	vk: Document members of anv_surface_view	2015-07-17 14:39:05 -07:00
Chad Versace	169251bff0	vk: Remove more raw casts This removes nearly all the remaining raw Anvil<->Vulkan casts from the C source files. (File compiler.cpp still contains many raw casts, and I plan on ignoring that). As far as I can tell, the only remaining raw casts are: anv_attachment_view -> anv_depth_stencil_view anv_attachment_view -> anv_color_attachment_view	2015-07-17 14:32:22 -07:00
Chad Versace	fc3838376b	vk/image: Add braces around multi-line ifs	2015-07-17 13:38:09 -07:00
Connor Abbott	b2cfd85060	nir/spirv: don't declare builtin blocks They aren't used, and the backend was barfing on them. Also, remove a hack in in compiler.cpp now that they're gone.	2015-07-16 11:04:22 -07:00
Connor Abbott	b599735be4	nir/spirv: add support for loading UBO's We directly emit ubo load intrinsics based off of the offset information handed to us from SPIR-V.	2015-07-16 10:54:09 -07:00
Connor Abbott	513ee7fa48	nir/types: add more nir_type_is_xxx() wrappers	2015-07-15 21:58:32 -07:00
Connor Abbott	9fa0989ff2	nir: move to two-level binding model for UBO's The GLSL layer above is still hacky, so we're really just moving the hack into GLSL-to-NIR. I'd rather not go all the way and make GLSL support the Vulkan binding model too, since presumably we'll be switching to SPIR-V exclusively, and so working on proper GLSL support will be a waste of time. For now, doing this keeps it working as we add SPIR-V->NIR support though.	2015-07-15 17:18:48 -07:00
Chad Versace	5520221118	vk: Remove unneeded vulkan-138.h	2015-07-15 17:16:07 -07:00
Chad Versace	73a8f9543a	vk: Bump vulkan.h version to 0.138	2015-07-15 17:16:07 -07:00
Chad Versace	55781f8d02	vk/0.138: Update VkResult values	2015-07-15 17:16:07 -07:00
Chad Versace	756d8064c1	vk/0.132: Do type-safety	2015-07-15 17:16:07 -07:00
Jason Ekstrand	927f54de68	vk/cmd_buffer: Move batch buffer padding to anv_batch_bo_finish()	2015-07-15 17:11:04 -07:00
Jason Ekstrand	9c0db9d349	vk/cmd_buffer: Rename bo_count to exec2_bo_count	2015-07-15 16:56:29 -07:00
Jason Ekstrand	6037b5d610	vk/cmd_buffer: Add a helper for allocating dynamic state This matches what we do for surface state and makes the dynamic state pool more opaque to things that need to get dynamic state.	2015-07-15 16:56:29 -07:00
Jason Ekstrand	7ccc8dd24a	vk/private.h: Move cmd_buffer functions to near the cmd_buffer struct	2015-07-15 16:56:29 -07:00
Jason Ekstrand	d22d5f25fc	vk: Split command buffer state into its own structure Everything else in anv_cmd_buffer is the actual guts of the datastructure.	2015-07-15 16:56:29 -07:00
Jason Ekstrand	da4d9f6c7c	vk: Move most of the anv_Cmd related stuff to its own file	2015-07-15 16:56:28 -07:00
Jason Ekstrand	d862099198	vk: Pull the guts of anv_cmd_buffer into its own file	2015-07-15 16:56:28 -07:00
Chad Versace	498ae009d3	vk/glsl: Replace raw casts Needed for upcoming type-safety changes.	2015-07-15 15:51:37 -07:00
Chad Versace	6f140e8af1	vk/meta: Remove raw casts Needed for upcoming type-safety changes.	2015-07-15 15:51:37 -07:00
Chad Versace	badbf0c94a	vk/x11: Remove raw casts The raw casts in the WSI functions will break the build when the type-safety changes arrive.	2015-07-15 15:49:10 -07:00
Chad Versace	61a4bfe253	vk: Delete vkDbgSetObjectTag() Because VkObject is going away.	2015-07-15 15:34:20 -07:00
Jason Ekstrand	e1c78ebe53	vk/device: Remove unneeded checks for NULL	2015-07-15 15:22:32 -07:00
Jason Ekstrand	f4748bff59	vk/device: Provide proper NULL handling in anv_device_free The Vulkan spec does not specify that the free function provided to CreateInstance must handle NULL properly so we do it in the wrapper. If this ever changes in the spec, we can delete the extra 2 lines.	2015-07-15 15:22:32 -07:00
Chad Versace	4c8e1e5888	vk: Stop internally calling anv_DestroyObject() Replace each anv_DestroyObject() with anv_DestroyFoo(). Let vkDestroyObject() live for a while longer for Crucible's sake.	2015-07-15 15:11:16 -07:00
Chad Versace	f5ad06eb78	vk: Fix vkDestroyObject dispatch for VkRenderPass It called anv_device_free() instead of anv_DestroyRenderPass().	2015-07-15 15:07:41 -07:00
Chad Versace	188f2328de	vk: Fix vkCreate/DestroyRenderPass While updating vkDestroyObject, I discovered that vkDestroyPass reliably crashes. That hasn't been an issue yet, though, because it is never called. In vkCreateRenderPass: - Don't allocate empty attachment arrays. - Ensure that pointers to empty attachment arrays are NULL. - Store VkRenderPassCreateInfo::subpassCount as anv_render_pass::subpass_count. In vkDestroyRenderPass: - Fix loop bounds: s/attachment_count/subpass_count/ - Don't call anv_device_free on null pointers.	2015-07-15 15:07:41 -07:00
Chad Versace	c6270e8044	vk: Refactor create/destroy code for anv_descriptor_set Define two new functions: anv_descriptor_set_create anv_descriptor_set_destroy	2015-07-15 14:31:22 -07:00
Chad Versace	365d80a91e	vk: Replace some raw casts with safe casts That is, replace some instances of (VkFoo) foo with anv_foo_to_handle(foo)	2015-07-15 14:00:21 -07:00
Chad Versace	7529e7ce86	vk: Correct anv_CreateShaderModule's prototype s/VkShader/VkShaderModule/ :sigh: I look forward to type-safety.	2015-07-15 13:59:47 -07:00
Chad Versace	8213be790e	vk: Define struct anv_image_view, anv_buffer_view Follow the pattern of anv_attachment_view. We need these structs to implement the type-safety that arrived in the 0.132 header.	2015-07-15 12:19:29 -07:00
Chad Versace	43241a24bc	vk/meta: Fix declared type of a shader module s/VkShader/VkShaderModule/ I'm looking forward to a type-safe vulkan.h ;)	2015-07-15 11:49:37 -07:00
Chad Versace	94e473c993	vk: Remove struct anv_object Trivial removal because vkDestroyObject() no longer uses it.	2015-07-15 11:29:43 -07:00
Jason Ekstrand	e375f722a6	vk/device: More documentation on surface state flushing	2015-07-15 11:09:02 -07:00
Connor Abbott	9aabe69028	vk/device: explain why a flush is necessary Jason found this from experimenting, but the docs give a reasonable explanation of why it's necessary.	2015-07-14 23:03:19 -07:00
Chad Versace	5f46c4608f	vk: Fix indentation of anv_dynamic_cb_state	2015-07-14 18:19:10 -07:00
Chad Versace	0eeba6b80c	vk: Add finishmes for VkDescriptorPool VkDescriptorPool is a stub object. As a consequence, it's impossible to free descriptor set memory.	2015-07-14 18:19:00 -07:00
Jason Ekstrand	2b5a4dc5f3	vk: Add vulkan-138 and remove vulkan-0.132 Now, 138 is the target and not 132. Once object destruction is finished, we can delete 138 as it will be identical to vulkan.h	2015-07-14 17:54:13 -07:00
Jason Ekstrand	1f658bed70	vk/device: Add stub support for command pools Real support isn't really that far away. We just need a data structure with a linked list and a few tests.	2015-07-14 17:40:00 -07:00
Jason Ekstrand	ca7243b54e	vk/vulkan.h: Add the stuff for cross-queue resource sharing We only have one queue, so this is currently a no-op on our implementation.	2015-07-14 17:20:50 -07:00
Jason Ekstrand	553b4434ca	vk/vulkan.h: Add a couple of size fields for specialization constants	2015-07-14 17:12:39 -07:00
Jason Ekstrand	e5db209d54	vk/vulkan.h: Move around buffer image granularities	2015-07-14 17:10:37 -07:00
Jason Ekstrand	c7fcfebd5b	vk: Add stubs for all the sparse resource stuff	2015-07-14 17:06:11 -07:00
Jason Ekstrand	2a9136feb4	vk/image: Add a stub for the new ImageFormatProperties function This lets the client query about things like multisample. We don't do multisample right now, so I'll let Chad deal with that when he gets to it.	2015-07-14 17:05:30 -07:00
Jason Ekstrand	2c4dc92f40	vk/vulkan.h: Rename FormatInfo to FormatProperties	2015-07-14 17:04:46 -07:00
Jason Ekstrand	d7f44852be	vk/vulkan.h: Re-order some #define's	2015-07-14 16:41:39 -07:00
Jason Ekstrand	1fd3bc818a	vk/vulkan.h: Rename a function parameter	2015-07-14 16:39:01 -07:00
Jason Ekstrand	2e2f48f840	vk: Remove abreviations	2015-07-14 16:34:31 -07:00
Jason Ekstrand	02db21ae11	vk: Add the new extension/layer enumeration entrypoints	2015-07-14 16:11:21 -07:00
Jason Ekstrand	a463eacb8f	vk/vulkan.h: Change maxAnisotropy to a float	2015-07-14 15:04:11 -07:00
Jason Ekstrand	98957b18d2	vk/vulkan.h: Add the VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT flag	2015-07-14 15:03:39 -07:00
Jason Ekstrand	a35811d086	vk/vulkan.h: Rename a couple of function parameters No functional change.	2015-07-14 15:03:01 -07:00
Jason Ekstrand	55723e97f1	vk: Split the memory requirements/binding functions	2015-07-14 14:59:39 -07:00
Jason Ekstrand	ccb2e5cd62	vk: Make barriers more precise (rev. 133)	2015-07-14 14:50:35 -07:00
Jason Ekstrand	30445f8f7a	vk: Split the dynamic state binding function into one per state	2015-07-14 14:26:10 -07:00
Jason Ekstrand	d2c0870ff3	vk/vulkan.h: Rename a function parameter to match 132	2015-07-14 14:11:04 -07:00
Jason Ekstrand	8478350992	vk: Implement Multipass	2015-07-14 11:37:14 -07:00
Jason Ekstrand	68768c40be	vk/vulkan.h: Re-arrange some enums and definitions in preparation for 131	2015-07-14 11:32:15 -07:00
Chad Versace	66cbb7f76d	vk/0.132: Add vkDestroyRenderPass()	2015-07-14 11:21:31 -07:00
Chad Versace	6d0ed38db5	vk/0.132: Add vkDestroy*View() vkDestroyColorAttachmentView vkDestroyDepthStencilView These functions are not in the 0.132 header, but adding them will help us attain the type-safety API updates more quickly.	2015-07-14 11:19:22 -07:00
Chad Versace	1ca611cbad	vk/0.132: Add vkDestroyCommandBuffer()	2015-07-14 11:11:41 -07:00
Chad Versace	6eec0b186c	vk/0.132: Add vkDestroyImageView() Just declare it in vulkan.h. Jason defined the function earlier in image.c.	2015-07-14 11:09:14 -07:00
Chad Versace	4b2c5a98f0	vk/0.132: Add vkDestroyBufferView() Just declare it in vulkan.h. Jason already defined the function earlier in vulkan.c.	2015-07-14 11:06:57 -07:00
Chad Versace	08f7731f67	vk/0.132: Add vkDestroyFramebuffer()	2015-07-14 10:59:30 -07:00
Chad Versace	0c8456ef1e	vk/0.132: Add vkDestroyDynamicDepthStencilState()	2015-07-14 10:54:51 -07:00
Chad Versace	b29c929e8e	vk/0.132: Add vkDestroyDynamicColorBlendState()	2015-07-14 10:52:45 -07:00
Chad Versace	5e1737c42f	vk/0.132: Add vkDestroyDynamicRasterState()	2015-07-14 10:51:08 -07:00
Chad Versace	d80fea1af6	vk/0.132: Add vkDestroyDynamicViewportState()	2015-07-14 10:42:45 -07:00
Chad Versace	9250e1e9e5	vk/0.132: Add vkDestroyDescriptorPool()	2015-07-14 10:38:22 -07:00
Chad Versace	f925ea31e7	vk/0.132: Add vkDestroyDescriptorSetLayout()	2015-07-14 10:36:49 -07:00
Chad Versace	ec5e2f4992	vk/0.132: Add vkDestroySampler()	2015-07-14 10:34:00 -07:00
Chad Versace	a684198935	vk/0.132: Add vkDestroyPipelineLayout()	2015-07-14 10:29:47 -07:00
Chad Versace	6e5ab5cf1b	vk/0.132: Add vkDestroyPipeline()	2015-07-14 10:26:17 -07:00
Chad Versace	114015321e	vk/0.132: Add vkDestroyPipelineCache()	2015-07-14 10:19:27 -07:00
Chad Versace	cb57bff36c	vk/0.132: Add vkDestroyShader()	2015-07-14 10:16:22 -07:00
Chad Versace	8ae8e14ba7	vk/0.132: Add vkDestroyShaderModule()	2015-07-14 10:13:09 -07:00
Chad Versace	dd67c134ad	vk/0.132: Add vkDestroyImage() We only need to add it to vulkan.h because Jason defined the function earlier in image.c.	2015-07-14 10:13:00 -07:00
Chad Versace	e18377f435	vk/0.132: Dispatch vkDestroyObject to new destructors Oops. My recent commits added new destructors, but forgot to teach vkDestroyObject about them. They are: vkDestroyFence vkDestroyEvent vkDestroySemaphore vkDestroyQueryPool vkDestroyBuffer	2015-07-14 09:58:22 -07:00
Chad Versace	e93b6d8eb1	vk/0.132: Add vkDestroyBuffer()	2015-07-14 09:47:45 -07:00
Chad Versace	584cb7a16f	vk/0.132: Add vkDestroyQueryPool()	2015-07-14 09:44:58 -07:00
Chad Versace	68c7ef502d	vk/0.132: Add vkDestroyEvent()	2015-07-14 09:33:47 -07:00
Chad Versace	549070b18c	vk/0.132: Add vkDestroySemaphore()	2015-07-14 09:31:34 -07:00
Chad Versace	ebb191f145	vk/0.132: Add vkDestroyFence()	2015-07-14 09:29:35 -07:00
Chad Versace	435ccf4056	vk/0.132: Rename VkDynamic*State types sed -i -e 's/VkDynamicVpState/VkDynamicViewportState/g' \ -e 's/VkDynamicRsState/VkDynamicRasterState/g' \ -e 's/VkDynamicCbState/VkDynamicColorBlendState/g' \ -e 's/VkDynamicDsState/VkDynamicDepthStencilState/g' \ $(git ls-files include/vulkan src/vulkan)	2015-07-13 16:19:28 -07:00
Connor Abbott	ffb51fd112	nir/spirv: update to SPIR-V revision 31 This means that now the internal version of glslangValidator is required. This includes some changes due to the sampler/texture rework, but doesn't actually enable anything more yet. We also don't yet handle UBO's correctly, and don't handle matrix stride and row major/column major yet.	2015-07-13 15:01:01 -07:00
Chad Versace	45f8723f44	vk/0.132: Move VkQueryControlFlags	2015-07-13 13:09:32 -07:00
Chad Versace	180c07ee50	vk/0.132: Move VkImageAspectFlags	2015-07-13 13:08:56 -07:00
Chad Versace	4b05a8cd31	vk/0.132: Move VkCmdBufferOptimizeFlags	2015-07-13 13:08:07 -07:00
Chad Versace	f1cf55fae6	vk/0.132: Move VkWaitEvent	2015-07-13 13:06:53 -07:00
Chad Versace	3112098776	vk/0.132: Move VkCmdBufferLevel	2015-07-13 13:06:33 -07:00
Chad Versace	c633ab5822	vk/0.132: Drop VK_ATTACHMENT_STORE_OP_RESOLVE_MSAA	2015-07-13 13:05:24 -07:00
Chad Versace	8f3b2187e1	vk/0.132: Rename bool32_t -> VkBool32 sed -i 's/bool32_t/VkBool32/g' \ $(git ls-files src/vulkan include/vulkan)	2015-07-13 13:03:36 -07:00
Chad Versace	77dcfe3c70	vk/0.132: Remove stray typedef	2015-07-13 12:58:17 -07:00
Chad Versace	601d0891a6	vk/0.132: Move VKImageUsageFlags	2015-07-13 12:48:44 -07:00
Chad Versace	829810fa27	vk/0.132: Move VkImageType and VkImageTiling	2015-07-13 11:49:56 -07:00
Chad Versace	17c8232ecf	vk/0.132: Import the 0.132 header Import it as vulkan-0.132.h.	2015-07-13 11:47:12 -07:00
Chad Versace	a158ff55f0	vk/vulkan.h: Remove headers for old API versions Remove the temporary headers for 0.90 and 0.130.	2015-07-13 11:46:30 -07:00
Chad Versace	1c4238a8e5	vk/0.130: Bump header version to 0.130 All APIs have been updated. This eliminates the diff between the work-in-progress header and the 0.130 header.	2015-07-10 20:06:09 -07:00
Chad Versace	f43a304dc6	vk/0.130: Update vkAllocMemory to use VkMemoryType	2015-07-10 17:35:52 -07:00
Chad Versace	df2a013881	vk/0.130: Implement vkGetPhysicalDeviceMemoryProperties()	2015-07-10 17:35:52 -07:00
Chad Versace	c7f512721c	vk/gem: Change signature of anv_gem_get_aperture() Replace the anv_device parameter with anv_physical_device, because this needs querying before vkCreateDevice.	2015-07-10 17:35:52 -07:00
Chad Versace	8cda3e9b1b	vk/device: Add member anv_physical_device::fd During anv_physical_device_init(), we opend the DRM device to do some queries, then promptly closed it. Now we keep it open for the lifetime of the anv_physical_device so that we can query it some more during vkGetPhysicalDevice*Properties() [which will happen in follow-up commits].	2015-07-10 17:35:52 -07:00
Chad Versace	4422bd4cf6	vk/device: Add func anv_physical_device_finish() Because in a follow-up patch I need to do some non-trival teardown on anv_physical_device. Currently, however, anv_physical_device_finish() is currently a no-op that's just called in the right place. Also, rename function fill_physical_device -> anv_physical_device_init for symmetry.	2015-07-10 17:35:52 -07:00
Jason Ekstrand	7552e026da	vk/device: Add an explicit destructor for RenderPass	2015-07-10 12:33:04 -07:00
Jason Ekstrand	8b342b39a3	vk/image: Add an explicit DestroyImage function	2015-07-10 12:30:58 -07:00
Jason Ekstrand	b94b8dfad5	vk/image: Add explicit constructors for buffer/image view types	2015-07-10 12:26:31 -07:00
Jason Ekstrand	18340883e3	nir: Add C++ versions of NIR_(SRC\|DEST)_INIT	2015-07-10 11:57:33 -07:00
Chad Versace	9e64a2a8e4	mesa: Fix generation of git_sha1.h.tmp for gitlinks Don't assume that $(top_srcdir)/.git is a directory. It may be a gitlink file [1] if $(top_srcdir) is a submodule checkout or a linked worktree [2]. [1] A "gitlink" is a text file that specifies the real location of the gitdir. [2] Linked worktrees are a new feature in Git 2.5. Cc: "10.6, 10.5" <mesa-stable@lists.freedesktop.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (cherry picked from commit `75784243df`)	2015-07-10 11:24:25 -07:00
Jason Ekstrand	19f0a9b582	vk/query.c: Use the casting functions	2015-07-09 20:32:44 -07:00
Jason Ekstrand	6eb221c884	vk/pipeline.c: Use the casting functions	2015-07-09 20:28:08 -07:00
Jason Ekstrand	fb4e2195ec	vk/formats.c: Use the casting functions	2015-07-09 20:24:17 -07:00
Jason Ekstrand	a52e208203	vk/image.c: Use the casting functions	2015-07-09 20:24:07 -07:00
Jason Ekstrand	b1de1d4f6e	vk/device.c: One more use of a casting function	2015-07-09 20:23:46 -07:00
Jason Ekstrand	8739e8fbe2	vk/meta.c: Use the casting functions	2015-07-09 20:16:13 -07:00
Jason Ekstrand	92556c77f4	vk: Fix the build	2015-07-09 18:59:08 -07:00
Jason Ekstrand	098209eedf	device.c: Use the cast helpers a bunch of places	2015-07-09 18:49:43 -07:00
Jason Ekstrand	73f9187e33	device.c: Use the cast helpers	2015-07-09 18:41:27 -07:00
Jason Ekstrand	7d24fab4ef	vk/private.h: Add a bunch of static inline casting functions We will need these as soon as we turn on type saftey. We might as well define and start using them now rather than later.	2015-07-09 18:40:54 -07:00
Jason Ekstrand	5c49730164	vk/device.c: Fix whitespace issues	2015-07-09 18:20:28 -07:00
Jason Ekstrand	c95f9b61f2	vk/device.c: Use ANV_FROM_HANDLE a bunch of places	2015-07-09 18:20:10 -07:00
Jason Ekstrand	335e88c8ee	vk/vulkan.h: Add the pEnabledFeatures field to DeviceCreateInfo	2015-07-09 16:21:31 -07:00
Jason Ekstrand	34871cf7f3	vk/vulkan.h: Change the MsCreateInfo structure to the 130 version We do nothing with it at the moment, so this is a no-op.	2015-07-09 16:19:54 -07:00
Jason Ekstrand	8c2c37fae7	vk: Remove the old GetPhysicalDeviceInfo call	2015-07-09 16:14:37 -07:00
Jason Ekstrand	1f907011a3	vk: Add the new PhysicalDeviceQueue queries	2015-07-09 16:14:37 -07:00
Jason Ekstrand	977a469bce	vk: Support GetPhysicalDeviceProperties	2015-07-09 16:14:37 -07:00
Jason Ekstrand	65e0b304b6	vk: Add support for GetPhysicalDeviceLimits	2015-07-09 16:14:37 -07:00
Jason Ekstrand	f6d51f3fd3	vk: Add GetPhysicalDeviceFeatures	2015-07-09 16:14:37 -07:00
Chad Versace	5b75dffd04	vk/device: Fix vkEnumeratePhysicalDevices() The Vulkan spec says that pPhysicalDeviceCount is an out parameter if pPhysicalDevices is NULL; otherwise it's an inout parameter. Mesa incorrectly treated it unconditionally as an inout parameter, which could have lead to reading unitialized data.	2015-07-09 15:53:21 -07:00
Chad Versace	fa915b661d	vk/device: Move device enumeration to vkEnumeratePhysicalDevices() Don't enumerate devices in vkCreateInstance(). That's where global, device-independent initialization should happen. Move device enumeration to the more logical location, vkEnumeratePhysicalDevices().	2015-07-09 15:41:17 -07:00
Chad Versace	c34d314db3	vk/device: Be consistent about path to DRM device Function fill_physical_device() has a 'path' parameter, and struct anv_physical_device has a 'path' member. Sometimes these are used; sometimes hardcoded "/dev/dri/renderD128" is used instead. Be consistent. Hardcode "/dev/dri/renderD128" in exactly one location, during initialization of the physical device.	2015-07-09 15:27:26 -07:00
Connor Abbott	cff06bbe7d	vk/compiler: create an empty parameters list Prevents problems when initializing the sanity_param_count.	2015-07-09 14:29:23 -04:00
Connor Abbott	3318a86d12	nir/spirv: fix wrong writemask for ALU operations	2015-07-09 14:28:39 -04:00
Connor Abbott	b8fedc19f5	nir/spirv: fix memory context for builtin variable Fixes valgrind errors with func.depthstencil.basic.	2015-07-08 22:03:30 -04:00
Connor Abbott	e4292ac039	nir/spirv: zero out value array Before values are pushed or annotated with a name, decoration, etc., they need to have an invalid type, NULL name, NULL decoration, etc. ralloc zero's everything by accident, so this wasn't an issue in practice, but we should be explicitly zero'ing it.	2015-07-08 22:03:30 -04:00
Connor Abbott	997831868f	vk/compiler: create the right kind of program struct This fixes Valgrind errors and gets all the tests to pass with --use-spir-v.	2015-07-08 22:03:30 -04:00
Connor Abbott	a841e2c747	vk/compiler: mark inputs/outputs as read/written This doesn't handle inputs and outputs larger than a vec4, but we plan to add a varyiing splitting/packing pass to handle those anyways.	2015-07-08 22:03:30 -04:00
Jason Ekstrand	8640dc12dc	vk/vulkan.h: Copy the VkStructureType enum from version 130 We now have the exact same structs which require pType.	2015-07-08 17:45:52 -07:00
Jason Ekstrand	5a4ebf6bc1	vk: Move to the new pipeline creation API's	2015-07-08 17:30:18 -07:00
Chad Versace	4fcb32a17d	vk/0.130: Remove VkImageViewCreateInfo::minLod It's now set solely through VkSampler.	2015-07-08 14:48:22 -07:00
Jason Ekstrand	367b9ba78f	vk/vulkan.h: Move renderPassContinue from GraphicsBeginInfo to BeginInfo	2015-07-08 14:37:30 -07:00
Jason Ekstrand	d29ec8fa36	vk/vulkan.h: Update to the new UpdateDescriptorSets api	2015-07-08 14:24:56 -07:00
Jason Ekstrand	c8577b5f52	vk: Add a macro for creating anv variables from vulkan handles This is very helpful for doing the mass bunch of casts at the top of a function. It will also be invaluable when we get type saftey in the API.	2015-07-08 14:24:14 -07:00
Chad Versace	ccb27a002c	vk/0.130 Update VkObjectType values Don't import any new enum tokens from the 0.130 header. Just update the values of existing enums. This reduces the diff by about 16 lines.	2015-07-08 12:53:49 -07:00
Chad Versace	8985dd15a1	vk/0.130: Remove VkDescriptorUpdateMode Nowhere used.	2015-07-08 12:51:46 -07:00
Chad Versace	e02dfa309a	vk/0.130: Remove VK_DEVICE_CREATE_MULTI_DEVICE_IQ_MATCH_BIT	2015-07-08 12:49:48 -07:00
Chad Versace	e9034ed875	vk/0.130: Update vkCmdBlitImage signature Add VkTexFilter param. Ignored for now.	2015-07-08 12:47:48 -07:00
Jason Ekstrand	aae45ab583	vk/vulkan.h: Add packing parameters to BufferImageCopy	2015-07-08 11:51:34 -07:00
Chad Versace	b4ef7f354b	vk/0.130: Remove msaa members of VkDepthStencilViewCreateInfo	2015-07-08 11:50:51 -07:00
Jason Ekstrand	522ab835d6	vk/vulkan.h: Move over to the new border color enums	2015-07-08 11:44:52 -07:00
Jason Ekstrand	7598329774	vk/vulkan.h: Move VkFormatProperties	2015-07-08 11:16:45 -07:00
Jason Ekstrand	52940e8fcf	vk/vulkan.h: Add RenderPassBeginContents	2015-07-08 10:57:13 -07:00
Jason Ekstrand	e19d6be2a9	vk/vulkan.h: Add command buffer levels	2015-07-08 10:53:32 -07:00
Jason Ekstrand	c84f2d3b8c	vk/vulkan.h: Import the VkPipeEvent enum from 130 Now, VkPipeEventFlags is back in sync with VkPipeEvent	2015-07-08 10:49:46 -07:00
Jason Ekstrand	b20cc72603	vk/vulkan.h: Remove VkFormatInfoType	2015-07-08 10:39:31 -07:00
Jason Ekstrand	8e05bbeee9	vk/vulkan.h: Update extension handling to rev 130	2015-07-08 10:38:07 -07:00
Jason Ekstrand	cc29a5f4be	vk/vulkan.h: Move format quering to the physical device	2015-07-08 09:34:47 -07:00
Jason Ekstrand	719fa8ac74	vk/vulkan.h: Remove some peer opening structs and STRUCTURE_TYPE enums	2015-07-08 09:25:13 -07:00
Jason Ekstrand	fc6dcc6227	vk: Add a copy of the v90 header.	2015-07-08 09:23:29 -07:00
Jason Ekstrand	12119282e6	vk/vulkan.h: Remove an unneeded comment	2015-07-08 09:18:09 -07:00
Jason Ekstrand	3c65a1ac14	vk/vulkan.h: Remove the MemoryRange stubs and add sparse stubs	2015-07-08 09:16:48 -07:00
Jason Ekstrand	bb6567f5d1	vk/vulkan.h: Switch BindObjectMemory to a device function and remove the index	2015-07-08 09:04:16 -07:00
Jason Ekstrand	e7acdda184	vk/vulkan.h: Switch to the split ProcAddr functions in 130	2015-07-07 18:51:53 -07:00
Jason Ekstrand	db24afee2f	vk/vulkan.h: Switch from GetImageSubresourceInfo to GetImageSubresourceLayout	2015-07-07 18:20:18 -07:00
Jason Ekstrand	ef8980e256	vk/vulkan.h: Switch from GetObjectInfo to GetMemoryRequirements	2015-07-07 18:16:42 -07:00
Jason Ekstrand	d9c2caea6a	vk: Update memory flushing functions to 130 This involves updating the prototype for FlushMappedMemory, adding InvalidateMappedMemoryRanges, and removing PinSystemMemory.	2015-07-07 17:22:31 -07:00
Jason Ekstrand	d5349b1b18	vk/vulkan.h: Constify the pFences parameter to ResetFences	2015-07-07 17:18:00 -07:00
Jason Ekstrand	6aa1b89457	vk/vulkan.h: Move the definitions of Create(Framebuffer\|RenderPass) This better matches the 130 header.	2015-07-07 17:13:10 -07:00
Jason Ekstrand	0ff06540ae	vk: Implement the GetRenderAreaGranularity function At the moment, we're just going to scissor clears so a granularity of 1x1 is all we need.	2015-07-07 17:11:37 -07:00
Jason Ekstrand	435b062b26	vk/vulkan.h: Add a PipelineLayout parameter to BindDescriptorSets	2015-07-07 17:06:10 -07:00
Jason Ekstrand	518ca9e254	vk/vulkan.h: Add a compareEnable parameter to SamplerCreateInfo Our hardware doesn't actually need this, so adding it is a no-op.	2015-07-07 16:49:04 -07:00
Jason Ekstrand	672590710b	vk/vulkan.h: Remove initialCount from SemaphoreCreateInfo	2015-07-07 16:42:42 -07:00
Jason Ekstrand	80046a7d54	vk/vulkan.h: Update clear color handling to 130	2015-07-07 16:37:43 -07:00
Jason Ekstrand	3e4b00d283	meta: Use the VkClearColorValue structure for the color attribute	2015-07-07 16:27:06 -07:00
Jason Ekstrand	a35fef1ab2	vk/vulkan.h: Remove the pass argument from EndRenderPass	2015-07-07 16:22:23 -07:00
Jason Ekstrand	d2ca7e24b4	vk/vulkan.h: Rename VertexInputStateInfo to VertexInputStateCreateInfo	2015-07-07 16:15:55 -07:00
Jason Ekstrand	abbb776bbe	vk/vulkan.h: Remove programPointSize Instead, we auto-detect whether or not your shader writes gl_PointSize. If it does, we use 1.0, otherwise we take it from the shader.	2015-07-07 16:00:46 -07:00
Chad Versace	e7ddfe03ab	vk/0.130: Stub vkCmdClear*Attachment() funcs vkCmdClearColorAttachment vkCmdClearDepthStencilAttachment	2015-07-07 15:57:37 -07:00
Chad Versace	f89e2e6304	vk/0.130: Define enum VkImageAspectFlagBits	2015-07-07 15:57:37 -07:00
Chad Versace	55ab1737d3	vk/0.130: Define VkRect3D	2015-07-07 15:55:53 -07:00
Chad Versace	11901a9100	vk/0.130: Update name of vkCmdClearDepthStencilImage()	2015-07-07 15:53:35 -07:00
Chad Versace	dff32238c7	vk/0.130: Stub vkCmdExecuteCommands()	2015-07-07 15:51:55 -07:00
Chad Versace	85c0d69be9	vk/0.130: Update vkCmdWaitEvents() signature	2015-07-07 15:49:57 -07:00
Chad Versace	0ecb789b71	vk: Remove unused 'v' param from stub() macro	2015-07-07 15:47:24 -07:00
Chad Versace	f78d684772	vk: Stub vkCmdPushConstants() from 0.130 header	2015-07-07 15:46:19 -07:00
Chad Versace	18ee32ef9d	vk: Update vkCmdPipelineBarrier to 0.130 header	2015-07-07 15:43:41 -07:00
Chad Versace	4af79ab076	vk: Add func anv_clear_mask() A little helper func for inspecting and clearing bitmasks.	2015-07-07 15:43:41 -07:00
Jason Ekstrand	788a8352b9	vk/vulkan.h: Remove some unused fields. In particular, the following are removed: - disableVertexReuse - clipOrigin - depthMode - pointOrigin - provokingVertex	2015-07-07 15:33:00 -07:00
Jason Ekstrand	7fbed521bb	vk/vulkan.h: Remove the explicit primitive restart index Unfortunately, this requires some non-trivial changes to the driver. Now that the primitive restart index isn't given explicitly by the client, we always use ~0 for everything like D3D does. Unfortunately, our hardware is awesome and a 32-bit version of ~0 doesn't match any 16-bit values. This means, we have to set it to either UINT16_MAX or UINT32_MAX depending on the size of the index type. Since we get the index type from CmdBindIndexBuffer and the rest of the VF packet from the pipeline, we need to lazy-emit the VF packet.	2015-07-07 15:33:00 -07:00
Chad Versace	d6b840beff	vk: Delete some comments not present in 0.130 header Deleting the comments reduces diff noise.	2015-07-07 15:16:13 -07:00
Chad Versace	84a5bc25e3	vk: Pull in remaining 0.130 handle types This pulls in the definition of VkShaderModule and VkPipelineCache, which nowhere used yet.	2015-07-07 15:13:01 -07:00
Chad Versace	f2899b1af2	vk: Pull in #defines from 0.130 header Despite not being used yet, pulling in the macros does diminish the header diff.	2015-07-07 15:11:30 -07:00
Jason Ekstrand	962d6932fa	vk/vulkan.h: Rename (min\|max)Depth to (min\|max)DepthBounds	2015-07-07 12:37:54 -07:00
Jason Ekstrand	1fb859e4b2	vk/vulkan.h: Remove client-settable pointSize from DynamicRsState	2015-07-07 12:35:32 -07:00
Jason Ekstrand	245583075c	vk/vulkan.h: Remove UINT8 index buffers	2015-07-07 11:26:49 -07:00
Jason Ekstrand	0a42332904	vk/vulkan.h: Re-order the object declarations	2015-07-07 11:26:49 -07:00
Kristian Høgsberg Kristensen	a1eea996d4	vk: Emit 3DSTATE_SAMPLE_MASK This was missing and was causing the driver to not work with execlists. Presumably we get a different initial hw context with execlists enabled, that has sample mask 0 initially. Set this to 0xffff for now. When we add MS support, we need to take the value from VkPipelineMsStateCreateInfo::sampleMask.	2015-07-06 23:54:12 -07:00
Kristian Høgsberg Kristensen	c325bb24b5	vk: Pull in new generated headers The new headers use stdbool for enable/disable fields which implicitly converts expressions like (flags & 8) to 0 or 1. Also handles MBO (must-be-one) fields by setting them to one, corrects a bspec typo (_3DPRIM_LISTSTRIP_ADJ -> LINESTRIP) and makes a few enum values less clashy.	2015-07-06 22:12:26 -07:00
Chad Versace	23075bccb3	vk/image: Validate vkCreateImageView more Exhaustively validate the function input. If it's not validated and doesn't have an anv_finishme(), then I overlooked it.	2015-07-06 18:28:26 -07:00
Chad Versace	69e11adecc	vk/image: Add more info to VkImageViewType table Convert the table from the direct mapping VkImageViewType -> SurfaceType into a mapping to an info struct VkImageViewType -> struct anv_image_view_info	2015-07-06 18:28:26 -07:00
Chad Versace	b844f542e0	vk: Update VkImageViewType to 0.130.0 This splits 1D and 1D_ARRAY, 2D and 2D_ARRAY, CUBE and CUBE_ARRAY. The new tokens are unused. This is just a header update.	2015-07-06 18:28:26 -07:00
Chad Versace	5b04db71ff	vk/image: Move validation for vkCreateImageView Move the validation from anv_CreateImageView() and anv_image_view_init() to anv_validate_CreateImageView(). No new validation is added.	2015-07-06 18:27:14 -07:00
Jason Ekstrand	1f1b26bceb	vk/vulkan.h: Rename VkRect to VkRect2D	2015-07-06 17:47:18 -07:00
Jason Ekstrand	63c1190e47	vk/vulkan.h: Rename count to arraySize in VkDescriptorSetLayoutBinding	2015-07-06 17:43:58 -07:00
Jason Ekstrand	d84f3155b1	vk/vulkan.h: Remove the Vk(Memory\|Semaphor\|Image)OpenInfo structs We already deleted the functions that need them. The structs are just dangling uselessly.	2015-07-06 17:37:13 -07:00
Jason Ekstrand	65f9ccb4e7	vk/vulkan.h: Remove VK_MEMORY_PROPERTY_PREFER_HOST_LOCAL_BIT We weren't doing anything with it, so this is a no-op	2015-07-06 17:33:45 -07:00
Jason Ekstrand	68fa750f2e	vk/vulkan.h: Replace DEVICE_COHERENT_BIT with DEVICE_NON_COHERENT_BIT	2015-07-06 17:32:28 -07:00
Jason Ekstrand	d5b5bd67f6	vk/vulkan.h: Use the query result bits from revision 130 None of the important bits or names actually changed. It just added/removed some no-op names. No functional change.	2015-07-06 17:27:11 -07:00
Jason Ekstrand	d843418c2e	vk/vulkan.h: One more quick enum refactor clean-up	2015-07-06 17:26:29 -07:00
Jason Ekstrand	2b37fc28d1	vk/vulkan.h: Get rid of VERTEX_INPUT_STEP_RATE_DRAW We never supported it, so no functional change.	2015-07-06 17:24:26 -07:00
Jason Ekstrand	a75967b1bb	vk/vulkan.h: Remove the CLEAR_OPTIMAL image layout	2015-07-06 17:21:19 -07:00
Jason Ekstrand	2b404e5d00	vk: Rename CPU_READ/WRITE_BIT to HOST_READ/WRITE_BIT	2015-07-06 17:18:25 -07:00
Jason Ekstrand	c57ca3f16f	vk/vulkan.h: Remove VK_IMAGE_CREATE_CLONEABLE_BIT	2015-07-06 17:14:30 -07:00
Jason Ekstrand	2de388c49c	vk: Remove SHAREABLE bits They were removed from the Vulkan API and we don't really use them because there are no multi-GPU i965 systems.	2015-07-06 17:12:51 -07:00
Jason Ekstrand	1b0c47bba6	vk/vulkan.h: Re-order the logic op enums	2015-07-06 17:08:11 -07:00
Jason Ekstrand	c7cef662d0	vk/vulkan.h: Reformat a bunch of enums to match revision 130 In theory, no functional change.	2015-07-06 17:06:02 -07:00
Jason Ekstrand	8c5e48f307	vk: Rename NUM_SHADER_STAGE to SHADER_STAGE_NUM This is a refactor of more than just the header but it lets us finish reformating the shader stage enum.	2015-07-06 16:43:28 -07:00
Jason Ekstrand	d9176f2ec7	vk: Reformat a bunch of enums This accounts for a number differences between the generated headers and the hand-written header. Not all reformatting is done in this commit but it does make the headers much more diffable. In theory, no functional change.	2015-07-06 16:41:31 -07:00
Jason Ekstrand	e95bf93e5a	vk: Pull the VkResult enum from revision 130	2015-07-06 16:15:12 -07:00
Jason Ekstrand	1b7b580756	vk: re-arrange enums to match the order in revision 130	2015-07-06 16:11:05 -07:00
Jason Ekstrand	2fb524b369	vk: Rename a parameter in CmdBindDynamicStateObject	2015-07-06 15:37:17 -07:00
Jason Ekstrand	c5ffcc9958	vk: Remove multi-device stuff	2015-07-06 15:34:55 -07:00
Jason Ekstrand	c5ab5925df	vk: Remove ClearDescriptorSets	2015-07-06 15:32:40 -07:00
Jason Ekstrand	ea5fbe1957	vk: Remove begin/end descriptor pool update	2015-07-06 15:32:27 -07:00
Jason Ekstrand	9a798fa946	vk: Remove stub for CloneImageData	2015-07-06 15:30:05 -07:00
Jason Ekstrand	78a0d23d4e	vk: Remove the stub support for memory priorities	2015-07-06 15:28:10 -07:00
Jason Ekstrand	11cf214578	vk: Remove the stub support for explicit memory references	2015-07-06 15:27:58 -07:00
Jason Ekstrand	0dc7d4ac8a	vk/vulkan.h: Reformat structs to match revision 130 Structs in the old version were specified as typedef struct VkSomeThing_ { type field; // comment } VkSomeThing; However, in the generated headers, you have typedef struct { type field; } VkSomeThing; This commit also removes some unneeded whitespaces.	2015-07-06 15:19:12 -07:00
Jason Ekstrand	19aabb5730	vk/vulkah.h: Re-arrange structures to match the order in 130	2015-07-06 15:09:30 -07:00
Connor Abbott	f9dbc34a18	nir/spirv: fix some bugs	2015-07-06 15:00:37 -07:00
Connor Abbott	f3ea3b6e58	nir/spirv: add support for builtins inside structures We may be able to revert this depending on the outcome of bug 14190, but for now it gets vertex shaders working with SPIR-V.	2015-07-06 15:00:37 -07:00
Connor Abbott	15047514c9	nir/spirv: fix a bug with structure creation We were creating 2 extra bogus fields.	2015-07-06 15:00:37 -07:00
Connor Abbott	73351c6a18	nir/spirv: fix a bad assertion in the decoration handling We should be asserting that the parent decoration didn't hand us a member if the child decoration did, but different child decorations may obviously have different members.	2015-07-06 15:00:37 -07:00
Connor Abbott	70d2336e7e	nir/spirv: pull out logic for getting builtin locations Also add support for more builtins.	2015-07-06 15:00:37 -07:00
Connor Abbott	aca5fc6af1	nir/spirv: plumb through the type of dereferences We need this to know if a deref is of a builtin.	2015-07-06 15:00:37 -07:00
Connor Abbott	66375e2852	nir/spirv: handle structure member builtin decorations	2015-07-06 15:00:37 -07:00
Connor Abbott	23c179be75	nir/spirv: add a vtn_type struct This will handle decorations that aren't in the glsl_type.	2015-07-06 15:00:37 -07:00
Connor Abbott	f9bb95ad4a	nir/spirv: move 'type' into the union Since SSA values now have their own types, it's more convenient to make 'type' only used when we want to look up an actual SPIR-V type, since we're going to change its type soon to support various decorations that are handled at the SPIR-V -> NIR level.	2015-07-06 15:00:37 -07:00
Jason Ekstrand	d5dccc1e7a	vk: Move CreateFramebuffer and CreateRenderPass higher in the header This matches where they are in the 130 header.	2015-07-06 14:41:43 -07:00
Jason Ekstrand	4a42f45514	vk: Remove atomic counters stubs	2015-07-06 14:38:45 -07:00
Jason Ekstrand	630b19a1c8	vk: Make vulkan.h look more like vulkan-130.h Most of these changes are insubstantial. The only potentially substantial cyhange is that we added a few new #defines for API maximums.	2015-07-06 14:32:52 -07:00
Jason Ekstrand	2f9180b1b2	vk: Add a revision 130 header along-side the current header	2015-07-06 14:16:51 -07:00
Jason Ekstrand	1f1465f077	vk/meta: Add an initial implementation of ClearColorImage	2015-07-02 18:15:06 -07:00
Jason Ekstrand	8a6c8177e0	vk/meta: Factor the guts out of cmd_buffer_clear	2015-07-02 18:13:59 -07:00
Jason Ekstrand	beb0e25327	vk: Roll back to API v90 This is what version 0.1 of the Vulkan SDK is built against.	2015-07-01 16:44:12 -07:00
Jason Ekstrand	fa663c27f5	nir/spirv: Add initial structure member decoration support	2015-07-01 15:38:26 -07:00
Jason Ekstrand	e3d60d479b	nir/spirv: Make vtn_handle_type match the other handler functions Previously, the caller of vtn_handle_type had to handle actually inserting the type. However, this didn't really work if the type was decorated in any way.	2015-07-01 15:34:10 -07:00
Jason Ekstrand	7a749aa4ba	nir/spirv: Add basic support for Op[Group]MemberDecorate	2015-07-01 14:18:07 -07:00
Jason Ekstrand	682eb9489d	vk/x11: Allow for the client querying the size of the format properties	2015-07-01 14:18:07 -07:00
Chad Versace	bba767a9af	vk/formats: Fix entry for S8_UINT I forgot to update this when fixing the depth formats.	2015-06-30 09:41:44 -07:00
Chad Versace	6720b47717	vk/formats: Document new meaning of anv_format::cpp The way the code currently works is that anv_format::cpp is the cpp of anv_format::surface_format. Me and Kristian disagree about how the code should work. Despite that, I think it's in our discussion's best interest to document how the code currently works. That should eliminate confusion. If and when the code begins to work differently, then we'll update the anv_format comments.	2015-06-30 09:41:41 -07:00
Chad Versace	709fa463ec	vk/depth: Add a FIXME 3DSTATE_DEPTH_BUFFER.Width,Height are wrong.	2015-06-26 22:15:03 -07:00
Chad Versace	5b3a1ceb83	vk/image: Enable 2d single-sample color miptrees What's been tested, for both image views and color attachment views: - VK_FORMAT_R8G8B8A8_UNORM - VK_IMAGE_VIEW_TYPE_2D - mipLevels: 1, 2 - baseMipLevel: 0, 1 - arraySize: 1, 2 - baseArraySlice: 0, 1 What's known to be broken: - Depth and stencil miptrees. To fix this, anv_depth_stencil_view needs major rework. - VkImageViewType != 2D - MSAA Fixes Crucible tests: func.miptree.view-2d.levels02.array01.* func.miptree.view-2d.levels01.array02.* func.miptree.view-2d.levels02.array02.*	2015-06-26 22:11:15 -07:00
Chad Versace	c6e76aed9d	vk/image: Define anv_surface, refactor anv_image This prepares for upcoming miptree support. anv_surface is a proxy for color surfaces, depth surfaces, and stencil surfaces. Embed two instances of anv_surface into anv_image: the primary surface (color or depth), and an optional stencil surface.	2015-06-26 21:45:53 -07:00
Chad Versace	127cb3f6c5	vk/image: Reformat function signatures Reformat them to match Mesa code-style.	2015-06-26 20:12:42 -07:00
Chad Versace	fdcd71f71d	vk/image: Embed VkImageCreateInfo* into anv_image_create_info All function signatures that matched this pattern, old: f(const VkImageCreateInfo , const struct anv_image_create_info ) were rewritten as new: f(const struct anv_image_create_info *)	2015-06-26 20:06:08 -07:00
Chad Versace	ca6cef3302	vk/image: Drop some tmp vars in anv_image_view_init() Variables 'tile_mode' and 'format' are unneeded.	2015-06-26 19:50:04 -07:00
Chad Versace	9c46ba9ca2	vk/image: Abort on stencil image views The code doesn't work. Not even close. Replace the broken code with a FINISHME and abort.	2015-06-26 19:23:21 -07:00
Chad Versace	667529fbaa	vk: Reindent struct anv_image	2015-06-26 15:27:20 -07:00
Chad Versace	74e3eb304f	vk: Define MIN(a, b) macro	2015-06-26 15:09:07 -07:00
Chad Versace	55752fe94a	vk: Rename functions ALIGN_32 -> align_32 ALIGN_U32 and ALIGN_I32 are functions, not macros. So stop using allcaps.	2015-06-26 15:07:59 -07:00
Connor Abbott	6ee082718f	Merge branch 'wip/nir-vtn' into vulkan Adds composites and matrix multiplication, plus some control flow fixes.	2015-06-26 12:14:05 -07:00
Chad Versace	37d6e04ba1	vk/formats: Remove the cpp=0 stencil hack The format table defined cpp = 0 for stencil-only formats. The real cpp is 1. When code begins to lie, especially about stencil buffers, code becomes increasingly fragile as time progresses, and the damage becomes increasingly hard to undo. (For precedent, see the painful history of stencil buffer cpp in the git log for gen6 and gen7 in the i965 driver). Let's undo the stencil buffer cpp lie now to avoid future pain. In the format table, set cpp = 1 for VK_FORMAT_S8; replace checks for cpp == 0; and delete all comments about the hack.	2015-06-26 09:58:22 -07:00
Chad Versace	67a7659d69	vk/image: Refactor anv_image_create() From my experience with intel_mipmap_tree.c, I learned that for struct's like anv_image and intel_mipmap_tree, which have sprawling multi-function construction codepaths, it's easy to mistakenly use unitialized struct members during construction. Let's eliminate the risk of using unitialized anv_image members during construction. Fill the struct at the function bottom instead of piecemeal throughout the constructor.	2015-06-26 09:32:59 -07:00
Chad Versace	5d7103ee15	vk/image: Group some assertions closer together In anv_image_create(), group together the assertions on VkImageCreateInfo.	2015-06-26 09:05:46 -07:00
Chad Versace	0349e8d607	vk/formats: #undef fmt at end of format table	2015-06-26 07:38:02 -07:00
Chad Versace	068b8a41e2	vk: Fix comment for anv_depth_stencil_view::stencil_qpitch s/DEPTH/STENCIL/	2015-06-26 07:31:57 -07:00
Chad Versace	7ea707a42a	vk/image: Add qpitch fields to anv_depth_stencil_view For now, hard-code them to 0.	2015-06-25 20:10:16 -07:00
Chad Versace	b91a76de98	vk: Reindent and document struct anv_depth_stencil_view	2015-06-25 20:10:16 -07:00
Chad Versace	ebe1e768b8	vk/formats: Fix incorrect depth formats anv_format::surface_format was incorrect for Vulkan depth formats. For example, the format table mapped VK_FORMAT_D24_UNORM -> .surface_format = D24_UNORM_X8_UINT VK_FORMAT_D32_FLOAT -> .surface_format = D32_FLOAT but should have mapped VK_FORMAT_D24_UNORM -> .surface_format = R24_UNORM_X8_TYPELESS VK_FORMAT_D32_FLOAT -> .surface_format = R32_FLOAT The Crucible test func.depthstencil.basic passed despite the bug, but only because it did not attempt to texture from the depth surface. The core problem is that RENDER_SURFACE_STATE.SurfaceFormat and 3DSTATE_DEPTH_BUFFER.SurfaceFormat are distinct types. Considering them as enum spaces, the two enum spaces have incompatible collisions. Fix this by adding a new field 'depth_format' to struct anv_format. Refer to brw_surface_formats.c:translate_tex_format() for precedent.	2015-06-25 20:10:16 -07:00
Chad Versace	45b804a049	vk/image: Rename local variable in anv_image_create() This function has many local variables for info structs. Having one named simply 'info' is confusing. Rename it to 'format_info'.	2015-06-25 20:10:16 -07:00
Chad Versace	528071f004	vk/formats: Fix table entry for R8G8B8_SNORM Now that anv_formats[] is formatted like a table, buggy entries are easier to see.	2015-06-25 20:10:16 -07:00
Chad Versace	4c8146313f	vk/formats: Rename anv_format::format -> surface_format I misinterpreted anv_format::format as a VkFormat. Instead, it is a hardware surface format (RENDER_SURFACE_STATE.SurfaceFormat). Rename the field to 'surface_format' to make it unambiguous.	2015-06-25 20:10:16 -07:00
Chad Versace	4b8b451a1d	vk/formats: Rename anv_format::channels -> num_channels I misinterpreted anv_format::channels as a bitmask of channels. Renaming it to 'num_channels' makes it unambiguous.	2015-06-25 20:10:16 -07:00
Chad Versace	af0ade0d6c	vk: Reindent struct anv_format	2015-06-25 20:10:16 -07:00
Chad Versace	ae29fd1b55	vk/formats: Don't abbreviate tokens in the format table Abbreviating the VK_FORMAT_* tokens doesn't help much. To the contrary, it means grep and ctags can't find them.	2015-06-25 20:10:16 -07:00
Jason Ekstrand	d5e41a3a99	vk/compiler: Add the initial hacks to get SPIR-V up and going	2015-06-25 17:36:35 -07:00
Jason Ekstrand	c4c1d96a01	HACK: Get rid of sanity_param_count for FS	2015-06-25 17:36:34 -07:00
Jason Ekstrand	4f5ef945e0	i965: Don't print the GLSL IR if it doesn't exist	2015-06-25 17:36:34 -07:00
Jason Ekstrand	588acdb431	nir/spirv: Set the right location for shader input/outputs We need to add FRAG_RESULT_DATA0 etc. to the input/output location.	2015-06-25 17:36:34 -07:00
Jason Ekstrand	333b8ddd6b	nir/spirv: Set the interface type on uniform blocks	2015-06-25 17:36:34 -07:00
Jason Ekstrand	7e1792b1b7	nir/spirv: Set the system value mode on builtins	2015-06-25 17:36:34 -07:00
Jason Ekstrand	b72936fdad	nir/spirv: Actually put variables on the right linked list	2015-06-25 17:36:34 -07:00
Jason Ekstrand	ee0a8f23e4	glsl: Move vert_attrib varying_slot and frag_result enums to shader_enums.h	2015-06-25 17:36:34 -07:00
Chad Versace	fa352969a2	vk/image: Check extent does not exceed surface type limits	2015-06-25 16:53:24 -07:00
Chad Versace	99031aa0f3	vk/image: Stop hardcoding SurfaceType of VkImageView Instead, translate VkImageViewType to a gen SurfaceType.	2015-06-25 16:53:22 -07:00
Chad Versace	7ea121687c	vk/image: Add anv_image::surf_type This the gen SurfaceType, such as SURFTYPE_2D.	2015-06-25 16:52:16 -07:00
Chad Versace	cb30acaced	vk/image: Add tables for gen SurfaceType Tables for mapping VkImageType and VkImageViewType to gen SurfaceType. Tables are unused.	2015-06-25 16:52:16 -07:00
Chad Versace	1132080d5d	vk/util: Add anv_loge() for logging error messages	2015-06-25 16:52:16 -07:00
Chad Versace	5f2d469e37	vk: Add func anv_is_aligned()	2015-06-25 16:52:16 -07:00
Chad Versace	f7fb7575ef	vk: Add anv_minify()	2015-06-25 16:52:05 -07:00
Chad Versace	7cec6c5dfd	vk: Define MAX(a, b) macro	2015-06-25 16:29:42 -07:00
Jason Ekstrand	d178e15567	nir/spirv: Fix up some dererf ralloc parenting	2015-06-24 21:39:07 -07:00
Jason Ekstrand	845002e163	i965/nir: Handle returns as long as they're at the end of a function	2015-06-24 21:38:49 -07:00
Jason Ekstrand	2ecac045a4	i965/nir: Split NIR shader handling into two functions The brw_create_nir function takes a GLSL or ARB shader and turns it into a NIR shader. The guts of the optimization and lowering code is now split into a new brw_process_shader function.	2015-06-24 21:22:07 -07:00
Jason Ekstrand	e369a0eb41	nir/spirv: Use vtn_ssa_value for texture coordinates	2015-06-24 20:39:37 -07:00
Jason Ekstrand	d0bd2bc604	nir/spirv: Add support for the Uniform storage class This is kida sketchy. I'm not really sure this is the way it's supposed to be used.	2015-06-24 20:32:05 -07:00
Jason Ekstrand	ba0d9d33d4	nir/spirv: Add support for some more decorations including built-in	2015-06-24 20:30:32 -07:00
Jason Ekstrand	1bc0a1ad98	nir/spirv: Make the header file C++ safe	2015-06-24 19:01:10 -07:00
Jason Ekstrand	88d02a1b27	vk: Build xmlconfig stuff into libi965_compiler	2015-06-24 15:59:09 -07:00
Kristian Høgsberg Kristensen	24dff4f8fa	vk/headers: Handle MBO fields These must be set to one.	2015-06-24 09:37:50 -07:00
Jason Ekstrand	a62edcce4e	Merge remote-tracking branch 'mesa-public/master' into vulkan	2015-06-23 18:05:25 -07:00
Connor Abbott	dee4a94e69	nir/vtn: add support for phi nodes	2015-06-23 10:34:55 -07:00
Connor Abbott	fe1269cf28	nir/builder: add support for inserting before/after blocks	2015-06-23 10:34:22 -07:00
Connor Abbott	9a3dda101e	nir/vtn: fix emitting code after loops When we're done emitting the code for a loop, we need to visit the new break block, which is the merge block of the current loop, rather than the old merge block, which is the merge block of the loop containing the one we just emitted code for.	2015-06-22 13:53:08 -07:00
Connor Abbott	e9c21d0ca0	unbreak things	2015-06-22 11:59:55 -07:00
Kristian Høgsberg Kristensen	9b9f973ca6	vk: Implement scratch buffers to make spilling work	2015-06-19 15:42:15 -07:00
Kristian Høgsberg Kristensen	9e59003fb1	vk: Undo relocs for scratch bos	2015-06-19 15:42:15 -07:00
Kristian Høgsberg Kristensen	b20794cfa8	vk/allocator: Get rid of non-memfd path We can just use modern valgrind now.	2015-06-19 15:42:15 -07:00
Kristian Høgsberg Kristensen	aba75d0546	vk/headers: Make General State offsets relocations	2015-06-19 15:42:15 -07:00
Connor Abbott	841aab6f50	matrices matrices matrices	2015-06-18 18:52:44 -07:00
Connor Abbott	d0fc04aacf	nir/types: be less strict about constructing matrix types	2015-06-18 18:51:51 -07:00
Connor Abbott	22854a60ef	nir/builder: add a nir_fdot() convenience function	2015-06-18 17:34:55 -07:00
Connor Abbott	0e86ab7c0a	nir/types: add a helper to transpose a matrix type	2015-06-18 17:34:12 -07:00
Connor Abbott	de4c31a085	fix glsl450 for composites	2015-06-18 17:33:08 -07:00
Kristian Høgsberg Kristensen	aedd3c9579	vk: Add missing gen7 RENDER_SURFACE_STATE struct	2015-06-17 21:42:29 -07:00
Connor Abbott	bf5a615659	composites composites composites	2015-06-17 16:25:38 -07:00
Kristian Høgsberg Kristensen	fa8a07748d	vk: Compute CS exec mask and thread width max in pipeline We compute the right mask and thread width max parameters as part of pipeline creation and set them accordingly at vkCmdDispatch() and vkCmdDispatchIndirect() time. These parameters depend only on the local group size and the dispatch width of the program so we can figure this out at pipeline create time.	2015-06-12 18:21:50 -07:00
Kristian Høgsberg Kristensen	c103c4990c	vk: Set binding table layout for CS We weren't setting the binding table layout for the backend compiler.	2015-06-12 18:21:49 -07:00
Kristian Høgsberg Kristensen	2fdd17d259	vk: Generate CS prog_data into the pipeline instance We were generating the prog_data into a local variable and never initializing the pipeline->cs_prog_data one.	2015-06-12 18:21:49 -07:00
Kristian Høgsberg Kristensen	00494c6cb7	vk: Document how depth/stencil formats work in anv_image_create() This reverts commits `e17ed04` * vk/image: Don't double-allocate stencil buffers `1ee2d1c` * vk/image: Teach anv_image_choose_tile_mode about WMAJOR and instead adds a comment to describe the subtlety of how we create images for stencil only formats.	2015-06-11 22:07:16 -07:00
Kristian Høgsberg Kristensen	fbc9fe3c92	vk: Use compute pipeline layout when binding compute sets	2015-06-11 21:57:43 -07:00
Kristian Høgsberg Kristensen	765175f5d1	vk: Implement basic compute shader support	2015-06-11 15:31:42 -07:00
Kristian Høgsberg Kristensen	7637b02aaa	vk: Emit PIPELINE_SELECT on demand	2015-06-11 15:21:49 -07:00
Kristian Høgsberg Kristensen	405697eb3d	vk: Stop asserting we have a fragment shader Even for graphics, this is not a requirement, we can have a depth-only output pipeline.	2015-06-11 15:07:38 -07:00
Kristian Høgsberg Kristensen	e7edde60ba	vk: Defer setting viewport dynamic state We can't emit this until we've done a 3D pipeline select.	2015-06-11 15:04:09 -07:00
Kristian Høgsberg Kristensen	f7fe06cf0a	vk: Disable shader stages in the graphics pipeline batch We need to move this into the graphics pipeline batch so we don't emit it for compute pipelines.	2015-06-11 14:58:31 -07:00
Kristian Høgsberg Kristensen	9aae480cc4	vk: Don't emit STATE_SIP We don't have a SIP kernel and don't enable exceptions.	2015-06-11 14:56:29 -07:00
Kristian Høgsberg Kristensen	923e923bbc	vk: Compile fragment shader after VS and GS Just moving code around to do shader stages in the natual order.	2015-06-11 14:55:50 -07:00
Jason Ekstrand	1dd63fcbed	vk/entrypoints: Don't print every single function call	2015-06-11 10:10:13 -07:00
Kristian Høgsberg Kristensen	b581e924b6	vk: Remove left-over trp call	2015-06-11 09:26:49 -07:00
Kristian Høgsberg Kristensen	d76ea7644a	vk: Set maximum point size range We set both minimum and maximum point size to 0 in 3DSTATE_CLIP, which will clip away all points.	2015-06-11 09:25:04 -07:00
Kristian Høgsberg Kristensen	a5b49d2799	vk: Use generated headers with fixed point support The generated headers now convert float in the template struct to the correct fixed point format.	2015-06-11 09:25:04 -07:00
Kristian Høgsberg Kristensen	ea7ef46cf9	vk: Regenerate headers with __gen_validate_value()	2015-06-11 09:25:03 -07:00
Jason Ekstrand	a566b1e08a	vk/formats: Refactor format properties code Along with the refactor, we now do the right thing when we hit an unsupported format: Set the flags to 0 and return VK_SUCCESS.	2015-06-11 09:11:16 -07:00
Jason Ekstrand	2a3c29698c	vk/image: Add a bunch of asserts	2015-06-10 21:04:51 -07:00
Jason Ekstrand	c8b62d109b	vk: Add a couple vk_error calls	2015-06-10 21:04:13 -07:00
Jason Ekstrand	7153b56abc	vk/private: Add a non-fatal assert	2015-06-10 21:03:50 -07:00
Jason Ekstrand	29d2bbb2b5	vk/cmd: Add an initial implementation of PipelineBarrier We may want to do something more inteligent here later such as actually handling image layout transitions. However, this should do for now.	2015-06-10 16:37:33 -07:00
Jason Ekstrand	047ed02723	vk/emit: Use valgrind to validate every packed field	2015-06-10 12:43:02 -07:00
Jason Ekstrand	9cae3d18ac	vk: Add valgrind checks in various emit functions The check in batch_bo_finish should catch any undefined values in the batch but isn't that great for debugging. The checks in the various emit functions will help get better granularity.	2015-06-09 21:51:37 -07:00
Jason Ekstrand	d5ad24e39b	vk: Move the valgrind include and VG() macro to private.h	2015-06-09 21:51:37 -07:00
Chad Versace	e17ed04b03	vk/image: Don't double-allocate stencil buffers If the main surface has format S8_UINT, then don't allocate the auxiliary stencil surface.	2015-06-09 16:39:28 -07:00
Chad Versace	1ee2d1c3fc	vk/image: Teach anv_image_choose_tile_mode about WMAJOR	2015-06-09 16:38:55 -07:00
Chad Versace	2d2e148952	vk/util: Add anv_abortf(), anv_abortfv() Convenience functions to print an error message then abort.	2015-06-09 16:38:50 -07:00
Chad Versace	ffb1ee5d20	vk: Define anv_noreturn macro	2015-06-09 16:38:46 -07:00
Chad Versace	f1db3b3869	vk/image: Factor tile mode selection into separate function Because it will eventually need to get smarter.	2015-06-09 16:38:42 -07:00
Jason Ekstrand	11e941900a	vk/device: Actually allow destruction	2015-06-09 16:28:46 -07:00
Jason Ekstrand	5d4b6a01af	vk/cmd_buffer: Properly initialize/reset dynamic states	2015-06-09 16:27:55 -07:00
Jason Ekstrand	634a6150b9	vk/pipeline: Zero out the depth-stencil state when not in use	2015-06-09 16:26:55 -07:00
Jason Ekstrand	919e7b7551	vk/device: Use anv_CreateDynamicViewportState instead of the vk one	2015-06-09 16:01:56 -07:00
Jason Ekstrand	0599d39dd9	vk/device: Dedent the vkCreateDynamicViewportState call	2015-06-09 15:53:26 -07:00
Chad Versace	d57c4cf999	vk/util: Annotate anv_finishme() as printflike	2015-06-09 14:46:49 -07:00
Chad Versace	822cb16abe	vk: Define anv_printflike() macro	2015-06-09 14:46:45 -07:00
Chad Versace	081f617b5a	vk/image: Stop hardcoding alignment of stencil surfaces Look up the alignment from anv_tile_info_table.	2015-06-09 14:16:56 -07:00
Chad Versace	e6bd568f36	vk/image: Rewrite tile info table - Reduce the number of table lookups in anv_image_create from 4 to 1. - Add field for surface alignment. - Shorten field names tile_width, tile_height -> width, height.	2015-06-09 14:16:45 -07:00
Chad Versace	5b777e2bcf	vk/image: Delete an old comment	2015-06-09 14:14:29 -07:00
Jason Ekstrand	d842a6965f	vk/compiler: Free the GL errors data	2015-06-09 12:36:23 -07:00
Jason Ekstrand	9f292219bf	vk/compiler: Free more of prog_data when tearing down a pipeline	2015-06-09 12:36:23 -07:00
Jason Ekstrand	66b00d5e5a	vk/queue: Embed the queue in and allocate it with the device	2015-06-09 12:36:23 -07:00
Jason Ekstrand	38f5eef59d	vk/device: Free border color states when we have valgrind	2015-06-09 12:36:23 -07:00
Jason Ekstrand	999b56c507	vk/device: Destroy all batch buffers Due to a copy+paste error, we were destroying all but the first batch or surface state buffer. Now we destroy them all.	2015-06-09 12:36:23 -07:00
Jason Ekstrand	3a38b0db5f	vk/meta: Clean up temporary objects	2015-06-09 12:36:23 -07:00
Jason Ekstrand	9d6f55dedf	vk/surface_view: Add a destructor	2015-06-09 12:36:23 -07:00
Chad Versace	e6162c2fef	vk/image: Add anv_image::h_align,v_align Use the new fields to compute RENDER_SURFACE_STATE.Surface*Alignment. We still hardcode them to 4, though.	2015-06-09 12:19:24 -07:00
Jason Ekstrand	58afc24e57	vk/allocator: Remove the concept of a slave block pool This reverts commit `d24f8245db`.	2015-06-08 17:46:32 -07:00
Jason Ekstrand	b6363c3f12	vk/device: Remove the binding table pools/streams	2015-06-08 17:45:57 -07:00
Jason Ekstrand	531549d9fc	vk/pipeline: Move freeing the program stream to pipeline.c It's created in pipeline.c so we should free it there.	2015-06-08 14:27:04 -07:00
Jason Ekstrand	66a4dab89a	vk/pipeline: Don't destroy the program stream It's freed in compiler.cpp and we don't want to free it twice.	2015-06-08 13:53:19 -07:00
Jason Ekstrand	920fb771d4	vk/allocator: Make the use of NULL_BLOCK in state_stream_finish explicit	2015-06-08 13:53:19 -07:00
Kristian Høgsberg Kristensen	52637c0996	vk: Quiet a few warnings	2015-06-08 08:51:40 -07:00
Kristian Høgsberg Kristensen	9eab70e54f	vk: Create a minimal context for the compiler This avoids the full brw context initialization and just sets up context constants, initializes extensions and sets a few driver vfuncs for the front-end GLSL compiler.	2015-06-08 08:51:40 -07:00
Jason Ekstrand	ce00233c13	vk/cmd_buffer: Use the dynamic state stream in emit_dynamic and merge_dynamic	2015-06-05 17:26:41 -07:00
Jason Ekstrand	e69588b764	vk/device: Use a 64-byte alignment for CC state	2015-06-05 17:26:26 -07:00
Jason Ekstrand	c2eeab305b	vk/pipeline: Actually free the program stream and dynamic pool	2015-06-05 17:26:26 -07:00
Jason Ekstrand	ed2ca020f8	vk/allocator: Avoid double-free in the bo pool	2015-06-05 17:12:28 -07:00
Jason Ekstrand	aa523d3c62	vk/gem: Call VALGRIND_FREELIKE_BLOCK before unmapping	2015-06-05 16:41:49 -07:00
Chad Versace	87d98e1935	vk: Fix 2 incorrect typecasts The compiler didn't find the cast errors because all Vulkan types are just integers.	2015-06-04 14:32:22 -07:00
Chad Versace	b981379bcf	vk: Make `make clean` remove generated spirv headers	2015-06-04 14:26:46 -07:00
Jason Ekstrand	8d930da35d	vk/allocator: Remove an unneeded VG() wrapper	2015-06-04 09:14:33 -07:00
Jason Ekstrand	7f90e56e42	vk/device: Dissalow device destruction	2015-06-04 09:14:33 -07:00
Chad Versace	9cd42b3dea	vk: Fix build Commit 1286bd, which deleted vk.c, broke the build. Update the Makefile to fix it.	2015-06-04 09:01:30 -07:00
Jason Ekstrand	251aea80b0	vk/DS: Mask stencil masks to 8 bits	2015-06-03 16:59:13 -07:00
Connor Abbott	47bd462b0c	awesome control flow bugfixes/clarifications	2015-06-03 14:10:28 -04:00
Kristian Høgsberg Kristensen	a37d122e88	vk: Set color/blend state in meta clear if not set yet	2015-06-02 23:08:05 -07:00
Kristian Høgsberg Kristensen	1286bd3160	vk: Delete vk.c test case We now have crucible up and running and all vk sub-cases have been moved over. Delete this crufty old hack of a test case.	2015-06-02 22:57:42 -07:00
Kristian Høgsberg Kristensen	2f6aa424e9	vk: Update generated headers with support for 64 bit fields	2015-06-02 22:57:42 -07:00
Kristian Høgsberg Kristensen	5744d1763c	vk: Set cb_state to NULL at cmd buffer create time Dynamic color/blend state can be NULL in case we're not rendering to color targets (only output to depth and/or stencil). Initialize cmd_buffer->cb_state to NULL so we can reliably detect whether it's been set or not.	2015-06-02 22:57:42 -07:00
Kristian Høgsberg Kristensen	c8f078537e	vk: Implement vertexOffset parameter of vkCmdDrawIndexed() As exposed by the func.draw_indexed test, we were ignoring the argument and hardcoding 0.	2015-06-02 22:57:42 -07:00
Jason Ekstrand	e702197e3f	vk/formats: Add a name to the metadata and better logging	2015-06-02 11:30:39 -07:00
Jason Ekstrand	fbafc946c6	vk/formats: Rework the formats table	2015-06-02 11:30:39 -07:00
Kristian Høgsberg Kristensen	f98c89ef31	vk: Move query related functionality to new file query.c	2015-06-01 21:52:45 -07:00
Jason Ekstrand	08748e3a0c	i965: Use NIR by default for vertex shaders on GEN8+ GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell: total instructions in shared programs: 2742062 -> 2681339 (-2.21%) instructions in affected programs: 1514770 -> 1454047 (-4.01%) helped: 5813 HURT: 1120 The gained programs are ARB vertext programs that were previously going through the vec4 backend. Now that we have prog_to_nir, ARB vertex programs can go through the scalar backend so they show up as "gained" in the shader-db results. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Matt Turner <mattst88@gmail.com>	2015-06-01 12:25:58 -07:00
Jason Ekstrand	d4cbf6a728	vk/compiler: Add an index_count to the bind map and check for OOB	2015-06-01 12:25:58 -07:00
Jason Ekstrand	510b5c3bed	vk/HACK: Plumb real descriptor set/index into textures	2015-06-01 12:25:58 -07:00
Jason Ekstrand	aded32bf04	NIR: Add a helper for doing sampler lowering for vulkan	2015-06-01 12:25:58 -07:00
Kristian Høgsberg Kristensen	5caa408579	vk: Indent tables to align '=' at column 48	2015-05-31 22:36:26 -07:00
Kristian Høgsberg Kristensen	76bb658518	vk: Add support for anisotropic bits	2015-05-31 22:15:34 -07:00
Kristian Høgsberg Kristensen	dc56e4f7b8	vk: Implement support for sampler border colors This supports the three Vulkan border color types for float color formats. The support for integer formats is a little trickier, as we don't know the format of the texture at this time.	2015-05-31 17:20:48 -07:00
Jason Ekstrand	e497ac2c62	vk/device: Only flush the texture cache when setting state base address After further examination, it appears that the other flushes and stalls weren't actually needed.	2015-05-30 18:04:50 -07:00
Jason Ekstrand	2251305e1a	vk/cmd_buffer: Track descriptor set dirtying per-stage	2015-05-30 10:07:29 -07:00
Jason Ekstrand	33cccbbb73	vk/device: Emit PIPE_CONTROL flushes surrounding new STATE_BASE_ADDRESS According to the bspec, you're supposed to emit a PIPE_CONTROL with a CS stall and a render target flush prior to chainging STATE_BASE_ADDRESS. A little experimentation, however, shows that this is not enough. It also appears as if you have to flush the texture cache after chainging base address or things won't propagate properly.	2015-05-30 08:08:07 -07:00
Jason Ekstrand	b2b9fc9fad	vk/allocator: Don't call VALGRIND_MALLOCLIKE_BLOCK on fresh gem_mmap's	2015-05-29 21:15:47 -07:00
Jason Ekstrand	03ffa9ca31	vk: Don't crash on partial descriptor sets	2015-05-29 20:43:10 -07:00
Jason Ekstrand	4ffbab5ae0	vk/device: Allow for starting a new surface state buffer This commit allows for us to create a whole new surface state buffer when the old one runs out of room. We simply re-emit the state base address for the new state, re-emit binding tables, and keep going.	2015-05-29 17:49:41 -07:00
Jason Ekstrand	c4bd5f87a0	vk/device: Do lazy surface state emission for binding tables Before, we were emitting surface states up-front when binding tables were updated. Now, we wait to emit the surface states until we emit the binding table. This makes meta simpler and should make it easier to deal with swapping out the surface state buffer.	2015-05-29 16:51:11 -07:00
Kristian Høgsberg Kristensen	4aecec0bd6	vk: Store dynamic slot index with struct anv_descriptor_slot We need to make sure we use the right index into dynamic offset array. Dynamic descriptors can be present or not in different stages and to get the right offset, we need to compute the index at vkCreateDescriptorSetLayout time.	2015-05-29 11:32:53 -07:00
Kristian Høgsberg Kristensen	fad418ff47	vk: Implement dynamic buffer offsets We do this by creating a surface state on the fly that incorporates the dynamic offset. This patch also refactor the descriptor set layout constructor a bit to be less clever with switch statement fall through. Instead of duplicating the subtle code to update the sampler and surface slot map, we just use two switch statements.	2015-05-28 22:41:20 -07:00
Jason Ekstrand	9ffc1bed15	vk/device: Split state base address emit into its own function	2015-05-28 15:34:08 -07:00
Jason Ekstrand	468c89a351	vk/device: Use anv_batch_emit for MI_BATCH_BUFFER_START	2015-05-28 15:25:02 -07:00
Jason Ekstrand	2dc0f7fe5b	vk/device: Actually destroy batch buffers	2015-05-28 13:08:21 -07:00
Jason Ekstrand	8cf932fd25	vk/query: Don't emit a CS stall by itself Both the bspec and the simulator don't like this. I'm not sure if stalling at the scoreboard is right but it at least shuts up the simulator.	2015-05-28 10:27:53 -07:00
Jason Ekstrand	730ca0efb1	vk/device: Fixups for batch buffer chaining Some how these didn't get merged with the other batch buffer chaining stuff. Oh well, it's here now.	2015-05-28 10:26:11 -07:00
Jason Ekstrand	de221a672d	meta: Add a default ds_state and use it when no ds state is set	2015-05-28 10:06:45 -07:00
Jason Ekstrand	6eefeb1f84	vk/meta: Share the dummy RS and CB state between clear and blit	2015-05-28 10:00:38 -07:00
Kristian Høgsberg Kristensen	5a317ef4cb	vk: Initialize dynamic state binding points to NULL We rely on these being initialized to NULL so meta can reliably detect whether or not they've been set. ds_state is also allowed to not be present so we need a well-defined value for that.	2015-05-27 22:13:48 -07:00
Chad Versace	1435bf4bc4	.gitignore: Ignore spirv2nir binary	2015-05-27 17:01:09 -07:00
Chad Versace	f559fe9134	.gitignore: Scope Vulkan's generated source files Don't ignore any file named entrypoints.{c,h}. Ignore it only if it's in src/vulkan.	2015-05-27 16:59:53 -07:00
Chad Versace	ca385dcf2a	vk: gitignore generated source files	2015-05-27 16:57:31 -07:00
Chad Versace	466f61e9f6	vk/glsl_scraper: Replace adhoc arg parsing with argparse	2015-05-27 16:56:02 -07:00
Chad Versace	fab9011c44	vk/image: Assert that VkImageTiling is valid	2015-05-27 16:21:04 -07:00
Chad Versace	c0739043b3	vk/image: Remove trailing whitespace	2015-05-27 16:15:47 -07:00
Chad Versace	4514e63893	vk/glsl: Reject invalid options The script incorrectly interpreted --blah as the input filename.	2015-05-27 16:14:26 -07:00
Chad Versace	fd8b5e0df2	vk/glsl_scraper: Indent large text blocks Indent them to the same level as if the text was code. No changes in entrypoints.{c,h} after a clean build.	2015-05-27 16:09:31 -07:00
Chad Versace	df4b02f4ed	vk/glsl_scraper: Fix code style for imports Python style is one module imported per line, and imports are at the top of the file.	2015-05-27 16:04:12 -07:00
Jason Ekstrand	b23885857f	vk/meta: Actually create the CB state for blits	2015-05-27 12:06:30 -07:00
Jason Ekstrand	da8f148203	vk: Rework anv_batch and use chaining batch buffers This mega-commit primarily does two things. First, is to turn anv_batch into a better abstraction of a batch. Instead of actually having a BO, it now has a few pointers to some piece of memory that are used to add data to the "batch". If it gets to the end, there is a function pointer that it can call to attempt to grow the batch. The second change is to start using chained batch buffers. When the end of the current batch BO is reached, it automatically creates a new one and ineserts an MI_BATCH_BUFFER_START command to chain to it. In this way, our batch buffers are effectively infinite in length.	2015-05-27 11:48:28 -07:00
Jason Ekstrand	59def43fc8	Fixup for growable reloc lists	2015-05-27 11:48:28 -07:00
Jason Ekstrand	1c63575de8	vk/cmd_buffer: Allocate the surface_bo from device->batch_bo_pool	2015-05-27 11:48:28 -07:00
Jason Ekstrand	403266be05	vk/device: Make reloc lists growable	2015-05-27 11:48:28 -07:00
Jason Ekstrand	5ef81f0a05	vk/device: Use a bo pool for batch buffers	2015-05-27 11:48:28 -07:00
Jason Ekstrand	6f3e3c715a	vk/allocator: Add a BO pool	2015-05-27 11:48:28 -07:00
Jason Ekstrand	59328bac10	vk/allocator: Add a free list that acts on pointers instead of offsets	2015-05-27 11:48:28 -07:00
Kristian Høgsberg	a1d30f867d	vk: Add support for dynamic and pipeline color blend state	2015-05-26 17:12:37 -07:00
Kristian Høgsberg	2514ac5547	vk/test: Create and use color/blend dynamic and pipeline state	2015-05-26 17:12:37 -07:00
Kristian Høgsberg	1cd8437b9d	vk/meta: Allocate and set color/blend state For color blend, we have to set our own state to avoid inheriting bogus blend state.	2015-05-26 17:12:37 -07:00
Kristian Høgsberg	610e6291da	vk: Allocate samplers from dynamic stream	2015-05-26 11:50:34 -07:00
Kristian Høgsberg	b29f44218d	vk: Emit color calc state This involves pulling stencil ref values out of DS dynamic state and the blend constant out of CB dynamic state.	2015-05-26 11:27:31 -07:00
Kristian Høgsberg	5e637c5d5a	vk/pack: Generate length macros for structs	2015-05-26 11:27:31 -07:00
Kristian Høgsberg	998837764f	vk: Program depth bias This makes 3DSTATE_RASTER a split state command.	2015-05-26 11:27:31 -07:00
Kristian Høgsberg	0dbed616af	vk: Add support for texture component swizzle This also drops the share create_surface_state helper and moves filling out SURFACE_STATE directly into anv_image_view_init() and anv_color_attachment_view_init().	2015-05-26 11:27:29 -07:00
Kristian Høgsberg	cbe7ed416e	vk: Implement dynamic and pipeline ds state	2015-05-25 20:20:31 -07:00
Kristian Høgsberg	37743f90bc	vk: Set up depth and stencil buffers	2015-05-25 20:20:31 -07:00
Kristian Høgsberg	7c0d0021eb	vk/test: Add new depth-stencil test Not yet a depth stencil test, but will become one.	2015-05-25 20:20:31 -07:00
Kristian Høgsberg	0997a7b2e3	vk: Add basic MOCS settings This matches what we do for GL.	2015-05-25 20:20:31 -07:00
Kristian Høgsberg	c03314bdd3	vk: Update to header files with nested struct support This will let us do MOCS settings right.	2015-05-25 20:20:31 -07:00
Jason Ekstrand	ae8c93e023	vk/cmd_buffer: Initialize the pipeline pointer to NULL If a meta operation is called before the pipeline is set, this can cause uses of undefined values. They should be harmless, but we might as well shut up valgrind on this one too.	2015-05-25 17:14:49 -07:00
Jason Ekstrand	912944e59d	vk/device: Use the correct number of viewports when creating default VP state Fixes valgrind uninitialized value errors	2015-05-25 17:14:49 -07:00
Jason Ekstrand	1b211feb6c	vk/compiler: Zero out the vs_prog_data struct when VS is disabled Prevents uninitialized value errors	2015-05-25 17:14:49 -07:00
Jason Ekstrand	903bd4b056	vk/compiler: Fix up the binding hack and make it work in NIR	2015-05-25 12:57:32 -07:00
Jason Ekstrand	57153da2d5	vk: Actually implement some sort of destructor for all object types	2015-05-22 15:15:08 -07:00
Jason Ekstrand	0f0b5aecb8	vk/pipeline: Track VB's that are actually used by the pipeline Previously, we just blasted out whatever VB's we had marked as "dirty" regardless of which ones were used by the pipeline. Given that the stride of the VB is embedded in the pipeline this can cause problems. One problem is if the pipeline doesn't use the given VB binding we emit a bogus stride. Another problem is that we weren't properly resetting the dirty bits when the pipeline changed.	2015-05-21 16:58:53 -07:00
Jason Ekstrand	0a54751910	vk/device: Memset descriptor sets to 0 and handle descriptor set holes	2015-05-21 16:33:04 -07:00
Jason Ekstrand	519fe765e2	vk: Do relocations in surface states when they are created Previously, we waited until later and did a pass through the used surfaces and did the relocations then. This lead to doing double-relocations which was causing us to get bogus surface offsets.	2015-05-21 15:55:29 -07:00
Jason Ekstrand	ccf2bf9b99	vk/test: Use the glsl_scraper for building shaders	2015-05-21 12:24:02 -07:00
Jason Ekstrand	f3d70e4165	vk/glsl_scraper: Use the LunarG back-door for GLSL source	2015-05-21 12:22:44 -07:00
Jason Ekstrand	cb56372eeb	vk/glsl_scraper: Use a fake GLSL version that glslang will accept	2015-05-21 12:21:02 -07:00
Jason Ekstrand	0e441cde71	vk: Bake the GLSL_VK_SHADER macro into the scraper output file	2015-05-21 12:21:00 -07:00
Jason Ekstrand	f17e835c26	vk/meta: Use glsl_scraper for our GLSL source We are not yet using SPIR-V for meta but this is a first step.	2015-05-21 11:39:54 -07:00
Jason Ekstrand	b13c0f469b	vk: More out-of-tree build fixes	2015-05-21 11:32:59 -07:00
Jason Ekstrand	f294154e42	vk: Fix for out-of-tree builds	2015-05-21 10:23:18 -07:00
Kristian Høgsberg	f9e66ea621	vk: Remove render pass stub call This isn't really a stub.	2015-05-20 20:34:52 -07:00
Kristian Høgsberg	a29df71dd2	vk: Add WSI implementation	2015-05-20 20:34:52 -07:00
Kristian Høgsberg	f886647b75	vk: Add debug stubs	2015-05-20 20:34:52 -07:00
Kristian Høgsberg	63da974529	vk: Mark remaining unsupported formats as such	2015-05-20 20:34:52 -07:00
Kristian Høgsberg	387a1bb58f	vk: Mark VK_FORMAT_UNDEFINED as 1 cpp, 1 channel	2015-05-20 20:34:52 -07:00
Kristian Høgsberg	a1bd426393	vk: Stream surface state instead of using the surface pool Since the binding table pointer is only 16 bits, we can only have 64kb of binding table state allocated at any given time. With a block size of 1kb, that amounts to just 64 command buffers, which is not enough.	2015-05-20 20:34:52 -07:00
Kristian Høgsberg	01504057f5	vk: Use surface_format_info from dri driver for vkGetFormatInfo	2015-05-20 20:34:52 -07:00
Chad Versace	a61f307996	vk: Fix result of vkCreateInstance When fill_physical_device() fails, don't return VK_SUCCESS.	2015-05-20 19:51:10 -07:00
Jason Ekstrand	14929046ba	vk/compiler: Add shader language detection This commit adds support for the LunarG GLSL back-door as well as detecting regular GLSL and SPIR-V. The SPIR-V path doesn't exist yet, so that will cause an assert-fail.	2015-05-20 17:05:41 -07:00
Jason Ekstrand	47c1cf5ce6	vk/test: Add a test for testing buffer copies	2015-05-20 16:20:04 -07:00
Jason Ekstrand	bea66ac5ad	vk/meta: Add support for copying arbitrary size buffers	2015-05-20 16:20:04 -07:00
Jason Ekstrand	9557b85e3d	vk/meta: Use the biggest format possible for buffer copies This should substantially improve throughput of buffer copies.	2015-05-20 16:20:04 -07:00
Jason Ekstrand	13719e9225	vk/meta: Fix buffer copy extents	2015-05-20 16:20:04 -07:00
Jason Ekstrand	d7044a19b1	vk/meta: Use texture() instead of texture2D()	2015-05-19 12:44:35 -07:00
Jason Ekstrand	edff076188	vk: Use binding instead of index in uniform layout qualifiers This more closely matches what the Vulkan docs say to do.	2015-05-19 12:44:22 -07:00
Jason Ekstrand	e37a89136f	vk/glsl_scraper: Add a --glsl-only option	2015-05-19 11:29:07 -07:00
Jason Ekstrand	4bcf58a192	vk/glsl_scraper: Use the line number from the end of the macro We used to use the line number from the start of the macro but this doesn't seem to match the c preprocessor	2015-05-19 11:29:07 -07:00
Jason Ekstrand	1573913194	vk/glsl_scraper: Don't open files until needed This prevents us from writing an empty file when the compile failed.	2015-05-19 11:29:07 -07:00
Kristian Høgsberg	e4c11f50b5	vk: Call finish for binding table state stream	2015-05-18 21:12:13 -07:00
Jason Ekstrand	851495d344	vk/meta: Use the new *view_init functions and stack-allocated views This should save us a good deal of the leakage that meta currently has.	2015-05-18 20:57:43 -07:00
Jason Ekstrand	4668bbb161	vk/image: Factor view creation out into separate _init functions The _init functions work basically the same as the Vulkan entrypoints except that they act on an already-created view and take an optional command buffer option. If a command buffer is given, the surface state is allocated out of the command buffer's state stream.	2015-05-18 20:57:43 -07:00
Jason Ekstrand	7c9f209427	Revert "vk/allocator: Don't use memfd when valgrind is detected" This reverts commit `b6ab076d6b`. It turns out setting USE_MEMFD to 0 is really bad because it means we can't resize the pool. Besides, valgrind SVN handles memfd so we really don't need this fallback for valgrind anymore.	2015-05-18 20:57:43 -07:00
Jason Ekstrand	923691c70d	vk: Use a separate block pool and state stream for binding tables The binding table pointers packet only allows for a 16-bit binding table address so all binding tables have to be in the first 64 KB of the surface state BO. We solve this by adding a slave block pool that pulls off the first 64 KB worth of blocks and reserves them for binding tables.	2015-05-18 20:57:43 -07:00
Jason Ekstrand	d24f8245db	vk/allocator: Add a concept of a slave block pool We probably need a better name but this will do for now.	2015-05-18 20:57:43 -07:00
Kristian Høgsberg	997596e4c4	vk/test: Add test that prints format features	2015-05-18 20:52:44 -07:00
Kristian Høgsberg	241b59cba0	vk/test: Test timestamps and occlusion queries	2015-05-18 20:52:44 -07:00
Kristian Høgsberg	ae9ac47c74	vk: Make timestamp command work correctly This was using the wrong timestamp register and needs to write a 64 bit value.	2015-05-18 20:52:43 -07:00
Kristian Høgsberg	82ddab4b18	vk: Make occlusion query work, both copy and get functions	2015-05-18 20:52:43 -07:00
Kristian Høgsberg	1d40e6ade8	vk: Update generated header files This fixes a problem where register addresses where incorrectly shifted.	2015-05-18 20:52:43 -07:00
Kristian Høgsberg	f330bad545	vk: Only fill render targets for meta clear Clear inherits the render targets from the current render pass. This means we need to fill out the binding table after switching to meta bindings. However, meta copies etc happen outside a render pass and break when we try to fill in the render targets. This change fills the render targets only for meta clear.	2015-05-18 20:52:43 -07:00
Jason Ekstrand	b6c7d8c911	vk/pipeline: Use a state_stream for storing programs Previously, we were effectively using a state_stream, it was just hand-rolled based on a block pool. Now we actually use the data structure.	2015-05-18 15:58:20 -07:00
Jason Ekstrand	4063b7deb8	vk/allocator: Add support for valgrind tracking of state pools and streams We leave the block pool untracked so that reads/writes to freed blocks will get caught and do the tracking at the state pool/stream level. We have to do a few extra gymnastics for streams because valgrind works in terms of poitners and we work in terms of separate map and offset. Fortunately, the users of the state pool and stream should always be using the map pointer provided in the anv_state structure. We just have to track, per block, the map that was used when we initially got the block. Then we can make sure we always use that map and valgrind should stay happy.	2015-05-18 15:58:20 -07:00
Jason Ekstrand	b6ab076d6b	vk/allocator: Don't use memfd when valgrind is detected	2015-05-18 15:58:20 -07:00
Jason Ekstrand	682d11a6e8	vk/allocator: Assert that block_pool_grow succeeds	2015-05-18 15:48:19 -07:00
Jason Ekstrand	28804fb9e4	vk/gem: VG_CLEAR the padding for the gem_mmap struct	2015-05-18 12:05:17 -07:00
Jason Ekstrand	8440b13f55	vk/meta: Rework the indentation style No functional change.	2015-05-18 10:43:51 -07:00
Kristian Høgsberg	5286ef7849	vk: Provide more realistic values for device info	2015-05-18 10:27:08 -07:00
Kristian Høgsberg	69fd473321	vk: Use a temporary buffer for formatting in finishme This is more likely to avoid breaking up the message when racing with other threads.	2015-05-18 10:27:08 -07:00
Jason Ekstrand	cd7ab6ba4e	vk/meta: Add an initial implementation of vkCmdCopyBuffer Compile-tested only	2015-05-18 10:27:08 -07:00
Jason Ekstrand	c25ce55fd3	vk/meta: Add an initial implementation of vkCmdCopyBufferToImage Compile-tested only	2015-05-18 10:27:08 -07:00
Jason Ekstrand	08bd554cda	vk/meta: Add an initial implementation of vkCmdBlitImage Compile-tested only	2015-05-18 10:27:08 -07:00
Jason Ekstrand	fb27d80781	vk/meta: Add an initial implementation of vkCmdCopyImage Compile-tested only	2015-05-18 10:27:08 -07:00
Jason Ekstrand	c15f3834e3	vk/gem: Set the gem_mmap.flags parameter to 0 if it exists	2015-05-18 10:27:08 -07:00
Jason Ekstrand	f7b0f922be	vk/gem: Only VK_CLEAR the addr_ptr in gen_mmap	2015-05-18 10:27:07 -07:00
Kristian Høgsberg	ca7e62d421	vk: Add a logger wrapper for the generated entrypoint	2015-05-18 10:27:07 -07:00
Kristian Høgsberg	eb92745b2e	vk/gem: Just return -1 from anv_gem_wait() on error We were returning -errno, unlike all the other gem functions.	2015-05-18 10:27:07 -07:00
Kristian Høgsberg	05754549e8	vk: Fix vkGetOjectInfo return values We weren't properly returning the allocation count.	2015-05-18 10:27:07 -07:00
Kristian Høgsberg	6afb26452b	vk: Implement fences This basic implementation uses a throw-away bo for synchronization.	2015-05-18 10:27:07 -07:00
Kristian Høgsberg	e26a7ffbd9	vk/meta: Use anv_* internal entrypoints	2015-05-18 10:27:07 -07:00
Kristian Høgsberg	b7fac7a7d1	vk: Implement allocation count query	2015-05-18 10:27:07 -07:00
Kristian Høgsberg	783e6217fc	vk: Change pData/pDataSize semantics We now always copy the entire struct unless pData is NULL and unconditionally write back the struct size. It's not clear this is useful if the structs may grow over time, but it seems to be the expected behaviour for now.	2015-05-18 10:27:07 -07:00
Kristian Høgsberg	b4b3bd1c51	vk: Return VK_SUCCESS from vkAllocDescriptorSets This should've been returning VK_SUCCESS all along.	2015-05-18 10:27:07 -07:00
Kristian Høgsberg	a9f2115486	vk: Return VK_SUCCESS for all descriptor pool entry points	2015-05-18 10:27:07 -07:00
Kristian Høgsberg	60ebcbed54	vk: Start Implementing vkGetFormatInfo() We move the format table and vkGetFormatInfo to their own file in the process.	2015-05-18 10:27:07 -07:00
Kristian Høgsberg	454345da1e	vk: Add script for generating ifunc entry points This lets us generate a hash table for vkGetProcAddress and lets us call public functions internally without the public entrypoint overhead.	2015-05-18 10:27:02 -07:00
Kristian Høgsberg	333bcc2072	vk: Fix vulkan header inconsistency The function pointer typedef and the function prototype for vkCmdClearColorImage() didn't agree. Fix the typedef to match the prototype.	2015-05-17 21:08:31 -07:00
Kristian Høgsberg	b9eb56a404	vk: Add function pointer typedef for intel extension Also guard function prototype by VK_PROTOTYPES.	2015-05-17 21:08:30 -07:00
Kristian Høgsberg	75cb85c56a	vk: Add missing VKAPI for vkQueueRemoveMemReferences	2015-05-17 21:08:30 -07:00
Jason Ekstrand	a924ea0c75	Merge remote-tracking branch 'fdo-personal/wip/nir-vtn' into vulkan This adds the SPIR-V -> NIR translator.	2015-05-16 12:43:16 -07:00
Jason Ekstrand	a63952510d	nir/spirv: Don't assert that the current block is empty It's possible that someone will give us SPIR-V code in which someone needlessly branches to new blocks. We should handle that ok now.	2015-05-16 12:34:34 -07:00
Jason Ekstrand	4e44dcc312	nir/spirv: Add initial support for samplers	2015-05-16 12:34:15 -07:00
Jason Ekstrand	d6f52dfb3e	nir/spirv: Move Exp and Log to the list of currently unhandled ALU ops NIR doesn't have the native opcodes for them anymore	2015-05-16 12:33:32 -07:00
Jason Ekstrand	a53e795524	nir/types: Add support for sampler types	2015-05-16 12:32:58 -07:00
Jason Ekstrand	0fa9211d7f	nir/spirv: Make the global constants in spirv.h static I've been promissed in a bug that this will be fixed in a future version of the header. However, in the interest of my branch building, I'm adding these changes in myself for the moment.	2015-05-16 11:16:34 -07:00
Jason Ekstrand	036a4b1855	nir/spirv: Handle jump-to-loop in a more general way	2015-05-16 11:16:34 -07:00
Jason Ekstrand	56f533b3a0	nir/spirv: Handle boolean uniforms correctly	2015-05-16 11:16:34 -07:00
Jason Ekstrand	64bc58a88e	nir/spirv: Handle control-flow with loops	2015-05-16 11:16:34 -07:00
Jason Ekstrand	3a2db9207d	nir/spirv: Set a name on temporary variables	2015-05-16 11:16:34 -07:00
Jason Ekstrand	a28f8ad9f1	nir/spirv: Use the correct length for copying string literals	2015-05-16 11:16:34 -07:00
Jason Ekstrand	7b9c29e440	nir/spirv: Make vtn_ssa_value handle constants as well as ssa values	2015-05-16 11:16:33 -07:00
Jason Ekstrand	b0d1854efc	nir/spirv: Add initial support for GLSL 4.50 builtins	2015-05-16 11:16:33 -07:00
Jason Ekstrand	1da9876486	nir/spirv: Split the core datastructures into a header file	2015-05-16 11:16:33 -07:00
Jason Ekstrand	98d78856f6	nir/spirv: Use the builder for all instructions We don't actually use it to create all the instructions but we do use it for insertion always. This should make things far more consistent for implementing extended instructions.	2015-05-16 11:16:33 -07:00
Jason Ekstrand	ff828749ea	nir/spirv: Add support for a bunch of ALU operations	2015-05-16 11:16:33 -07:00
Jason Ekstrand	d2a7972557	nir/spirv: Add support for indirect array accesses	2015-05-16 11:16:33 -07:00
Jason Ekstrand	683c99908a	nir/spirv: Explicitly type constants and SSA values	2015-05-16 11:16:33 -07:00
Jason Ekstrand	c5650148a9	nir/spirv: Handle OpBranchConditional We do control-flow handling as a two-step process. The first step is to walk the instructions list and record various information about blocks and functions. This is where the acutal nir_function_overload objects get created. We also record the start/stop instruction for each block. Then a second pass walks over each of the functions and over the blocks in each function in a way that's NIR-friendly and actually parses the instructions.	2015-05-16 11:16:33 -07:00
Jason Ekstrand	ebc152e4c9	nir/spirv: Add a helper for getting a value as an SSA value	2015-05-16 11:16:33 -07:00
Jason Ekstrand	f23afc549b	nir/spirv: Split instruction handling into preamble and body sections	2015-05-16 11:16:33 -07:00
Jason Ekstrand	ae6d32c635	nir/spirv: Implement load/store instructiosn	2015-05-16 11:16:33 -07:00
Jason Ekstrand	88f6fbc897	nir: Add a helper for getting the tail of a deref chain	2015-05-16 11:16:33 -07:00
Jason Ekstrand	06acd174f3	nir/spirv: Actaully add variables to the funciton or shader	2015-05-16 11:16:33 -07:00
Jason Ekstrand	5045efa4aa	nir/spirv: Add a vtn_untyped_value helper	2015-05-16 11:16:33 -07:00
Jason Ekstrand	01f3aa9c51	nir/spirv: Use vtn_value in the types code and fix a off-by-one error	2015-05-16 11:16:33 -07:00
Jason Ekstrand	6ff0830d64	nir/types: Add an is_vector_or_scalar helper	2015-05-16 11:16:33 -07:00
Jason Ekstrand	5acd472271	nir/spirv: Add support for deref chains	2015-05-16 11:16:33 -07:00
Jason Ekstrand	7182597e50	nir/types: Add a scalar type constructor	2015-05-16 11:16:32 -07:00
Jason Ekstrand	eccd798cc2	nir/spirv: Add support for OpLabel	2015-05-16 11:16:32 -07:00
Jason Ekstrand	a6cb9d9222	nir/spirv: Add support for declaring functions	2015-05-16 11:16:32 -07:00
Jason Ekstrand	8ee23dab04	nir/types: Add accessors for function parameter/return types	2015-05-16 11:16:32 -07:00
Jason Ekstrand	707b706d18	nir/spirv: Add support for declaring variables Deref chains and variable load/store operations are still missing.	2015-05-16 11:16:32 -07:00
Jason Ekstrand	b2db85d8e4	nir/spirv: Add support for constants	2015-05-16 11:16:32 -07:00
Jason Ekstrand	3f83579664	nir/spirv: Add basic support for types	2015-05-16 11:16:32 -07:00
Jason Ekstrand	e9d3b1e694	nir/types: Add more helpers for creating types	2015-05-16 11:16:32 -07:00
Jason Ekstrand	fe550f0738	glsl/types: Expose the function_param and struct_field structs to C Previously, they were hidden behind a #ifdef __cplusplus so C wouldn't find them. This commit simpliy moves the ifdef.	2015-05-16 11:16:32 -07:00
Jason Ekstrand	053778c493	glsl/types: Add support for function types	2015-05-16 11:16:32 -07:00
Jason Ekstrand	7b63b3de93	glsl: Add GLSL_TYPE_FUNCTION to the base types enums	2015-05-16 11:16:32 -07:00
Jason Ekstrand	2b570a49a9	nir/spirv: Rework the way values are added Instead of having functions to add values and set various things, we just have a function that does a few asserts and then returns the value. The caller is then responsible for setting the various fields.	2015-05-16 11:16:32 -07:00
Jason Ekstrand	f9a31ba044	nir/spirv: Add stub support for extension instructions	2015-05-16 11:16:32 -07:00
Jason Ekstrand	4763a13b07	REVERT: Add a simple helper program for testing SPIR-V -> NIR translation	2015-05-16 11:16:32 -07:00
Jason Ekstrand	cae8db6b7e	glsl/compiler: Move the error_no_memory stub to standalone_scaffolding.cpp	2015-05-16 11:16:32 -07:00
Jason Ekstrand	98452cd8ae	nir: Add the start of a SPIR-V to NIR translator At the moment, it can handle the very basics of strings and can ignore debug instructions. It also has basic support for decorations.	2015-05-16 11:16:32 -07:00
Jason Ekstrand	573ca4a4a7	nir: Import the revision 30 SPIR-V header from Khronos	2015-05-16 11:16:31 -07:00
Jason Ekstrand	057bef8a84	vk/device: Use bias rather than layers for computing binding table size Because we statically use the first 8 binding table entries for render targets, we need to create a table of size 8 + surfaces.	2015-05-16 10:42:53 -07:00
Jason Ekstrand	22e61c9da4	vk/meta: Make clear a no-op if no layers need clearing Among other things, this prevents recursive meta.	2015-05-16 10:30:05 -07:00
Jason Ekstrand	120394ac92	vk/meta: Save and restore the old bindings pointer If we don't do this then recursive meta is completely broken. What happens is that the outer meta call may change the bindings pointer and the inner meta call will change it again and, when it exits set it back to the default. However, the outer meta call may be relying on it being left alone so it uses the non-meta descriptor sets instead of its own.	2015-05-16 10:28:04 -07:00
Jason Ekstrand	4223de769e	vk/device: Simplify surface_count calculation	2015-05-16 10:23:09 -07:00
Jason Ekstrand	eb1952592e	vk/glsl_helpers: Fix GLSL_VK_SHADER with respect to commas Previously, the GLSL_VK_SHADER macro didn't work if the shader contained commas outside of parentheses due to the way the C preprocessor works. This commit fixes this by making it variadic again and doing it correctly this time.	2015-05-15 22:17:07 -07:00
Kristian Høgsberg	3b9f32e893	vk: Make cmd_buffer->bindings a pointer This lets us save and restore efficiently by just moving the pointer to a temporary bindings struct for meta.	2015-05-15 18:12:07 -07:00
Kristian Høgsberg	9540130c41	vk: Move vertex buffers into struct anv_bindings	2015-05-15 16:34:31 -07:00
Kristian Høgsberg	0cfc493775	vk: Fix GLSL_VK_SHADER macro Stringify doesn't work with __ARGV__. The last macro argument swallows up excess arguments and as such we can just stringify that.	2015-05-15 16:15:04 -07:00
Kristian Høgsberg	af45f4a558	vk: Fix warning from missing initializer Struct initializers need to be { 0, } to zero out the variable they're initializing.	2015-05-15 16:07:17 -07:00
Kristian Høgsberg	bf096c9ec3	vk: Build binding tables at bind descriptor time This changes the way descriptor sets and layouts work so that we fill out binding table contents at the time we bind descriptor sets. We manipulate the binding table contents and sampler state in a shadow-copy in anv_cmd_buffer. At draw time, we allocate the actual binding table and sampler state and flush the anv_cmd_buffer copies.	2015-05-15 16:05:31 -07:00
Kristian Høgsberg	1f6c220b45	vk: Update the bind map length to reflect MAX_SETS	2015-05-15 15:22:29 -07:00
Kristian Høgsberg	b806e80e66	vk: Flip back to using memfd for the allocators	2015-05-15 15:22:29 -07:00
Kristian Høgsberg	0a775e1eab	vk: Rename dyn_state_pool to dynamic_state_pool Given that we already tolerate surface_state_pool and the even longer instruction_state_pool, there's no reason to arbitrarily abbreviate dynamic.	2015-05-15 15:22:29 -07:00
Kristian Høgsberg	f5b0f1351f	vk: Consolidate image, buffer and color attachment views These are all just surface state, offset and a bo.	2015-05-15 15:22:29 -07:00
Jason Ekstrand	41db8db0f2	vk: Add a GLSL scraper utility This new utility, glsl_scraper.py scrapes C files for instances of the GLSL_VK_SHADER macro, pulls out the shader source, and compiles it to SPIR-V. The compilation is done using glslValidator. The result is then placed into another C file as arrays of dwords that can be easiliy handed to a Vulkan driver.	2015-05-14 19:18:57 -07:00
Jason Ekstrand	79ace6def6	vk/meta: Add a magic GLSL shader source macro	2015-05-14 19:07:34 -07:00
Jason Ekstrand	018a0c1741	vk/meta: Add a better comment about the VS for blits	2015-05-14 11:39:32 -07:00
Jason Ekstrand	8c92701a69	vk/test: Use VK_IMAGE_TILING_OPTIMAL for the render target	2015-05-13 22:27:38 -07:00
Jason Ekstrand	4fb8bddc58	vk/test: Do a copy of the RT into a linear buffer and write that to a PNG	2015-05-13 22:23:30 -07:00
Jason Ekstrand	bd5b76d6d0	vk/meta: Add the start of a blit implementation Currently, we only implement CopyImageToBuffer	2015-05-13 22:23:30 -07:00
Jason Ekstrand	94b8c0b810	vk/pipeline: Default to a SamplerCount of 1 for PS	2015-05-13 22:23:30 -07:00
Jason Ekstrand	d3d4776202	vk/pipeline: Add an extra flag for force-disabling the vertex shader This way we can pass in a vertex shader and yet have the pipeline emit an empty 3DSTATE_VS packet. We need this for meta because we need to trick the compiler into not deleting our inputs but at the same time disable the VS so that we can use a rectlist. This should go away once we actually get SPIR-V.	2015-05-13 22:23:30 -07:00
Jason Ekstrand	a1309c5255	vk/pass: Emit a flushing pipe control at the end of the pass This is rather crude but it at least makes sure that all the render targets get flushed at the end of the pass. We probably actually want to do somthing based on image layout traansitions, but this will work for now.	2015-05-13 22:23:30 -07:00
Jason Ekstrand	07943656a7	vk/compiler: Set the binding table texture_start This is by no means a complete solution to the binding table problems. However, it does make texturing actually work. Before, we were texturing from the render target since they were both starting at 0.	2015-05-13 22:23:30 -07:00
Jason Ekstrand	cd197181f2	vk/compiler: Zero the prog data We use prog_data[stage] != NULL to determine whether or not we need to clean up that stage. Make sure it default to NULL.	2015-05-13 22:22:59 -07:00
Jason Ekstrand	1f7dcf9d75	vk/image: Stash more information in images and views	2015-05-13 22:22:59 -07:00
Jason Ekstrand	43126388cd	vk/meta: Save/restore more stuff in cmd_buffer_restore	2015-05-13 22:22:59 -07:00
Chad Versace	50806e8dec	vk: Install headers I need this for building a testsuite.	2015-05-13 17:49:26 -07:00
Kristian Høgsberg	83c7e1f1db	vk: Add support for sampler descriptors	2015-05-13 14:47:11 -07:00
Kristian Høgsberg	4f9eaf77a5	vk: Use a typesafe anv_descriptor struct	2015-05-13 14:47:11 -07:00
Kristian Høgsberg	5c9d77600b	vk: Create and bind a sampler in vk.c	2015-05-13 14:47:11 -07:00
Kristian Høgsberg	18acfa7301	vk: Fix copy-n-paste sType in vkCreateSampler	2015-05-13 14:47:11 -07:00
Kristian Høgsberg	a1ec789b0b	vk: Add a dynamic state stream to anv_cmd_buffer We'll need this for sampler state.	2015-05-13 14:47:11 -07:00
Kristian Høgsberg	3f52c016fa	vk: Move struct anv_sampler to private.h	2015-05-13 14:47:11 -07:00
Kristian Høgsberg	a77229c979	vk: Allocate layout->count number of descriptors layout->count is the number of descriptors the application requested. layout->total is the number of entries we need across all stages.	2015-05-13 14:47:11 -07:00
Kristian Høgsberg	a3fd136509	vk: Fill out sampler state from API values	2015-05-13 14:47:11 -07:00
Chad Versace	828817b88f	vk: Ignore vk executable	2015-05-13 12:05:38 -07:00
Kristian Høgsberg	2b7a060178	vk: Fix stale error handling in vkQueueSubmit	2015-05-12 14:38:58 -07:00
Kristian Høgsberg	cb986ef597	vk: Submit all cmd buffers passed to vkQueueSubmit	2015-05-12 14:38:12 -07:00
Kristian Høgsberg	9905481552	vk: Add generated header for HSW and IVB (GEN75 and GEN7)	2015-05-12 14:29:04 -07:00
Jason Ekstrand	ffe9f60358	vk: Add stub() and stub_return() macros and mark piles of functions as stubs	2015-05-12 13:45:02 -07:00
Jason Ekstrand	d3b374ce59	vk/util: Add a anv_finishme function/macro	2015-05-12 13:43:36 -07:00
Jason Ekstrand	7727720585	vk/meta: Break setting up meta clear state into it's own functin	2015-05-12 13:03:50 -07:00
Jason Ekstrand	4336a1bc00	vk/pipeline: Add support for disabling the scissor in "extra"	2015-05-12 12:53:01 -07:00
Kristian Høgsberg	d77c34d1d2	vk: Add clear load-op for render passes	2015-05-11 23:25:29 -07:00
Kristian Høgsberg	b734e0bcc5	vk: Add support for driver-internal custom pipelines This lets us disable the viewport, use rect lists and repclear.	2015-05-11 23:25:29 -07:00
Kristian Høgsberg	ad132bbe48	vk: Fix 3DSTATE_VERTEX_BUFFER emission Set VertexBufferIndex to the attribute binding, not the location.	2015-05-11 23:25:29 -07:00
Kristian Høgsberg	6a895c6681	vk: Add 32 bpc signed and unsigned integer formats	2015-05-11 23:25:29 -07:00
Kristian Høgsberg	55b9b703ea	vk: Add anv_batch_emit_merge() helper macro This lets us emit a state packet by merging to half-backed versions, typically one from the pipeline object and one from a dynamic state objects.	2015-05-11 23:25:28 -07:00
Kristian Høgsberg	099faa1a2b	vk: Store bo pointer in anv_image and anv_buffer We don't need to point back to the memory object the bo came from. Pointing directly to a bo lets us bind images and buffers to other bos - like our allocator bos.	2015-05-11 23:25:28 -07:00
Kristian Høgsberg	4f25f5d86c	vk: Support not having a vertex shader This lets us bypass the vertex shader and pass data straight into the rasterizer part of the pipeline.	2015-05-11 23:25:28 -07:00
Kristian Høgsberg	20ad071190	vk: Allow NULL as a valid pipeline layout Vertex buffers and render targets aren't part of the layout so having an empty layout is pretty common.	2015-05-11 22:12:56 -07:00
Kristian Høgsberg	769785c497	Add vulkan driver for BDW	2015-05-09 11:38:32 -07:00

2601 changed files with 298042 additions and 65421 deletions

									
										1

.dir-locals.el
									
												View File
												
				@@ -5,6 +5,7 @@

				  (c-file-style . "stroustrup")

				  (fill-column . 78)

				  (eval . (progn

					    (c-set-offset 'case-label '0)

					    (c-set-offset 'innamespace '0)

					    (c-set-offset 'inline-open '0)))

				  )

3

.gitignore vendored

View File

@@ -34,6 +34,7 @@ aclocal.m4
 config.log
 config.status
 cscope*
 tags
 .scon*
 config.py
 build
@@ -46,3 +47,5 @@ manifest.txt
 Makefile
 Makefile.in
 .install-mesa-links
 .install-gallium-links
 /src/git_sha1.h

460

.mailmap Normal file

View File

@@ -0,0 +1,460 @@
 Aapo Tahkola <aet@rasterburn.org> <aapo@aapo-desktop.(none)>
 Adam Jackson <ajax@redhat.com> <ajax@benzedrine.nwnk.net>
 Adam Jackson <ajax@redhat.com> <ajax@freedesktop.org>
 Adrian Marius Negreanu <adrian.m.negreanu@intel.com> Adrian Negreanu <adrian.m.negreanu@intel.com>
 Adrian Marius Negreanu <adrian.m.negreanu@intel.com> Negreanu Marius Adrian <adrian.m.negreanu@intel.com>
 Dave Airlie <airlied@redhat.com> <airliedfreedesktop.org>
 Dave Airlie <airlied@redhat.com> airlied <airlied@unused-12-215.bne.redhat.com>
 Dave Airlie <airlied@redhat.com> <airlied@dhcp-1-203.bne.redhat.com>
 Dave Airlie <airlied@redhat.com> <airlied@gmail.com>
 Dave Airlie <airlied@redhat.com> <airlied@itt42.(none)>
 Dave Airlie <airlied@redhat.com> <airlied@linux.ie>
 Dave Airlie <airlied@redhat.com> <airlied@nx6125b.(none)>
 Dave Airlie <airlied@redhat.com> <airlied@panoply-rh.(none)>
 Dave Airlie <airlied@redhat.com> <airlied@ppcg5.localdomain>
 Alan Coopersmith <alan.coopersmith@oracle.com> <alan.coopersmith@sun.com>
 Alan Hourihane <alanh@vmware.com> <alanh@tungstengraphics.com>
 Alan Hourihane <alanh@vmware.com> <alanh@fairlite.demon.co.uk>
 Alan Hourihane <alanh@vmware.com> <alanh@jetpack.(none)>
 Alexander Monakov <amonakov@gmail.com> <amonakov@ispras.ru>
 Alexander von Gluck IV <kallisti5@unixzen.com> Alexander von Gluck <kallisti5@unixzen.com>
 Alex Corscadden <alexc@vmware.com> <alexc@alexc-dev1.prom.eng.vmware.com>
 Alex Corscadden <alexc@vmware.com> <alexc@alexc-dev1.vmware.com>
 Alex Deucher <alexdeucher@gmail.com> <alexander.deucher@amd.com>
 Alex Deucher <alexdeucher@gmail.com> <agd5f@yahoo.com>
 Alex Deucher <alexdeucher@gmail.com> <alex@botch2.com>
 Alex Deucher <alexdeucher@gmail.com> <alex@botch2.(none)>
 Alex Deucher <alexdeucher@gmail.com> <alex@cube.(none)>
 Alex Deucher <alexdeucher@gmail.com> <alex@samba.(none)>
 Andreas Fänger <a.faenger@e-sign.com> <a.faenger@e-sign.com>
 Andreas Hartmetz <ahartmetz@gmail.com> <andreas.hartmetz@kdab.com>
 Andre Heider <a.heider@gmail.com>
 Andreas Heider <andreas@heider.io>
 Andreas Pokorny <andreas.pokorny@canonical.com> <andreas.pokorny@elektrobit.com>
 Andrew Randrianasulu <randrianasulu@gmail.com> <randrik_a@yahoo.com>
 Andrew Randrianasulu <randrianasulu@gmail.com> <randrik@mail.ru>
 Arthur Huillet <arthur.huillet@free.fr> Arthur HUILLET <arthur.huillet@free.fr>
 Benjamin Franzke <benjaminfranzke@googlemail.com> ben <benjaminfranzke@googlemail.com>
 Ben Skeggs <bskeggs@redhat.com> <darktama@beleth.(none)>
 Ben Skeggs <bskeggs@redhat.com> <darktama@iinet.net.au>
 Ben Skeggs <bskeggs@redhat.com> <darktama@nisroch.keine.ath.cx>
 Ben Skeggs <bskeggs@redhat.com> <skeggsb-at-gmail.com>
 Ben Skeggs <bskeggs@redhat.com> <skeggsb@gmail.com>
 Ben Skeggs <bskeggs@redhat.com> <skeggsb@localhost.localdomain>
 Ben Skeggs <bskeggs@redhat.com> <skeggsb@nisroch.keine.ath.cx>
 Ben Widawsky <benjamin.widawsky@intel.com> Ben Widawsky <ben@bwidawsk.net>
 Blair Sadewitz <blair.sadewitz@gmail.com> Blair Sadewitz <blair.sadewitz.gmail.com>
 Boris Peterbarg <reist@users.sourceforge.net> reist <reist>
 Brian Paul <brianp@vmware.com> Brian <brian.paul@tungstengraphics.com>
 Brian Paul <brianp@vmware.com> <brian.paul@tungstengraphics.com>
 Brian Paul <brianp@vmware.com> <brian.e.paul@gmail.com>
 Brian Paul <brianp@vmware.com> <brianp@kemper.freedesktop.org>
 Brian Paul <brianp@vmware.com> brian <brian@cvp965.(none)>
 Brian Paul <brianp@vmware.com> Brian <brian@i915.localnet.net>
 Brian Paul <brianp@vmware.com> Brian <brian@nostromo.localnet.net>
 Brian Paul <brianp@vmware.com> Brian <brian@poulsbo.localnet.net>
 Brian Paul <brianp@vmware.com> Brian <brian@ps3.localnet.net>
 Brian Paul <brianp@vmware.com> Brian <brianp@vmware.com>
 Brian Paul <brianp@vmware.com> Brian <brian@yutani.localnet.net>
 Brian Paul <brianp@vmware.com> root <brian.paul@tungstengraphics.com>
 Brian Paul <brianp@vmware.com> root <root@i915.localnet.net>
 Brian Paul <brianp@vmware.com> root <root@nostromo.localnet.net>
 Brian Paul <brianp@vmware.com> root <root@i965.localnet.net>
 Bruce Merry <bmerry@users.sourceforge.net> <bmerry@gmail.com>
 Carl-Philip Hänsch <cphaensch@googlemail.com> Carl-Philip Haensch <s3734770@mail.zih.tu-dresden.de>
 Carl-Philip Hänsch <cphaensch@googlemail.com> Carl-Philip Haensch <carli@carli-laptop.(none)>
 Carl-Philip Hänsch <cphaensch@googlemail.com> Carl-Philip Haensch <Carl-Philip.Haensch@mailbox.tu-dresden.de>
 Chad Versace <chad.versace@intel.com> <chad@chad-versace.us>
 Chad Versace <chad.versace@intel.com> <Chad Versace chad@chad-versace.us>
 Chad Versace <chad.versace@intel.com> <chad.versace@linux.intel.com>
 Chia-I Wu <olvaffe@gmail.com> <olv@lunarg.com>
 Chia-I Wu <olvaffe@gmail.com> Chia-Wu <olvaffe@gmail.com>
 Chih-Wei Huang <cwhuang@linux.org.tw> Chih-Wei Huang <cwhuang@android-x86.org>
 Christian König <christian.koenig@amd.com> Christian Koenig <christian.koenig@amd.com>
 Christian König <christian.koenig@amd.com> Christian König <christian.koenig at amd.com>
 Christian König <christian.koenig@amd.com> Christian König <deathsimple@vodafone.de>
 Christoph Brill <egore911@egore911.de> Christoph Bill <egore@gmx.de>
 Christoph Brill <egore911@egore911.de> <egore@gmx.de>
 Christoph Bumiller <christoph.bumiller@speed.at> <e0425955@student.tuwien.ac.at>
 Christopher James Halse Rogers <christopher.halse.rogers@canonical.com> Christopher James Halse Rogers <raof@ubuntu.com>
 Claudio Ciccani <klan@directfb.org> <klan@users.sf.net>
 Claudio Ciccani <klan@directfb.org> <klan@users.sourceforge.net>
 Connor Abbott <cwabbott0@gmail.com> <connor.w.abbott@intel.com>
 Connor Abbott <cwabbott0@gmail.com> <connor.abbott@intel.com>
 Corbin Simpson <MostAwesomeDude@gmail.com> <mostawesomed...@gmail.com>
 Corbin Simpson <MostAwesomeDude@gmail.com> <mostawesomedude@gmail.com>
 Courtney Goeltzenleuchter <courtney@lunarg.com> <courtney@LunarG.com>
 Daniel Skinner <sio@users.sourceforge.net> sio <sio>
 Daniel Stone <daniels@collabora.com> <daniel@fooishbar.org>
 David Miller <davem@davemloft.net> David S. Miller <davem@davemloft.net>
 David Miller <davem@davemloft.net> Dave Miller <davem@davemloft.net>
 David Miller <davem@davemloft.net> davem69 <davem69>
 David Heidelberger <david.heidelberger@ixit.cz> David Heidelberg <david@ixit.cz>
 David Heidelberger <david.heidelberger@ixit.cz> <d.okias@gmail.com>
 David Reveman <reveman@chromium.org> <c99drn@cs.umu.se>
 Dieter Nützel <Dieter@nuetzel-hh.de> Dieter Nützel <dieter@nuetzel-hh.de>
 Dmitry Cherkassov <dcherkassov@gmail.com> Dmitry Cherkasov <dcherkassov@gmail.com>
 Dylan Baker <dylanx.c.baker@intel.com> <baker.dylan.c@gmail.com>
 Emeric Grange <emeric.grange@gmail.com> Emeric <emeric.grange@gmail.com>
 Emil Velikov <emil.l.velikov@gmail.com> <emil.velikov@collabora.com>
 Eric Anholt <eric@anholt.net> Eric Anholt <anholt@FreeBSD.org>
 Eugeni Dodonov <eugeni.dodonov@intel.com> <eugeni@mandriva.com>
 Fabian Bieler <der.fabe@gmx.net> <fabianbieler@fastmail.fm>
 Fabian Bieler <der.fabe@gmx.net> <&lt;der.fabe@gmx.net&gt>
 Feng, Haitao <haitao.feng@intel.com> Haitao Feng <haitao.feng@intel.com>
 Frank Henigman <fjhenigman@google.com> <fjhenigman@chromium.org>
 George Sapountzis <gsapountzis@gmail.com> George Sapountzis <gsap7@yahoo.gr>
 Gwenole Beauchesne <gwenole.beauchesne@intel.com> <gb.devel@gmail.com>
 Hamish Marson <hmarson@users.sourceforge.net> hmarson <hmarson>
 Hans de Goede <hdegoede@redhat.com> Hans de Goede <j.w..r..degoede@hhs.nl>
 Homer Hsing <dongsheng.xing@intel.com> <homer.hsing@gmail.com>
 Hui Qi Tay <hqtay@vmware.com> <tayhuiqithq@gmail.com>
 Ian Romanick <ian.d.romanick@intel.com> <idr@freedesktop.org>
 Ian Romanick <ian.d.romanick@intel.com> <idr@us.ibm.com>
 Jakob Bornecrantz <wallbraker@gmail.com> <jakob@vmware.com>
 Jakob Bornecrantz <wallbraker@gmail.com> <jakob@aurora.(none)>
 Jakob Bornecrantz <wallbraker@gmail.com> <jakob@aurora.walkyrie.se>
 Jakob Bornecrantz <wallbraker@gmail.com> <jakob@tungstengraphics.com>
 Jakob Bornecrantz <wallbraker@gmail.com> <wallbraker 'at' gmail 'dot' com>
 Jakub Bogusz <qboosh@pld-linux.org> <gboosh@pld-linux.org>
 James Legg <jlegg@feralinteractive.com> <lankyleggy@gmail.com>
 Jan Vesely <jano.vesely@gmail.com> Jan Vesely <jan.vesely@rutgers.edu>
 Jason Ekstrand <jason@jlekstrand.net> <jason.ekstrand@intel.com>
 Jeremy Huddleston <jeremyhu@apple.com> <jeremyhu@freedesktop.org>
 Jeremy Huddleston <jeremyhu@apple.com> <jeremy@tifa.local>
 Jeremy Huddleston <jeremyhu@apple.com> <jeremy@vincent.local>
 Jeremy Huddleston <jeremyhu@apple.com> <jeremy@yuffie.local>
 Jeremy Huddleston <jeremyhu@apple.com> Jeremy Huddleston Sequoia <jeremyhu@apple.com>
 Jeremy Kolb <jkolb@freedesktop.org> <jkolb@brandeis.edu>
 Jerome Glisse <jglisse@redhat.com> <glisse@freedesktop.org>
 Jerome Glisse <jglisse@redhat.com> <glisse@kemper.freedesktop.org>
 Jerome Glisse <jglisse@redhat.com> John Doe <glisse@barney.(none)>
 Jerome Glisse <jglisse@redhat.com> John Doe <glisse@localhost.localdomain>
 Jesse Barnes <jesse.barnes@intel.com> <jbarnes@hobbes.lan>
 Jesse Barnes <jesse.barnes@intel.com> <jbarnes@hobbes.(none)>
 Jesse Barnes <jesse.barnes@intel.com> <jbarnes@jbarnes-desktop.localdomain>
 Jesse Barnes <jesse.barnes@intel.com> <jbarnes@jbarnes-t61.(none)>
 Jesse Barnes <jesse.barnes@intel.com> <jbarnes@virtuousgeek.org>
 Joakim Sindholt <bacn@zhasha.com> <opensource@zhasha.com>
 Joakim Sindholt <bacn@zhasha.com> <zhasha@gallium-dev.(none)>
 Jochen Gerlach <jtg@users.sourceforge.net> jtg <jtg>
 Joel Bosveld <joel.bosveld@gmail.com> <Joel.Bosveld@gmail.com>
 Jonathan Adamczewski <jadamcze@utas.edu.au> <jadamcze@utas.edu.a>
 Jon Turney <jon.turney@dronecode.org.uk> Jon TURNEY <jon.turney@dronecode.org.uk>
 José Fonseca <jfonseca@vmware.com> Jose Fonseca <jfonseca@vmware.com>
 José Fonseca <jfonseca@vmware.com> Jose Fonseca <jrfonseca@tungstengraphics.com>
 José Fonseca <jfonseca@vmware.com> <jfonseca@pegasus.(none)>
 José Fonseca <jfonseca@vmware.com> <jfonseca@titan.(none)>
 José Fonseca <jfonseca@vmware.com> <jose.r.fonseca@gmail.com>
 José Fonseca <jfonseca@vmware.com> <jrfonseca@tungstengraphics.com>
 José Fonseca <jfonseca@vmware.com> <j_r_fonseca@yahoo.co.uk>
 Jouk Jansen <joukj@hrem.nano.tudelft.nl> Jouk Jansen <jouk@hrem.nano.tudelft.nl>
 Jouk Jansen <joukj@hrem.nano.tudelft.nl> Jouk Jansen <joukj@hrem.stm.tudelft.nl>
 Jouk Jansen <joukj@hrem.nano.tudelft.nl> joukj <joukj@tarantella.(none)>
 Jouk Jansen <joukj@hrem.nano.tudelft.nl> Jouk <joukj@tarantella.nano.tudelft.nl>
 Jouk Jansen <joukj@hrem.nano.tudelft.nl> Jouk <joukj@tarantella.(none)>
 Jouk Jansen <joukj@hrem.nano.tudelft.nl> J.Jansen <joukj@tarantella.nano.tudelft.nl>
 Juan Zhao <juan.j.zhao@intel.com> <juan.j.zhao@linux.intel.com>
 Julien Cristau <jcristau@debian.org> <julien.cristau@logilab.fr>
 Julien Isorce <j.isorce@samsung.com> <julien.isorce@gmail.com>
 Kalyan Kondapally <kalyan.kondapally@intel.com> <kondapallykalyancontribute@gmail.com>
 Karl Schultz <karl.w.schultz@gmail.com> Karl Schultze <k.w.schultz@comcast.net>
 Karl Schultz <karl.w.schultz@gmail.com> unknown <kwschult@.na.qualcomm.com>
 Karl Schultz <karl.w.schultz@gmail.com> <k.w.schultz@comcast.net>
 Karl Schultz <karl.w.schultz@gmail.com> <Karl.W.Schultz@gmail.com>
 Karl Schultz <karl.w.schultz@gmail.com> <kschultz@freedesktop.org>
 Keith Harrison <sio2@users.sourceforge.net> sio2 <sio2>
 Keith Packard <keithp@keithp.com> <keithp@koto.keithp.com>
 Keith Packard <keithp@keithp.com> <keithp@neko.keithp.com>
 Keith Whitwell <keithw@vmware.com> <keith@tungstengraphics.com>
 Keith Whitwell <keithw@vmware.com> keithw <keithw@keithw-laptop.(none)>
 Kristian Høgsberg <krh@bitplanet.net> <krh@redhat.com>
 Kristian Høgsberg <krh@bitplanet.net> <krh@hinata.boston.redhat.com>
 Kristian Høgsberg <krh@bitplanet.net> <krh@sasori.boston.redhat.com>
 Kristian Høgsberg <krh@bitplanet.net> <krh@temari.boston.redhat.com>
 Kristian Høgsberg <krh@bitplanet.net> <kristian.h.kristensen@intel.com>
 Krzesimir Nowak <qdlacz@gmail.com> <krzesimir@kinvolk.io>
 Li Peng <peng.li@intel.com> <peng.li@linux.intel.com>
 Lucas Stach <dev@lynxeye.de> <l.stach@pengutronix.de>
 Maarten Lankhorst <maarten.lankhorst@ubuntu.com> <dev@mblankhorst.nl>
 Maarten Lankhorst <maarten.lankhorst@ubuntu.com> <m.b.lankhorst@gmail.com>
 Maarten Lankhorst <maarten.lankhorst@ubuntu.com> <maarten.lankhorst@canonical.com>
 Maciej Cencora <m.cencora@gmail.com> <maciej@osiris.(none)>
 Marc-André Lureau <marcandre.lureau@gmail.com> Marc-Andre Lureau <marcandre.lureau@gmail.com>
 Marc Dietrich <marvin24@gmx.de> Marc <marvin24@gmx.de>
 Marc Dietrich <marvin24@gmx.de> marvin24 <marvin24@gmx.de>
 Marcin Ślusarz <marcin.slusarz@gmail.com> Marcin Slusarz <marcin.slusarz@gmail.com>
 Marek Olšák <marek.olsak@amd.com> <maraeo@gmail.com>
 Mario Kleiner <mario.kleiner.de@gmail.com> kleinerm <mario.kleiner@tuebingen.mpg.de>
 Mario Kleiner <mario.kleiner.de@gmail.com> <mario.kleiner@tuebingen.mpg.de>
 Mark Mueller <markkmueller@gmail.com> <MarkKMueller@gmail.com>
 Marta Lofstedt <marta.lofstedt@intel.com> <marta.lofstedt@linux.intel.com>
 Martin Peres <martin.peres@linux.intel.com> <martin.peres@labri.fr>
 Mathias Fröhlich <mathias.froehlich@gmx.net> Mathias Froehlich <Mathias.Froehlich@gmx.net>
 Mathias Fröhlich <mathias.froehlich@gmx.net> Mathias Froehlich <Mathias.Froehlich@web.de>
 Mathias Fröhlich <mathias.froehlich@gmx.net> Mathias Frohlich <M.Froehlich@science-computing.de>
 Mathias Fröhlich <mathias.froehlich@gmx.net> <frohlich8@users.sourceforge.net>
 Mathias Fröhlich <mathias.froehlich@gmx.net> <Mathias.Froehlich@gmx.net>
 Mathias Fröhlich <mathias.froehlich@gmx.net> <Mathias.Froehlich@web.de>
 Mathias Fröhlich <mathias.froehlich@gmx.net> M.Froehlich@science-computing.de <M.Froehlich@science-computing.de>
 Matthew W. S. Bell <matthew@bells23.org.uk> Matthew Bell <matthew@bells23.org.uk>
 Maxence Le Doré <maxence.ledore@gmail.com> Maxence Le Dore <maxence.ledore@gmail.com>
 Micah Fedke <micah.fedke@collabora.co.uk> <M.Fedke@Astronautics.com>
 Michal Krol <michal@vmware.com> <michal@tungstengraphics.com>
 Michal Krol <michal@vmware.com> Michal Krol <michal@ubuntu-vbox.(none)>
 Michal Krol <michal@vmware.com> Michal Krol <mjkrol@gmail.org>
 Michal Krol <michal@vmware.com> michal <michal@capacitor.(none)>
 Michal Krol <michal@vmware.com> michal <michal@michal-laptop.(none)>
 Michal Krol <michal@vmware.com> michal <michal@quad.(none)>
 Michal Krol <michal@vmware.com> michal <michal@transistor.(none)>
 Michal Krol <michal@vmware.com> Michal <michal@tungstengraphics.com>
 Michal Krol <michal@vmware.com> michal <michal@wmvare.com>
 Michel Dänzer <michel@daenzer.net> <michel.daenzer@amd.com>
 Michel Dänzer <michel@daenzer.net> <daenzer@vmware.com>
 Michel Dänzer <michel@daenzer.net> <michel@tungstengraphics.com>
 Michel Dänzer <michel@daenzer.net> Michel Daenzer <michel.daenzer@amd.com>
 Michel Dänzer <michel@daenzer.net> Michel Daenzer <daenzer@localhost.(none)>
 Mike Kaplinskiy <mike.kaplinskiy@gmail.com> Mike Kaplinksiy <mike.kaplinskiy@gmail.com>
 Mike Kaplinskiy <mike.kaplinskiy@gmail.com> <mike.kaplinskiy@gmai.com>
 Mike Stroyan <mike@lunarg.com> <mike@LunarG.com>
 Nian Wu <nian.wu@intel.com> <nian@graphics.(none)>
 Nian Wu <nian.wu@intel.com> <nian@tinderbox.sh.intel.com>
 Nick Bowler <nbowler@draconx.ca>
 Nick Sarnie <commendsarnex@gmail.com>
 Nicolai Hähnle <nicolai.haehnle@amd.com> <nhaehnle@gmail.com>
 Nicolai Hähnle <nicolai.haehnle@amd.com> Nicolai Haehnle <nhaehnle@gmail.com>
 Nicolai Hähnle <nicolai.haehnle@amd.com> Nicolai Haehnle <prefect_@gmx.net>
 Nicolai Hähnle <nicolai.haehnle@amd.com> Nicolai Haehnle <prefect@upb.de>
 Nigel Stewart <nigels@users.sourceforge.net> <nigels@sourceforge.net>
 Nigel Stewart <nigels@users.sourceforge.net> <nstewart@nvidia.com>
 nobled <nobled@dreamwidth.org> <nobled2@nobled2-karmic.(none)>
 Oliver McFadden <oliver.mcfadden@linux.intel.com> <z3ro.geek@gmail.com>
 Owain Ainsworth <zerooa@googlemail.com> Owain G. Ainsworth <oga@openbsd.org>
 Owen W. Taylor <otaylor@fishsoup.net> Owen Taylor <otaylor@snell.localdomain>
 Patrice Mandin <patmandin@gmail.com> <patrice@manoir.racoon.city>
 Patrice Mandin <patmandin@gmail.com> <pmandin@caramail.com>
 Patrice Mandin <patmandin@gmail.com> <pmandin@freedesktop.org>
 Pauli Nieminen <pauli.nieminen@linux.intel.com> <suokkos@gmail.com>
 Paulo Zanoni <paulo.r.zanoni@intel.com> Paulo Zanoni <pzanoni@mandriva.com>
 Paul Seidler <sepek@exherbo.org> Paul Seidler <pl.seidler@googlemail.com>
 Pekka Paalanen <pekka.paalanen@collabora.co.uk> <ppaalanen@gmail.com>
 Pekka Paalanen <pekka.paalanen@collabora.co.uk> <pq@iki.fi>
 Peter Hutterer <peter.hutterer@who-t.net> <peter@cs.unisa.edu.au>
 Pierre-Eric Pelloux-Prayer <pelloux@gmail.com> pepp <pelloux@gmail.com>
 Pierre Willenbrock <pierre@pirsoft.de> Pierre Willenbrok <pierre@pirsoft.de>
 Quentin Glidic <sardemff7+git@sardemff7.net> <sardemff7@sardemff7.net>
 RALOVICH, Kristóf <tade60@freemail.hu> <kristof.ralovich@gmail.com>
 Richard Li <richardradeon@gmail.com> <RichardZ.Li@amd.com>
 # The next ones are not 100% sure
 Richard Li <richardradeon@gmail.com> richard <richard@richard-desktop3.(none)>
 Richard Li <richardradeon@gmail.com> richard <richard@richard-desktop.(none)>
 Richard Li <richardradeon@gmail.com> root <root@richard-desktop.(none)>
 Richard Sandiford <rsandifo@linux.vnet.ibm.com> <r.sandiford@uk.ibm.com>
 Rob Clark <robclark@freedesktop.org> <Rob Clark robdclark@freedesktop.org>
 Rob Clark <robclark@freedesktop.org> <robdclark@gmail.com>
 Robert Bragg <robert@sixbynine.org> <robert@linux.intel.com>
 Robert Ellison <papillo@vmware.com> <papillo@i965-laptop.(none)>
 Robert Ellison <papillo@vmware.com> <papillo@tungstengraphics.com>
 Robert Hooker <sarvatt@ubuntu.com> <robert.hooker@canonical.com>
 Roland Scheidegger <sroland@vmware.com> <rscheidegger@gmx.ch>
 Roland Scheidegger <sroland@vmware.com> <sroland@tungstengraphics.com>
 Roy Spliet <rspliet@eclipso.eu> <r.spliet@student.tudelft.nl>
 Rune Petersen <rune@megahurts.dk> Rune Peterson <rune@megahurts.dk>
 Ryan Houdek <sonicadvance1@gmail.com> <Sonicadvance1@gmail.com>
 Sam Hocevar <sam@hocevar.net> Sam Hocevar <sam@zoy.org>
 Samuel Iglesias Gonsálvez <siglesias@igalia.com> Samuel Iglesias Gonsalvez <siglesias@igalia.com>
 Sean D'Epagnier <sean@depagnier.com> <geckosenator@freedesktop.org>
 Serge Martin <edb+mesa@sigluy.net> Serge Martin (EdB) <edb+mesa@sigluy.net>
 Serge Martin <edb+mesa@sigluy.net> EdB <edb+mesa@sigluy.net>
 Sinclair Yeh <syeh@vmware.com> <sinclair.yeh@intel.com>
 Stefan Brüns <stefan.bruens@rwth-aachen.de> <Stefan.Bruens@rwth-aachen.de>
 Stéphane Marchesin <marcheu@chromium.org> Stephane Marchesin <marchesin@icps.u-strasbg.fr>
 Stéphane Marchesin <marcheu@chromium.org> Stephane Marchesin <stephane.marchesin@gmail.com>
 Sven M. Hallberg <pesco@users.sourceforge.net> pesco <pesco>
 Tapani Pälli <tapani.palli@intel.com> <tapani.palli@gmail.com>
 Tapani Pälli <tapani.palli@intel.com> Tapani <tapani.palli@intel.com>
 Thierry Reding <treding@nvidia.com> <thierry@gilfi.de>
 Thierry Reding <treding@nvidia.com> <thierry.reding@avionic-design.de>
 Thierry Vignaud <thierry.vignaud@gmail.com> <tvignaud@mandriva.com>
 Thomas Balling Sørensen <tball@io.dk> <tball@tball-laptop.(none)>
 Thomas Hellstrom <thellstrom@vmware.com> Thomas <thellstrom@vmware.com>
 Thomas Hellstrom <thellstrom@vmware.com> Thomas Hellstrom <thellstrom-at-vmware-dot-com>
 Thomas Hellstrom <thellstrom@vmware.com> Thomas Hellstrom <thomas-at-tungstengraphics-dot-com>
 Thomas Hellstrom <thellstrom@vmware.com> Thomas Hellstrom <thomas@tungstengraphics.com>
 Thomas Hellstrom <thellstrom@vmware.com> Thomas Hellström <thomas@tungstengraphics.com>
 Thomas Tanner <tanner@gmx.net> tanner <tanner>
 Tilman Sauerbeck <tilman@code-monkey.de> <tilman@freedesktop.org>
 Timothy Arceri <timothy.arceri@collabora.com> <t_arceri@yahoo.com.au>
 Timothy Arceri <timothy.arceri@collabora.com> Timothy <t_arceri@yahoo.com.au>
 Tom Fogal <tfogal@alumni.unh.edu> <tfogal@sci.utah.edu>
 Tom Stellard <thomas.stellard@amd.com> <tstellar@gmail.com>
 Tom Stellard <thomas.stellard@amd.com> Thomas Stellard <tom.stellard@amd.com>
 Tormod Volden <debian.tormod@gmail.com> <lists.tormod@gmail.com>
 Török Edwin <edwin+mesa@etorok.net> Török Edvin <edwintorok@gmail.com>
 Török Edwin <edwin+mesa@etorok.net> <edwintorok@gmail.com>
 Ville Syrjälä <ville.syrjala@linux.intel.com> Ville Syrjala <syrjala@freedesktop.org>
 Ville Syrjälä <ville.syrjala@linux.intel.com> Ville Syrjala <syrjala@sci.fi>
 Vincent Lejeune <vljn@ovi.com> <peluche.canard@gmail.com>
 Vinson Lee <vlee@freedesktop.org> <vlee@vmware.com>
 Zhenyu Wang <zhenyuw@linux.intel.com> Wang Zhenyu <zhenyu.z.wang@intel.com>
 Zack Rusin <zackr@vmware.com> <zack@kde.org>
 Zack Rusin <zackr@vmware.com> <zack@pixel.(none)>
 Zack Rusin <zackr@vmware.com> <zack@tungstengraphics.com>
 Zhang <zxpmyth@yahoo.com.cn> zhang <zxpmyth@yahoo.com.cn>

									
										101

.travis.yml
									
										Normal file
									
												View File
												
				@@ -0,0 +1,101 @@

				language: c

				sudo: false

				cache:

				  directories:

				    - $HOME/.ccache

				addons:

				  apt:

				    packages:

				      - libdrm-dev

				      - libudev-dev

				      - x11proto-xf86vidmode-dev

				      - libexpat1-dev

				      - libxcb-dri2-0-dev

				      - libx11-xcb-dev

				      - llvm-3.4-dev

				      - scons

				env:

				  global:

				    - XORG_RELEASES=http://xorg.freedesktop.org/releases/individual

				    - XCB_RELEASES=http://xcb.freedesktop.org/dist

				    - XORGMACROS_VERSION=util-macros-1.19.0

				    - GLPROTO_VERSION=glproto-1.4.17

				    - DRI2PROTO_VERSION=dri2proto-2.8

				    - DRI3PROTO_VERSION=dri3proto-1.0

				    - PRESENTPROTO_VERSION=presentproto-1.0

				    - LIBPCIACCESS_VERSION=libpciaccess-0.13.4

				    - LIBDRM_VERSION=libdrm-2.4.65

				    - XCBPROTO_VERSION=xcb-proto-1.11

				    - LIBXCB_VERSION=libxcb-1.11

				    - LIBXSHMFENCE_VERSION=libxshmfence-1.2

				    - PKG_CONFIG_PATH=$HOME/prefix/lib/pkgconfig

				  matrix:

				    - BUILD=make

				    - BUILD=scons

				install:

				  - export PATH="/usr/lib/ccache:$PATH"

				  - pip install --user mako

				  # Install dependencies where we require specific versions (or where

				  # disallowed by Travis CI's package whitelisting).

				  - wget $XORG_RELEASES/util/$XORGMACROS_VERSION.tar.bz2

				  - tar -jxvf $XORGMACROS_VERSION.tar.bz2

				  - (cd $XORGMACROS_VERSION && ./configure --prefix=$HOME/prefix && make install)

				  - wget $XORG_RELEASES/proto/$GLPROTO_VERSION.tar.bz2

				  - tar -jxvf $GLPROTO_VERSION.tar.bz2

				  - (cd $GLPROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)

				  - wget $XORG_RELEASES/proto/$DRI2PROTO_VERSION.tar.bz2

				  - tar -jxvf $DRI2PROTO_VERSION.tar.bz2

				  - (cd $DRI2PROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)

				  - wget $XORG_RELEASES/proto/$DRI3PROTO_VERSION.tar.bz2

				  - tar -jxvf $DRI3PROTO_VERSION.tar.bz2

				  - (cd $DRI3PROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)

				  - wget $XORG_RELEASES/proto/$PRESENTPROTO_VERSION.tar.bz2

				  - tar -jxvf $PRESENTPROTO_VERSION.tar.bz2

				  - (cd $PRESENTPROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)

				  - wget $XCB_RELEASES/$XCBPROTO_VERSION.tar.bz2

				  - tar -jxvf $XCBPROTO_VERSION.tar.bz2

				  - (cd $XCBPROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)

				  - wget $XCB_RELEASES/$LIBXCB_VERSION.tar.bz2

				  - tar -jxvf $LIBXCB_VERSION.tar.bz2

				  - (cd $LIBXCB_VERSION && ./configure --prefix=$HOME/prefix && make install)

				  - wget $XORG_RELEASES/lib/$LIBPCIACCESS_VERSION.tar.bz2

				  - tar -jxvf $LIBPCIACCESS_VERSION.tar.bz2

				  - (cd $LIBPCIACCESS_VERSION && ./configure --prefix=$HOME/prefix && make install)

				  - wget http://dri.freedesktop.org/libdrm/$LIBDRM_VERSION.tar.bz2

				  - tar -jxvf $LIBDRM_VERSION.tar.bz2

				  - (cd $LIBDRM_VERSION && ./configure --prefix=$HOME/prefix && make install)

				  - wget $XORG_RELEASES/lib/$LIBXSHMFENCE_VERSION.tar.bz2

				  - tar -jxvf $LIBXSHMFENCE_VERSION.tar.bz2

				  - (cd $LIBXSHMFENCE_VERSION && ./configure --prefix=$HOME/prefix && make install)

				# Disabled LLVM (and therefore r300 and r600) because the build fails

				# with "undefined reference to `clock_gettime'" and "undefined

				# reference to `setupterm'" in llvmpipe.

				script:

				  - if test "x$BUILD" = xmake; then

				      ./autogen.sh --enable-debug

				        --disable-gallium-llvm

				        --with-egl-platforms=x11,drm

				        --with-dri-drivers=i915,i965,radeon,r200,swrast,nouveau

				        --with-gallium-drivers=svga,swrast,vc4,virgl

				        ;

				      make && make check;

				    elif test x$BUILD = xscons; then

				      scons;

				    fi

									
										25

Android.common.mk
									
												View File
												
				@@ -21,13 +21,8 @@

				# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER

				# DEALINGS IN THE SOFTWARE.

				# use c99 compiler by default

				ifeq ($(LOCAL_CC),)

				ifeq ($(LOCAL_IS_HOST_MODULE),true)

				LOCAL_CC := $(HOST_CC) -std=c99 -D_GNU_SOURCE

				else

				LOCAL_CC := $(TARGET_CC) -std=c99

				endif

				LOCAL_CFLAGS += -D_GNU_SOURCE

				endif

				LOCAL_C_INCLUDES += \

				@@ -37,6 +32,8 @@ LOCAL_C_INCLUDES += \

				MESA_VERSION := $(shell cat $(MESA_TOP)/VERSION)

				# define ANDROID_VERSION (e.g., 4.0.x => 0x0400)

				LOCAL_CFLAGS += \

					-Wno-unused-parameter \

					-Wno-date-time \

					-DPACKAGE_VERSION=\"$(MESA_VERSION)\" \

					-DPACKAGE_BUGREPORT=\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\" \

					-DANDROID_VERSION=0x0$(MESA_ANDROID_MAJOR_VERSION)0$(MESA_ANDROID_MINOR_VERSION)

				@@ -57,14 +54,18 @@ LOCAL_CFLAGS += \

					-DHAVE___BUILTIN_CLZLL \

					-DHAVE___BUILTIN_UNREACHABLE \

					-DHAVE_PTHREAD=1 \

					-DHAVE_DLOPEN \

					-fvisibility=hidden \

					-Wno-sign-compare

				# mesa requires at least c99 compiler

				LOCAL_CONLYFLAGS += \

					-std=c99

				ifeq ($(strip $(MESA_ENABLE_ASM)),true)

				ifeq ($(TARGET_ARCH),x86)

				LOCAL_CFLAGS += \

					-DUSE_X86_ASM \

					-DHAVE_DLOPEN \

				endif

				endif

				@@ -82,9 +83,19 @@ LOCAL_CPPFLAGS += \

					-Wno-error=non-virtual-dtor \

					-Wno-non-virtual-dtor

				ifeq ($(MESA_LOLLIPOP_BUILD),true)

				  LOCAL_CFLAGS_32 += -DDEFAULT_DRIVER_DIR=\"/system/lib/$(MESA_DRI_MODULE_REL_PATH)\"

				  LOCAL_CFLAGS_64 += -DDEFAULT_DRIVER_DIR=\"/system/lib64/$(MESA_DRI_MODULE_REL_PATH)\"

				else

				  LOCAL_CFLAGS += -DDEFAULT_DRIVER_DIR=\"/system/lib/$(MESA_DRI_MODULE_REL_PATH)\"

				endif

				# uncomment to keep the debug symbols

				#LOCAL_STRIP_MODULE := false

				ifeq ($(strip $(LOCAL_MODULE_TAGS)),)

				LOCAL_MODULE_TAGS := optional

				endif

				# Quiet down the build system and remove any .h files from the sources

				LOCAL_SRC_FILES := $(patsubst %.h, , $(LOCAL_SRC_FILES))

									
										17

Android.mk
									
												View File
												
				@@ -24,7 +24,7 @@

				# BOARD_GPU_DRIVERS should be defined.  The valid values are

				#

				#   classic drivers: i915 i965

				#   gallium drivers: swrast freedreno i915g ilo nouveau r300g r600g radeonsi vc4 vmwgfx

				#   gallium drivers: swrast freedreno i915g ilo nouveau r300g r600g radeonsi vc4 virgl vmwgfx

				#

				# The main target is libGLES_mesa.  For each classic driver enabled, a DRI

				# module will also be built.  DRI modules will be loaded by libGLES_mesa.

				@@ -42,11 +42,15 @@ $(call local-intermediates-dir)

				endef

				endif

				MESA_DRI_MODULE_REL_PATH := dri

				MESA_DRI_MODULE_PATH := $(TARGET_OUT_SHARED_LIBRARIES)/$(MESA_DRI_MODULE_REL_PATH)

				MESA_DRI_MODULE_UNSTRIPPED_PATH := $(TARGET_OUT_SHARED_LIBRARIES_UNSTRIPPED)/$(MESA_DRI_MODULE_REL_PATH)

				MESA_COMMON_MK := $(MESA_TOP)/Android.common.mk

				MESA_PYTHON2 := python

				classic_drivers := i915 i965

				gallium_drivers := swrast freedreno i915g ilo nouveau r300g r600g radeonsi vmwgfx vc4

				gallium_drivers := swrast freedreno i915g ilo nouveau r300g r600g radeonsi vmwgfx vc4 virgl

				MESA_GPU_DRIVERS := $(strip $(BOARD_GPU_DRIVERS))

				@@ -84,18 +88,21 @@ MESA_ENABLE_LLVM := $(if $(filter radeonsi,$(MESA_GPU_DRIVERS)),true,false)

				ifneq ($(strip $(MESA_GPU_DRIVERS)),)

				SUBDIRS := \

					src/gbm \

					src/loader \

					src/mapi \

					src/glsl \

					src/compiler \

					src/mesa \

					src/util \

					src/egl \

					src/mesa/drivers/dri

				INC_DIRS := $(call all-named-subdir-makefiles,$(SUBDIRS))

				ifeq ($(strip $(MESA_BUILD_GALLIUM)),true)

				SUBDIRS += src/gallium

				INC_DIRS += $(call all-named-subdir-makefiles,src/gallium)

				endif

				include $(call all-named-subdir-makefiles,$(SUBDIRS))

				include $(INC_DIRS)

				endif

									
										14

Makefile.am
									
												View File
												
				@@ -22,20 +22,29 @@

				SUBDIRS = src

				AM_DISTCHECK_CONFIGURE_FLAGS = \

					--enable-dri \

					--enable-dri3 \

					--enable-egl \

					--enable-gallium-tests \

					--enable-gallium-osmesa \

					--enable-gallium-llvm \

					--enable-gbm \

					--enable-gles1 \

					--enable-gles2 \

					--enable-glx \

					--enable-glx-tls \

					--enable-nine \

					--enable-opencl \

					--enable-opengl \

					--enable-va \

					--enable-vdpau \

					--enable-xa \

					--enable-xvmc \

					--disable-llvm-shared-libs \

					--with-egl-platforms=x11,wayland,drm \

					--with-egl-platforms=x11,wayland,drm,surfaceless \

					--with-dri-drivers=i915,i965,nouveau,radeon,r200,swrast \

					--with-gallium-drivers=i915,ilo,nouveau,r300,r600,radeonsi,freedreno,svga,swrast

					--with-gallium-drivers=i915,ilo,nouveau,r300,r600,radeonsi,freedreno,svga,swrast,vc4,virgl \

					--with-vulkan-drivers=intel

				ACLOCAL_AMFLAGS = -I m4

				@@ -51,7 +60,6 @@ noinst_HEADERS = \

					include/c99_alloca.h \

					include/c99_compat.h \

					include/c99_math.h \

					include/c99 \

					include/c11 \

					include/D3D9 \

					include/HaikuGL \

106

REVIEWERS Normal file

View File

@@ -0,0 +1,106 @@
 Overview:
 	This file is similar in syntax (or more precisly a subset) of what is
 	used by the MAINTAINERS file in the linux kernel.  Some fields do not
 	apply, for example, in all cases, send patches to:
 		mesa-dev@lists.freedesktop.org
 	and in all cases the patchwork instance is:
 		https://patchwork.freedesktop.org/project/mesa/
 	The purpose is not exactly the same the MAINTAINERS file in the linux
 	kernel, as there are not official/formal maintainers of different
 	subsystems in mesa, but is meant to give an idea of who to CC for
 	various patches for review, and to allow the use of
 	scripts/get_reviewer.pl as git --cc-cmd.
 Usage:
 	When sending patches:
 		git send-email --cc-cmd ./scripts/get_reviewer.pl ...
 	Or to configure as default:
 		git config sendemail.cccmd ./scripts/get_reviewer.pl
 Descriptions of section entries:
 	R: Designated reviewer: FullName <address@domain>
 	   These reviewers should be CCed on patches.
 	F: Files and directories with wildcard patterns.
 	   A trailing slash includes all files and subdirectory files.
 	   F:	drivers/net/	all files in and below drivers/net
 	   F:	drivers/net/*	all files in drivers/net, but not below
 	   F:	*/net/*		all files in "any top level directory"/net
 	   One pattern per line.  Multiple F: lines acceptable.
 	N: Files and directories with regex patterns.
 	   N:	[^a-z]tegra	all files whose path contains the word tegra
 	   One pattern per line.  Multiple N: lines acceptable.
 	   scripts/get_maintainer.pl has different behavior for files that
 	   match F: pattern and matches of N: patterns.  By default,
 	   get_maintainer will not look at git log history when an F: pattern
 	   match occurs.  When an N: match occurs, git log history is used
 	   to also notify the people that have git commit signatures.
 Maintainers List (try to look for most precise areas first)
 Note: this is an opt-in system, I have not tried to add anyone who hasn't
 either asked me or sent a patch to add themselves.
 		-----------------------------------
 NIR
 R:	Jason Ekstrand <jason@jlekstrand.net>
 F:	src/compiler/nir/
 DOCUMENTATION
 R: Emil Velikov <emil.l.velikov@gmail.com>
 F: docs/
 F: doxygen/
 COMPATIBILITY HEADERS
 R: Emil Velikov <emil.l.velikov@gmail.com>
 F: include/c99*
 DRI LOADER
 R: Emil Velikov <emil.l.velikov@gmail.com>
 F: src/loader/
 GALLIUM LOADER
 R: Emil Velikov <emil.l.velikov@gmail.com>
 F: src/gallium/auxiliary/pipe-loader/
 F: src/gallium/auxiliary/target-helpers/
 GALLIUM TARGETS
 R: Emil Velikov <emil.l.velikov@gmail.com>
 F: src/gallium/targets/
 AUTOCONF BUILD
 R: Emil Velikov <emil.l.velikov@gmail.com>
 F: configure.ac
 F: */Automake.inc
 F: */Makefile.*am
 F: */Makefile.sources
 SCONS BUILD
 F: scons/
 F: */SConscript*
 F: */Makefile.sources
 ANDROID BUILD
 R: Emil Velikov <emil.l.velikov@gmail.com>
 F: CleanSpec.mk
 F: */Android.*mk
 F: */Makefile.sources
 WAYLAND EGL SUPPORT
 R: Daniel Stone <daniels@collabora.com>
 F: src/egl/wayland/*
 F: src/egl/drivers/dri2/platform_wayland.c
 FREEDRENO
 R:	Rob Clark <robclark@freedesktop.org>
 F:	src/gallium/drivers/freedreno/

									
										19

SConstruct
									
												View File
												
				@@ -1,7 +1,7 @@

				#######################################################################

				# Top-level SConstruct

				#

				# For example, invoke scons as 

				# For example, invoke scons as

				#

				#   scons build=debug llvm=yes machine=x86

				#

				@@ -12,13 +12,13 @@

				#   build='debug'

				#   llvm=True

				#   machine='x86'

				# 

				#

				# Invoke

				#

				#   scons -h

				#

				# to get the full list of options. See scons manpage for more info.

				#  

				#

				import os

				import os.path

				@@ -36,7 +36,7 @@ common.AddOptions(opts)

				env = Environment(

					options = opts,

					tools = ['gallium'],

					toolpath = ['#scons'],	

					toolpath = ['#scons'],

					ENV = os.environ,

				)

				@@ -53,7 +53,7 @@ else:

				    print 'scons: warning: targets option is deprecated; pass the targets on their own such as'

				    print

				    print '  scons %s' % ' '.join(targets)

				    print 

				    print

				    COMMAND_LINE_TARGETS.append(targets)

				@@ -84,9 +84,14 @@ env.Append(CPPPATH = [

				#print env.Dump()

				# Add a check target for running tests

				check = env.Alias('check')

				env.AlwaysBuild(check)

				#######################################################################

				# Invoke host SConscripts 

				# 

				# Invoke host SConscripts

				#

				# For things that are meant to be run on the native host build machine, instead

				# of the target machine.

				#

2

VERSION

View File

@@ -1 +1 @@
 .1.1
 .3.0-devel

									
										76

appveyor.yml
									
										Normal file
									
												View File
												
				@@ -0,0 +1,76 @@

				# http://www.appveyor.com/docs/appveyor-yml

				#

				# To setup AppVeyor for your own personal repositories do the following:

				# - Sign up

				# - Add a new project

				# - Select Git and fill in the Git clone URL

				# - Setup a Git hook as explained in

				#   https://github.com/appveyor/webhooks#installing-git-hook

				# - Check 'Settings > General > Skip branches without appveyor.yml'

				# - Check 'Settings > General > Rolling builds'

				# - Setup the global or project notifications to your liking

				#

				# Note that kicking (or restarting) a build via the web UI will not work, as it

				# will fail to find appveyor.yml .  The Git hook is the most practical way to

				# kick a build.

				#

				# See also:

				# - http://help.appveyor.com/discussions/problems/2209-node-grunt-build-specify-a-project-or-solution-file-the-directory-does-not-contain-a-project-or-solution-file

				# - http://help.appveyor.com/discussions/questions/1184-build-config-vs-appveyoryaml

				version: '{build}'

				branches:

				  except:

				  - /^travis.*$/

				# Don't download the full Mesa history to speed up cloning.  However the clone

				# depth must not be too small, otherwise builds might fail when lots of patches

				# are committed in succession, because the desired commit is not found on the

				# truncated history.

				#

				# See also:

				# - https://www.appveyor.com/blog/2014/06/04/shallow-clone-for-git-repositories

				clone_depth: 100

				cache:

				- win_flex_bison-2.4.5.zip

				- llvm-3.3.1-msvc2013-mtd.7z

				environment:

				  WINFLEXBISON_ARCHIVE: win_flex_bison-2.4.5.zip

				  LLVM_ARCHIVE: llvm-3.3.1-msvc2013-mtd.7z

				install:

				# Check pip

				- python --version

				- python -m pip --version

				# Install Mako

				- python -m pip install --egg Mako

				# Install SCons

				- python -m pip install --egg scons==2.4.1

				- scons --version

				# Install flex/bison

				- if not exist "%WINFLEXBISON_ARCHIVE%" appveyor DownloadFile "http://downloads.sourceforge.net/project/winflexbison/%WINFLEXBISON_ARCHIVE%"

				- 7z x -y -owinflexbison\ "%WINFLEXBISON_ARCHIVE%" > nul

				- set Path=%CD%\winflexbison;%Path%

				- win_flex --version

				- win_bison --version

				# Download and extract LLVM

				- if not exist "%LLVM_ARCHIVE%" appveyor DownloadFile "https://people.freedesktop.org/~jrfonseca/llvm/%LLVM_ARCHIVE%"

				- 7z x -y "%LLVM_ARCHIVE%" > nul

				- mkdir llvm\bin

				- set LLVM=%CD%\llvm

				build_script:

				- scons -j%NUMBER_OF_PROCESSORS% MSVC_VERSION=12.0 llvm=1

				after_build:

				- scons -j%NUMBER_OF_PROCESSORS% MSVC_VERSION=12.0 llvm=1 check

				# It's possible to setup notification here, as described in

				# http://www.appveyor.com/docs/notifications#appveyor-yml-configuration , but

				# doing so would cause the notification settings to be replicated across all

				# repos, which is most likely undesired.  So it's better to rely on the

				# Appveyor global/project notification settings.

5

bin/.cherry-ignore

View File

@@ -1,5 +0,0 @@
 # As per Marek http://lists.freedesktop.org/archives/mesa-stable/2015-December/003600.html
 c4fd7b1ec679d10992b42a2811cab8245a5 Revert "radeonsi: disable DCC on Stoney"
 # causes regression in xwayland, kde/plasma, mpv, steam ... fdo#92759
 839793680f99b8387bee9489733d5071c10f3ace i965: Use MESA_FORMAT_B8G8R8X8_SRGB for RGB visuals

									
										35

bin/get-extra-pick-list.sh
									
										Executable file
									
												View File
												
				@@ -0,0 +1,35 @@

				#!/bin/sh

				# Script for generating a list of candidates which fix commits that have been

				# previously cherry-picked to a stable branch.

				#

				# Usage examples:

				#

				# $ bin/get-extra-pick-list.sh

				# $ bin/get-extra-pick-list.sh > picklist

				# $ bin/get-extra-pick-list.sh | tee picklist

				# Use the last branchpoint as our limit for the search

				# XXX: there should be a better way for this

				latest_branchpoint=`git branch | grep \* | cut -c 3-`-branchpoint

				# Grep for commits with "cherry picked from commit" in the commit message.

				git log --reverse --grep="cherry picked from commit" $latest_branchpoint..HEAD |\

					grep "cherry picked from commit" |\

					sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' |\

					cut -c -8 |\

				while read sha

				do

					# Check if the original commit is referenced in master

					git log -n1 --pretty=oneline --grep=$sha $latest_branchpoint..origin/master |\

						cut -c -8 |\

					while read candidate

					do

						# Check if the potential fix, hasn't landed in branch yet.

						found=`git log -n1 --pretty=oneline --reverse --grep=$candidate $latest_branchpoint..HEAD |wc -l`

						if test $found = 0

						then

							echo Commit $candidate might need to be picked, as it references $sha

						fi

					done

				done

									
										1

common.py
									
												View File
												
				@@ -97,6 +97,7 @@ def AddOptions(opts):

				    opts.Add(BoolOption('embedded', 'embedded build', 'no'))

				    opts.Add(BoolOption('analyze',

				                        'enable static code analysis where available', 'no'))

				    opts.Add(BoolOption('asan', 'enable Address Sanitizer', 'no'))

				    opts.Add('toolchain', 'compiler toolchain', default_toolchain)

				    opts.Add(BoolOption('gles', 'EXPERIMENTAL: enable OpenGL ES support',

				                        'no'))

485

configure.ac

View File

@@ -68,13 +68,13 @@ OPENCL_VERSION=1
 AC_SUBST([OPENCL_VERSION])
 dnl Versions for external dependencies
 LIBDRM_REQUIRED=2.4.60
 LIBDRM_REQUIRED=2.4.66
 LIBDRM_RADEON_REQUIRED=2.4.56
 LIBDRM_AMDGPU_REQUIRED=2.4.63
 LIBDRM_INTEL_REQUIRED=2.4.61
 LIBDRM_NVVIEUX_REQUIRED=2.4.33
 LIBDRM_NOUVEAU_REQUIRED=2.4.62
 LIBDRM_FREEDRENO_REQUIRED=2.4.65
 LIBDRM_NVVIEUX_REQUIRED=2.4.66
 LIBDRM_NOUVEAU_REQUIRED=2.4.66
 LIBDRM_FREEDRENO_REQUIRED=2.4.67
 DRI2PROTO_REQUIRED=2.6
 DRI3PROTO_REQUIRED=1.0
 PRESENTPROTO_REQUIRED=1.0
@@ -99,6 +99,7 @@ AM_PROG_CC_C_O
 AM_PROG_AS
 AX_CHECK_GNU_MAKE
 AC_CHECK_PROGS([PYTHON2], [python2.7 python2 python])
 AC_CHECK_PROGS([PYTHON3], [python3.5 python3.4 python3])
 AC_PROG_SED
 AC_PROG_MKDIR_P
@@ -110,10 +111,10 @@ LT_INIT([disable-static])
 AC_CHECK_PROG(RM, rm, [rm -f])
 AX_PROG_BISON([],
               AS_IF([test ! -f "$srcdir/src/glsl/glcpp/glcpp-parse.c"],
               AS_IF([test ! -f "$srcdir/src/compiler/glsl/glcpp/glcpp-parse.c"],
                     [AC_MSG_ERROR([bison not found - unable to compile glcpp-parse.y])]))
 AX_PROG_FLEX([],
              AS_IF([test ! -f "$srcdir/src/glsl/glcpp/glcpp-lex.c"],
              AS_IF([test ! -f "$srcdir/src/compiler/glsl/glcpp/glcpp-lex.c"],
                    [AC_MSG_ERROR([flex not found - unable to compile glcpp-lex.l])]))
 AC_CHECK_PROG(INDENT, indent, indent, cat)
@@ -141,6 +142,12 @@ else
     fi
 fi
 if test -z "$PYTHON3"; then
     if test ! -f "$srcdir/src/intel/genxml/gen9_pack.h"; then
         AC_MSG_ERROR([Python3 not found - unable to generate sources])
     fi
 fi
 AC_PROG_INSTALL
 dnl We need a POSIX shell for parts of the build. Assume we have one
@@ -197,6 +204,13 @@ if test "x$GCC" = xyes -a "x$acv_mesa_CLANG" = xno; then
     fi
 fi
 dnl We don't support building Mesa with Sun C compiler
 dnl https://bugs.freedesktop.org/show_bug.cgi?id=93189
 AC_CHECK_DECL([__SUNPRO_C], [SUNCC=yes], [SUNCC=no])
 if test "x$SUNCC" = xyes; then
     AC_MSG_ERROR([Building with Sun C compiler is not supported, use GCC instead.])
 fi
 dnl Check for compiler builtins
 AX_GCC_BUILTIN([__builtin_bswap32])
 AX_GCC_BUILTIN([__builtin_bswap64])
@@ -216,8 +230,10 @@ AX_GCC_FUNC_ATTRIBUTE([format])
 AX_GCC_FUNC_ATTRIBUTE([malloc])
 AX_GCC_FUNC_ATTRIBUTE([packed])
 AX_GCC_FUNC_ATTRIBUTE([pure])
 AX_GCC_FUNC_ATTRIBUTE([returns_nonnull])
 AX_GCC_FUNC_ATTRIBUTE([unused])
 AX_GCC_FUNC_ATTRIBUTE([warn_unused_result])
 AX_GCC_FUNC_ATTRIBUTE([weak])
 AM_CONDITIONAL([GEN_ASM_OFFSETS], test "x$GEN_ASM_OFFSETS" = xyes)
@@ -238,9 +254,13 @@ _SAVE_LDFLAGS="$LDFLAGS"
 _SAVE_CPPFLAGS="$CPPFLAGS"
 dnl Compiler macros
 DEFINES="-D__STDC_LIMIT_MACROS"
 DEFINES="-D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS"
 AC_SUBST([DEFINES])
 android=no
 case "$host_os" in
 *-android)
     android=yes
     ;;
 linux*|*-gnu*|gnu*)
     DEFINES="$DEFINES -D_GNU_SOURCE"
     ;;
@@ -252,6 +272,8 @@ cygwin*)
     ;;
 esac
 AM_CONDITIONAL(HAVE_ANDROID, test "x$android" = xyes)
 dnl Add flags for gcc and g++
 if test "x$GCC" = xyes; then
     CFLAGS="$CFLAGS -Wall"
@@ -298,8 +320,7 @@ if test "x$GCC" = xyes; then
     # Flags to help ensure that certain portions of the code -- and only those
     # portions -- can be built with MSVC:
     # - src/util, src/gallium/auxiliary, and src/gallium/drivers/llvmpipe needs
     #   to build with Windows SDK 7.0.7600, which bundles MSVC 2008
     # - src/util, src/gallium/auxiliary, rc/gallium/drivers/llvmpipe, and
     # - non-Linux/Posix OpenGL portions needs to build on MSVC 2013 (which
     #   supports most of C99)
     # - the rest has no compiler compiler restrictions
@@ -316,9 +337,6 @@ if test "x$GCC" = xyes; then
 		    AC_MSG_RESULT([yes])],
 		    AC_MSG_RESULT([no]));
     CFLAGS="$save_CFLAGS"
     MSVC2008_COMPAT_CFLAGS="$MSVC2013_COMPAT_CFLAGS -Werror=declaration-after-statement"
     MSVC2008_COMPAT_CXXFLAGS="$MSVC2013_COMPAT_CXXFLAGS"
 fi
 if test "x$GXX" = xyes; then
     CXXFLAGS="$CXXFLAGS -Wall"
@@ -346,8 +364,6 @@ fi
 AC_SUBST([MSVC2013_COMPAT_CFLAGS])
 AC_SUBST([MSVC2013_COMPAT_CXXFLAGS])
 AC_SUBST([MSVC2008_COMPAT_CFLAGS])
 AC_SUBST([MSVC2008_COMPAT_CXXFLAGS])
 dnl even if the compiler appears to support it, using visibility attributes isn't
 dnl going to do anything useful currently on cygwin apart from emit lots of warnings
@@ -389,6 +405,61 @@ fi
 AM_CONDITIONAL([SSE41_SUPPORTED], [test x$SSE41_SUPPORTED = x1])
 AC_SUBST([SSE41_CFLAGS], $SSE41_CFLAGS)
 dnl Check for Endianness
 AC_C_BIGENDIAN(
    little_endian=no,
    little_endian=yes,
    little_endian=no,
    little_endian=no
 )
 dnl Check for POWER8 Architecture
 PWR8_CFLAGS="-mpower8-vector"
 have_pwr8_intrinsics=no
 AC_MSG_CHECKING(whether gcc supports -mpower8-vector)
 save_CFLAGS=$CFLAGS
 CFLAGS="$PWR8_CFLAGS $CFLAGS"
 AC_COMPILE_IFELSE([AC_LANG_SOURCE([[
 #if defined(__GNUC__) && (__GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 8))
 #error "Need GCC >= 4.8 for sane POWER8 support"
 #endif
 #include <altivec.h>
 int main () {
     vector unsigned char r;
     vector unsigned int v = vec_splat_u32 (1);
     r = __builtin_vec_vgbbd ((vector unsigned char) v);
     return 0;
 }]])], have_pwr8_intrinsics=yes)
 CFLAGS=$save_CFLAGS
 AC_ARG_ENABLE(pwr8,
    [AC_HELP_STRING([--disable-pwr8-inst],
                    [disable POWER8-specific instructions])],
    [enable_pwr8=$enableval], [enable_pwr8=auto])
 if test "x$enable_pwr8" = xno ; then
    have_pwr8_intrinsics=disabled
 fi
 if test $have_pwr8_intrinsics = yes && test $little_endian = yes ; then
    DEFINES="$DEFINES -D_ARCH_PWR8"
    CXXFLAGS="$CXXFLAGS $PWR8_CFLAGS"
    CFLAGS="$CFLAGS $PWR8_CFLAGS"
 else
    PWR8_CFLAGS=
 fi
 AC_MSG_RESULT($have_pwr8_intrinsics)
 if test "x$enable_pwr8" = xyes && test $have_pwr8_intrinsics = no ; then
    AC_MSG_ERROR([POWER8 compiler support not detected])
 fi
 if test $have_pwr8_intrinsics = yes && test $little_endian = no ; then
    AC_MSG_WARN([POWER8 optimization is enabled only on POWER8 Little-Endian])
 fi
 AC_SUBST([PWR8_CFLAGS], $PWR8_CFLAGS)
 dnl Can't have static and shared libraries, default to static if user
 dnl explicitly requested. If both disabled, set to static since shared
 dnl was explicitly requested.
@@ -414,8 +485,29 @@ AC_ARG_ENABLE([debug],
     [enable_debug="$enableval"],
     [enable_debug=no]
 )
 AC_ARG_ENABLE([profile],
     [AS_HELP_STRING([--enable-profile],
         [enable profiling of code @<:@default=disabled@:>@])],
     [enable_profile="$enableval"],
     [enable_profile=no]
 )
 if test "x$enable_profile" = xyes; then
     DEFINES="$DEFINES -DPROFILE"
     if test "x$GCC" = xyes; then
         CFLAGS="$CFLAGS -fno-omit-frame-pointer"
     fi
     if test "x$GXX" = xyes; then
         CXXFLAGS="$CXXFLAGS -fno-omit-frame-pointer"
     fi
 fi
 if test "x$enable_debug" = xyes; then
     DEFINES="$DEFINES -DDEBUG"
     if test "x$enable_profile" = xyes; then
         AC_MSG_WARN([Debug and Profile are enabled at the same time])
     fi
     if test "x$GCC" = xyes; then
         if ! echo "$CFLAGS" | grep -q -e '-g'; then
             CFLAGS="$CFLAGS -g"
@@ -436,6 +528,8 @@ else
    DEFINES="$DEFINES -DNDEBUG"
 fi
 DEFAULT_GL_LIB_NAME=GL
 dnl
 dnl Check if linker supports -Bsymbolic
 dnl
@@ -533,6 +627,23 @@ esac
 AM_CONDITIONAL(HAVE_COMPAT_SYMLINKS, test "x$HAVE_COMPAT_SYMLINKS" = xyes)
 DEFAULT_GL_LIB_NAME=GL
 dnl
 dnl Libglvnd configuration
 dnl
 AC_ARG_ENABLE([libglvnd],
     [AS_HELP_STRING([--enable-libglvnd],
         [Build for libglvnd @<:@default=disabled@:>@])],
     [enable_libglvnd="$enableval"],
     [enable_libglvnd=no])
 AM_CONDITIONAL(USE_LIBGLVND_GLX, test "x$enable_libglvnd" = xyes)
 #AM_COND_IF([USE_LIBGLVND_GLX], [DEFINES="${DEFINES} -DUSE_LIBGLVND_GLX=1"])
 if test "x$enable_libglvnd" = xyes ; then
     DEFINES="${DEFINES} -DUSE_LIBGLVND_GLX=1"
     DEFAULT_GL_LIB_NAME=GLX_mesa
 fi
 dnl
 dnl library names
 dnl
@@ -570,13 +681,13 @@ AC_ARG_WITH([gl-lib-name],
   [AS_HELP_STRING([--with-gl-lib-name@<:@=NAME@:>@],
     [specify GL library name @<:@default=GL@:>@])],
   [GL_LIB=$withval],
   [GL_LIB=GL])
   [GL_LIB="$DEFAULT_GL_LIB_NAME"])
 AC_ARG_WITH([osmesa-lib-name],
   [AS_HELP_STRING([--with-osmesa-lib-name@<:@=NAME@:>@],
     [specify OSMesa library name @<:@default=OSMesa@:>@])],
   [OSMESA_LIB=$withval],
   [OSMESA_LIB=OSMesa])
 AS_IF([test "x$GL_LIB" = xyes], [GL_LIB=GL])
 AS_IF([test "x$GL_LIB" = xyes], [GL_LIB="$DEFAULT_GL_LIB_NAME"])
 AS_IF([test "x$OSMESA_LIB" = xyes], [OSMESA_LIB=OSMesa])
 dnl
@@ -627,8 +738,10 @@ test "x$enable_asm" = xno && AC_MSG_RESULT([no])
 if test "x$enable_asm" = xyes -a "x$cross_compiling" = xyes; then
     case "$host_cpu" in
     i?86 | x86_64 | amd64)
         enable_asm=no
         AC_MSG_RESULT([no, cross compiling])
         if test "x$host_cpu" != "x$target_cpu"; then
             enable_asm=no
             AC_MSG_RESULT([no, cross compiling])
         fi
         ;;
     esac
 fi
@@ -719,6 +832,10 @@ dnl to -pthread, which causes problems if we need -lpthread to appear in
 dnl pkgconfig files.
 test -z "$PTHREAD_LIBS" && PTHREAD_LIBS="-lpthread"
 PKG_CHECK_MODULES(PTHREADSTUBS, pthread-stubs)
 AC_SUBST(PTHREADSTUBS_CFLAGS)
 AC_SUBST(PTHREADSTUBS_LIBS)
 dnl SELinux awareness.
 AC_ARG_ENABLE([selinux],
     [AS_HELP_STRING([--enable-selinux],
@@ -779,8 +896,8 @@ AC_ARG_ENABLE([dri3],
     [enable_dri3="$enableval"],
     [enable_dri3="$dri3_default"])
 AC_ARG_ENABLE([glx],
     [AS_HELP_STRING([--enable-glx],
         [enable GLX library @<:@default=enabled@:>@])],
     [AS_HELP_STRING([--enable-glx@<:@=dri|xlib|gallium-xlib@:>@],
         [enable the GLX library and choose an implementation @<:@default=auto@:>@])],
     [enable_glx="$enableval"],
     [enable_glx=yes])
 AC_ARG_ENABLE([osmesa],
@@ -846,17 +963,6 @@ AC_ARG_ENABLE([opencl_icd],
            @<:@default=disabled@:>@])],
     [enable_opencl_icd="$enableval"],
     [enable_opencl_icd=no])
 AC_ARG_ENABLE([xlib-glx],
     [AS_HELP_STRING([--enable-xlib-glx],
         [make GLX library Xlib-based instead of DRI-based @<:@default=disabled@:>@])],
     [enable_xlib_glx="$enableval"],
     [enable_xlib_glx=no])
 AC_ARG_ENABLE([r600-llvm-compiler],
     [AS_HELP_STRING([--enable-r600-llvm-compiler],
         [Enable experimental LLVM backend for graphics shaders @<:@default=disabled@:>@])],
     [enable_r600_llvm="$enableval"],
     [enable_r600_llvm=no])
 AC_ARG_ENABLE([gallium-tests],
     [AS_HELP_STRING([--enable-gallium-tests],
@@ -915,36 +1021,85 @@ AM_CONDITIONAL(NEED_OPENGL_COMMON, test "x$enable_opengl" = xyes -o \
                                         "x$enable_gles1" = xyes -o \
                                         "x$enable_gles2" = xyes)
 if test "x$enable_glx" = xno; then
     AC_MSG_WARN([GLX disabled, disabling Xlib-GLX])
     enable_xlib_glx=no
 # Validate GLX options
 if test "x$enable_glx" = xyes; then
     if test "x$enable_dri" = xyes; then
         enable_glx=dri
     elif test -n "$with_gallium_drivers"; then
         enable_glx=gallium-xlib
     else
         enable_glx=xlib
     fi
 fi
 case "x$enable_glx" in
 xdri | xxlib | xgallium-xlib)
     # GLX requires OpenGL
     if test "x$enable_opengl" = xno; then
         AC_MSG_ERROR([GLX cannot be built without OpenGL])
     fi
 if test "x$enable_dri$enable_xlib_glx" = xyesyes; then
     AC_MSG_ERROR([DRI and Xlib-GLX cannot be built together])
     # Check individual dependencies
     case "x$enable_glx" in
     xdri)
         if test "x$enable_dri" = xno; then
             AC_MSG_ERROR([DRI-based GLX requires DRI to be enabled])
         fi
         ;;
     xxlib)
         if test "x$enable_dri" = xyes; then
             AC_MSG_ERROR([Xlib-based GLX cannot be built with DRI enabled])
         fi
         ;;
     xgallium-xlib )
         if test "x$enable_dri" = xyes; then
             AC_MSG_ERROR([Xlib-based (Gallium) GLX cannot be built with DRI enabled])
         fi
         if test -z "$with_gallium_drivers"; then
             AC_MSG_ERROR([Xlib-based (Gallium) GLX cannot be built without Gallium enabled])
         fi
         ;;
     esac
     ;;
 xno)
     ;;
 *)
     AC_MSG_ERROR([Illegal value for --enable-glx: $enable_glx])
     ;;
 esac
 AM_CONDITIONAL(HAVE_DRI_GLX, test "x$enable_glx" = xdri)
 AM_CONDITIONAL(HAVE_XLIB_GLX, test "x$enable_glx" = xxlib)
 AM_CONDITIONAL(HAVE_GALLIUM_XLIB_GLX, test "x$enable_glx" = xgallium-xlib)
 dnl
 dnl Libglvnd configuration
 dnl
 AC_ARG_ENABLE([libglvnd],
     [AS_HELP_STRING([--enable-libglvnd],
         [Build for libglvnd @<:@default=disabled@:>@])],
     [enable_libglvnd="$enableval"],
     [enable_libglvnd=no])
 AM_CONDITIONAL(USE_LIBGLVND_GLX, test "x$enable_libglvnd" = xyes)
 if test "x$enable_libglvnd" = xyes ; then
     dnl XXX: update once we can handle more than libGL/glx.
     dnl Namely: we should error out if neither of the glvnd enabled libraries
     dnl are built
     case "x$enable_glx" in
     xno)
         AC_MSG_ERROR([cannot build libglvnd without GLX])
         ;;
     xxlib | xgallium-xlib )
         AC_MSG_ERROR([cannot build libgvnd when Xlib-GLX or Gallium-Xlib-GLX is enabled])
         ;;
     xdri)
         ;;
     esac
     PKG_CHECK_MODULES([GLVND], libglvnd >= 0.1.0)
     DEFINES="${DEFINES} -DUSE_LIBGLVND_GLX=1"
     DEFAULT_GL_LIB_NAME=GLX_mesa
 fi
 if test "x$enable_opengl$enable_xlib_glx" = xnoyes; then
     AC_MSG_ERROR([Xlib-GLX cannot be built without OpenGL])
 fi
 # Disable GLX if OpenGL is not enabled
 if test "x$enable_glx$enable_opengl" = xyesno; then
     AC_MSG_WARN([OpenGL not enabled, disabling GLX])
     enable_glx=no
 fi
 # Disable GLX if DRI and Xlib-GLX are not enabled
 if test "x$enable_glx" = xyes -a \
         "x$enable_dri" = xno -a \
         "x$enable_xlib_glx" = xno; then
     AC_MSG_WARN([Neither DRI nor Xlib-GLX enabled, disabling GLX])
     enable_glx=no
 fi
 AM_CONDITIONAL(HAVE_DRI_GLX, test "x$enable_glx" = xyes -a \
                                   "x$enable_dri" = xyes)
 # Check for libdrm
 PKG_CHECK_MODULES([LIBDRM], [libdrm >= $LIBDRM_REQUIRED],
                   [have_libdrm=yes], [have_libdrm=no])
@@ -999,10 +1154,6 @@ dnl
 dnl Driver specific build directories
 dnl
 if test -n "$with_gallium_drivers" -a "x$enable_glx$enable_xlib_glx" = xyesyes; then
     NEED_WINSYS_XLIB="yes"
 fi
 if test "x$enable_gallium_osmesa" = xyes; then
     if ! echo "$with_gallium_drivers" | grep -q 'swrast'; then
         AC_MSG_ERROR([gallium_osmesa requires the gallium swrast driver])
@@ -1195,8 +1346,8 @@ AC_ARG_ENABLE([driglx-direct],
 dnl
 dnl libGL configuration per driver
 dnl
 case "x$enable_glx$enable_xlib_glx" in
 xyesyes)
 case "x$enable_glx" in
 xxlib | xgallium-xlib)
     # Xlib-based GLX
     dri_modules="x11 xext xcb"
     PKG_CHECK_MODULES([XLIBGL], [$dri_modules])
@@ -1206,7 +1357,7 @@ xyesyes)
     GL_LIB_DEPS="$GL_LIB_DEPS $SELINUX_LIBS -lm $PTHREAD_LIBS $DLOPEN_LIBS"
     GL_PC_LIB_PRIV="$GL_PC_LIB_PRIV $SELINUX_LIBS -lm $PTHREAD_LIBS"
     ;;
 xyesno)
 xdri)
     # DRI-based GLX
     PKG_CHECK_MODULES([GLPROTO], [glproto >= $GLPROTO_REQUIRED])
@@ -1233,7 +1384,7 @@ xyesno)
             if test x"$enable_dri3" = xyes; then
                PKG_CHECK_EXISTS([xcb >= $XCB_REQUIRED], [], AC_MSG_ERROR([DRI3 requires xcb >= $XCB_REQUIRED]))
                dri3_modules="xcb-dri3 xcb-present xcb-sync xshmfence >= $XSHMFENCE_REQUIRED"
                dri3_modules="xcb xcb-dri3 xcb-present xcb-sync xshmfence >= $XSHMFENCE_REQUIRED"
                PKG_CHECK_MODULES([XCB_DRI3], [$dri3_modules])
             fi
         fi
@@ -1295,11 +1446,11 @@ AC_SUBST([HAVE_XF86VIDMODE])
 dnl
 dnl More GLX setup
 dnl
 case "x$enable_glx$enable_xlib_glx" in
 xyesyes)
 case "x$enable_glx" in
 xxlib | xgallium-xlib)
     DEFINES="$DEFINES -DUSE_XSHM"
     ;;
 xyesno)
 xdri)
     DEFINES="$DEFINES -DGLX_INDIRECT_RENDERING"
     if test "x$driglx_direct" = xyes; then
         DEFINES="$DEFINES -DGLX_DIRECT_RENDERING"
@@ -1472,8 +1623,58 @@ if test -n "$with_dri_drivers"; then
     DRI_DIRS=`echo $DRI_DIRS|tr " " "\n"|sort -u|tr "\n" " "`
 fi
 #
 # Vulkan driver configuration
 #
 AC_ARG_WITH([vulkan-drivers],
     [AS_HELP_STRING([--with-vulkan-drivers@<:@=DIRS...@:>@],
         [comma delimited Vulkan drivers list, e.g.
         "intel"
         @<:@default=no@:>@])],
     [with_vulkan_drivers="$withval"],
     [with_vulkan_drivers="no"])
 # Doing '--without-vulkan-drivers' will set this variable to 'no'.  Clear it
 # here so that the script doesn't choke on an unknown driver name later.
 case "x$with_vulkan_drivers" in
     xyes) with_vulkan_drivers="$VULKAN_DRIVERS_DEFAULT" ;;
     xno) with_vulkan_drivers='' ;;
 esac
 AC_ARG_WITH([vulkan-icddir],
     [AS_HELP_STRING([--with-vulkan-icddir=DIR],
         [directory for the Vulkan driver icd files @<:@${sysconfdir}/vulkan/icd.d@:>@])],
     [VULKAN_ICD_INSTALL_DIR="$withval"],
     [VULKAN_ICD_INSTALL_DIR='${sysconfdir}/vulkan/icd.d'])
 AC_SUBST([VULKAN_ICD_INSTALL_DIR])
 if test -n "$with_vulkan_drivers"; then
     VULKAN_DRIVERS=`IFS=', '; echo $with_vulkan_drivers`
     for driver in $VULKAN_DRIVERS; do
         case "x$driver" in
         xintel)
             if test "x$HAVE_I965_DRI" != xyes; then
                 AC_MSG_ERROR([Intel Vulkan driver requires the i965 dri driver])
             fi
             if test "x$with_sha1" == "x"; then
                 AC_MSG_ERROR([Intel Vulkan driver requires SHA1])
             fi
             HAVE_INTEL_VULKAN=yes;
             ;;
         *)
             AC_MSG_ERROR([Vulkan driver '$driver' does not exist])
             ;;
         esac
     done
     VULKAN_DRIVERS=`echo $VULKAN_DRIVERS|tr " " "\n"|sort -u|tr "\n" " "`
 fi
 AM_CONDITIONAL(NEED_MEGADRIVER, test -n "$DRI_DIRS")
 AM_CONDITIONAL(NEED_LIBMESA, test "x$enable_xlib_glx" = xyes -o \
 AM_CONDITIONAL(NEED_LIBMESA, test "x$enable_glx" = xxlib -o \
                                   "x$enable_osmesa" = xyes -o \
                                   -n "$DRI_DIRS")
@@ -1488,7 +1689,7 @@ AC_ARG_WITH([osmesa-bits],
     [osmesa_bits="$withval"],
     [osmesa_bits=8])
 if test "x$osmesa_bits" != x8; then
     if test "x$enable_dri" = xyes -o "x$enable_glx" = xyes; then
     if test "x$enable_dri" = xyes -o "x$enable_glx" != xno; then
         AC_MSG_WARN([Ignoring OSMesa channel bits because of non-OSMesa driver])
         osmesa_bits=8
     fi
@@ -1644,7 +1845,12 @@ if test "x$enable_xvmc" = xyes -o \
         "x$enable_vdpau" = xyes -o \
         "x$enable_omx" = xyes -o \
         "x$enable_va" = xyes; then
     PKG_CHECK_MODULES([VL], [x11-xcb xcb xcb-dri2 >= $XCBDRI2_REQUIRED])
     if test x"$enable_dri3" = xyes; then
         PKG_CHECK_MODULES([VL], [xcb-dri3 xcb-present xcb-sync xshmfence >= $XSHMFENCE_REQUIRED
                                  x11-xcb xcb xcb-dri2 >= $XCBDRI2_REQUIRED])
     else
         PKG_CHECK_MODULES([VL], [x11-xcb xcb xcb-dri2 >= $XCBDRI2_REQUIRED])
     fi
     need_gallium_vl_winsys=yes
 fi
 AM_CONDITIONAL(NEED_GALLIUM_VL_WINSYS, test "x$need_gallium_vl_winsys" = xyes)
@@ -1658,6 +1864,7 @@ AM_CONDITIONAL(HAVE_ST_XVMC, test "x$enable_xvmc" = xyes)
 if test "x$enable_vdpau" = xyes; then
     PKG_CHECK_MODULES([VDPAU], [vdpau >= $VDPAU_REQUIRED])
     gallium_st="$gallium_st vdpau"
     DEFINES="$DEFINES -DHAVE_ST_VDPAU"
 fi
 AM_CONDITIONAL(HAVE_ST_VDPAU, test "x$enable_vdpau" = xyes)
@@ -1834,6 +2041,9 @@ for plat in $egl_platforms; do
 			AC_MSG_ERROR([EGL platform surfaceless requires libdrm >= $LIBDRM_REQUIRED])
 		;;
 	android)
 		;;
 	*)
 		AC_MSG_ERROR([EGL platform '$plat' does not exist])
 		;;
@@ -1854,11 +2064,11 @@ else
     EGL_NATIVE_PLATFORM="_EGL_INVALID_PLATFORM"
 fi
 AM_CONDITIONAL(HAVE_EGL_PLATFORM_X11, echo "$egl_platforms" | grep -q 'x11')
 AM_CONDITIONAL(HAVE_EGL_PLATFORM_WAYLAND, echo "$egl_platforms" | grep -q 'wayland')
 AM_CONDITIONAL(HAVE_PLATFORM_X11, echo "$egl_platforms" | grep -q 'x11')
 AM_CONDITIONAL(HAVE_PLATFORM_WAYLAND, echo "$egl_platforms" | grep -q 'wayland')
 AM_CONDITIONAL(HAVE_EGL_PLATFORM_DRM, echo "$egl_platforms" | grep -q 'drm')
 AM_CONDITIONAL(HAVE_EGL_PLATFORM_SURFACELESS, echo "$egl_platforms" | grep -q 'surfaceless')
 AM_CONDITIONAL(HAVE_EGL_PLATFORM_NULL, echo "$egl_platforms" | grep -q 'null')
 AM_CONDITIONAL(HAVE_EGL_PLATFORM_ANDROID, echo "$egl_platforms" | grep -q 'android')
 AM_CONDITIONAL(HAVE_EGL_DRIVER_DRI2, test "x$HAVE_EGL_DRIVER_DRI2" != "x")
@@ -2076,7 +2286,12 @@ gallium_require_drm_loader() {
     fi
 }
 dnl This is for Glamor. Skip this if OpenGL is disabled.
 require_egl_drm() {
     if test "x$enable_opengl" = xno; then
         return 0
     fi
     case "$with_egl_platforms" in
         *drm*)
             ;;
@@ -2098,7 +2313,7 @@ radeon_llvm_check() {
     if test "x$enable_gallium_llvm" != "xyes"; then
         AC_MSG_ERROR([--enable-gallium-llvm is required when building $1])
     fi
     llvm_check_version_for "3" "5" "0" $1
     llvm_check_version_for "3" "6" "0" $1
     if test true && $LLVM_CONFIG --targets-built | grep -iqvw $amdgpu_llvm_target_name ; then
         AC_MSG_ERROR([LLVM $amdgpu_llvm_target_name not enabled in your LLVM build.])
     fi
@@ -2109,6 +2324,16 @@ radeon_llvm_check() {
     fi
 }
 swr_llvm_check() {
     gallium_require_llvm $1
     if test ${LLVM_VERSION_INT} -lt 306; then
         AC_MSG_ERROR([LLVM version 3.6 or later required when building $1])
     fi
     if test "x$enable_gallium_llvm" != "xyes"; then
         AC_MSG_ERROR([--enable-gallium-llvm is required when building $1])
     fi
 }
 dnl Duplicates in GALLIUM_DRIVERS_DIRS are removed by sorting it after this block
 if test -n "$with_gallium_drivers"; then
     gallium_drivers=`IFS=', '; echo $with_gallium_drivers`
@@ -2143,14 +2368,8 @@ if test -n "$with_gallium_drivers"; then
             PKG_CHECK_MODULES([RADEON], [libdrm_radeon >= $LIBDRM_RADEON_REQUIRED])
             gallium_require_drm "Gallium R600"
             gallium_require_drm_loader
             if test "x$enable_r600_llvm" = xyes -o "x$enable_opencl" = xyes; then
                 radeon_llvm_check "r600g"
                 LLVM_COMPONENTS="${LLVM_COMPONENTS} bitreader asmparser"
             fi
             if test "x$enable_r600_llvm" = xyes; then
                 USE_R600_LLVM_COMPILER=yes;
             fi
             if test "x$enable_opencl" = xyes; then
                 radeon_llvm_check "r600g"
                 LLVM_COMPONENTS="${LLVM_COMPONENTS} bitreader asmparser"
             fi
             ;;
@@ -2181,6 +2400,35 @@ if test -n "$with_gallium_drivers"; then
                 HAVE_GALLIUM_LLVMPIPE=yes
             fi
             ;;
         xswr)
             swr_llvm_check "swr"
             AC_MSG_CHECKING([whether $CXX supports c++11/AVX/AVX2])
             AVX_CXXFLAGS="-march=core-avx-i"
             AVX2_CXXFLAGS="-march=core-avx2"
             AC_LANG_PUSH([C++])
             save_CXXFLAGS="$CXXFLAGS"
             CXXFLAGS="-std=c++11 $CXXFLAGS"
             AC_COMPILE_IFELSE([AC_LANG_PROGRAM()],[],
                               [AC_MSG_ERROR([c++11 compiler support not detected])])
             CXXFLAGS="$save_CXXFLAGS"
             save_CXXFLAGS="$CXXFLAGS"
             CXXFLAGS="$AVX_CXXFLAGS $CXXFLAGS"
             AC_COMPILE_IFELSE([AC_LANG_PROGRAM()],[],
                               [AC_MSG_ERROR([AVX compiler support not detected])])
             CXXFLAGS="$save_CXXFLAGS"
             save_CFLAGS="$CXXFLAGS"
             CXXFLAGS="$AVX2_CXXFLAGS $CXXFLAGS"
             AC_COMPILE_IFELSE([AC_LANG_PROGRAM()],[],
                               [AC_MSG_ERROR([AVX2 compiler support not detected])])
             CXXFLAGS="$save_CXXFLAGS"
             AC_LANG_POP([C++])
             HAVE_GALLIUM_SWR=yes
             ;;
         xvc4)
             HAVE_GALLIUM_VC4=yes
             gallium_require_drm "vc4"
@@ -2213,6 +2461,9 @@ dnl in LLVM_LIBS.
 if test "x$MESA_LLVM" != x0; then
     if ! $LLVM_CONFIG --libs ${LLVM_COMPONENTS} >/dev/null; then
        AC_MSG_ERROR([Calling ${LLVM_CONFIG} failed])
     fi
     LLVM_LIBS="`$LLVM_CONFIG --libs ${LLVM_COMPONENTS}`"
     dnl llvm-config may not give the right answer when llvm is a built as a
@@ -2267,6 +2518,10 @@ AM_CONDITIONAL(HAVE_GALLIUM_NOUVEAU, test "x$HAVE_GALLIUM_NOUVEAU" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_FREEDRENO, test "x$HAVE_GALLIUM_FREEDRENO" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_SOFTPIPE, test "x$HAVE_GALLIUM_SOFTPIPE" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_LLVMPIPE, test "x$HAVE_GALLIUM_LLVMPIPE" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_SWR, test "x$HAVE_GALLIUM_SWR" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_SWRAST, test "x$HAVE_GALLIUM_SOFTPIPE" = xyes -o \
                                          "x$HAVE_GALLIUM_LLVMPIPE" = xyes -o \
                                          "x$HAVE_GALLIUM_SWR" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_VC4, test "x$HAVE_GALLIUM_VC4" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_VIRGL, test "x$HAVE_GALLIUM_VIRGL" = xyes)
@@ -2288,12 +2543,16 @@ AM_CONDITIONAL(HAVE_R200_DRI, test x$HAVE_R200_DRI = xyes)
 AM_CONDITIONAL(HAVE_RADEON_DRI, test x$HAVE_RADEON_DRI = xyes)
 AM_CONDITIONAL(HAVE_SWRAST_DRI, test x$HAVE_SWRAST_DRI = xyes)
 AM_CONDITIONAL(HAVE_INTEL_VULKAN, test "x$HAVE_INTEL_VULKAN" = xyes)
 AM_CONDITIONAL(HAVE_INTEL_DRIVERS, test "x$HAVE_INTEL_VULKAN" = xyes -o \
                                         "x$HAVE_I965_DRI" = xyes)
 AM_CONDITIONAL(NEED_RADEON_DRM_WINSYS, test "x$HAVE_GALLIUM_R300" = xyes -o \
                                             "x$HAVE_GALLIUM_R600" = xyes -o \
                                             "x$HAVE_GALLIUM_RADEONSI" = xyes)
 AM_CONDITIONAL(NEED_WINSYS_XLIB, test "x$NEED_WINSYS_XLIB" = xyes)
 AM_CONDITIONAL(NEED_WINSYS_XLIB, test "x$enable_glx" = xgallium-xlib)
 AM_CONDITIONAL(NEED_RADEON_LLVM, test x$NEED_RADEON_LLVM = xyes)
 AM_CONDITIONAL(USE_R600_LLVM_COMPILER, test x$USE_R600_LLVM_COMPILER = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_COMPUTE, test x$enable_opencl = xyes)
 AM_CONDITIONAL(HAVE_MESA_LLVM, test x$MESA_LLVM = x1)
 AM_CONDITIONAL(USE_VC4_SIMULATOR, test x$USE_VC4_SIMULATOR = xyes)
@@ -2302,7 +2561,6 @@ if test "x$USE_VC4_SIMULATOR" = xyes -a "x$HAVE_GALLIUM_ILO" = xyes; then
 fi
 AM_CONDITIONAL(HAVE_LIBDRM, test "x$have_libdrm" = xyes)
 AM_CONDITIONAL(HAVE_X11_DRIVER, test "x$enable_xlib_glx" = xyes)
 AM_CONDITIONAL(HAVE_OSMESA, test "x$enable_osmesa" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_OSMESA, test "x$enable_gallium_osmesa" = xyes)
@@ -2336,6 +2594,27 @@ AC_SUBST([XA_MINOR], $XA_MINOR)
 AC_SUBST([XA_TINY], $XA_TINY)
 AC_SUBST([XA_VERSION], "$XA_MAJOR.$XA_MINOR.$XA_TINY")
 AC_ARG_ENABLE(valgrind,
               [AS_HELP_STRING([--enable-valgrind],
                              [Build mesa with valgrind support (default: auto)])],
                              [VALGRIND=$enableval], [VALGRIND=auto])
 if test "x$VALGRIND" != xno; then
 	PKG_CHECK_MODULES(VALGRIND, [valgrind], [have_valgrind=yes], [have_valgrind=no])
 fi
 AC_MSG_CHECKING([whether to enable Valgrind support])
 if test "x$VALGRIND" = xauto; then
 	VALGRIND="$have_valgrind"
 fi
 if test "x$VALGRIND" = "xyes"; then
 	if ! test "x$have_valgrind" = xyes; then
 		AC_MSG_ERROR([Valgrind support required but not present])
 	fi
 	AC_DEFINE([HAVE_VALGRIND], 1, [Use valgrind intrinsics to suppress false warnings])
 fi
 AC_MSG_RESULT([$VALGRIND])
 dnl Restore LDFLAGS and CPPFLAGS
 LDFLAGS="$_SAVE_LDFLAGS"
 CPPFLAGS="$_SAVE_CPPFLAGS"
@@ -2353,6 +2632,7 @@ CXXFLAGS="$CXXFLAGS $USER_CXXFLAGS"
 dnl Substitute the config
 AC_CONFIG_FILES([Makefile
 		src/Makefile
 		src/compiler/Makefile
 		src/egl/Makefile
 		src/egl/main/egl.pc
 		src/egl/wayland/wayland-drm/Makefile
@@ -2375,6 +2655,7 @@ AC_CONFIG_FILES([Makefile
 		src/gallium/drivers/rbug/Makefile
 		src/gallium/drivers/softpipe/Makefile
 		src/gallium/drivers/svga/Makefile
 		src/gallium/drivers/swr/Makefile
 		src/gallium/drivers/trace/Makefile
 		src/gallium/drivers/vc4/Makefile
 		src/gallium/drivers/virgl/Makefile
@@ -2422,11 +2703,14 @@ AC_CONFIG_FILES([Makefile
 		src/gallium/winsys/virgl/vtest/Makefile
 		src/gbm/Makefile
 		src/gbm/main/gbm.pc
 		src/glsl/Makefile
 		src/glx/Makefile
 		src/glx/apple/Makefile
 		src/glx/tests/Makefile
 		src/gtest/Makefile
 		src/intel/Makefile
 		src/intel/genxml/Makefile
 		src/intel/isl/Makefile
 		src/intel/vulkan/Makefile
 		src/loader/Makefile
 		src/mapi/Makefile
 		src/mapi/es1api/glesv1_cm.pc
@@ -2453,6 +2737,14 @@ AC_CONFIG_FILES([Makefile
 AC_OUTPUT
 # Fix up dependencies in *.Plo files, where we changed the extension of a
 # source file
 $SED -i -e 's/brw_blorp.cpp/brw_blorp.c/' src/mesa/drivers/dri/i965/.deps/brw_blorp.Plo
 $SED -i -e 's/gen6_blorp.cpp/gen6_blorp.c/' src/mesa/drivers/dri/i965/.deps/gen6_blorp.Plo
 $SED -i -e 's/gen7_blorp.cpp/gen7_blorp.c/' src/mesa/drivers/dri/i965/.deps/gen7_blorp.Plo
 $SED -i -e 's/gen8_blorp.cpp/gen8_blorp.c/' src/mesa/drivers/dri/i965/.deps/gen8_blorp.Plo
 dnl
 dnl Output some configuration info for the user
 dnl
@@ -2491,12 +2783,15 @@ if test "x$enable_dri" != xno; then
         echo "        DRI driver dir:  $DRI_DRIVER_INSTALL_DIR"
 fi
 case "x$enable_glx$enable_xlib_glx" in
 xyesyes)
 case "x$enable_glx" in
 xdri)
     echo "        GLX:             DRI-based"
     ;;
 xxlib)
     echo "        GLX:             Xlib-based"
     ;;
 xyesno)
     echo "        GLX:             DRI-based"
 xgallium-xlib)
     echo "        GLX:             Xlib-based (Gallium)"
     ;;
 *)
     echo "        GLX:             $enable_glx"
@@ -2520,6 +2815,15 @@ if test "$enable_egl" = yes; then
     echo "        EGL drivers:    $egl_drivers"
 fi
 # Vulkan
 echo ""
 if test "x$VULKAN_DRIVERS" != x; then
     echo "        Vulkan drivers:  $VULKAN_DRIVERS"
     echo "        Vulkan ICD dir:  $VULKAN_ICD_INSTALL_DIR"
 else
     echo "        Vulkan drivers:  no"
 fi
 echo ""
 if test "x$MESA_LLVM" = x1; then
     echo "        llvm:            yes"
@@ -2570,6 +2874,7 @@ if test "x$MESA_LLVM" = x1; then
     echo ""
 fi
 echo "        PYTHON2:         $PYTHON2"
 echo "        PYTHON3:         $PYTHON3"
 echo ""
 echo "        Run '${MAKE-make}' to build Mesa"

490

docs/COPYING

View File

@@ -1,490 +0,0 @@
 Some parts of Mesa are copyrighted under the GNU LGPL.  See the
 Mesa/docs/COPYRIGHT file for details.
 The following is the standard GNU copyright file.
 ----------------------------------------------------------------------
 		  GNU LIBRARY GENERAL PUBLIC LICENSE
 		       Version 2, June 1991
  Copyright (C) 1991 Free Software Foundation, Inc.
 Mass Ave, Cambridge, MA 02139, USA
  Everyone is permitted to copy and distribute verbatim copies
  of this license document, but changing it is not allowed.
 [This is the first released version of the library GPL.  It is
  numbered 2 because it goes with version 2 of the ordinary GPL.]
 			    Preamble
   The licenses for most software are designed to take away your
 freedom to share and change it.  By contrast, the GNU General Public
 Licenses are intended to guarantee your freedom to share and change
 free software--to make sure the software is free for all its users.
   This license, the Library General Public License, applies to some
 specially designated Free Software Foundation software, and to any
 other libraries whose authors decide to use it.  You can use it for
 your libraries, too.
   When we speak of free software, we are referring to freedom, not
 price.  Our General Public Licenses are designed to make sure that you
 have the freedom to distribute copies of free software (and charge for
 this service if you wish), that you receive source code or can get it
 if you want it, that you can change the software or use pieces of it
 in new free programs; and that you know you can do these things.
   To protect your rights, we need to make restrictions that forbid
 anyone to deny you these rights or to ask you to surrender the rights.
 These restrictions translate to certain responsibilities for you if
 you distribute copies of the library, or if you modify it.
   For example, if you distribute copies of the library, whether gratis
 or for a fee, you must give the recipients all the rights that we gave
 you.  You must make sure that they, too, receive or can get the source
 code.  If you link a program with the library, you must provide
 complete object files to the recipients so that they can relink them
 with the library, after making changes to the library and recompiling
 it.  And you must show them these terms so they know their rights.
   Our method of protecting your rights has two steps: (1) copyright
 the library, and (2) offer you this license which gives you legal
 permission to copy, distribute and/or modify the library.
   Also, for each distributor's protection, we want to make certain
 that everyone understands that there is no warranty for this free
 library.  If the library is modified by someone else and passed on, we
 want its recipients to know that what they have is not the original
 version, so that any problems introduced by others will not reflect on
 the original authors' reputations.
   Finally, any free program is threatened constantly by software
 patents.  We wish to avoid the danger that companies distributing free
 software will individually obtain patent licenses, thus in effect
 transforming the program into proprietary software.  To prevent this,
 we have made it clear that any patent must be licensed for everyone's
 free use or not licensed at all.
   Most GNU software, including some libraries, is covered by the ordinary
 GNU General Public License, which was designed for utility programs.  This
 license, the GNU Library General Public License, applies to certain
 designated libraries.  This license is quite different from the ordinary
 one; be sure to read it in full, and don't assume that anything in it is
 the same as in the ordinary license.
   The reason we have a separate public license for some libraries is that
 they blur the distinction we usually make between modifying or adding to a
 program and simply using it.  Linking a program with a library, without
 changing the library, is in some sense simply using the library, and is
 analogous to running a utility program or application program.  However, in
 a textual and legal sense, the linked executable is a combined work, a
 derivative of the original library, and the ordinary General Public License
 treats it as such.
   Because of this blurred distinction, using the ordinary General
 Public License for libraries did not effectively promote software
 sharing, because most developers did not use the libraries.  We
 concluded that weaker conditions might promote sharing better.
   However, unrestricted linking of non-free programs would deprive the
 users of those programs of all benefit from the free status of the
 libraries themselves.  This Library General Public License is intended to
 permit developers of non-free programs to use free libraries, while
 preserving your freedom as a user of such programs to change the free
 libraries that are incorporated in them.  (We have not seen how to achieve
 this as regards changes in header files, but we have achieved it as regards
 changes in the actual functions of the Library.)  The hope is that this
 will lead to faster development of free libraries.
   The precise terms and conditions for copying, distribution and
 modification follow.  Pay close attention to the difference between a
 "work based on the library" and a "work that uses the library".  The
 former contains code derived from the library, while the latter only
 works together with the library.
   Note that it is possible for a library to be covered by the ordinary
 General Public License rather than by this special one.
 		  GNU LIBRARY GENERAL PUBLIC LICENSE
    TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
 . This License Agreement applies to any software library which
 contains a notice placed by the copyright holder or other authorized
 party saying it may be distributed under the terms of this Library
 General Public License (also called "this License").  Each licensee is
 addressed as "you".
   A "library" means a collection of software functions and/or data
 prepared so as to be conveniently linked with application programs
 (which use some of those functions and data) to form executables.
   The "Library", below, refers to any such software library or work
 which has been distributed under these terms.  A "work based on the
 Library" means either the Library or any derivative work under
 copyright law: that is to say, a work containing the Library or a
 portion of it, either verbatim or with modifications and/or translated
 straightforwardly into another language.  (Hereinafter, translation is
 included without limitation in the term "modification".)
   "Source code" for a work means the preferred form of the work for
 making modifications to it.  For a library, complete source code means
 all the source code for all modules it contains, plus any associated
 interface definition files, plus the scripts used to control compilation
 and installation of the library.
   Activities other than copying, distribution and modification are not
 covered by this License; they are outside its scope.  The act of
 running a program using the Library is not restricted, and output from
 such a program is covered only if its contents constitute a work based
 on the Library (independent of the use of the Library in a tool for
 writing it).  Whether that is true depends on what the Library does
 and what the program that uses the Library does.
 . You may copy and distribute verbatim copies of the Library's
 complete source code as you receive it, in any medium, provided that
 you conspicuously and appropriately publish on each copy an
 appropriate copyright notice and disclaimer of warranty; keep intact
 all the notices that refer to this License and to the absence of any
 warranty; and distribute a copy of this License along with the
 Library.
   You may charge a fee for the physical act of transferring a copy,
 and you may at your option offer warranty protection in exchange for a
 fee.
 . You may modify your copy or copies of the Library or any portion
 of it, thus forming a work based on the Library, and copy and
 distribute such modifications or work under the terms of Section 1
 above, provided that you also meet all of these conditions:
     a) The modified work must itself be a software library.
     b) You must cause the files modified to carry prominent notices
     stating that you changed the files and the date of any change.
     c) You must cause the whole of the work to be licensed at no
     charge to all third parties under the terms of this License.
     d) If a facility in the modified Library refers to a function or a
     table of data to be supplied by an application program that uses
     the facility, other than as an argument passed when the facility
     is invoked, then you must make a good faith effort to ensure that,
     in the event an application does not supply such function or
     table, the facility still operates, and performs whatever part of
     its purpose remains meaningful.
     (For example, a function in a library to compute square roots has
     a purpose that is entirely well-defined independent of the
     application.  Therefore, Subsection 2d requires that any
     application-supplied function or table used by this function must
     be optional: if the application does not supply it, the square
     root function must still compute square roots.)
 These requirements apply to the modified work as a whole.  If
 identifiable sections of that work are not derived from the Library,
 and can be reasonably considered independent and separate works in
 themselves, then this License, and its terms, do not apply to those
 sections when you distribute them as separate works.  But when you
 distribute the same sections as part of a whole which is a work based
 on the Library, the distribution of the whole must be on the terms of
 this License, whose permissions for other licensees extend to the
 entire whole, and thus to each and every part regardless of who wrote
 it.
 Thus, it is not the intent of this section to claim rights or contest
 your rights to work written entirely by you; rather, the intent is to
 exercise the right to control the distribution of derivative or
 collective works based on the Library.
 In addition, mere aggregation of another work not based on the Library
 with the Library (or with a work based on the Library) on a volume of
 a storage or distribution medium does not bring the other work under
 the scope of this License.
 . You may opt to apply the terms of the ordinary GNU General Public
 License instead of this License to a given copy of the Library.  To do
 this, you must alter all the notices that refer to this License, so
 that they refer to the ordinary GNU General Public License, version 2,
 instead of to this License.  (If a newer version than version 2 of the
 ordinary GNU General Public License has appeared, then you can specify
 that version instead if you wish.)  Do not make any other change in
 these notices.
   Once this change is made in a given copy, it is irreversible for
 that copy, so the ordinary GNU General Public License applies to all
 subsequent copies and derivative works made from that copy.
   This option is useful when you wish to copy part of the code of
 the Library into a program that is not a library.
 . You may copy and distribute the Library (or a portion or
 derivative of it, under Section 2) in object code or executable form
 under the terms of Sections 1 and 2 above provided that you accompany
 it with the complete corresponding machine-readable source code, which
 must be distributed under the terms of Sections 1 and 2 above on a
 medium customarily used for software interchange.
   If distribution of object code is made by offering access to copy
 from a designated place, then offering equivalent access to copy the
 source code from the same place satisfies the requirement to
 distribute the source code, even though third parties are not
 compelled to copy the source along with the object code.
 . A program that contains no derivative of any portion of the
 Library, but is designed to work with the Library by being compiled or
 linked with it, is called a "work that uses the Library".  Such a
 work, in isolation, is not a derivative work of the Library, and
 therefore falls outside the scope of this License.
   However, linking a "work that uses the Library" with the Library
 creates an executable that is a derivative of the Library (because it
 contains portions of the Library), rather than a "work that uses the
 library".  The executable is therefore covered by this License.
 Section 6 states terms for distribution of such executables.
   When a "work that uses the Library" uses material from a header file
 that is part of the Library, the object code for the work may be a
 derivative work of the Library even though the source code is not.
 Whether this is true is especially significant if the work can be
 linked without the Library, or if the work is itself a library.  The
 threshold for this to be true is not precisely defined by law.
   If such an object file uses only numerical parameters, data
 structure layouts and accessors, and small macros and small inline
 functions (ten lines or less in length), then the use of the object
 file is unrestricted, regardless of whether it is legally a derivative
 work.  (Executables containing this object code plus portions of the
 Library will still fall under Section 6.)
   Otherwise, if the work is a derivative of the Library, you may
 distribute the object code for the work under the terms of Section 6.
 Any executables containing that work also fall under Section 6,
 whether or not they are linked directly with the Library itself.
 . As an exception to the Sections above, you may also compile or
 link a "work that uses the Library" with the Library to produce a
 work containing portions of the Library, and distribute that work
 under terms of your choice, provided that the terms permit
 modification of the work for the customer's own use and reverse
 engineering for debugging such modifications.
   You must give prominent notice with each copy of the work that the
 Library is used in it and that the Library and its use are covered by
 this License.  You must supply a copy of this License.  If the work
 during execution displays copyright notices, you must include the
 copyright notice for the Library among them, as well as a reference
 directing the user to the copy of this License.  Also, you must do one
 of these things:
     a) Accompany the work with the complete corresponding
     machine-readable source code for the Library including whatever
     changes were used in the work (which must be distributed under
     Sections 1 and 2 above); and, if the work is an executable linked
     with the Library, with the complete machine-readable "work that
     uses the Library", as object code and/or source code, so that the
     user can modify the Library and then relink to produce a modified
     executable containing the modified Library.  (It is understood
     that the user who changes the contents of definitions files in the
     Library will not necessarily be able to recompile the application
     to use the modified definitions.)
     b) Accompany the work with a written offer, valid for at
     least three years, to give the same user the materials
     specified in Subsection 6a, above, for a charge no more
     than the cost of performing this distribution.
     c) If distribution of the work is made by offering access to copy
     from a designated place, offer equivalent access to copy the above
     specified materials from the same place.
     d) Verify that the user has already received a copy of these
     materials or that you have already sent this user a copy.
   For an executable, the required form of the "work that uses the
 Library" must include any data and utility programs needed for
 reproducing the executable from it.  However, as a special exception,
 the source code distributed need not include anything that is normally
 distributed (in either source or binary form) with the major
 components (compiler, kernel, and so on) of the operating system on
 which the executable runs, unless that component itself accompanies
 the executable.
   It may happen that this requirement contradicts the license
 restrictions of other proprietary libraries that do not normally
 accompany the operating system.  Such a contradiction means you cannot
 use both them and the Library together in an executable that you
 distribute.
 . You may place library facilities that are a work based on the
 Library side-by-side in a single library together with other library
 facilities not covered by this License, and distribute such a combined
 library, provided that the separate distribution of the work based on
 the Library and of the other library facilities is otherwise
 permitted, and provided that you do these two things:
     a) Accompany the combined library with a copy of the same work
     based on the Library, uncombined with any other library
     facilities.  This must be distributed under the terms of the
     Sections above.
     b) Give prominent notice with the combined library of the fact
     that part of it is a work based on the Library, and explaining
     where to find the accompanying uncombined form of the same work.
 . You may not copy, modify, sublicense, link with, or distribute
 the Library except as expressly provided under this License.  Any
 attempt otherwise to copy, modify, sublicense, link with, or
 distribute the Library is void, and will automatically terminate your
 rights under this License.  However, parties who have received copies,
 or rights, from you under this License will not have their licenses
 terminated so long as such parties remain in full compliance.
 . You are not required to accept this License, since you have not
 signed it.  However, nothing else grants you permission to modify or
 distribute the Library or its derivative works.  These actions are
 prohibited by law if you do not accept this License.  Therefore, by
 modifying or distributing the Library (or any work based on the
 Library), you indicate your acceptance of this License to do so, and
 all its terms and conditions for copying, distributing or modifying
 the Library or works based on it.
 . Each time you redistribute the Library (or any work based on the
 Library), the recipient automatically receives a license from the
 original licensor to copy, distribute, link with or modify the Library
 subject to these terms and conditions.  You may not impose any further
 restrictions on the recipients' exercise of the rights granted herein.
 You are not responsible for enforcing compliance by third parties to
 this License.
 . If, as a consequence of a court judgment or allegation of patent
 infringement or for any other reason (not limited to patent issues),
 conditions are imposed on you (whether by court order, agreement or
 otherwise) that contradict the conditions of this License, they do not
 excuse you from the conditions of this License.  If you cannot
 distribute so as to satisfy simultaneously your obligations under this
 License and any other pertinent obligations, then as a consequence you
 may not distribute the Library at all.  For example, if a patent
 license would not permit royalty-free redistribution of the Library by
 all those who receive copies directly or indirectly through you, then
 the only way you could satisfy both it and this License would be to
 refrain entirely from distribution of the Library.
 If any portion of this section is held invalid or unenforceable under any
 particular circumstance, the balance of the section is intended to apply,
 and the section as a whole is intended to apply in other circumstances.
 It is not the purpose of this section to induce you to infringe any
 patents or other property right claims or to contest validity of any
 such claims; this section has the sole purpose of protecting the
 integrity of the free software distribution system which is
 implemented by public license practices.  Many people have made
 generous contributions to the wide range of software distributed
 through that system in reliance on consistent application of that
 system; it is up to the author/donor to decide if he or she is willing
 to distribute software through any other system and a licensee cannot
 impose that choice.
 This section is intended to make thoroughly clear what is believed to
 be a consequence of the rest of this License.
 . If the distribution and/or use of the Library is restricted in
 certain countries either by patents or by copyrighted interfaces, the
 original copyright holder who places the Library under this License may add
 an explicit geographical distribution limitation excluding those countries,
 so that distribution is permitted only in or among countries not thus
 excluded.  In such case, this License incorporates the limitation as if
 written in the body of this License.
 . The Free Software Foundation may publish revised and/or new
 versions of the Library General Public License from time to time.
 Such new versions will be similar in spirit to the present version,
 but may differ in detail to address new problems or concerns.
 Each version is given a distinguishing version number.  If the Library
 specifies a version number of this License which applies to it and
 "any later version", you have the option of following the terms and
 conditions either of that version or of any later version published by
 the Free Software Foundation.  If the Library does not specify a
 license version number, you may choose any version ever published by
 the Free Software Foundation.
 . If you wish to incorporate parts of the Library into other free
 programs whose distribution conditions are incompatible with these,
 write to the author to ask for permission.  For software which is
 copyrighted by the Free Software Foundation, write to the Free
 Software Foundation; we sometimes make exceptions for this.  Our
 decision will be guided by the two goals of preserving the free status
 of all derivatives of our free software and of promoting the sharing
 and reuse of software generally.
 			    NO WARRANTY
 . BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO
 WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW.
 EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR
 OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY
 KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE
 IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
 PURPOSE.  THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE
 LIBRARY IS WITH YOU.  SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME
 THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
 . IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN
 WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY
 AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU
 FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR
 CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE
 LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING
 RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A
 FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF
 SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH
 DAMAGES.
 		     END OF TERMS AND CONDITIONS
      Appendix: How to Apply These Terms to Your New Libraries
   If you develop a new library, and you want it to be of the greatest
 possible use to the public, we recommend making it free software that
 everyone can redistribute and change.  You can do so by permitting
 redistribution under these terms (or, alternatively, under the terms of the
 ordinary General Public License).
   To apply these terms, attach the following notices to the library.  It is
 safest to attach them to the start of each source file to most effectively
 convey the exclusion of warranty; and each file should have at least the
 "copyright" line and a pointer to where the full notice is found.
     <one line to give the library's name and a brief idea of what it does.>
     Copyright (C) <year>  <name of author>
     This library is free software; you can redistribute it and/or
     modify it under the terms of the GNU Library General Public
     License as published by the Free Software Foundation; either
     version 2 of the License, or (at your option) any later version.
     This library is distributed in the hope that it will be useful,
     but WITHOUT ANY WARRANTY; without even the implied warranty of
     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
     Library General Public License for more details.
     You should have received a copy of the GNU Library General Public
     License along with this library; if not, write to the Free
     Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
 Also add information on how to contact you by electronic and paper mail.
 You should also get your employer (if you work as a programmer) or your
 school, if any, to sign a "copyright disclaimer" for the library, if
 necessary.  Here is a sample; alter the names:
   Yoyodyne, Inc., hereby disclaims all copyright interest in the
   library `Frob' (a library for tweaking knobs) written by James Random Hacker.
   <signature of Ty Coon>, 1 April 1990
   Ty Coon, President of Vice
 That's all there is to it!

382

docs/GL3.txt

View File

@@ -1,13 +1,28 @@
 # Status of OpenGL extensions in Mesa
 Status of OpenGL 3.x features in Mesa
 Here's how to read this file:
 all DONE: <driver>, ...
     All the extensions are done for the given list of drivers.
 Note: when an item is marked as "DONE" it means all the core Mesa
 infrastructure is complete but it may be the case that few (if any) drivers
 implement the features.
 DONE
     The extension is done for Mesa and no implementation is necessary on the
     driver-side.
 DONE ()
     The extension is done for Mesa and all the drivers in the "all DONE" list.
 OpenGL Core and Compatibility context support
 DONE (<driver>, ...)
     The extension is done for Mesa, all the drivers in the "all DONE" list, and
     all the drivers in the brackets.
 in progress
     The extension is started but not finished yet.
 not started
     The extension isn't started yet.
 # OpenGL Core and Compatibility context support
 OpenGL 3.1 and later versions are only supported with the Core profile.
 There are no plans to support GL_ARB_compatibility. The last supported OpenGL
@@ -15,249 +30,248 @@ version with all deprecated features is 3.0. Some of the later GL features
 are exposed in the 3.0 context as extensions.
 Feature                                               Status
 ----------------------------------------------------- ------------------------
 Feature                                                 Status
 ------------------------------------------------------- ------------------------
 GL 3.0, GLSL 1.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe
 GL 3.0, GLSL 1.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr
   glBindFragDataLocation, glGetFragDataLocation         DONE
   Conditional rendering (GL_NV_conditional_render)      DONE ()
   Map buffer subranges (GL_ARB_map_buffer_range)        DONE ()
   Clamping controls (GL_ARB_color_buffer_float)         DONE ()
   Float textures, renderbuffers (GL_ARB_texture_float)  DONE ()
   GL_NV_conditional_render (Conditional rendering)      DONE ()
   GL_ARB_map_buffer_range (Map buffer subranges)        DONE ()
   GL_ARB_color_buffer_float (Clamping controls)         DONE ()
   GL_ARB_texture_float (Float textures, renderbuffers)  DONE ()
   GL_EXT_packed_float                                   DONE ()
   GL_EXT_texture_shared_exponent                        DONE ()
   Float depth buffers (GL_ARB_depth_buffer_float)       DONE ()
   Framebuffer objects (GL_ARB_framebuffer_object)       DONE ()
   GL_ARB_depth_buffer_float (Float depth buffers)       DONE ()
   GL_ARB_framebuffer_object (Framebuffer objects)       DONE ()
   GL_ARB_half_float_pixel                               DONE (all drivers)
   GL_ARB_half_float_vertex                              DONE ()
   GL_EXT_texture_integer                                DONE ()
   GL_EXT_texture_array                                  DONE ()
   Per-buffer blend and masks (GL_EXT_draw_buffers2)     DONE ()
   GL_EXT_draw_buffers2 (Per-buffer blend and masks)     DONE ()
   GL_EXT_texture_compression_rgtc                       DONE ()
   GL_ARB_texture_rg                                     DONE ()
   Transform feedback (GL_EXT_transform_feedback)        DONE ()
   Vertex array objects (GL_ARB_vertex_array_object)     DONE ()
   sRGB framebuffer format (GL_EXT_framebuffer_sRGB)     DONE ()
   GL_EXT_transform_feedback (Transform feedback)        DONE ()
   GL_ARB_vertex_array_object (Vertex array objects)     DONE ()
   GL_EXT_framebuffer_sRGB (sRGB framebuffer format)     DONE ()
   glClearBuffer commands                                DONE
   glGetStringi command                                  DONE
   glTexParameterI, glGetTexParameterI commands          DONE
   glVertexAttribI commands                              DONE
   Depth format cube textures                            DONE ()
   GLX_ARB_create_context (GLX 1.4 is required)          DONE
   Multisample anti-aliasing                             DONE (llvmpipe (*), softpipe (*))
   Multisample anti-aliasing                             DONE (llvmpipe (*), softpipe (*), swr (*))
 (*) llvmpipe and softpipe have fake Multisample anti-aliasing support
 (*) llvmpipe, softpipe, and swr have fake Multisample anti-aliasing support
 GL 3.1, GLSL 1.40 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe
 GL 3.1, GLSL 1.40 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr
   Forward compatible context support/deprecations       DONE ()
   Instanced drawing (GL_ARB_draw_instanced)             DONE ()
   Buffer copying (GL_ARB_copy_buffer)                   DONE ()
   Primitive restart (GL_NV_primitive_restart)           DONE ()
   GL_ARB_draw_instanced (Instanced drawing)             DONE ()
   GL_ARB_copy_buffer (Buffer copying)                   DONE ()
   GL_NV_primitive_restart (Primitive restart)           DONE ()
 vertex texture image units                         DONE ()
   Texture buffer objs (GL_ARB_texture_buffer_object)    DONE for OpenGL 3.1 contexts ()
   Rectangular textures (GL_ARB_texture_rectangle)       DONE ()
   Uniform buffer objs (GL_ARB_uniform_buffer_object)    DONE ()
   Signed normalized textures (GL_EXT_texture_snorm)     DONE ()
   GL_ARB_texture_buffer_object (Texture buffer objs)    DONE (for OpenGL 3.1 contexts)
   GL_ARB_texture_rectangle (Rectangular textures)       DONE ()
   GL_ARB_uniform_buffer_object (Uniform buffer objs)    DONE ()
   GL_EXT_texture_snorm (Signed normalized textures)     DONE ()
 GL 3.2, GLSL 1.50 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe
   Core/compatibility profiles                           DONE
   Geometry shaders                                      DONE ()
   BGRA vertex order (GL_ARB_vertex_array_bgra)          DONE ()
   Base vertex offset(GL_ARB_draw_elements_base_vertex)  DONE ()
   Frag shader coord (GL_ARB_fragment_coord_conventions) DONE ()
   Provoking vertex (GL_ARB_provoking_vertex)            DONE ()
   Seamless cubemaps (GL_ARB_seamless_cube_map)          DONE ()
   Multisample textures (GL_ARB_texture_multisample)     DONE ()
   Frag depth clamp (GL_ARB_depth_clamp)                 DONE ()
   Fence objects (GL_ARB_sync)                           DONE ()
   GL_ARB_vertex_array_bgra (BGRA vertex order)          DONE (swr)
   GL_ARB_draw_elements_base_vertex (Base vertex offset) DONE (swr)
   GL_ARB_fragment_coord_conventions (Frag shader coord) DONE (swr)
   GL_ARB_provoking_vertex (Provoking vertex)            DONE (swr)
   GL_ARB_seamless_cube_map (Seamless cubemaps)          DONE (swr)
   GL_ARB_texture_multisample (Multisample textures)     DONE (swr)
   GL_ARB_depth_clamp (Frag depth clamp)                 DONE (swr)
   GL_ARB_sync (Fence objects)                           DONE (swr)
   GLX_ARB_create_context_profile                        DONE
 GL 3.3, GLSL 3.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe
   GL_ARB_blend_func_extended                            DONE ()
   GL_ARB_blend_func_extended                            DONE (swr)
   GL_ARB_explicit_attrib_location                       DONE (all drivers that support GLSL)
   GL_ARB_occlusion_query2                               DONE ()
   GL_ARB_occlusion_query2                               DONE (swr)
   GL_ARB_sampler_objects                                DONE (all drivers)
   GL_ARB_shader_bit_encoding                            DONE ()
   GL_ARB_texture_rgb10_a2ui                             DONE ()
   GL_ARB_texture_swizzle                                DONE ()
   GL_ARB_timer_query                                    DONE ()
   GL_ARB_instanced_arrays                               DONE ()
   GL_ARB_vertex_type_2_10_10_10_rev                     DONE ()
   GL_ARB_shader_bit_encoding                            DONE (swr)
   GL_ARB_texture_rgb10_a2ui                             DONE (swr)
   GL_ARB_texture_swizzle                                DONE (swr)
   GL_ARB_timer_query                                    DONE (swr)
   GL_ARB_instanced_arrays                               DONE (swr)
   GL_ARB_vertex_type_2_10_10_10_rev                     DONE (swr)
 GL 4.0, GLSL 4.00 --- all DONE: nvc0, radeonsi
 GL 4.0, GLSL 4.00 --- all DONE: nvc0, r600, radeonsi
   GL_ARB_draw_buffers_blend                            DONE (i965, nv50, r600, llvmpipe, softpipe)
   GL_ARB_draw_indirect                                 DONE (i965, r600, llvmpipe, softpipe)
   GL_ARB_gpu_shader5                                   DONE (i965, r600)
   - 'precise' qualifier                                DONE
   - Dynamically uniform sampler array indices          DONE (softpipe)
   - Dynamically uniform UBO array indices              DONE ()
   - Implicit signed -> unsigned conversions            DONE
   - Fused multiply-add                                 DONE ()
   - Packing/bitfield/conversion functions              DONE (softpipe)
   - Enhanced textureGather                             DONE (softpipe)
   - Geometry shader instancing                         DONE (llvmpipe, softpipe)
   - Geometry shader multiple streams                   DONE ()
   - Enhanced per-sample shading                        DONE ()
   - Interpolation functions                            DONE ()
   - New overload resolution rules                      DONE
   GL_ARB_gpu_shader_fp64                               DONE (r600, llvmpipe, softpipe)
   GL_ARB_sample_shading                                DONE (i965, nv50, r600)
   GL_ARB_shader_subroutine                             DONE (i965, nv50, r600, llvmpipe, softpipe)
   GL_ARB_tessellation_shader                           DONE ()
   GL_ARB_texture_buffer_object_rgb32                   DONE (i965, r600, llvmpipe, softpipe)
   GL_ARB_texture_cube_map_array                        DONE (i965, nv50, r600, llvmpipe, softpipe)
   GL_ARB_texture_gather                                DONE (i965, nv50, r600, llvmpipe, softpipe)
   GL_ARB_texture_query_lod                             DONE (i965, nv50, r600, softpipe)
   GL_ARB_transform_feedback2                           DONE (i965, nv50, r600, llvmpipe, softpipe)
   GL_ARB_transform_feedback3                           DONE (i965, nv50, r600, llvmpipe, softpipe)
   GL_ARB_draw_buffers_blend                             DONE (i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_draw_indirect                                  DONE (i965, llvmpipe, softpipe, swr)
   GL_ARB_gpu_shader5                                    DONE (i965)
   - 'precise' qualifier                                 DONE
   - Dynamically uniform sampler array indices           DONE (softpipe)
   - Dynamically uniform UBO array indices               DONE ()
   - Implicit signed -> unsigned conversions             DONE
   - Fused multiply-add                                  DONE ()
   - Packing/bitfield/conversion functions               DONE (softpipe)
   - Enhanced textureGather                              DONE (softpipe)
   - Geometry shader instancing                          DONE (llvmpipe, softpipe)
   - Geometry shader multiple streams                    DONE ()
   - Enhanced per-sample shading                         DONE ()
   - Interpolation functions                             DONE ()
   - New overload resolution rules                       DONE
   GL_ARB_gpu_shader_fp64                                DONE (i965/gen8+, llvmpipe, softpipe)
   GL_ARB_sample_shading                                 DONE (i965, nv50)
   GL_ARB_shader_subroutine                              DONE (i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_tessellation_shader                            DONE (i965)
   GL_ARB_texture_buffer_object_rgb32                    DONE (i965, llvmpipe, softpipe, swr)
   GL_ARB_texture_cube_map_array                         DONE (i965, nv50, llvmpipe, softpipe)
   GL_ARB_texture_gather                                 DONE (i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_texture_query_lod                              DONE (i965, nv50, softpipe)
   GL_ARB_transform_feedback2                            DONE (i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_transform_feedback3                            DONE (i965, nv50, llvmpipe, softpipe, swr)
 GL 4.1, GLSL 4.10 --- all DONE: nvc0, radeonsi
 GL 4.1, GLSL 4.10 --- all DONE: nvc0, r600, radeonsi
   GL_ARB_ES2_compatibility                             DONE (i965, nv50, r600, llvmpipe, softpipe)
   GL_ARB_get_program_binary                            DONE (0 binary formats)
   GL_ARB_separate_shader_objects                       DONE (all drivers)
   GL_ARB_shader_precision                              DONE (all drivers that support GLSL 4.10)
   GL_ARB_vertex_attrib_64bit                           DONE (r600, llvmpipe, softpipe)
   GL_ARB_viewport_array                                DONE (i965, nv50, r600, llvmpipe)
   GL_ARB_ES2_compatibility                              DONE (i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_get_program_binary                             DONE (0 binary formats)
   GL_ARB_separate_shader_objects                        DONE (all drivers)
   GL_ARB_shader_precision                               DONE (all drivers that support GLSL 4.10)
   GL_ARB_vertex_attrib_64bit                            DONE (i965/gen8+, llvmpipe, softpipe)
   GL_ARB_viewport_array                                 DONE (i965, nv50, llvmpipe, softpipe)
 GL 4.2, GLSL 4.20:
 GL 4.2, GLSL 4.20 -- all DONE: radeonsi
   GL_ARB_texture_compression_bptc                      DONE (i965, nvc0, r600, radeonsi)
   GL_ARB_compressed_texture_pixel_storage              DONE (all drivers)
   GL_ARB_shader_atomic_counters                        DONE (i965)
   GL_ARB_texture_storage                               DONE (all drivers)
   GL_ARB_transform_feedback_instanced                  DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_base_instance                                 DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_shader_image_load_store                       DONE (i965)
   GL_ARB_conservative_depth                            DONE (all drivers that support GLSL 1.30)
   GL_ARB_shading_language_420pack                      DONE (all drivers that support GLSL 1.30)
   GL_ARB_shading_language_packing                      DONE (all drivers)
   GL_ARB_internalformat_query                          DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_map_buffer_alignment                          DONE (all drivers)
   GL_ARB_texture_compression_bptc                       DONE (i965, nvc0, r600, radeonsi)
   GL_ARB_compressed_texture_pixel_storage               DONE (all drivers)
   GL_ARB_shader_atomic_counters                         DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_texture_storage                                DONE (all drivers)
   GL_ARB_transform_feedback_instanced                   DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_base_instance                                  DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_shader_image_load_store                        DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_conservative_depth                             DONE (all drivers that support GLSL 1.30)
   GL_ARB_shading_language_420pack                       DONE (all drivers that support GLSL 1.30)
   GL_ARB_shading_language_packing                       DONE (all drivers)
   GL_ARB_internalformat_query                           DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_map_buffer_alignment                           DONE (all drivers)
 GL 4.3, GLSL 4.30:
   GL_ARB_arrays_of_arrays                              DONE (i965)
   GL_ARB_ES3_compatibility                             DONE (all drivers that support GLSL 3.30)
   GL_ARB_clear_buffer_object                           DONE (all drivers)
   GL_ARB_compute_shader                                in progress (jljusten)
   GL_ARB_copy_image                                    DONE (i965, nv50, nvc0, radeonsi)
   GL_KHR_debug                                         DONE (all drivers)
   GL_ARB_explicit_uniform_location                     DONE (all drivers that support GLSL)
   GL_ARB_fragment_layer_viewport                       DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe)
   GL_ARB_framebuffer_no_attachments                    DONE (i965)
   GL_ARB_internalformat_query2                         not started
   GL_ARB_invalidate_subdata                            DONE (all drivers)
   GL_ARB_multi_draw_indirect                           DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_program_interface_query                       DONE (all drivers)
   GL_ARB_robust_buffer_access_behavior                 not started
   GL_ARB_shader_image_size                             DONE (i965)
   GL_ARB_shader_storage_buffer_object                  DONE (i965)
   GL_ARB_stencil_texturing                             DONE (i965/gen8+, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_texture_buffer_range                          DONE (nv50, nvc0, i965, r600, radeonsi, llvmpipe)
   GL_ARB_texture_query_levels                          DONE (all drivers that support GLSL 1.30)
   GL_ARB_texture_storage_multisample                   DONE (all drivers that support GL_ARB_texture_multisample)
   GL_ARB_texture_view                                  DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_vertex_attrib_binding                         DONE (all drivers)
   GL_ARB_arrays_of_arrays                               DONE (all drivers that support GLSL 1.30)
   GL_ARB_ES3_compatibility                              DONE (all drivers that support GLSL 3.30)
   GL_ARB_clear_buffer_object                            DONE (all drivers)
   GL_ARB_compute_shader                                 DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_copy_image                                     DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_KHR_debug                                          DONE (all drivers)
   GL_ARB_explicit_uniform_location                      DONE (all drivers that support GLSL)
   GL_ARB_fragment_layer_viewport                        DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe)
   GL_ARB_framebuffer_no_attachments                     DONE (i965, nvc0, r600, radeonsi, softpipe)
   GL_ARB_internalformat_query2                          DONE (all drivers)
   GL_ARB_invalidate_subdata                             DONE (all drivers)
   GL_ARB_multi_draw_indirect                            DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_program_interface_query                        DONE (all drivers)
   GL_ARB_robust_buffer_access_behavior                  DONE (i965, nvc0, radeonsi)
   GL_ARB_shader_image_size                              DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_shader_storage_buffer_object                   DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_stencil_texturing                              DONE (i965/gen8+, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_texture_buffer_range                           DONE (nv50, nvc0, i965, r600, radeonsi, llvmpipe)
   GL_ARB_texture_query_levels                           DONE (all drivers that support GLSL 1.30)
   GL_ARB_texture_storage_multisample                    DONE (all drivers that support GL_ARB_texture_multisample)
   GL_ARB_texture_view                                   DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_vertex_attrib_binding                          DONE (all drivers)
 GL 4.4, GLSL 4.40:
   GL_MAX_VERTEX_ATTRIB_STRIDE                          DONE (all drivers)
   GL_ARB_buffer_storage                                DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_ARB_clear_texture                                 DONE (i965, nv50, nvc0)
   GL_ARB_enhanced_layouts                              in progress (Timothy)
   - compile-time constant expressions                  DONE
   - explicit byte offsets for blocks                   in progress
   - forced alignment within blocks                     in progress
   - specified vec4-slot component numbers              in progress
   - specified transform/feedback layout                in progress
   - input/output block locations                       in progress
   GL_ARB_multi_bind                                    DONE (all drivers)
   GL_ARB_query_buffer_object                           not started
   GL_ARB_texture_mirror_clamp_to_edge                  DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_texture_stencil8                              DONE (nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_vertex_type_10f_11f_11f_rev                   DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_MAX_VERTEX_ATTRIB_STRIDE                           DONE (all drivers)
   GL_ARB_buffer_storage                                 DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_ARB_clear_texture                                  DONE (i965, nv50, nvc0)
   GL_ARB_enhanced_layouts                               in progress (Timothy)
   - compile-time constant expressions                   DONE
   - explicit byte offsets for blocks                    DONE
   - forced alignment within blocks                      DONE
   - specified vec4-slot component numbers               in progress
   - specified transform/feedback layout                 DONE
   - input/output block locations                        DONE
   GL_ARB_multi_bind                                     DONE (all drivers)
   GL_ARB_query_buffer_object                            DONE (i965/hsw+, nvc0)
   GL_ARB_texture_mirror_clamp_to_edge                   DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_texture_stencil8                               DONE (i965/gen8+, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_vertex_type_10f_11f_11f_rev                    DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
 GL 4.5, GLSL 4.50:
   GL_ARB_ES3_1_compatibility                           not started
   GL_ARB_clip_control                                  DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_conditional_render_inverted                   DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_cull_distance                                 in progress (Tobias)
   GL_ARB_derivative_control                            DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_ARB_direct_state_access                           DONE (all drivers)
   GL_ARB_get_texture_sub_image                         DONE (all drivers)
   GL_ARB_shader_texture_image_samples                  DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_ARB_texture_barrier                               DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_KHR_context_flush_control                         DONE (all - but needs GLX/EGL extension to be useful)
   GL_KHR_robust_buffer_access_behavior                 not started
   GL_KHR_robustness                                    90% done (the ARB variant)
   GL_EXT_shader_integer_mix                            DONE (all drivers that support GLSL)
   GL_ARB_ES3_1_compatibility                            DONE (nvc0, radeonsi)
   GL_ARB_clip_control                                   DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_conditional_render_inverted                    DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_cull_distance                                  DONE (i965, nv50, nvc0, llvmpipe, softpipe)
   GL_ARB_derivative_control                             DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_ARB_direct_state_access                            DONE (all drivers)
   GL_ARB_get_texture_sub_image                          DONE (all drivers)
   GL_ARB_shader_texture_image_samples                   DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_ARB_texture_barrier                                DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_KHR_context_flush_control                          DONE (all - but needs GLX/EGL extension to be useful)
   GL_KHR_robustness                                     DONE (i965)
   GL_EXT_shader_integer_mix                             DONE (all drivers that support GLSL)
 These are the extensions cherry-picked to make GLES 3.1
 GLES3.1, GLSL ES 3.1
   GL_ARB_arrays_of_arrays                              DONE (i965)
   GL_ARB_compute_shader                                in progress (jljusten)
   GL_ARB_draw_indirect                                 DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_explicit_uniform_location                     DONE (all drivers that support GLSL)
   GL_ARB_framebuffer_no_attachments                    DONE (i965)
   GL_ARB_program_interface_query                       DONE (all drivers)
   GL_ARB_shader_atomic_counters                        DONE (i965)
   GL_ARB_shader_image_load_store                       DONE (i965)
   GL_ARB_shader_image_size                             DONE (i965)
   GL_ARB_shader_storage_buffer_object                  DONE (i965)
   GL_ARB_shading_language_packing                      DONE (all drivers)
   GL_ARB_separate_shader_objects                       DONE (all drivers)
   GL_ARB_stencil_texturing                             DONE (i965/gen8+, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   Multisample textures (GL_ARB_texture_multisample)    DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_texture_storage_multisample                   DONE (all drivers that support GL_ARB_texture_multisample)
   GL_ARB_vertex_attrib_binding                         DONE (all drivers)
   GS5 Enhanced textureGather                           DONE (i965, nvc0, r600, radeonsi)
   GS5 Packing/bitfield/conversion functions            DONE (i965, nvc0, r600, radeonsi)
   GL_EXT_shader_integer_mix                            DONE (all drivers that support GLSL)
   GL_ARB_arrays_of_arrays                               DONE (all drivers that support GLSL 1.30)
   GL_ARB_compute_shader                                 DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_draw_indirect                                  DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_explicit_uniform_location                      DONE (all drivers that support GLSL)
   GL_ARB_framebuffer_no_attachments                     DONE (i965, nvc0, r600, radeonsi, softpipe)
   GL_ARB_program_interface_query                        DONE (all drivers)
   GL_ARB_shader_atomic_counters                         DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_shader_image_load_store                        DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_shader_image_size                              DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_shader_storage_buffer_object                   DONE (i965, nvc0, radeonsi, softpipe)
   GL_ARB_shading_language_packing                       DONE (all drivers)
   GL_ARB_separate_shader_objects                        DONE (all drivers)
   GL_ARB_stencil_texturing                              DONE (i965/gen8+, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_texture_multisample (Multisample textures)     DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_texture_storage_multisample                    DONE (all drivers that support GL_ARB_texture_multisample)
   GL_ARB_vertex_attrib_binding                          DONE (all drivers)
   GS5 Enhanced textureGather                            DONE (i965, nvc0, r600, radeonsi)
   GS5 Packing/bitfield/conversion functions             DONE (i965, nvc0, r600, radeonsi)
   GL_EXT_shader_integer_mix                             DONE (all drivers that support GLSL)
   Additional functionality not covered above:
       glMemoryBarrierByRegion                          DONE
       glGetTexLevelParameter[fi]v - needs updates      DONE
       glMemoryBarrierByRegion                           DONE
       glGetTexLevelParameter[fi]v - needs updates       DONE
       glGetBooleani_v - restrict to GLES enums
       gl_HelperInvocation support
       gl_HelperInvocation support                       DONE (i965, nvc0, r600, radeonsi)
 GLES3.2, GLSL ES 3.2
   GL_EXT_color_buffer_float                            DONE (all drivers)
   GL_KHR_blend_equation_advanced                       not started
   GL_KHR_debug                                         DONE (all drivers)
   GL_KHR_robustness                                    90% done (the ARB variant)
   GL_KHR_texture_compression_astc_ldr                  DONE (i965/gen9+)
   GL_OES_copy_image                                    not started (based on GL_ARB_copy_image, which is done for some drivers)
   GL_OES_draw_buffers_indexed                          not started
   GL_OES_draw_elements_base_vertex                     DONE (all drivers)
   GL_OES_geometry_shader                               not started (based on GL_ARB_geometry_shader4, which is done for all drivers)
   GL_OES_gpu_shader5                                   not started (based on parts of GL_ARB_gpu_shader5, which is done for some drivers)
   GL_OES_primitive_bounding box                        not started
   GL_OES_sample_shading                                not started (based on parts of GL_ARB_sample_shading, which is done for some drivers)
   GL_OES_sample_variables                              not started (based on parts of GL_ARB_sample_shading, which is done for some drivers)
   GL_OES_shader_image_atomic                           not started (based on parts of GL_ARB_shader_image_load_store, which is done for some drivers)
   GL_OES_shader_io_blocks                              not started (based on parts of GLSL 1.50, which is done)
   GL_OES_shader_multisample_interpolation              not started (based on parts of GL_ARB_gpu_shader5, which is done)
   GL_OES_tessellation_shader                           not started (based on GL_ARB_tessellation_shader, which is done for some drivers)
   GL_OES_texture_border_clamp                          not started (based on GL_ARB_texture_border_clamp, which is done)
   GL_OES_texture_buffer                                not started (based on GL_ARB_texture_buffer_object, GL_ARB_texture_buffer_range, and GL_ARB_texture_buffer_object_rgb32 that are all done)
   GL_OES_texture_cube_map_array                        not started (based on GL_ARB_texture_cube_map_array, which is done for all drivers)
   GL_OES_texture_stencil8                              not started (based on GL_ARB_texture_stencil8, which is done for some drivers)
   GL_OES_texture_storage_multisample_2d_array          DONE (all drivers that support GL_ARB_texture_multisample)
   GL_EXT_color_buffer_float                             DONE (all drivers)
   GL_KHR_blend_equation_advanced                        not started
   GL_KHR_debug                                          DONE (all drivers)
   GL_KHR_robustness                                     DONE (i965)
   GL_KHR_texture_compression_astc_ldr                   DONE (i965/gen9+)
   GL_OES_copy_image                                     DONE (i965)
   GL_OES_draw_buffers_indexed                           DONE (all drivers that support GL_ARB_draw_buffers_blend)
   GL_OES_draw_elements_base_vertex                      DONE (all drivers)
   GL_OES_geometry_shader                                started (idr)
   GL_OES_gpu_shader5                                    DONE (all drivers that support GL_ARB_gpu_shader5)
   GL_OES_primitive_bounding_box                         not started
   GL_OES_sample_shading                                 DONE (i965, nvc0, r600, radeonsi)
   GL_OES_sample_variables                               DONE (i965, nvc0, r600, radeonsi)
   GL_OES_shader_image_atomic                            DONE (all drivers that support GL_ARB_shader_image_load_store)
   GL_OES_shader_io_blocks                               DONE (i965/gen8+, nvc0, radeonsi)
   GL_OES_shader_multisample_interpolation               DONE (i965, nvc0, r600, radeonsi)
   GL_OES_tessellation_shader                            started (Ken)
   GL_OES_texture_border_clamp                           DONE (all drivers)
   GL_OES_texture_buffer                                 DONE (i965, nvc0, radeonsi)
   GL_OES_texture_cube_map_array                         not started (based on GL_ARB_texture_cube_map_array, which is done for all drivers)
   GL_OES_texture_stencil8                               DONE (all drivers that support GL_ARB_texture_stencil8)
   GL_OES_texture_storage_multisample_2d_array           DONE (all drivers that support GL_ARB_texture_multisample)
 More info about these features and the work involved can be found at
 http://dri.freedesktop.org/wiki/MissingFunctionality

									
										4

docs/contents.html
									
												View File
												
				@@ -90,14 +90,14 @@

				<li><a href="http://www.opengl.org" target="_parent">OpenGL website</a>

				<li><a href="http://dri.freedesktop.org" target="_parent">DRI website</a>

				<li><a href="http://www.freedesktop.org" target="_parent">freedesktop.org</a>

				<li><a href="http://planet.freedesktop.org" target="_parent">Developer blogs</a>

				</ul>

				<b>Hosted by:</b>

				<br>

				<blockquote>

				<a href="http://sourceforge.net"

				target="_parent"><img src="http://sourceforge.net/sflogo.php?group_id=3&amp;type=1"

				width="88" height="31" align="bottom" alt="Sourceforge.net" border="0"></a>

				target="_parent">sourceforge.net</a>

				</blockquote>

				</body>

									
										4

docs/download.html
									
												View File
												
				@@ -18,7 +18,9 @@

				<p>

				Primary Mesa download site:

				<a href="ftp://ftp.freedesktop.org/pub/mesa/">freedesktop.org</a> (FTP)

				<a href="ftp://ftp.freedesktop.org/pub/mesa/">ftp.freedesktop.org</a> (FTP)

				or <a href="https://mesa.freedesktop.org/archive/">mesa.freedesktop.org</a>

				(HTTP).

				</p>

				<p>

									
										8

docs/egl.html
									
												View File
												
				@@ -89,9 +89,11 @@ types such as <code>EGLNativeDisplayType</code> or

				<p>The available platforms are <code>x11</code>, <code>drm</code>,

				<code>wayland</code>, <code>surfaceless</code>, <code>android</code>,

				and <code>haiku</code>.  The <code>android</code> platform

				can only be built as a system component, part of AOSP, while the

				<code>haiku</code> platform can only be built with SCons.

				and <code>haiku</code>.

				The <code>android</code> platform can either be built as a system

				component, part of AOSP, using <code>Android.mk</code> files, or

				cross-compiled using appropriate <code>configure</code> options.

				The <code>haiku</code> platform can only be built with SCons.

				Unless for special needs, the build system should

				select the right platforms automatically.</p>

									
										33

docs/envvars.html
									
												View File
												
				@@ -91,11 +91,20 @@ This is only valid for versions &gt;= 3.0.

				<li> Mesa may not really implement all the features of the given version.

				(for developers only)

				</ul>

				<li>MESA_GLES_VERSION_OVERRIDE - changes the value returned by

				glGetString(GL_VERSION) for OpenGL ES.

				<ul>

				<li> The format should be MAJOR.MINOR

				<li> Examples: 2.0, 3.0, 3.1

				<li> Mesa may not really implement all the features of the given version.

				(for developers only)

				</ul>

				<li>MESA_GLSL_VERSION_OVERRIDE - changes the value returned by

				glGetString(GL_SHADING_LANGUAGE_VERSION). Valid values are integers, such as

				"130".  Mesa will not really implement all the features of the given language version

				if it's higher than what's normally reported. (for developers only)

				<li>MESA_GLSL - <a href="shading.html#envvars">shading language compiler options</a>

				<li>MESA_NO_MINMAX_CACHE - when set, the minmax index cache is globally disabled.

				</ul>

				@@ -154,6 +163,9 @@ See the <a href="xlibdriver.html">Xlib software driver page</a> for details.

				   <li>blorp - emit messages about the blorp operations (blits &amp; clears)</li>

				   <li>nodualobj - suppress generation of dual-object geometry shader code</li>

				   <li>optimizer - dump shader assembly to files at each optimization pass and iteration that make progress</li>

				   <li>vec4 - force vec4 mode in vertex shader</li>

				   <li>spill_fs - force spilling of all registers in the scalar backend (useful to debug spilling code)</li>

				   <li>spill_vec4 - force spilling of all registers in the vec4 backend (useful to debug spilling code)</li>

				</ul>

				</ul>

				@@ -223,7 +235,7 @@ See src/mesa/state_tracker/st_debug.c for other options.

				<li>LP_PERF - a comma-separated list of options to selectively no-op various

				    parts of the driver.  See the source code for details.

				<li>LP_NUM_THREADS - an integer indicating how many threads to use for rendering.

				    Zero turns of threading completely.  The default value is the number of CPU

				    Zero turns off threading completely.  The default value is the number of CPU

				    cores present.

				</ul>

				@@ -244,6 +256,25 @@ for details.

				</ul>

				<h3>VC4 driver environment variables</h3>

				<ul>

				<li>VC4_DEBUG - a comma-separated list of named flags, which do various things:

				<ul>

				   <li>cl - dump command list during creation</li>

				   <li>qpu - dump generated QPU instructions</li>

				   <li>qir - dump QPU IR during program compile</li>

				   <li>nir - dump NIR during program compile</li>

				   <li>tgsi - dump TGSI during program compile</li>

				   <li>shaderdb - dump program compile information for shader-db analysis</li>

				   <li>perf - print during performance-related events</li>

				   <li>norast - skip actual hardware execution of commands</li>

				   <li>always_flush - flush after each draw call</li>

				   <li>always_sync - wait for finish after each flush</li>

				   <li>dump - write a GPU command stream trace file (VC4 simulator only)</li>

				</ul>

				</ul>

				<p>

				Other Gallium drivers have their own environment variables.  These may change

				frequently so the source code should be consulted for details.

									
										73

docs/index.html
									
												View File
												
				@@ -16,6 +16,79 @@

				<h1>News</h1>

				<h2>May 9, 2016</h2>

				<p>

				<a href="relnotes/11.1.4.html">Mesa 11.1.4</a> and

				<a href="relnotes/11.2.2.html">Mesa 11.2.2</a> are released.

				These are bug-fix releases from the 11.1 and 11.2 branches, respectively.

				<br>

				NOTE: It is anticipated that 11.1.4 will be the final release in the 11.1.4

				series. Users of 11.1 are encouraged to migrate to the 11.2 series in order

				to obtain future fixes.

				</p>

				<h2>April 17, 2016</h2>

				<p>

				<a href="relnotes/11.1.3.html">Mesa 11.1.3</a> and

				<a href="relnotes/11.2.1.html">Mesa 11.2.1</a> are released.

				These are bug-fix releases from the 11.1 and 11.2 branches, respectively.

				</p>

				<h2>April 4, 2016</h2>

				<p>

				<a href="relnotes/11.2.0.html">Mesa 11.2.0</a> is released.  This is a

				new development release.  See the release notes for more information

				about the release.

				</p>

				<h2>February 10, 2016</h2>

				<p>

				<a href="relnotes/11.1.2.html">Mesa 11.1.2</a> is released.

				This is a bug-fix release.

				</p>

				<h2>January 22, 2016</h2>

				<p>

				<a href="relnotes/11.0.9.html">Mesa 11.0.9</a> is released.

				This is a bug-fix release.

				<br>

				NOTE: It is anticipated that 11.0.9 will be the final release in the 11.0

				series. Users of 11.0 are encouraged to migrate to the 11.1 series in order

				to obtain future fixes.

				</p>

				<h2>January 13, 2016</h2>

				<p>

				<a href="relnotes/11.1.1.html">Mesa 11.1.1</a> is released.

				This is a bug-fix release.

				</p>

				<h2>December 21, 2015</h2>

				<p>

				<a href="relnotes/11.0.8.html">Mesa 11.0.8</a> is released.

				This is a bug-fix release.

				</p>

				<h2>December 15, 2015</h2>

				<p>

				<a href="relnotes/11.1.0.html">Mesa 11.1.0</a> is released.  This is a new

				development release.  See the release notes for more information about

				the release.

				</p>

				<h2>December 9, 2015</h2>

				<p>

				<a href="relnotes/11.0.7.html">Mesa 11.0.7</a> is released.

				This is a bug-fix release.

				</p>

				<p>

				Mesa demos 8.3.0 is also released.

				See the <a href="http://lists.freedesktop.org/archives/mesa-announce/2015-December/000191.html">announcement</a> for more information about the release.

				You can download it from <a href="ftp://ftp.freedesktop.org/pub/mesa/demos/8.3.0/">ftp.freedesktop.org/pub/mesa/demos/8.3.0/</a>.

				</p>

				<h2>November 21, 2015</h2>

				<p>

				<a href="relnotes/11.0.6.html">Mesa 11.0.6</a> is released.

									
										8

docs/install.html
									
												View File
												
				@@ -39,7 +39,7 @@ Version 2.6.4 or later should work.

				</li>

				<br>

				<li><a href="http://www.makotemplates.org/">Python Mako module</a> -

				Python Mako module is required. Version 0.7.3 or later should work.

				Python Mako module is required. Version 0.3.4 or later should work.

				</li>

				</br>

				<li><a href="http://www.scons.org/">SCons</a> is required for building on

				@@ -58,6 +58,9 @@ On Windows with MinGW, install flex and bison with:

				For MSVC on Windows, install

				<a href="http://winflexbison.sourceforge.net/">Win flex-bison</a>.

				</li>

				<br>

				<li>For building on Windows, Microsoft Visual Studio 2013 or later is required.

				</li>

				</ul>

				@@ -70,8 +73,7 @@ The following are required for DRI-based hardware acceleration with Mesa:

				<ul>

				<li><a href="http://xorg.freedesktop.org/releases/individual/proto/">

				dri2proto</a> version 2.6 or later

				<li><a href="http://dri.freedesktop.org/libdrm/">libDRM</a>

				version 2.4.33 or later

				<li><a href="http://dri.freedesktop.org/libdrm/">libDRM</a> latest version

				<li>Xorg server version 1.5 or later

				<li>Linux 2.6.28 or later

				</ul>

									
										14

docs/license.html
									
												View File
												
				@@ -46,10 +46,10 @@ library</em>. <br>

				<p>

				The Mesa distribution consists of several components.  Different copyrights

				and licenses apply to different components.  For example, some demo programs

				are copyrighted by SGI, some of the Mesa device drivers are copyrighted by

				their authors.  See below for a list of Mesa's main components and the license

				for each.

				and licenses apply to different components.

				For example, the GLX client code uses the SGI Free Software License B, and

				some of the Mesa device drivers are copyrighted by their authors.

				See below for a list of Mesa's main components and the license for each.

				</p>

				<p>

				The core Mesa library is licensed according to the terms of the MIT license.

				@@ -97,13 +97,17 @@ and their respective licenses.

				<pre>

				Component         Location               License

				------------------------------------------------------------------

				Main Mesa code    src/mesa/              Mesa (MIT)

				Main Mesa code    src/mesa/              MIT

				Device drivers    src/mesa/drivers/*     MIT, generally

				Gallium code      src/gallium/           MIT

				Ext headers       include/GL/glext.h     Khronos

				                  include/GL/glxext.h

				GLX client code   src/glx/               SGI Free Software License B

				C11 thread        include/c11/threads*.h Boost (permissive)

				emulation

				</pre>

									
										11

docs/relnotes.html
									
												View File
												
				@@ -21,6 +21,17 @@ The release notes summarize what's new or changed in each Mesa release.

				</p>

				<ul>

				<li><a href="relnotes/11.2.2.html">11.2.2 release notes</a>

				<li><a href="relnotes/11.1.4.html">11.1.4 release notes</a>

				<li><a href="relnotes/11.2.1.html">11.2.1 release notes</a>

				<li><a href="relnotes/11.1.3.html">11.1.3 release notes</a>

				<li><a href="relnotes/11.2.0.html">11.2.0 release notes</a>

				<li><a href="relnotes/11.1.2.html">11.1.2 release notes</a>

				<li><a href="relnotes/11.0.9.html">11.0.9 release notes</a>

				<li><a href="relnotes/11.1.1.html">11.1.1 release notes</a>

				<li><a href="relnotes/11.0.8.html">11.0.8 release notes</a>

				<li><a href="relnotes/11.1.0.html">11.1.0 release notes</a>

				<li><a href="relnotes/11.0.7.html">11.0.7 release notes</a>

				<li><a href="relnotes/11.0.6.html">11.0.6 release notes</a>

				<li><a href="relnotes/11.0.5.html">11.0.5 release notes</a>

				<li><a href="relnotes/11.0.4.html">11.0.4 release notes</a>

									
										2

docs/relnotes/11.0.5.html
									
												View File
												
				@@ -45,8 +45,6 @@ because compatibility contexts are not supported.

				<ul>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91993">Bug 91993</a> - Graphical glitch in Astromenace (open-source game).</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92214">Bug 92214</a> - Flightgear crashes during splashboot with R600 driver, LLVM 3.7.0 and mesa 11.0.2</li>

									
										154

docs/relnotes/11.0.7.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,154 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 11.0.7 Release Notes / December 9, 2015</h1>

				<p>

				Mesa 11.0.7 is a bug fix release which fixes bugs found since the 11.0.6 release.

				</p>

				<p>

				Mesa 11.0.7 implements the OpenGL 4.1 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.1.  OpenGL

				4.1 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				07c27004ff68b288097d17b2faa7bdf15ec73c96b7e6c9835266e544adf0a62f  mesa-11.0.7.tar.gz

				e7e90a332ede6c8fd08eff90786a3fd1605a4e62ebf3a9b514047838194538cb  mesa-11.0.7.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90348">Bug 90348</a> - Spilling failure of b96 merged value</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92363">Bug 92363</a> - [BSW/BDW] ogles1conform Gets test fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92438">Bug 92438</a> - Segfault in pushbuf_kref when running the android emulator (qemu) on nv50</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93110">Bug 93110</a> - [NVE4] textureSize() and textureQueryLevels() uses a texture bound during the previous draw call</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93126">Bug 93126</a> - wrongly claim supporting GL_EXT_texture_rg</li>

				</ul>

				<h2>Changes</h2>

				<p>Chris Wilson (1):</p>

				<ul>

				  <li>meta: Compute correct buffer size with SkipRows/SkipPixels</li>

				</ul>

				<p>Daniel Stone (1):</p>

				<ul>

				  <li>egl/wayland: Ignore rects from SwapBuffersWithDamage</li>

				</ul>

				<p>Dave Airlie (4):</p>

				<ul>

				  <li>texgetimage: consolidate 1D array handling code.</li>

				  <li>r600: geometry shader gsvs itemsize workaround</li>

				  <li>r600: rv670 use at least 16es/gs threads</li>

				  <li>r600: workaround empty geom shader.</li>

				</ul>

				<p>Emil Velikov (4):</p>

				<ul>

				  <li>docs: add sha256 checksums for 11.0.6</li>

				  <li>get-pick-list.sh: Require explicit "11.0" for nominating stable patches</li>

				  <li>mesa; add get-extra-pick-list.sh script into bin/</li>

				  <li>Update version to 11.0.7</li>

				</ul>

				<p>François Tigeot (1):</p>

				<ul>

				  <li>xmlconfig: Add support for DragonFly</li>

				</ul>

				<p>Ian Romanick (22):</p>

				<ul>

				  <li>mesa: Make bind_vertex_buffer avilable outside varray.c</li>

				  <li>mesa: Refactor update_array_format to make _mesa_update_array_format_public</li>

				  <li>mesa: Refactor enable_vertex_array_attrib to make _mesa_enable_vertex_array_attrib</li>

				  <li>i965: Pass brw_context instead of gl_context to brw_draw_rectlist</li>

				  <li>i965: Use DSA functions for VBOs in brw_meta_fast_clear</li>

				  <li>i965: Use internal functions for buffer object access</li>

				  <li>i965: Don't pollute the buffer object namespace in brw_meta_fast_clear</li>

				  <li>meta: Use DSA functions for PBO in create_texture_for_pbo</li>

				  <li>meta: Use _mesa_NamedBufferData and _mesa_NamedBufferSubData for users of _mesa_meta_setup_vertex_objects</li>

				  <li>i965: Use _mesa_NamedBufferSubData for users of _mesa_meta_setup_vertex_objects</li>

				  <li>meta: Don't leave the VBO bound after _mesa_meta_setup_vertex_objects</li>

				  <li>meta: Track VBO using gl_buffer_object instead of GL API object handle</li>

				  <li>meta: Use DSA functions for VBOs in _mesa_meta_setup_vertex_objects</li>

				  <li>meta: Use internal functions for buffer object and VAO access</li>

				  <li>meta: Don't pollute the buffer object namespace in _mesa_meta_setup_vertex_objects</li>

				  <li>meta: Partially convert _mesa_meta_DrawTex to DSA</li>

				  <li>meta: Track VBO using gl_buffer_object instead of GL API object handle in _mesa_meta_DrawTex</li>

				  <li>meta: Use internal functions for buffer object and VAO access in _mesa_meta_DrawTex</li>

				  <li>meta: Don't pollute the buffer object namespace in _mesa_meta_DrawTex</li>

				  <li>meta/TexSubImage: Don't pollute the buffer object namespace</li>

				  <li>meta/generate_mipmap: Don't leak the framebuffer object</li>

				  <li>glsl: Fix off-by-one error in array size check assertion</li>

				</ul>

				<p>Ilia Mirkin (7):</p>

				<ul>

				  <li>nvc0/ir: actually emit AFETCH on kepler</li>

				  <li>nir: fix typo in idiv lowering, causing large-udiv-udiv failures</li>

				  <li>nouveau: use the buffer usage to determine placement when no binding</li>

				  <li>nv50,nvc0: properly handle buffer storage invalidation on dsa buffer</li>

				  <li>nv50/ir: fix (un)spilling of 3-wide results</li>

				  <li>mesa: support GL_RED/GL_RG in ES2 contexts when driver support exists</li>

				  <li>nvc0/ir: start offset at texBindBase for txq, like regular texturing</li>

				</ul>

				<p>Jonathan Gray (1):</p>

				<ul>

				  <li>automake: fix some occurrences of hardcoded -ldl and -lpthread</li>

				</ul>

				<p>Leo Liu (1):</p>

				<ul>

				  <li>radeon/vce: disable Stoney VCE for 11.0</li>

				</ul>

				<p>Marta Lofstedt (1):</p>

				<ul>

				  <li>gles2: Update gl2ext.h to revision: 32120</li>

				</ul>

				<p>Oded Gabbay (1):</p>

				<ul>

				  <li>llvmpipe: disable VSX in ppc due to LLVM PPC bug</li>

				</ul>

				</div>

				</body>

				</html>

									
										200

docs/relnotes/11.0.8.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,200 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 11.0.8 Release Notes / December 9, 2015</h1>

				<p>

				Mesa 11.0.8 is a bug fix release which fixes bugs found since the 11.0.7 release.

				</p>

				<p>

				Mesa 11.0.8 implements the OpenGL 4.1 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.1.  OpenGL

				4.1 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				ab9db87b54d7525e4b611b82577ea9a9eae55927558df57b190059d5ecd9406f  mesa-11.0.8.tar.gz

				5696e4730518b6805d2ed5def393c4293f425a2c2c01bd5ed4bdd7ad62f7ad75  mesa-11.0.8.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91806">Bug 91806</a> - configure does not test whether assembler supports sse4.1</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92849">Bug 92849</a> - [IVB HSW BDW] piglit image load/store load-from-cleared-image.shader_test fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92909">Bug 92909</a> - Offset/alignment issue with layout std140 and vec3</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93004">Bug 93004</a> - Guild Wars 2 crash on nouveau DX11 cards</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93215">Bug 93215</a> - [Regression bisected] Ogles1conform Automatic mipmap generation test is fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93266">Bug 93266</a> - gl_arb_shading_language_420pack does not allow binding of image variables</li>

				</ul>

				<h2>Changes</h2>

				<p>Boyuan Zhang (1):</p>

				<ul>

				  <li>radeon/uvd: uv pitch separation for stoney</li>

				</ul>

				<p>Dave Airlie (9):</p>

				<ul>

				  <li>r600: do SQ flush ES ring rolling workaround</li>

				  <li>r600: SMX returns CONTEXT_DONE early workaround</li>

				  <li>r600/shader: split address get out to a function.</li>

				  <li>r600/shader: add utility functions to do single slot arithmatic</li>

				  <li>r600g: fix geom shader input indirect indexing.</li>

				  <li>r600: handle geometry dynamic input array index</li>

				  <li>radeonsi: handle doubles in lds load path.</li>

				  <li>mesa/varray: set double arrays to non-normalised.</li>

				  <li>mesa/shader: return correct attribute location for double matrix arrays</li>

				</ul>

				<p>Emil Velikov (8):</p>

				<ul>

				  <li>docs: add sha256 checksums for 11.0.7</li>

				  <li>cherry-ignore: don't pick a specific i965 formats patch</li>

				  <li>Revert "i965/nir: Remove unused indirect handling"</li>

				  <li>Revert "i965/state: Get rid of dword_pitch arguments to buffer functions"</li>

				  <li>Revert "i965/vec4: Use a stride of 1 and byte offsets for UBOs"</li>

				  <li>Revert "i965/fs: Use a stride of 1 and byte offsets for UBOs"</li>

				  <li>Revert "i965/vec4: Use byte offsets for UBO pulls on Sandy Bridge"</li>

				  <li>Update version to 11.0.8</li>

				</ul>

				<p>Francisco Jerez (1):</p>

				<ul>

				  <li>i965: Resolve color and flush for all active shader images in intel_update_state().</li>

				</ul>

				<p>Ian Romanick (1):</p>

				<ul>

				  <li>meta/generate_mipmap: Work-around GLES 1.x problem with GL_DRAW_FRAMEBUFFER</li>

				</ul>

				<p>Ilia Mirkin (17):</p>

				<ul>

				  <li>freedreno/a4xx: support lod_bias</li>

				  <li>freedreno/a4xx: fix 5_5_5_1 texture sampler format</li>

				  <li>freedreno/a4xx: point regid to "red" even for alpha-only rb formats</li>

				  <li>nvc0/ir: fold postfactor into immediate</li>

				  <li>nv50/ir: deal with loops with no breaks</li>

				  <li>nv50/ir: the mad source might not have a defining instruction</li>

				  <li>nv50/ir: fix instruction permutation logic</li>

				  <li>nv50/ir: don't forget to mark flagsDef on cvt in txb lowering</li>

				  <li>nv50/ir: fix DCE to not generate 96-bit loads</li>

				  <li>nv50/ir: avoid looking at uninitialized srcMods entries</li>

				  <li>gk110/ir: fix imul hi emission with limm arg</li>

				  <li>gk104/ir: sampler doesn't matter for txf</li>

				  <li>gk110/ir: fix imad sat/hi flag emission for immediate args</li>

				  <li>nv50/ir: fix cutoff for using r63 vs r127 when replacing zero</li>

				  <li>nv50/ir: can't have predication and immediates</li>

				  <li>glsl: assign varying locations to tess shaders when doing SSO</li>

				  <li>ttn: add TEX2 support</li>

				</ul>

				<p>Jason Ekstrand (5):</p>

				<ul>

				  <li>i965/vec4: Use byte offsets for UBO pulls on Sandy Bridge</li>

				  <li>i965/fs: Use a stride of 1 and byte offsets for UBOs</li>

				  <li>i965/vec4: Use a stride of 1 and byte offsets for UBOs</li>

				  <li>i965/state: Get rid of dword_pitch arguments to buffer functions</li>

				  <li>i965/nir: Remove unused indirect handling</li>

				</ul>

				<p>Jonathan Gray (2):</p>

				<ul>

				  <li>configure.ac: use pkg-config for libelf</li>

				  <li>configure: check for python2.7 for PYTHON2</li>

				</ul>

				<p>Kenneth Graunke (2):</p>

				<ul>

				  <li>i965: Fix fragment shader struct inputs.</li>

				  <li>i965: Fix scalar vertex shader struct outputs.</li>

				</ul>

				<p>Marek Olšák (8):</p>

				<ul>

				  <li>radeonsi: fix occlusion queries on Fiji</li>

				  <li>radeonsi: fix a hang due to uninitialized border color registers</li>

				  <li>radeonsi: fix Fiji for LLVM &lt;= 3.7</li>

				  <li>radeonsi: don't call of u_prims_for_vertices for patches and rectangles</li>

				  <li>radeonsi: apply the streamout workaround to Fiji as well</li>

				  <li>gallium/radeon: fix Hyper-Z hangs by programming PA_SC_MODE_CNTL_1 correctly</li>

				  <li>tgsi/scan: add flag colors_written</li>

				  <li>r600g: write all MRTs only if there is exactly one output (fixes a hang)</li>

				</ul>

				<p>Matt Turner (1):</p>

				<ul>

				  <li>glsl: Allow binding of image variables with 420pack.</li>

				</ul>

				<p>Neil Roberts (2):</p>

				<ul>

				  <li>i965: Add MESA_FORMAT_B8G8R8X8_SRGB to brw_format_for_mesa_format</li>

				  <li>i965: Add B8G8R8X8_SRGB to the alpha format override</li>

				</ul>

				<p>Oded Gabbay (1):</p>

				<ul>

				  <li>configura.ac: fix test for SSE4.1 assembler support</li>

				</ul>

				<p>Patrick Rudolph (2):</p>

				<ul>

				  <li>nv50,nvc0: fix use-after-free when vertex buffers are unbound</li>

				  <li>gallium/util: return correct number of bound vertex buffers</li>

				</ul>

				<p>Samuel Pitoiset (1):</p>

				<ul>

				  <li>nvc0: free memory allocated by the prog which reads MP perf counters</li>

				</ul>

				<p>Tapani Pälli (1):</p>

				<ul>

				  <li>i965: use _Shader to get fragment program when updating surface state</li>

				</ul>

				<p>Tom Stellard (2):</p>

				<ul>

				  <li>radeonsi: Rename si_shader::ls_rsrc{1,2} to si_shader::rsrc{1,2}</li>

				  <li>radeonsi/compute: Use the compiler's COMPUTE_PGM_RSRC* register values</li>

				</ul>

				</div>

				</body>

				</html>

									
										127

docs/relnotes/11.0.9.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,127 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 11.0.9 Release Notes / January 22, 2016</h1>

				<p>

				Mesa 11.0.9 is a bug fix release which fixes bugs found since the 11.0.8 release.

				</p>

				<p>

				Mesa 11.0.9 implements the OpenGL 4.1 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.1.  OpenGL

				4.1 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				1597c2e983f476f98efdd6cd58b5298896d18479ff542bdeff28b98b129ede05  mesa-11.0.9.tar.gz

				a1262ff1c66a16ccf341186cf0e57b306b8589eb2cc5ce92ffb6788ab01d2b01  mesa-11.0.9.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91596">Bug 91596</a> - EGL_KHR_gl_colorspace (v2) causes problem with Android-x86 GUI</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92229">Bug 92229</a> - [APITRACE] SOMA have serious graphical errors</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93257">Bug 93257</a> - [SKL, bisected] ASTC dEQP tests segfault</li>

				</ul>

				<h2>Changes</h2>

				<p>Emil Velikov (6):</p>

				<ul>

				  <li>docs: add sha256 checksums for 11.0.8</li>

				  <li>cherry-ignore: add patch already in branch</li>

				  <li>cherry-ignore: add the dri3 glx null check patch</li>

				  <li>i915: correctly parse/set the context flags</li>

				  <li>egl/dri2: expose srgb configs when KHR_gl_colorspace is available</li>

				  <li>Update version to 11.0.9</li>

				</ul>

				<p>Grazvydas Ignotas (1):</p>

				<ul>

				  <li>r600: fix constant buffer size programming</li>

				</ul>

				<p>Ilia Mirkin (5):</p>

				<ul>

				  <li>nvc0: don't forget to reset VTX_TMP bufctx slot after blit completion</li>

				  <li>nv50/ir: float(s32 &amp; 0xff) = float(u8), not s8</li>

				  <li>nv50,nvc0: make sure there's pushbuf space and that we ref the bo early</li>

				  <li>nv50,nvc0: fix crash when increasing bsp bo size for h264</li>

				  <li>nvc0: scale up inter_bo size so that it's 16M for a 4K video</li>

				</ul>

				<p>Kenneth Graunke (2):</p>

				<ul>

				  <li>ralloc: Fix ralloc_adopt() to the old context's last child's parent.</li>

				  <li>nvc0: Set winding order regardless of domain.</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>radeonsi: don't miss changes to SPI_TMPRING_SIZE</li>

				</ul>

				<p>Miklós Máté (1):</p>

				<ul>

				  <li>mesa: Don't leak ATIfs instructions in DeleteFragmentShader</li>

				</ul>

				<p>Neil Roberts (1):</p>

				<ul>

				  <li>i965: Fix crash when calling glViewport with no surface bound</li>

				</ul>

				<p>Nicolai Hähnle (6):</p>

				<ul>

				  <li>gallium/radeon: only dispose locally created target machine in radeon_llvm_compile</li>

				  <li>mesa/bufferobj: make _mesa_delete_buffer_object externally accessible</li>

				  <li>st/mesa: use _mesa_delete_buffer_object</li>

				  <li>radeon: use _mesa_delete_buffer_object</li>

				  <li>i915: use _mesa_delete_buffer_object</li>

				  <li>i965: use _mesa_delete_buffer_object</li>

				</ul>

				<p>Oded Gabbay (1):</p>

				<ul>

				  <li>llvmpipe: use vpkswss when dst is signed</li>

				</ul>

				<p>Rob Herring (1):</p>

				<ul>

				  <li>freedreno/ir3: fix 32-bit builds with pointer-to-int-cast error enabled</li>

				</ul>

				</div>

				</body>

				</html>

									
										3

docs/relnotes/11.1.1.html
									
												View File
												
				@@ -31,7 +31,8 @@ because compatibility contexts are not supported.

				<h2>SHA256 checksums</h2>

				<pre>

				TBD

				b15089817540ba0bffd0aad323ecf3a8ff6779568451827c7274890b4a269d58  mesa-11.1.1.tar.gz

				64db074fc514136b5fb3890111f0d50604db52f0b1e94ba3fcb0fe8668a7fd20  mesa-11.1.1.tar.xz

				</pre>

									
										182

docs/relnotes/11.1.2.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,182 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 11.1.2 Release Notes / February 10, 2016</h1>

				<p>

				Mesa 11.1.2 is a bug fix release which fixes bugs found since the 11.1.1 release.

				</p>

				<p>

				Mesa 11.1.2 implements the OpenGL 4.1 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.1.  OpenGL

				4.1 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				ba0e7462b2936b86e6684c26fbb55519f8d9ad31d13a1c1e1afbe41e73466eea  mesa-11.1.2.tar.gz

				8f72aead896b340ba0f7a4a474bfaf71681f5d675592aec1cb7ba698e319148b  mesa-11.1.2.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91596">Bug 91596</a> - EGL_KHR_gl_colorspace (v2) causes problem with Android-x86 GUI</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93628">Bug 93628</a> - Exception: attempt to use unavailable module DRM when building MesaGL 11.1.0 on windows</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93648">Bug 93648</a> - Random lines being rendered when playing Dolphin (geometry shaders related, w/ apitrace)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93650">Bug 93650</a> - GL_ARB_separate_shader_objects is buggy (PCSX2)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93717">Bug 93717</a> - Meta mipmap generation can corrupt texture state</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93722">Bug 93722</a> - Segfault when compiling shader with a subroutine that takes a parameter</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93731">Bug 93731</a> - glUniformSubroutinesuiv segfaults when subroutine uniform is bound to a specific location</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93761">Bug 93761</a> - A conditional discard in a fragment shader causes no depth writing at all</li>

				</ul>

				<h2>Changes</h2>

				<p>Ben Widawsky (1):</p>

				<ul>

				  <li>i965/bxt: Fix conservative wm thread counts.</li>

				</ul>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>glsl: fix subroutine lowering reusing actual parmaters</li>

				</ul>

				<p>Emil Velikov (6):</p>

				<ul>

				  <li>docs: add sha256 checksums for 11.1.1</li>

				  <li>cherry-ignore: drop the i965/kbl .num_slices patch</li>

				  <li>i915: correctly parse/set the context flags</li>

				  <li>targets/dri: android: use WHOLE static libraries</li>

				  <li>egl/dri2: expose srgb configs when KHR_gl_colorspace is available</li>

				  <li>Update version to 11.1.2</li>

				</ul>

				<p>Eric Anholt (2):</p>

				<ul>

				  <li>vc4: Don't record the seqno of a failed job submit.</li>

				  <li>vc4: Throttle outstanding rendering after submission.</li>

				</ul>

				<p>François Tigeot (1):</p>

				<ul>

				  <li>gallium: Add DragonFly support</li>

				</ul>

				<p>Grazvydas Ignotas (1):</p>

				<ul>

				  <li>r600g: don't leak driver const buffers</li>

				</ul>

				<p>Ian Romanick (2):</p>

				<ul>

				  <li>meta/blit: Restore GL_DEPTH_STENCIL_TEXTURE_MODE state for GL_TEXTURE_RECTANGLE</li>

				  <li>meta: Use internal functions to set texture parameters</li>

				</ul>

				<p>Ilia Mirkin (6):</p>

				<ul>

				  <li>st/mesa: use surface format to generate mipmaps when available</li>

				  <li>glsl: always compute proper varying type, irrespective of varying packing</li>

				  <li>nvc0: avoid crashing when there are holes in vertex array bindings</li>

				  <li>nv50,nvc0: fix buffer clearing to respect engine alignment requirements</li>

				  <li>nv50/ir: fix false global CSE on instructions with multiple defs</li>

				  <li>st/mesa: treat a write as a read for range purposes</li>

				</ul>

				<p>Jason Ekstrand (3):</p>

				<ul>

				  <li>i965/vec4: Use UW type for multiply into accumulator on GEN8+</li>

				  <li>i965/fs/generator: Take an actual shader stage rather than a string</li>

				  <li>i965/fs: Always set channel 2 of texture headers in some stages</li>

				</ul>

				<p>Jose Fonseca (2):</p>

				<ul>

				  <li>scons: Conditionally use DRM module on pipe-loader.</li>

				  <li>pipe-loader: Fix PATH_MAX define on MSVC.</li>

				</ul>

				<p>Karol Herbst (1):</p>

				<ul>

				  <li>nv50/ir: fix memory corruption when spilling and redoing RA</li>

				</ul>

				<p>Kenneth Graunke (2):</p>

				<ul>

				  <li>glsl: Make bitfield_insert/extract and bfi/bfm non-vectorizable.</li>

				  <li>glsl: Allow implicit int -&gt; uint conversions for bitwise operators (&amp;, ^, |).</li>

				</ul>

				<p>Leo Liu (2):</p>

				<ul>

				  <li>vl: add zig zag scan for list 4x4</li>

				  <li>st/omx/dec/h264: fix corruption when scaling matrix present flag set</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>radeonsi: don't miss changes to SPI_TMPRING_SIZE</li>

				</ul>

				<p>Nicolai Hähnle (11):</p>

				<ul>

				  <li>mesa/bufferobj: make _mesa_delete_buffer_object externally accessible</li>

				  <li>st/mesa: use _mesa_delete_buffer_object</li>

				  <li>radeon: use _mesa_delete_buffer_object</li>

				  <li>i915: use _mesa_delete_buffer_object</li>

				  <li>i965: use _mesa_delete_buffer_object</li>

				  <li>util/u_pstipple.c: copy immediates during transformation</li>

				  <li>radeonsi: extract the VGT_GS_MODE calculation into its own function</li>

				  <li>radeonsi: ensure that VGT_GS_MODE is sent when necessary</li>

				  <li>radeonsi: add DCC buffer for sampler views on new CS</li>

				  <li>st/mesa: use the correct address generation functions in st_TexSubImage blit</li>

				  <li>radeonsi: fix discard-only fragment shaders (11.1 version)</li>

				</ul>

				<p>Timothy Arceri (4):</p>

				<ul>

				  <li>glsl: fix segfault linking subroutine uniform with explicit location</li>

				  <li>mesa: fix segfault in glUniformSubroutinesuiv()</li>

				  <li>glsl: fix interface block error message</li>

				  <li>glsl: create helper to remove outer vertex index array used by some stages</li>

				</ul>

				</div>

				</body>

				</html>

									
										319

docs/relnotes/11.1.3.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,319 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 11.1.3 Release Notes / April 17, 2016</h1>

				<p>

				Mesa 11.1.3 is a bug fix release which fixes bugs found since the 11.1.2 release.

				</p>

				<p>

				Mesa 11.1.3 implements the OpenGL 4.1 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.1.  OpenGL

				4.1 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				9e86c72b6b2e8adb53c1c4a0002ab267b45094d753eb9404b1db34f81ce94ccf  mesa-11.1.3.tar.gz

				51f6658a214d75e4d9f05207586d7ed56ebba75c6b10841176fb6675efa310ac  mesa-11.1.3.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=27512">Bug 27512</a> - Illegal instruction _mesa_x86_64_transform_points4_general</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91526">Bug 91526</a> - World of Warcraft (on Wine) has UI corruption with nouveau</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92193">Bug 92193</a> - [SKL] ES2-CTS.gtf.GL2ExtensionTests.compressed_astc_texture.compressed_astc_texture fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93358">Bug 93358</a> - [HSW] Unreal Elemental demo - assertion error in copy_image_with_blitter</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93418">Bug 93418</a> - Geometry Shaders output wrong vertices on Sandy Bridge</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93524">Bug 93524</a> - Clover doesn't build</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93667">Bug 93667</a> - Crash in eglCreateImageKHR with huge texture size</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93813">Bug 93813</a> - Incorrect viewport range when GL_CLIP_ORIGIN is GL_UPPER_LEFT</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94050">Bug 94050</a> - test_vec4_register_coalesce regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94073">Bug 94073</a> - Miscompilation of abs_vec3_vert_xvary_ref.vert in WebGL conformance</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94088">Bug 94088</a> - [llvmpipe] SIGFPE pthread_barrier_destroy.c:40</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94193">Bug 94193</a> - [llvmpipe] Line antialiasing looks different when GL_LINE_STIPPLE is enabled with pattern 0xffff</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94195">Bug 94195</a> - [llvmpipe] Does not build with LLVM 3.7.x on Windows</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94388">Bug 94388</a> - r600_blit.c:281: r600_decompress_depth_textures: Assertion `tex-&gt;is_depth &amp;&amp; !tex-&gt;is_flushing_texture' failed.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94412">Bug 94412</a> - Trine 3 misrender</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94481">Bug 94481</a> - softpipe - access violation in img_filter_2d_nearest</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94595">Bug 94595</a> - [Mesa AMD&amp;swrast] Texture views attached as framebuffers return their viewed tecture's color encoding and render incorrectly</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94954">Bug 94954</a> - test_vec4_copy_propagation fails in `make check`</li>

				</ul>

				<h2>Changes</h2>

				<p>Anuj Phogat (1):</p>

				<ul>

				  <li>i965: Fix assert conditions for src/dst x/y offsets</li>

				</ul>

				<p>Ben Widawsky (2):</p>

				<ul>

				  <li>i965: Make sure we blit a full compressed block</li>

				  <li>i965/skl: Add two missing device IDs</li>

				</ul>

				<p>Brian Paul (1):</p>

				<ul>

				  <li>mesa: fix incorrect viewport position when GL_CLIP_ORIGIN = GL_LOWER_LEFT</li>

				</ul>

				<p>Chris Forbes (1):</p>

				<ul>

				  <li>i965/blorp: Fix hiz ops on MSAA surfaces</li>

				</ul>

				<p>Christian König (1):</p>

				<ul>

				  <li>radeon/uvd: disable MPEG1</li>

				</ul>

				<p>Christian Schmidbauer (1):</p>

				<ul>

				  <li>st/nine: specify WINAPI only for i386 and amd64</li>

				</ul>

				<p>Daniel Czarnowski (3):</p>

				<ul>

				  <li>egl_dri2: NULL check for xcb_dri2_get_buffers_reply()</li>

				  <li>egl_dri2: set correct error code if swapbuffers fails</li>

				  <li>egl: support EGL_LARGEST_PBUFFER in eglCreatePbufferSurface(...)</li>

				</ul>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>mesa/fbobject: propogate Layered when reusing attachments.</li>

				</ul>

				<p>Derek Foreman (1):</p>

				<ul>

				  <li>egl/wayland: Try to use wl_surface.damage_buffer for SwapBuffersWithDamage</li>

				</ul>

				<p>Dongwon Kim (1):</p>

				<ul>

				  <li>egl: move Null check to eglGetSyncAttribKHR to prevent Segfault</li>

				</ul>

				<p>Emil Velikov (10):</p>

				<ul>

				  <li>docs: add sha256 checksums for 11.1.2</li>

				  <li>get-pick-list.sh: Require explicit "11.1" for nominating stable patches</li>

				  <li>cherry-ignore: do not pick nv50/ir commit</li>

				  <li>automake: add nine to make distcheck</li>

				  <li>install-gallium-links: port changes from install-lib-links</li>

				  <li>automake: add more missing options for make distcheck</li>

				  <li>mesa; add get-extra-pick-list.sh script into bin/</li>

				  <li>egl/x11: check the return value of xcb_dri2_get_buffers_reply()</li>

				  <li>nvc/ir: remove duplicate variable declaration</li>

				  <li>Update version to 11.1.3</li>

				</ul>

				<p>Francisco Jerez (4):</p>

				<ul>

				  <li>i965: Reupload push and pull constants when we get new shader image unit state.</li>

				  <li>i965/fs: Add missing analysis invalidation in opt_sampler_eot().</li>

				  <li>i965/fs: Add missing analysis invalidation in fixup_3src_null_dest().</li>

				  <li>i965/vec4: Consider removal of no-op MOVs as progress during register coalesce.</li>

				</ul>

				<p>Ilia Mirkin (21):</p>

				<ul>

				  <li>nvc0/ir: fix converting between predicate and gpr</li>

				  <li>nvc0: add some missing PUSH_SPACE's</li>

				  <li>nvc0: avoid negatives in PUSH_SPACE argument</li>

				  <li>glsl: make sure builtins are initialized before getting the shader</li>

				  <li>glsl: return cloned signature, not the builtin one</li>

				  <li>nv50/ir: fix quadop emission in the presence of predication</li>

				  <li>st/mesa: fix up result_src.type when doing i2u/u2i conversions</li>

				  <li>meta/copy_image: use precomputed dst_internal_format to avoid segfault</li>

				  <li>st/mesa: force depth mode to GL_RED for sized depth/stencil formats</li>

				  <li>glx: update to updated version of EXT_create_context_es2_profile</li>

				  <li>nv50,nvc0: bump minimum texture buffer offset alignment</li>

				  <li>nvc0: reset TFB bufctx when we no longer hold a reference to the buffers</li>

				  <li>glsl: avoid stack smashing when there are too many attributes</li>

				  <li>nvc0: fix blit triangle size to fully cover FB's &gt; 8192x8192</li>

				  <li>nv50: reset TFB bufctx when we no longer hold a reference to the buffers</li>

				  <li>nv50/ir: force-enable derivatives on TXD ops</li>

				  <li>st/mesa: only minify depth for 3d targets</li>

				  <li>nv50/ir: fix indirect texturing for non-array textures on nvc0</li>

				  <li>nvc0/ir: fix picking of coordinates from tex instruction for textureGrad</li>

				  <li>nvc0: disable primitive restart and index bias during blits</li>

				  <li>nv50/ir: we can't load local memory directly into an output</li>

				</ul>

				<p>Jason Ekstrand (1):</p>

				<ul>

				  <li>nir/lower_vec_to_movs: Better report channels handled by insert_mov</li>

				</ul>

				<p>Kenneth Graunke (3):</p>

				<ul>

				  <li>mesa: Make glGet queries initialize ctx-&gt;Debug when necessary.</li>

				  <li>mesa: Allow Get*() of several forgotten IsEnabled() pnames.</li>

				  <li>i965: Only magnify depth for 3D textures, not array textures.</li>

				</ul>

				<p>Koop Mast (1):</p>

				<ul>

				  <li>st/clover: Add libelf cflags to the build</li>

				</ul>

				<p>Marc-André Lureau (1):</p>

				<ul>

				  <li>virtio_gpu: Add virtio 1.0 PCI ID to driver map</li>

				</ul>

				<p>Marek Olšák (3):</p>

				<ul>

				  <li>radeonsi: fix Hyper-Z on Stoney</li>

				  <li>gallium/radeon: don't use temporary buffers for persistent mappings</li>

				  <li>radeonsi: fix Hyper-Z hangs on P2 configs</li>

				</ul>

				<p>Matt Turner (3):</p>

				<ul>

				  <li>i965/vec4: don't copy ATTR into 3src instructions with complex swizzles</li>

				  <li>i965/fs: Don't CSE negated multiplies with saturation.</li>

				  <li>i965/vec4: Update vec4 unit tests for commit 01dacc83ff.</li>

				</ul>

				<p>Nanley Chery (2):</p>

				<ul>

				  <li>mesa/image: Make _mesa_clip_readpixels() work with renderbuffers</li>

				  <li>mesa/readpix: Clip ReadPixels() area to the ReadBuffer's</li>

				</ul>

				<p>Nicolai Hähnle (2):</p>

				<ul>

				  <li>r600g: clear compressed_depthtex/colortex_mask when binding buffer texture</li>

				  <li>st/mesa: use the texture view's format for render-to-texture</li>

				</ul>

				<p>Nishanth Peethambaran (2):</p>

				<ul>

				  <li>st/omx: Remove trailing spaces</li>

				  <li>st/omx/dec: Correct the timestamping</li>

				</ul>

				<p>Oded Gabbay (8):</p>

				<ul>

				  <li>gallium/radeon: Correctly translate colorswaps for big endian</li>

				  <li>llvmpipe: use vpkswss when dst is signed</li>

				  <li>gallium/radeon: return correct values for BE in r600_translate_colorswap</li>

				  <li>gallium/radeon: remove separate BE path in r600_translate_colorswap</li>

				  <li>gallium/r600: Don't let h/w do endian swap for colorformat</li>

				  <li>gallium/radeon: disable evergreen_do_fast_color_clear for BE</li>

				  <li>r600g: Do colorformat endian swap for PIPE_USAGE_STAGING</li>

				  <li>radeonsi: Do colorformat endian swap for PIPE_USAGE_STAGING</li>

				</ul>

				<p>Olivier Pena (1):</p>

				<ul>

				  <li>scons: support for LLVM 3.7.</li>

				</ul>

				<p>Patrick Baggett (1):</p>

				<ul>

				  <li>mesa: Use SSE prefetch instructions rather than 3DNow instructions</li>

				</ul>

				<p>Rob Herring (10):</p>

				<ul>

				  <li>Android: remove dependence on .SECONDEXPANSION</li>

				  <li>Android: glsl: fix dependence on YACC_HEADER_SUFFIX from build system</li>

				  <li>Android: add -Wno-date-time flag for clang</li>

				  <li>Android: remove headers from LOCAL_SRC_FILES</li>

				  <li>Android: clean-up and fix DRI module path handling</li>

				  <li>freedreno: drop unnecessary -Wno-packed-bitfield-compat</li>

				  <li>gallium/radeon: Add space between string literal and identifier</li>

				  <li>r600: Make enum alu_op_flags unsigned</li>

				  <li>virtio_gpu: Add PCI ID to driver map</li>

				  <li>Android: fix x86 gallium builds</li>

				</ul>

				<p>Roland Scheidegger (2):</p>

				<ul>

				  <li>softpipe: fix anisotropic filtering crash</li>

				  <li>draw: fix line stippling</li>

				</ul>

				<p>Samuel Pitoiset (1):</p>

				<ul>

				  <li>nvc0: make sure to delete samplers used by compute shaders</li>

				</ul>

				<p>Steinar H. Gunderson (1):</p>

				<ul>

				  <li>mesa: Fix locking of GLsync objects.</li>

				</ul>

				<p>Tamil velan (1):</p>

				<ul>

				  <li>radeon/uvd: increase max height to 4096 for VI and newer</li>

				</ul>

				<p>Thomas Hellstrom (2):</p>

				<ul>

				  <li>winsys/svga: Fix an uninitialized return value</li>

				  <li>winsys/svga: Increase the fence timeout</li>

				</ul>

				<p>Vinson Lee (1):</p>

				<ul>

				  <li>llvmpipe: Do not use barriers if not using threads.</li>

				</ul>

				<p>xavier (1):</p>

				<ul>

				  <li>r600/sb: Do not distribute neg in expr_handler::fold_assoc() when folding multiplications.</li>

				</ul>

				</div>

				</body>

				</html>

									
										182

docs/relnotes/11.1.4.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,182 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 11.1.4 Release Notes / May 9, 2016</h1>

				<p>

				Mesa 11.1.4 is a bug fix release which fixes bugs found since the 11.1.3 release.

				</p>

				<p>

				Mesa 11.1.4 implements the OpenGL 4.1 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.1.  OpenGL

				4.1 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				034231fffb22621dadb8e4a968cb44752b8b68db7a2417568d63c275b3490cea  mesa-11.1.4.tar.gz

				0f781e9072655305f576efd4204d183bf99ac8cb8d9e0dd9fc2b4093230a0eba  mesa-11.1.4.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92850">Bug 92850</a> - Segfault loading War Thunder</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93962">Bug 93962</a> - [HSW, regression, bisected, CTS] ES2-CTS.gtf.GL2FixedTests.scissor.scissor - segfault/asserts</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94955">Bug 94955</a> - Uninitialized variables leads to random segfaults (valgrind log, apitrace attached)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94994">Bug 94994</a> - OSMesaGetProcAdress always fails on mangled OSMesa</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95026">Bug 95026</a> - Alien Isolation segfault after initial loading screen/video</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95133">Bug 95133</a> - X-COM Enemy Within crashes when entering tactical mission with Bonaire</li>

				</ul>

				<h2>Changes</h2>

				<p>Brian Paul (1):</p>

				<ul>

				  <li>gallium/util: initialize pipe_framebuffer_state to zeros</li>

				</ul>

				<p>Chad Versace (1):</p>

				<ul>

				  <li>dri: Fix robust context creation via EGL attribute</li>

				</ul>

				<p>Egbert Eich (1):</p>

				<ul>

				  <li>dri2: Check for dummyContext to see if the glx_context is valid</li>

				</ul>

				<p>Emil Velikov (5):</p>

				<ul>

				  <li>docs: add sha256 checksums for 11.1.3</li>

				  <li>cherry-ignore: add non-applicable "fix of a fix"</li>

				  <li>cherry-ignore: ignore st_DrawAtlasBitmaps mem leak fix</li>

				  <li>cherry-ignore: add CodeEmitterGK110::emitATOM() fix</li>

				  <li>Update version to 11.1.4</li>

				</ul>

				<p>Eric Anholt (4):</p>

				<ul>

				  <li>vc4: Fix subimage accesses to LT textures.</li>

				  <li>vc4: Add support for rendering to cube map surfaces.</li>

				  <li>vc4: Fix tests for format supported with nr_samples == 1.</li>

				  <li>vc4: Make sure we recompile when sample_mask changes.</li>

				</ul>

				<p>Frederic Devernay (1):</p>

				<ul>

				  <li>glapi: fix _glapi_get_proc_address() for mangled function names</li>

				</ul>

				<p>Jason Ekstrand (2):</p>

				<ul>

				  <li>i965/tiled_memcopy: Add aligned mem_copy parameters to the [de]tiling functions</li>

				  <li>i965/tiled_memcpy: Rework the RGBA -&gt; BGRA mem_copy functions</li>

				</ul>

				<p>Jonathan Gray (1):</p>

				<ul>

				  <li>egl/x11: authenticate before doing chipset id ioctls</li>

				</ul>

				<p>Jose Fonseca (1):</p>

				<ul>

				  <li>winsys/sw/xlib: use correct free function for xlib_dt-&gt;data</li>

				</ul>

				<p>Leo Liu (1):</p>

				<ul>

				  <li>radeon/uvd: fix tonga feedback buffer size</li>

				</ul>

				<p>Marek Olšák (2):</p>

				<ul>

				  <li>drirc: add a workaround for blackness in Warsow</li>

				  <li>st/mesa: fix blit-based GetTexImage for non-finalized textures</li>

				</ul>

				<p>Nicolai Hähnle (5):</p>

				<ul>

				  <li>radeonsi: fix bounds check in si_create_vertex_elements</li>

				  <li>gallium/radeon: handle failure when mapping staging buffer</li>

				  <li>st/glsl_to_tgsi: reduce stack explosion in recursive expression visitor</li>

				  <li>gallium/radeon: fix crash in r600_set_streamout_targets</li>

				  <li>radeonsi: correct NULL-pointer check in si_upload_const_buffer</li>

				</ul>

				<p>Oded Gabbay (4):</p>

				<ul>

				  <li>r600g/radeonsi: send endian info to format translation functions</li>

				  <li>r600g: set endianess of 16/32-bit buffers according to do_endian_swap</li>

				  <li>r600g: use do_endian_swap in color swapping functions</li>

				  <li>r600g: use do_endian_swap in texture swapping function</li>

				</ul>

				<p>Roland Scheidegger (3):</p>

				<ul>

				  <li>llvmpipe: (trivial) initialize src1_alpha var to NULL</li>

				  <li>gallivm: fix bogus argument order to lp_build_sample_mipmap function</li>

				  <li>gallivm: make sampling more robust against bogus coordinates</li>

				</ul>

				<p>Samuel Pitoiset (5):</p>

				<ul>

				  <li>gk110/ir: make use of IMUL32I for all immediates</li>

				  <li>nvc0/ir: fix wrong emission of (a OP b) OP c</li>

				  <li>gk110/ir: add emission for (a OP b) OP c</li>

				  <li>nvc0: reduce GL_MAX_3D_TEXTURE_SIZE to 2048 on Kepler+</li>

				  <li>st/glsl_to_tgsi: fix potential crash when allocating temporaries</li>

				</ul>

				<p>Stefan Dirsch (1):</p>

				<ul>

				  <li>dri3: Check for dummyContext to see if the glx_context is valid</li>

				</ul>

				<p>Thomas Hindoe Paaboel Andersen (1):</p>

				<ul>

				  <li>st/va: avoid dereference after free in vlVaDestroyImage</li>

				</ul>

				<p>WuZhen (3):</p>

				<ul>

				  <li>tgsi: initialize stack allocated struct</li>

				  <li>winsys/sw/dri: use correct free function for dri_sw_dt-&gt;data</li>

				  <li>android: enable dlopen() on all architectures</li>

				</ul>

				</div>

				</body>

				</html>

									
										296

docs/relnotes/11.2.0.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,296 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 11.2.0 Release Notes / 4 April 2016</h1>

				<p>

				Mesa 11.2.0 is a new development release.

				People who are concerned with stability and reliability should stick

				with a previous release or wait for Mesa 11.2.1.

				</p>

				<p>

				Mesa 11.2.0 implements the OpenGL 4.1 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.1.  OpenGL

				4.1 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				dea3d8143929aad5c24ef0993ddb05807b30c284b488fc62903adfcc1c127887  mesa-11.2.0.tar.gz

				1c1fed2674abf3f16ed2623e9a5694d6752c293194e18462ebc644a19cfaafb2  mesa-11.2.0.tar.xz

				</pre>

				<h2>New features</h2>

				<p>

				Note: some of the new features are only available with certain drivers.

				</p>

				<ul>

				<li>GL_ARB_arrays_of_arrays on all gallium drivers that provide GLSL 1.30</li>

				<li>GL_ARB_base_instance on freedreno/a4xx</li>

				<li>GL_ARB_compute_shader on i965</li>

				<li>GL_ARB_copy_image on r600</li>

				<li>GL_ARB_indirect_parameters on nvc0</li>

				<li>GL_ARB_query_buffer_object on nvc0</li>

				<li>GL_ARB_shader_atomic_counters on nvc0</li>

				<li>GL_ARB_shader_draw_parameters on i965, nvc0</li>

				<li>GL_ARB_shader_storage_buffer_object on nvc0</li>

				<li>GL_ARB_tessellation_shader on i965 and r600 (evergreen/cayman only)</li>

				<li>GL_ARB_texture_buffer_object_rgb32 on freedreno/a4xx</li>

				<li>GL_ARB_texture_buffer_range on freedreno/a4xx</li>

				<li>GL_ARB_texture_query_lod on freedreno/a4xx</li>

				<li>GL_ARB_texture_rgb10_a2ui on freedreno/a4xx</li>

				<li>GL_ARB_texture_view on freedreno/a4xx</li>

				<li>GL_ARB_vertex_type_10f_11f_11f_rev on freedreno/a4xx</li>

				<li>GL_KHR_texture_compression_astc_ldr on freedreno/a4xx</li>

				<li>GL_AMD_performance_monitor on radeonsi (CIK+ only)</li>

				<li>GL_ATI_meminfo on r600, radeonsi</li>

				<li>GL_NVX_gpu_memory_info on r600, radeonsi</li>

				<li>New OSMesaCreateContextAttribs() function (for creating core profile

				    contexts)</li>

				</ul>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=27512">Bug 27512</a> - Illegal instruction _mesa_x86_64_transform_points4_general</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75165">Bug 75165</a> - compute.c:464:49: error: function definition is not allowed here</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79783">Bug 79783</a> - Distorted output in obs-studio where other vendors &quot;work&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89330">Bug 89330</a> - piglit glsl-1.50 invariant-qualifier-in-out-block-01 regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89969">Bug 89969</a> - nouveau: add support for chunk decoding in order to support vaapi (st/va)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90348">Bug 90348</a> - Spilling failure of b96 merged value</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91526">Bug 91526</a> - World of Warcraft (on Wine) has UI corruption with nouveau</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91596">Bug 91596</a> - EGL_KHR_gl_colorspace (v2) causes problem with Android-x86 GUI</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91806">Bug 91806</a> - configure does not test whether assembler supports sse4.1</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91927">Bug 91927</a> - [SKL] [regression] piglit compressed textures tests fail  with kernel upgrade</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92193">Bug 92193</a> - [SKL] ES2-CTS.gtf.GL2ExtensionTests.compressed_astc_texture.compressed_astc_texture fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92229">Bug 92229</a> - [APITRACE] SOMA have serious graphical errors</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92233">Bug 92233</a> - Unigine Heaven 4.0 silhuette run</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92363">Bug 92363</a> - [BSW/BDW] ogles1conform Gets test fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92438">Bug 92438</a> - Segfault in pushbuf_kref when running the android emulator (qemu) on nv50</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92589">Bug 92589</a> - [BDW BSW SKL CTS] ES31-CTS.texture_gather.* GPU_HANG</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92595">Bug 92595</a> - [HSW,BDW,SKL][GLES 3.1 CTS] Big difference in the results for the ES31-CTS.shader_bitfield_operation.* tests</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92609">Bug 92609</a> - [BDW, BSW] piglit sampling-2d-array-as-2d-layer fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92687">Bug 92687</a> - Add support for ARB_internalformat_query2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92706">Bug 92706</a> - glBlitFramebuffer refuses to blit RGBA to RGB with MSAA</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92709">Bug 92709</a> - &quot;LLVM triggered Diagnostic Handler: unsupported call to function ldexpf in main&quot; when starting race in stuntrally</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92743">Bug 92743</a> - Centroid shouldn't have to match between the FS and the VS</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92759">Bug 92759</a> - [Regression, bisected] Visuals without alpha bits are not sRGB-capable</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92849">Bug 92849</a> - [IVB HSW BDW] piglit image load/store load-from-cleared-image.shader_test fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92909">Bug 92909</a> - Offset/alignment issue with layout std140 and vec3</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93004">Bug 93004</a> - Guild Wars 2 crash on nouveau DX11 cards</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93048">Bug 93048</a> - [CTS regression] mesa af2723 breaks GL Conformance for debug extension</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93063">Bug 93063</a> - drm_helper.h:227:1: error: static declaration of ‘pipe_virgl_create_screen’ follows non-static declaration</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93091">Bug 93091</a> - [opencl] segfault when running any opencl programs (like clinfo)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93092">Bug 93092</a> - lp_test_format regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93126">Bug 93126</a> - wrongly claim supporting GL_EXT_texture_rg</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93180">Bug 93180</a> - [regression] arb_separate_shader_objects.active sampler conflict fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93189">Bug 93189</a> - &quot;./util/u_inlines.h&quot;, line 83: operands have incompatible types: void &quot;:&quot; int</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93215">Bug 93215</a> - [Regression bisected] Ogles1conform Automatic mipmap generation test is fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93235">Bug 93235</a> - [regression] dispatch sanity broken by GetPointerv</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93257">Bug 93257</a> - [SKL, bisected] ASTC dEQP tests segfault</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93264">Bug 93264</a> - Tonga VM Faults since llvm ScheduleDAGInstrs: Rework schedule graph builder.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93266">Bug 93266</a> - gl_arb_shading_language_420pack does not allow binding of image variables</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93300">Bug 93300</a> - Two Worlds 2 renders water incorrectly</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93312">Bug 93312</a> - [SKL][GLES 3.1 CTS] ES31-CTS.layout_binding* GPU_HANG</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93320">Bug 93320</a> - [HSW,BDW,SKL][GLES 3.1 CTS] ES31-CTS.vertex_attrib_binding.advanced-bindingUpdate fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93322">Bug 93322</a> - [HSW,BDW,SKL][GLES 3.1 CTS] ES31-CTS.compute_shader.resource-ubo fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93323">Bug 93323</a> - [HSW,BDW,SKL][GLES 3.1 CTS]ES31-CTS.shader_image_load_store.basic-allTargets-store-fs fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93325">Bug 93325</a> - [HSW,BDW,SKL]ES31-CTS.explicit_uniform_location.uniform-loc-* 2 tests fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93339">Bug 93339</a> - glLinkProgram() should fail when a varying is never written to in a previous stage</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93348">Bug 93348</a> - [HSW,BDW,SKL][GLES 3.1 CTS] ES31-CTS.compute_shader.* segfault</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93358">Bug 93358</a> - [HSW] Unreal Elemental demo - assertion error in copy_image_with_blitter</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93387">Bug 93387</a> - inverse() shouldn’t be exposed in GLSL 1.20 and 1.30</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93388">Bug 93388</a> - [i965, regression, bisection] MESA_FORMAT_B8G8R8X8_SRGB changes break kwin</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93407">Bug 93407</a> - [SKL][GLES 3.1 CTS]ES31-CTS.compute_shader.resources-texture fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93410">Bug 93410</a> - [BDW,SKL][GLES 3.1 CTS]ES31-CTS.shader_image_load_store.negative-linkErrors fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93418">Bug 93418</a> - Geometry Shaders output wrong vertices on Sandy Bridge</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93426">Bug 93426</a> - [SKL,BDW,BSW,BXT] CTS regression: es2-cts.gtf.gl2fixedtests.buffer_objects.buffer_object,s</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93524">Bug 93524</a> - Clover doesn't build</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93526">Bug 93526</a> - GfxBench 4 tessellation demos misrender</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93532">Bug 93532</a> - [HSW,BDW,SKL][GLES 3.1 CTS] ES31-CTS.compute_shader.*. Regression, bisected.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93540">Bug 93540</a> - [BISECTED, HSW] Rendering issue in Heaven (and other benchmarks)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93560">Bug 93560</a> - opt_combine_constants failing fabsf(reg-&gt;f) == table.imm[i].val assertion</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93599">Bug 93599</a> - Strange green flashes with &quot;Metro: Last Light Redux&quot; + &quot;Metro 2033 Redux&quot; with Intel Mesa driver</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93648">Bug 93648</a> - Random lines being rendered when playing Dolphin (geometry shaders related, w/ apitrace)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93650">Bug 93650</a> - GL_ARB_separate_shader_objects is buggy (PCSX2)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93667">Bug 93667</a> - Crash in eglCreateImageKHR with huge texture size</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93696">Bug 93696</a> - [HSW,BDW;SKL][GLES 3.1 CTS]ES31-CTS.explicit_uniform_location.uniform-loc-mix-with-implicit-max-* fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93700">Bug 93700</a> - [SKL, regression] deqp-gles2.functional.texture.completeness</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93717">Bug 93717</a> - Meta mipmap generation can corrupt texture state</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93722">Bug 93722</a> - Segfault when compiling shader with a subroutine that takes a parameter</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93725">Bug 93725</a> - [HSW, regression, bisected] ES31-CTS.texture_gather.*depth*</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93731">Bug 93731</a> - glUniformSubroutinesuiv segfaults when subroutine uniform is bound to a specific location</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93761">Bug 93761</a> - A conditional discard in a fragment shader causes no depth writing at all</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93790">Bug 93790</a> - [HSW] Use after free with compute programs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93792">Bug 93792</a> - [HSW] intel_mipmap_tree.c:1325: intel_miptree_copy_slice: Assertion `src_mt-&gt;format == dst_mt-&gt;format</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93813">Bug 93813</a> - Incorrect viewport range when GL_CLIP_ORIGIN is GL_UPPER_LEFT</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93840">Bug 93840</a> - [i965] Alien: Isolation fails with GL_ARB_compute_shader enabled</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93862">Bug 93862</a> - [Bisected] &quot;drm/amdgpu: fix amdgpu_bo_pin_restricted VRAM placing v2&quot; is bad</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93878">Bug 93878</a> - [llvmpipe][softpipe] piglit arb_gpu_shader_fp64-double-gettransformfeedbackvarying regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93957">Bug 93957</a> - [HSW] Mishandling of sample count when using an attachment-less framebuffer (assertion error)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93961">Bug 93961</a> - virgl build failure after 2016-02-01 changes - no previous prototype for 'virgl_drm_winsys_create'</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93962">Bug 93962</a> - [HSW, regression, bisected, CTS] ES2-CTS.gtf.GL2FixedTests.scissor.scissor - segfault/asserts</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93989">Bug 93989</a> - build: flex-2.5.39 seems to be failing for glsl_lexer.ll</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94016">Bug 94016</a> - make check MesaExtensionsTest.AlphabeticallySorted regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94019">Bug 94019</a> - [bisected] 3D acceleration broken with gallium/radeon: just get num_tile_pipes from the winsys</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94050">Bug 94050</a> - test_vec4_register_coalesce regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94073">Bug 94073</a> - Miscompilation of abs_vec3_vert_xvary_ref.vert in WebGL conformance</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94081">Bug 94081</a> - [HSW] compute shader shared var + atomic op = fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94088">Bug 94088</a> - [llvmpipe] SIGFPE pthread_barrier_destroy.c:40</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94091">Bug 94091</a> - Tonga unreal elemental segfault since radeonsi: put image, fmask, and sampler descriptors into one array</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94100">Bug 94100</a> - [HSW] compute indirect dispatch with 0 work groups causes gpu hang</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94134">Bug 94134</a> - [regression] piglit.spec.arb_texture_view.sampling-2d-array-as-2d-layer assertion</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94139">Bug 94139</a> - [regression, HSW, IVB] piglit.spec.arb_compute_shader.minmax</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94150">Bug 94150</a> - UE4 Suntemple rendering errors</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94186">Bug 94186</a> - Crash when launching glxinfo and World of Warcraft with RV790</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94188">Bug 94188</a> - define (or undef) defined behaves stupidly</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94193">Bug 94193</a> - [llvmpipe] Line antialiasing looks different when GL_LINE_STIPPLE is enabled with pattern 0xffff</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94199">Bug 94199</a> - Shader abort/crash</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94253">Bug 94253</a> - [llvmpipe] piglit gl-1.0-swapbuffers-behavior regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94254">Bug 94254</a> - [llvmpipe] [softpipe] piglit read-front regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94257">Bug 94257</a> - [softpipe] piglit glx-copy-sub-buffer regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94274">Bug 94274</a> - [swrast] piglit arb_occlusion_query2-render regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94284">Bug 94284</a> - [radeonsi] outlast segfault on start</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94388">Bug 94388</a> - r600_blit.c:281: r600_decompress_depth_textures: Assertion `tex-&gt;is_depth &amp;&amp; !tex-&gt;is_flushing_texture' failed.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94412">Bug 94412</a> - Trine 3 misrender</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94481">Bug 94481</a> - softpipe - access violation in img_filter_2d_nearest</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94524">Bug 94524</a> - Wrong gl_TessLevelOuter interpretation for isolines</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94595">Bug 94595</a> - [Mesa AMD&amp;swrast] Texture views attached as framebuffers return their viewed tecture's color encoding and render incorrectly</li>

				</ul>

				<h2>Changes</h2>

				Microsoft Visual Studio 2013 or later is now required for building

				on Windows.

				Previously, Visual Studio 2008 and later were supported.

				</div>

				</body>

				</html>

									
										119

docs/relnotes/11.2.1.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,119 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 11.2.1 Release Notes / April 17, 2016</h1>

				<p>

				Mesa 11.2.1 is a bug fix release which fixes bugs found since the 11.2.0 release.

				</p>

				<p>

				Mesa 11.2.1 implements the OpenGL 4.1 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.1.  OpenGL

				4.1 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				cc2a024204564a71acc95cf262bf618fe49b1d77d351e5755eea705cadac5167  mesa-11.2.1.tar.gz

				a65207e9ae5c5f1c29f863c6a2cc98a7ab99762a24b82a248337f0ea9cfce01b  mesa-11.2.1.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93962">Bug 93962</a> - [HSW, regression, bisected, CTS] ES2-CTS.gtf.GL2FixedTests.scissor.scissor - segfault/asserts</li>

				</ul>

				<h2>Changes</h2>

				<p>Brian Paul (2):</p>

				<ul>

				  <li>st/mesa: fix glReadBuffer() assertion failure</li>

				  <li>st/mesa: fix memleak in glDrawPixels cache code</li>

				</ul>

				<p>Christian Schmidbauer (1):</p>

				<ul>

				  <li>st/nine: specify WINAPI only for i386 and amd64</li>

				</ul>

				<p>Emil Velikov (3):</p>

				<ul>

				  <li>docs: add sha256 checksums for 11.2.0</li>

				  <li>configure.ac: update the path of the generated files</li>

				  <li>Update version to 11.2.1</li>

				</ul>

				<p>Ilia Mirkin (1):</p>

				<ul>

				  <li>glsl: allow usage of the keyword buffer before GLSL 430 / ESSL 310</li>

				</ul>

				<p>Iurie Salomov (1):</p>

				<ul>

				  <li>va: check null context in vlVaDestroyContext</li>

				</ul>

				<p>Jason Ekstrand (2):</p>

				<ul>

				  <li>i965/tiled_memcopy: Add aligned mem_copy parameters to the [de]tiling functions</li>

				  <li>i965/tiled_memcpy: Rework the RGBA -&gt; BGRA mem_copy functions</li>

				</ul>

				<p>Kenneth Graunke (3):</p>

				<ul>

				  <li>i965: Fix textureSize() depth value for 1 layer surfaces on Gen4-6.</li>

				  <li>i965: Use brw-&gt;urb.min_vs_urb_entries instead of 32 for BLORP.</li>

				  <li>glsl: Lower variable indexing of system value arrays unconditionally.</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>drirc: add a workaround for blackness in Warsow</li>

				</ul>

				<p>Nicolai Hähnle (1):</p>

				<ul>

				  <li>radeonsi: fix bounds check in si_create_vertex_elements</li>

				</ul>

				<p>Samuel Pitoiset (1):</p>

				<ul>

				  <li>nv50/ir: do not try to attach JOIN ops to ATOM</li>

				</ul>

				<p>Thomas Hindoe Paaboel Andersen (1):</p>

				<ul>

				  <li>st/va: avoid dereference after free in vlVaDestroyImage</li>

				</ul>

				</div>

				</body>

				</html>

									
										210

docs/relnotes/11.2.2.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,210 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 11.2.2 Release Notes / May 9, 2016</h1>

				<p>

				Mesa 11.2.2 is a bug fix release which fixes bugs found since the 11.2.1 release.

				</p>

				<p>

				Mesa 11.2.2 implements the OpenGL 4.1 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.1.  OpenGL

				4.1 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				e2453014cd2cc5337a5180cdeffe8cf24fffbb83e20a96888e2b01df868eaae6  mesa-11.2.2.tar.gz

				40e148812388ec7c6d7b6657d5a16e2e8dabba8b97ddfceea5197947647bdfb4  mesa-11.2.2.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92850">Bug 92850</a> - Segfault loading War Thunder</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93767">Bug 93767</a> - Glitches with soft shadows and MSAA in Knights of the Old Republic 2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94955">Bug 94955</a> - Uninitialized variables leads to random segfaults (valgrind log, apitrace attached)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94994">Bug 94994</a> - OSMesaGetProcAdress always fails on mangled OSMesa</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95026">Bug 95026</a> - Alien Isolation segfault after initial loading screen/video</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95133">Bug 95133</a> - X-COM Enemy Within crashes when entering tactical mission with Bonaire</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95164">Bug 95164</a> - GLSL compiler (linker I think) emits assertion upon call to glAttachShader</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95251">Bug 95251</a> - vdpau decoder capabilities: not supported</li>

				</ul>

				<h2>Changes</h2>

				<p>Boyuan Zhang (1):</p>

				<ul>

				  <li>radeon/uvd: alignment fix for decode message buffer</li>

				</ul>

				<p>Brian Paul (2):</p>

				<ul>

				  <li>st/mesa: fix sampler view leak in st_DrawAtlasBitmaps()</li>

				  <li>gallium/util: initialize pipe_framebuffer_state to zeros</li>

				</ul>

				<p>Chad Versace (1):</p>

				<ul>

				  <li>dri: Fix robust context creation via EGL attribute</li>

				</ul>

				<p>Egbert Eich (1):</p>

				<ul>

				  <li>dri2: Check for dummyContext to see if the glx_context is valid</li>

				</ul>

				<p>Emil Velikov (5):</p>

				<ul>

				  <li>docs: add sha256 checksums for 11.2.1</li>

				  <li>docs: update the sha256 checksums for 11.2.1</li>

				  <li>cherry-ignore: remove duplicate commit</li>

				  <li>cherry-ignore: ignore the GetSamplerParameterIuiv{EXT,OES} fixups</li>

				  <li>Update version to 11.2.2</li>

				</ul>

				<p>Eric Anholt (4):</p>

				<ul>

				  <li>vc4: Fix subimage accesses to LT textures.</li>

				  <li>vc4: Add support for rendering to cube map surfaces.</li>

				  <li>vc4: Fix tests for format supported with nr_samples == 1.</li>

				  <li>vc4: Make sure we recompile when sample_mask changes.</li>

				</ul>

				<p>Frederic Devernay (1):</p>

				<ul>

				  <li>glapi: fix _glapi_get_proc_address() for mangled function names</li>

				</ul>

				<p>Ilia Mirkin (2):</p>

				<ul>

				  <li>nvc0: fix retrieving query results into buffer for timestamps</li>

				  <li>nouveau/video: properly detect the decoder class for availability checks</li>

				</ul>

				<p>Jason Ekstrand (1):</p>

				<ul>

				  <li>i965/fs: Properly report regs_written from SAMPLEINFO</li>

				</ul>

				<p>Jonathan Gray (1):</p>

				<ul>

				  <li>egl/x11: authenticate before doing chipset id ioctls</li>

				</ul>

				<p>Jose Fonseca (1):</p>

				<ul>

				  <li>winsys/sw/xlib: use correct free function for xlib_dt-&gt;data</li>

				</ul>

				<p>Kenneth Graunke (3):</p>

				<ul>

				  <li>i965: Fix clear code for ignoring colormask for XRGB formats on Gen9+.</li>

				  <li>glsl: Convert lower_vec_index_to_swizzle to a rvalue visitor.</li>

				  <li>glsl: Lower vector_extracts to swizzles after lower_vector_derefs.</li>

				</ul>

				<p>Leo Liu (1):</p>

				<ul>

				  <li>radeon/uvd: fix tonga feedback buffer size</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>st/mesa: fix blit-based GetTexImage for non-finalized textures</li>

				</ul>

				<p>Nicolai Hähnle (5):</p>

				<ul>

				  <li>gallium/radeon: handle failure when mapping staging buffer</li>

				  <li>st/glsl_to_tgsi: reduce stack explosion in recursive expression visitor</li>

				  <li>gallium/radeon: fix crash in r600_set_streamout_targets</li>

				  <li>radeonsi: correct NULL-pointer check in si_upload_const_buffer</li>

				  <li>radeonsi: work around an MSAA fast stencil clear problem</li>

				</ul>

				<p>Oded Gabbay (4):</p>

				<ul>

				  <li>r600g/radeonsi: send endian info to format translation functions</li>

				  <li>r600g: set endianess of 16/32-bit buffers according to do_endian_swap</li>

				  <li>r600g: use do_endian_swap in color swapping functions</li>

				  <li>r600g: use do_endian_swap in texture swapping function</li>

				</ul>

				<p>Patrick Rudolph (1):</p>

				<ul>

				  <li>r600g: fix and optimize tgsi_cmp when using ABS and NEG modifier</li>

				</ul>

				<p>Roland Scheidegger (3):</p>

				<ul>

				  <li>llvmpipe: (trivial) initialize src1_alpha var to NULL</li>

				  <li>gallivm: fix bogus argument order to lp_build_sample_mipmap function</li>

				  <li>gallivm: make sampling more robust against bogus coordinates</li>

				</ul>

				<p>Samuel Pitoiset (6):</p>

				<ul>

				  <li>gk110/ir: do not overwrite def value with zero for EXCH ops</li>

				  <li>gk110/ir: make use of IMUL32I for all immediates</li>

				  <li>nvc0/ir: fix wrong emission of (a OP b) OP c</li>

				  <li>gk110/ir: add emission for (a OP b) OP c</li>

				  <li>nvc0: reduce GL_MAX_3D_TEXTURE_SIZE to 2048 on Kepler+</li>

				  <li>st/glsl_to_tgsi: fix potential crash when allocating temporaries</li>

				</ul>

				<p>Stefan Dirsch (1):</p>

				<ul>

				  <li>dri3: Check for dummyContext to see if the glx_context is valid</li>

				</ul>

				<p>Topi Pohjolainen (2):</p>

				<ul>

				  <li>i965/blorp/gen7: Prepare re-using for gen8</li>

				  <li>i965/blorp: Use 8k chunk size for urb allocation</li>

				</ul>

				<p>WuZhen (3):</p>

				<ul>

				  <li>tgsi: initialize stack allocated struct</li>

				  <li>winsys/sw/dri: use correct free function for dri_sw_dt-&gt;data</li>

				  <li>android: enable dlopen() on all architectures</li>

				</ul>

				</div>

				</body>

				</html>

									
										89

docs/relnotes/11.3.0.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,89 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 11.3.0 Release Notes / TBD</h1>

				<p>

				Mesa 11.3.0 is a new development release.

				People who are concerned with stability and reliability should stick

				with a previous release or wait for Mesa 11.3.1.

				</p>

				<p>

				Mesa 11.3.0 implements the OpenGL 4.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.3.  OpenGL

				4.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				TBD.

				</pre>

				<h2>New features</h2>

				<p>

				Note: some of the new features are only available with certain drivers.

				</p>

				<ul>

				<li>OpenGL 4.3 on nvc0, radeonsi, i965 (Gen8+)</li>

				<li>OpenGL ES 3.1 on nvc0, radeonsi</li>

				<li>GL_ARB_ES3_1_compatibility on nvc0, radeonsi</li>

				<li>GL_ARB_compute_shader on nvc0, radeonsi, softpipe</li>

				<li>GL_ARB_cull_distance on i965/gen6+, nv50, nvc0, llvmpipe, softpipe</li>

				<li>GL_ARB_framebuffer_no_attachments on nvc0, r600, radeonsi, softpipe</li>

				<li>GL_ARB_internalformat_query2 on all drivers</li>

				<li>GL_ARB_query_buffer_object on i965/hsw+</li>

				<li>GL_ARB_robust_buffer_access_behavior on i965, nvc0, radeonsi</li>

				<li>GL_ARB_shader_atomic_counters on radeonsi, softpipe</li>

				<li>GL_ARB_shader_atomic_counter_ops on nvc0, radeonsi, softpipe</li>

				<li>GL_ARB_shader_image_load_store on nvc0, radeonsi, softpipe</li>

				<li>GL_ARB_shader_image_size on nvc0, radeonsi, softpipe</li>

				<li>GL_ARB_shader_storage_buffer_objects on radeonsi, softpipe</li>

				<li>GL_ATI_fragment_shader on all Gallium drivers</li>

				<li>GL_EXT_base_instance on all drivers that support GL_ARB_base_instance</li>

				<li>GL_EXT_clip_cull_distance on all drivers that support GL_ARB_cull_distance</li>

				<li>GL_KHR_robustness on i965</li>

				<li>GL_OES_copy_image on i965 (Baytrail and Gen8+)</li>

				<li>GL_OES_draw_buffers_indexed and GL_EXT_draw_buffers_indexed on all drivers that support GL_ARB_draw_buffers_blend</li>

				<li>GL_OES_gpu_shader5 and GL_EXT_gpu_shader5 on all drivers that support GL_ARB_gpu_shader5</li>

				<li>GL_OES_sample_shading on i965, nvc0, r600, radeonsi</li>

				<li>GL_OES_sample_variables on i965, nvc0, r600, radeonsi</li>

				<li>GL_OES_shader_image_atomic on all drivers that support GL_ARB_shader_image_load_store</li>

				<li>GL_OES_shader_io_blocks on i965, nvc0, radeonsi</li>

				<li>GL_OES_shader_multisample_interpolation on i965, nvc0, r600, radeonsi</li>

				<li>GL_OES_texture_border_clamp and GL_EXT_texture_border_clamp on all drivers that support GL_ARB_texture_border_clamp</li>

				<li>GL_OES_texture_buffer and GL_EXT_texture_buffer on i965, nvc0, radeonsi</li>

				<li>EGL_KHR_reusable_sync on all drivers</li>

				<li>GL_ARB_stencil_texture8 and GL_OES_stencil_texture8 on i965/gen8+</li>

				</ul>

				<h2>Bug fixes</h2>

				TBD.

				<h2>Changes</h2>

				TBD.

				</div>

				</body>

				</html>

									
										34

docs/repository.html
									
												View File
												
				@@ -68,21 +68,39 @@ To get the Mesa sources anonymously (read-only):

				<h2 id="developer">Developer git Access</h2>

				<p>

				Mesa developers need to first have an account on

				<a href="http://www.freedesktop.org">freedesktop.org</a>.

				To get an account, please ask Brian or the other Mesa developers for

				permission.

				Then, if there are no objections, follow this

				<a href="http://www.freedesktop.org/wiki/AccountRequests">

				procedure</a>.

				If you wish to become a Mesa developer with git-write privilege, please

				follow this procedure:

				</p>

				<ol>

				<li>Subscribe to the

				<a href="http://lists.freedesktop.org/mailman/listinfo/mesa-dev">mesa-dev</a>

				mailing list.

				<li>Start contributing to the project by posting patches / review requests to

				the mesa-dev list.  Specifically,

				<ul>

				<li>Use <code>git send-mail</code> to post your patches to mesa-dev.

				<li>Wait for someone to review the code and give you a <code>Reviewed-by</code>

				statement.

				<li>You'll have to rely on another Mesa developer to push your initial patches

				after they've been reviewed.

				</ul>

				<li>After you've demonstrated the ability to write good code and have had

				a dozen or so patches accepted you can apply for an account.

				<li>Occasionally, but rarely, someone may be given a git account sooner, but

				only if they're being supervised by another Mesa developer at the same

				organization and planning to work in a limited area of the code or on a

				separate branch.

				<li>To apply for an account, follow

				<a href="http://www.freedesktop.org/wiki/AccountRequests">these directions</a>.

				It's also appreciated if you briefly describe what you intend to do (work

				on a particular driver, add a new extension, etc.) in the bugzilla record.

				</ol>

				<p>

				Once your account is established:

				</p>

				<ol>

				<li>Install the git software on your computer if needed.<br><br>

				<li>Get an initial, local copy of the repository with:

				    <pre>

				    git clone git+ssh://username@git.freedesktop.org/git/mesa/mesa

									
										45

docs/shading.html
									
												View File
												
				@@ -209,51 +209,6 @@ The final vertex and fragment programs may be interpreted in software

				(see drivers/dri/i915/i915_fragprog.c for example).

				</p>

				<h3>Code Generation Options</h3>

				<p>

				Internally, there are several options that control the compiler's code

				generation and instruction selection.

				These options are seen in the gl_shader_state struct and may be set

				by the device driver to indicate its preferences:

				<pre>

				struct gl_shader_state

				{

				   ...

				   /** Driver-selectable options: */

				   GLboolean EmitHighLevelInstructions;

				   GLboolean EmitCondCodes;

				   GLboolean EmitComments;

				};

				</pre>

				<dl>

				<dt>EmitHighLevelInstructions</dt>

				<dd>

				This option controls instruction selection for loops and conditionals.

				If the option is set high-level IF/ELSE/ENDIF, LOOP/ENDLOOP, CONT/BRK

				instructions will be emitted.

				Otherwise, those constructs will be implemented with BRA instructions.

				</dd>

				<dt>EmitCondCodes</dt>

				<dd>

				If set, condition codes (ala GL_NV_fragment_program) will be used for

				branching and looping.

				Otherwise, ordinary registers will be used (the IF instruction will

				examine the first operand's X component and do the if-part if non-zero).

				This option is only relevant if EmitHighLevelInstructions is set.

				</dd>

				<dt>EmitComments</dt>

				<dd>

				If set, instructions will be annotated with comments to help with debugging.

				Extra NOP instructions will also be inserted.

				</dd>

				</dl>

				<h2 id="validation">Compiler Validation</h2>

				<p>

									
										3

docs/systems.html
									
												View File
												
				@@ -34,8 +34,7 @@ Hardware drivers include:

				</p>

				<ul>

				  <li>Intel i965, i945, i915.

				    See <a href="http://intellinuxgraphics.org/index.html">

				      Intel's website</a></li>

				    See <a href="https://01.org/linuxgraphics">Intel's website</a></li>

				  <li>AMD Radeon series.

				  See <a href="http://www.x.org/wiki/RadeonFeature">RadeonFeature</a></li>

				  <li>NVIDIA GPUs.

									
										4

docs/thanks.html
									
												View File
												
				@@ -42,9 +42,7 @@ Tungsten Graphics, Inc. have supported the ongoing development of Mesa.

				<li>The

				<a href="http://www.mesa3d.org">Mesa</a>

				website is hosted by

				<a href="http://sourceforge.net">

				<img src="http://sourceforge.net/sflogo.php?group_id=3&amp;type=1"

				width="88" height="31" align="bottom" alt="Sourceforge.net" border="0"></a>

				<a href="http://sourceforge.net">sourceforge.net</a>.

				<br>

				<br>

									
										2

docs/utilities.html
									
												View File
												
				@@ -31,7 +31,7 @@

				  <dd>is a very useful tool for tracking down

				  memory-related problems in your code.</dd>

				  <dt><a href="http:scan.coverity.com/projects/mesa">Coverity</a><dt>

				  <dt><a href="http://scan.coverity.com/projects/mesa">Coverity</a><dt>

				  <dd>provides static code analysis of Mesa.  If you create an account

				  you can see the results and try to fix outstanding issues.</dd>

				</dl>

8

doxygen/.gitignore vendored

View File

@@ -1,9 +1,6 @@
 *.db
 *.tag
 *.tmp
 agpgart
 array_cache
 core
 core_subset
 gallium
 gbm
@@ -13,11 +10,8 @@ i965
 main
 math
 math_subset
 miniglx
 radeondrm
 radeonfb
 nir
 radeon_subset
 shader
 swrast
 swrast_setup
 tnl

									
										7

doxygen/Makefile
									
												View File
												
				@@ -12,20 +12,21 @@ FULL = \

					vbo.doxy \

					glapi.doxy \

					glsl.doxy \

					shader.doxy \

					swrast.doxy \

					swrast_setup.doxy \

					tnl.doxy \

					tnl_dd.doxy \

					gbm.doxy \

					i965.doxy

					i965.doxy \

					nir.doxy

				full: $(FULL:.doxy=.tag)

					$(foreach FILE,$(FULL),doxygen $(FILE);)

				SUBSET = \

					main.doxy \

					math.doxy

					math.doxy \

					gallium.doxy

				subset: $(SUBSET:.doxy=.tag)

					$(foreach FILE,$(SUBSET),doxygen $(FILE);)

51

doxygen/common.doxy

View File

@@ -53,16 +53,6 @@ CREATE_SUBDIRS         = NO
 OUTPUT_LANGUAGE        = English
 # This tag can be used to specify the encoding used in the generated output.
 # The encoding is not always determined by the language that is chosen,
 # but also whether or not the output is meant for Windows or non-Windows users.
 # In case there is a difference, setting the USE_WINDOWS_ENCODING tag to YES
 # forces the Windows encoding (this is the default for the Windows binary),
 # whereas setting the tag to NO uses a Unix-style encoding (the default for
 # all platforms other than Windows).
 USE_WINDOWS_ENCODING   = NO
 # If the BRIEF_MEMBER_DESC tag is set to YES (the default) Doxygen will
 # include brief member descriptions after the members that are listed in
 # the file and class documentation (similar to JavaDoc).
@@ -147,13 +137,6 @@ JAVADOC_AUTOBRIEF      = YES
 MULTILINE_CPP_IS_BRIEF = NO
 # If the DETAILS_AT_TOP tag is set to YES then Doxygen
 # will output the detailed description near the top, like JavaDoc.
 # If set to NO, the detailed description appears after the member
 # documentation.
 DETAILS_AT_TOP         = YES
 # If the INHERIT_DOCS tag is set to YES (the default) then an undocumented
 # member inherits the documentation from any documented member that it
 # re-implements.
@@ -607,12 +590,6 @@ HTML_FOOTER            =
 HTML_STYLESHEET        =
 # If the HTML_ALIGN_MEMBERS tag is set to YES, the members of classes,
 # files or namespaces will be aligned in HTML using tables. If set to
 # NO a bullet list will be used.
 HTML_ALIGN_MEMBERS     = YES
 # If the GENERATE_HTMLHELP tag is set to YES, additional index files
 # will be generated that can be used as input for tools like the
 # Microsoft HTML help workshop to generate a compressed HTML help file (.chm)
@@ -839,18 +816,6 @@ GENERATE_XML           = NO
 XML_OUTPUT             = xml
 # The XML_SCHEMA tag can be used to specify an XML schema,
 # which can be used by a validating XML parser to check the
 # syntax of the XML files.
 XML_SCHEMA             =
 # The XML_DTD tag can be used to specify an XML DTD,
 # which can be used by a validating XML parser to check the
 # syntax of the XML files.
 XML_DTD                =
 # If the XML_PROGRAMLISTING tag is set to YES Doxygen will
 # dump the program listings (including syntax highlighting
 # and cross-referencing information) to the XML output. Note that
@@ -1104,22 +1069,6 @@ DOT_PATH               =
 DOTFILE_DIRS           =
 # The MAX_DOT_GRAPH_WIDTH tag can be used to set the maximum allowed width
 # (in pixels) of the graphs generated by dot. If a graph becomes larger than
 # this value, doxygen will try to truncate the graph, so that it fits within
 # the specified constraint. Beware that most browsers cannot cope with very
 # large images.
 MAX_DOT_GRAPH_WIDTH    = 1024
 # The MAX_DOT_GRAPH_HEIGHT tag can be used to set the maximum allows height
 # (in pixels) of the graphs generated by dot. If a graph becomes larger than
 # this value, doxygen will try to truncate the graph, so that it fits within
 # the specified constraint. Beware that most browsers cannot cope with very
 # large images.
 MAX_DOT_GRAPH_HEIGHT   = 1024
 # The MAX_DOT_GRAPH_DEPTH tag can be used to set the maximum depth of the
 # graphs generated by dot. A depth value of 3 means that only nodes reachable
 # from the root by following a path via at most 3 edges will be shown. Nodes that

3

doxygen/core_subset.doxy

View File

@@ -190,8 +190,7 @@ SKIP_FUNCTION_MACROS   = YES
 # Configuration::addtions related to external references
 #---------------------------------------------------------------------------
 TAGFILES		= \
 			 math_subset.tag=../math_subset \
 			 miniglx.tag=../miniglx
 			 math_subset.tag=../math_subset
 GENERATE_TAGFILE       = core_subset.tag
 ALLEXTERNALS           = NO
 PERL_PATH              =

9

doxygen/doxy.bat

View File

@@ -6,7 +6,9 @@ doxygen swrast_setup.doxy
 doxygen tnl.doxy
 doxygen core.doxy
 doxygen glapi.doxy
 doxygen shader.doxy
 doxygen glsl.doxy
 doxygen nir.doxy
 doxygen i965.doxy
 echo Building again, to resolve tags
 doxygen tnl_dd.doxy
@@ -15,5 +17,8 @@ doxygen math.doxy
 doxygen swrast.doxy
 doxygen swrast_setup.doxy
 doxygen tnl.doxy
 doxygen core.doxy
 doxygen glapi.doxy
 doxygen shader.doxy
 doxygen glsl.doxy
 doxygen nir.doxy
 doxygen i965.doxy

6

doxygen/gbm.doxy

View File

@@ -39,10 +39,10 @@ SKIP_FUNCTION_MACROS   = YES
 #---------------------------------------------------------------------------
 # Configuration::addtions related to external references
 #---------------------------------------------------------------------------
 TAGFILES               = main.tag=../core \
 TAGFILES               = main.tag=../main \
                          math.tag=../math \
                          tnl_dd.tag=../tnl_dd \
                          swrast_setup.tag=../gbm_setup \
                          swrast_setup.tag=../swrast_setup \
                          tnl.tag=../tnl \
                          vbo.tag=vbo
                          vbo.tag=../vbo
 GENERATE_TAGFILE       = gbm.tag

8

doxygen/glapi.doxy

View File

@@ -9,7 +9,7 @@ PROJECT_NAME           = "Mesa GL API dispatcher"
 #---------------------------------------------------------------------------
 # configuration options related to the input files
 #---------------------------------------------------------------------------
 INPUT                  = ../src/mesa/glapi/
 INPUT                  = ../src/mapi/glapi/
 FILE_PATTERNS          = *.c *.h
 RECURSIVE              = NO
 EXCLUDE                =
@@ -39,11 +39,11 @@ SKIP_FUNCTION_MACROS   = YES
 #---------------------------------------------------------------------------
 # Configuration::addtions related to external references
 #---------------------------------------------------------------------------
 TAGFILES               = main.tag=../core \
 TAGFILES               = main.tag=../main \
                          math.tag=../math \
                          tnl_dd.tag=../tnl_dd \
                          swrast.tag=../swrast \
                          swrast_setup.tag=../swrast_setup \
                          tnl.tag=../tnl \
                          vbo.tag=vbo
 GENERATE_TAGFILE       = swrast.tag
                          vbo.tag=../vbo
 GENERATE_TAGFILE       = glapi.tag

9

doxygen/glsl.doxy

View File

@@ -9,11 +9,12 @@ PROJECT_NAME           = "Mesa GLSL module"
 #---------------------------------------------------------------------------
 # configuration options related to the input files
 #---------------------------------------------------------------------------
 INPUT                  = ../src/glsl/
 INPUT                  = ../src/compiler/glsl/
 FILE_PATTERNS          = *.c *.cpp *.h
 RECURSIVE              = NO
 EXCLUDE                = ../src/glsl/glsl_lexer.cpp \
                          ../src/glsl/glsl_parser.cpp \
                          ../src/glsl/glsl_parser.h
 EXCLUDE                = ../src/compiler/glsl/glsl_lexer.cpp \
                          ../src/compiler/glsl/glsl_parser.cpp \
                          ../src/compiler/glsl/glsl_parser.h
 EXCLUDE_PATTERNS       =
 #---------------------------------------------------------------------------
 # configuration options related to the HTML output

									
										2

doxygen/header.html
									
												View File
												
				@@ -8,9 +8,9 @@

				<a class="qindex" href="../main/index.html">core</a> |

				<a class="qindex" href="../glapi/index.html">glapi</a> |

				<a class="qindex" href="../glsl/index.html">glsl</a> |

				<a class="qindex" href="../nir/index.html">nir</a> |

				<a class="qindex" href="../vbo/index.html">vbo</a> |

				<a class="qindex" href="../math/index.html">math</a> |

				<a class="qindex" href="../shader/index.html">shader</a> |

				<a class="qindex" href="../swrast/index.html">swrast</a> |

				<a class="qindex" href="../swrast_setup/index.html">swrast_setup</a> |

				<a class="qindex" href="../tnl/index.html">tnl</a> |

									
										1

doxygen/header_subset.html
									
												View File
												
				@@ -6,6 +6,5 @@

				<div class="qindex">

				<a class="qindex" href="../core_subset/index.html">Mesa Core</a> |

				<a class="qindex" href="../math_subset/index.html">math</a> |

				<a class="qindex" href="../miniglx/index.html">MiniGLX</a> |

				<a class="qindex" href="../radeon_subset/index.html">radeon_subset</a>

				</div>

2

doxygen/i965.doxy

View File

@@ -46,5 +46,5 @@ TAGFILES               = glsl.tag=../glsl \
                          swrast_setup.tag=../swrast_setup \
                          tnl.tag=../tnl \
                          tnl_dd.tag=../tnl_dd \
                          vbo.tag=vbo
                          vbo.tag=../vbo
 GENERATE_TAGFILE       = i965.tag

1

doxygen/main.doxy

View File

@@ -43,7 +43,6 @@ TAGFILES		= tnl_dd.tag=../tnl_dd \
 			 vbo.tag=../vbo \
                          glapi.tag=../glapi \
                          math.tag=../math \
                          shader.tag=../shader \
                          swrast.tag=../swrast \
                          swrast_setup.tag=../swrast_setup \
                          tnl.tag=../tnl

2

doxygen/math.doxy

View File

@@ -41,7 +41,7 @@ SKIP_FUNCTION_MACROS   = YES
 # Configuration::addtions related to external references
 #---------------------------------------------------------------------------
 TAGFILES               = tnl_dd.tag=../tnl_dd \
                          main.tag=../core \
                          main.tag=../main \
                          swrast.tag=../swrast \
                          swrast_setup.tag=../swrast_setup \
                          tnl.tag=../tnl \

43

doxygen/shader.doxy → doxygen/nir.doxy

View File

@@ -5,45 +5,46 @@
 #---------------------------------------------------------------------------
 # General configuration options
 #---------------------------------------------------------------------------
 PROJECT_NAME           = "Mesa Vertex and Fragment Program code"
 PROJECT_NAME           = "Mesa NIR module"
 #---------------------------------------------------------------------------
 # configuration options related to the input files
 # Configuration options related to the input files
 #---------------------------------------------------------------------------
 INPUT                  = ../src/mesa/shader/
 FILE_PATTERNS          = *.c *.h
 INPUT                  = ../src/compiler/nir
 FILE_PATTERNS          = *.c *.cpp *.h
 RECURSIVE              = NO
 EXCLUDE                =
 EXCLUDE_PATTERNS       =
 EXAMPLE_PATH           =
 EXAMPLE_PATTERNS       =
 EXCLUDE                =
 EXCLUDE_PATTERNS       =
 EXAMPLE_PATH           =
 EXAMPLE_PATTERNS       =
 EXAMPLE_RECURSIVE      = NO
 IMAGE_PATH             =
 INPUT_FILTER           =
 IMAGE_PATH             =
 INPUT_FILTER           =
 FILTER_SOURCE_FILES    = NO
 #---------------------------------------------------------------------------
 # configuration options related to the HTML output
 # Configuration options related to the HTML output
 #---------------------------------------------------------------------------
 HTML_OUTPUT            = shader
 HTML_OUTPUT            = nir
 #---------------------------------------------------------------------------
 # Configuration options related to the preprocessor
 # Configuration options related to the preprocessor
 #---------------------------------------------------------------------------
 ENABLE_PREPROCESSING   = YES
 MACRO_EXPANSION        = NO
 EXPAND_ONLY_PREDEF     = NO
 SEARCH_INCLUDES        = YES
 INCLUDE_PATH           = ../include/
 INCLUDE_FILE_PATTERNS  =
 PREDEFINED             =
 EXPAND_AS_DEFINED      =
 INCLUDE_FILE_PATTERNS  =
 PREDEFINED             =
 EXPAND_AS_DEFINED      =
 SKIP_FUNCTION_MACROS   = YES
 #---------------------------------------------------------------------------
 # Configuration::addtions related to external references
 # Configuration::additions related to external references
 #---------------------------------------------------------------------------
 TAGFILES               = main.tag=../core \
 TAGFILES               = glsl.tag=../glsl \
                          main.tag=../main \
                          math.tag=../math \
                          tnl_dd.tag=../tnl_dd \
                          swrast.tag=../swrast \
                          swrast_setup.tag=../swrast_setup \
                          tnl.tag=../tnl \
                          vbo.tag=vbo
 GENERATE_TAGFILE       = swrast.tag
                          tnl_dd.tag=../tnl_dd \
                          vbo.tag=../vbo
 GENERATE_TAGFILE       = nir.tag

3

doxygen/radeon_subset.doxy

View File

@@ -168,8 +168,7 @@ SKIP_FUNCTION_MACROS   = YES
 #---------------------------------------------------------------------------
 TAGFILES		= \
 			 core_subset.tag=../core_subset \
                          math_subset.tag=../math_subset \
                          miniglx.tag=../miniglx
                          math_subset.tag=../math_subset
 GENERATE_TAGFILE       = radeon_subset.tag
 ALLEXTERNALS           = NO
 PERL_PATH              =

4

doxygen/swrast.doxy

View File

@@ -39,10 +39,10 @@ SKIP_FUNCTION_MACROS   = YES
 #---------------------------------------------------------------------------
 # Configuration::addtions related to external references
 #---------------------------------------------------------------------------
 TAGFILES               = main.tag=../core \
 TAGFILES               = main.tag=../main \
                          math.tag=../math \
                          tnl_dd.tag=../tnl_dd \
                          swrast_setup.tag=../swrast_setup \
                          tnl.tag=../tnl \
                          vbo.tag=vbo
                          vbo.tag=../vbo
 GENERATE_TAGFILE       = swrast.tag

2

doxygen/swrast_setup.doxy

View File

@@ -41,7 +41,7 @@ SKIP_FUNCTION_MACROS   = YES
 # Configuration::addtions related to external references
 #---------------------------------------------------------------------------
 TAGFILES               = tnl_dd.tag=../tnl_dd \
                          main.tag=../core \
                          main.tag=../main \
                          math.tag=../math \
                          swrast.tag=../swrast \
                          tnl.tag=../tnl \

9

doxygen/tnl.doxy

View File

@@ -40,11 +40,10 @@ SKIP_FUNCTION_MACROS   = YES
 #---------------------------------------------------------------------------
 # Configuration::addtions related to external references
 #---------------------------------------------------------------------------
 TAGFILES               = tnl_dd.tag=../tnl \
                          main.tag=../core \
 TAGFILES               = tnl_dd.tag=../tnl_dd \
                          main.tag=../main \
                          math.tag=../math \
                          shader.tag=../shader \
                          swrast.tag=../swrast \
                          swrast_setup.tag=swrast_setup \
                          vbo.tag=vbo
                          swrast_setup.tag=../swrast_setup \
                          vbo.tag=../vbo
 GENERATE_TAGFILE       = tnl.tag

5

doxygen/tnl_dd.doxy

View File

@@ -39,11 +39,10 @@ SKIP_FUNCTION_MACROS   = YES
 #---------------------------------------------------------------------------
 # Configuration::addtions related to external references
 #---------------------------------------------------------------------------
 TAGFILES               = main.tag=../core \
 TAGFILES               = main.tag=../main \
                          math.tag=../math \
 			 shader.tag=../shader \
                          swrast.tag=../swrast \
                          swrast_setup.tag=../swrast_setup \
                          tnl.tag=../tnl \
                          vbo.tag=vbo
                          vbo.tag=../vbo
 GENERATE_TAGFILE       = tnl_dd.tag

3

doxygen/vbo.doxy

View File

@@ -40,9 +40,8 @@ SKIP_FUNCTION_MACROS   = YES
 #---------------------------------------------------------------------------
 # Configuration::addtions related to external references
 #---------------------------------------------------------------------------
 TAGFILES               = main.tag=../core \
 TAGFILES               = main.tag=../main \
 			 math.tag=../math \
                          shader.tag=../shader \
 			 swrast.tag=../swrast \
 			 swrast_setup.tag=../swrast_setup \
 			 tnl.tag=../tnl \

									
										10

include/D3D9/d3d9.h
									
												View File
												
				@@ -260,7 +260,7 @@ struct IDirect3DDevice9 : public IUnknown

					virtual HRESULT WINAPI SetStreamSourceFreq(UINT StreamNumber, UINT Setting) = 0;

					virtual HRESULT WINAPI GetStreamSourceFreq(UINT StreamNumber, UINT *pSetting) = 0;

					virtual HRESULT WINAPI SetIndices(IDirect3DIndexBuffer9 *pIndexData) = 0;

					virtual HRESULT WINAPI GetIndices(IDirect3DIndexBuffer9 **ppIndexData, UINT *pBaseVertexIndex) = 0;

					virtual HRESULT WINAPI GetIndices(IDirect3DIndexBuffer9 **ppIndexData) = 0;

					virtual HRESULT WINAPI CreatePixelShader(const DWORD *pFunction, IDirect3DPixelShader9 **ppShader) = 0;

					virtual HRESULT WINAPI SetPixelShader(IDirect3DPixelShader9 *pShader) = 0;

					virtual HRESULT WINAPI GetPixelShader(IDirect3DPixelShader9 **ppShader) = 0;

				@@ -848,7 +848,7 @@ typedef struct IDirect3DDevice9Vtbl

					HRESULT (WINAPI *SetStreamSourceFreq)(IDirect3DDevice9 *This, UINT StreamNumber, UINT Setting);

					HRESULT (WINAPI *GetStreamSourceFreq)(IDirect3DDevice9 *This, UINT StreamNumber, UINT *pSetting);

					HRESULT (WINAPI *SetIndices)(IDirect3DDevice9 *This, IDirect3DIndexBuffer9 *pIndexData);

					HRESULT (WINAPI *GetIndices)(IDirect3DDevice9 *This, IDirect3DIndexBuffer9 **ppIndexData, UINT *pBaseVertexIndex);

					HRESULT (WINAPI *GetIndices)(IDirect3DDevice9 *This, IDirect3DIndexBuffer9 **ppIndexData);

					HRESULT (WINAPI *CreatePixelShader)(IDirect3DDevice9 *This, const DWORD *pFunction, IDirect3DPixelShader9 **ppShader);

					HRESULT (WINAPI *SetPixelShader)(IDirect3DDevice9 *This, IDirect3DPixelShader9 *pShader);

					HRESULT (WINAPI *GetPixelShader)(IDirect3DDevice9 *This, IDirect3DPixelShader9 **ppShader);

				@@ -975,7 +975,7 @@ struct IDirect3DDevice9

				#define IDirect3DDevice9_SetStreamSourceFreq(p,a,b) (p)->lpVtbl->SetStreamSourceFreq(p,a,b)

				#define IDirect3DDevice9_GetStreamSourceFreq(p,a,b) (p)->lpVtbl->GetStreamSourceFreq(p,a,b)

				#define IDirect3DDevice9_SetIndices(p,a) (p)->lpVtbl->SetIndices(p,a)

				#define IDirect3DDevice9_GetIndices(p,a,b) (p)->lpVtbl->GetIndices(p,a,b)

				#define IDirect3DDevice9_GetIndices(p,a) (p)->lpVtbl->GetIndices(p,a)

				#define IDirect3DDevice9_CreatePixelShader(p,a,b) (p)->lpVtbl->CreatePixelShader(p,a,b)

				#define IDirect3DDevice9_SetPixelShader(p,a) (p)->lpVtbl->SetPixelShader(p,a)

				#define IDirect3DDevice9_GetPixelShader(p,a) (p)->lpVtbl->GetPixelShader(p,a)

				@@ -1099,7 +1099,7 @@ typedef struct IDirect3DDevice9ExVtbl

					HRESULT (WINAPI *SetStreamSourceFreq)(IDirect3DDevice9Ex *This, UINT StreamNumber, UINT Setting);

					HRESULT (WINAPI *GetStreamSourceFreq)(IDirect3DDevice9Ex *This, UINT StreamNumber, UINT *pSetting);

					HRESULT (WINAPI *SetIndices)(IDirect3DDevice9Ex *This, IDirect3DIndexBuffer9 *pIndexData);

					HRESULT (WINAPI *GetIndices)(IDirect3DDevice9Ex *This, IDirect3DIndexBuffer9 **ppIndexData, UINT *pBaseVertexIndex);

					HRESULT (WINAPI *GetIndices)(IDirect3DDevice9Ex *This, IDirect3DIndexBuffer9 **ppIndexData);

					HRESULT (WINAPI *CreatePixelShader)(IDirect3DDevice9Ex *This, const DWORD *pFunction, IDirect3DPixelShader9 **ppShader);

					HRESULT (WINAPI *SetPixelShader)(IDirect3DDevice9Ex *This, IDirect3DPixelShader9 *pShader);

					HRESULT (WINAPI *GetPixelShader)(IDirect3DDevice9Ex *This, IDirect3DPixelShader9 **ppShader);

				@@ -1242,7 +1242,7 @@ struct IDirect3DDevice9Ex

				#define IDirect3DDevice9Ex_SetStreamSourceFreq(p,a,b) (p)->lpVtbl->SetStreamSourceFreq(p,a,b)

				#define IDirect3DDevice9Ex_GetStreamSourceFreq(p,a,b) (p)->lpVtbl->GetStreamSourceFreq(p,a,b)

				#define IDirect3DDevice9Ex_SetIndices(p,a) (p)->lpVtbl->SetIndices(p,a)

				#define IDirect3DDevice9Ex_GetIndices(p,a,b) (p)->lpVtbl->GetIndices(p,a,b)

				#define IDirect3DDevice9Ex_GetIndices(p,a) (p)->lpVtbl->GetIndices(p,a)

				#define IDirect3DDevice9Ex_CreatePixelShader(p,a,b) (p)->lpVtbl->CreatePixelShader(p,a,b)

				#define IDirect3DDevice9Ex_SetPixelShader(p,a) (p)->lpVtbl->SetPixelShader(p,a)

				#define IDirect3DDevice9Ex_GetPixelShader(p,a) (p)->lpVtbl->GetPixelShader(p,a)

									
										21

include/D3D9/d3d9types.h
									
												View File
												
				@@ -173,16 +173,16 @@ typedef struct _RGNDATA {

				#define D3DPRESENTFLAG_RESTRICTED_CONTENT              0x00000400

				#define D3DPRESENTFLAG_RESTRICT_SHARED_RESOURCE_DRIVER 0x00000800

				#ifdef WINAPI

				#undef WINAPI

				#endif /* WINAPI*/

				#if defined(__x86_64__) || defined(_M_X64)

				#define WINAPI __attribute__((ms_abi))

				#else /* x86_64 */

				#define WINAPI __attribute__((__stdcall__))

				#endif /* x86_64 */

				/* Windows calling convention */

				#ifndef WINAPI

				  #if defined(__x86_64__) && !defined(__ILP32__)

				    #define WINAPI __attribute__((ms_abi))

				  #elif defined(__i386__)

				    #define WINAPI __attribute__((__stdcall__))

				  #else /* neither amd64 nor i386 */

				    #define WINAPI

				  #endif

				#endif /* WINAPI */

				/* Implementation caps */

				#define D3DPRESENT_BACK_BUFFERS_MAX    3

				@@ -227,6 +227,7 @@ typedef struct _RGNDATA {

				#define D3DERR_DRIVERINVALIDCALL         MAKE_D3DHRESULT(2157)

				#define D3DERR_DEVICEREMOVED             MAKE_D3DHRESULT(2160)

				#define D3DERR_DEVICEHUNG                MAKE_D3DHRESULT(2164)

				#define S_PRESENT_OCCLUDED               MAKE_D3DSTATUS(2168)

				/********************************************************

				 * Bitmasks                                             *

									
										11

include/EGL/eglmesaext.h
									
												View File
												
				@@ -34,17 +34,6 @@ extern "C" {

				#include <EGL/eglplatform.h>

				#ifndef EGL_MESA_drm_display

				#define EGL_MESA_drm_display 1

				#ifdef EGL_EGLEXT_PROTOTYPES

				EGLAPI EGLDisplay EGLAPIENTRY eglGetDRMDisplayMESA(int fd);

				#endif /* EGL_EGLEXT_PROTOTYPES */

				typedef EGLDisplay (EGLAPIENTRYP PFNEGLGETDRMDISPLAYMESA) (int fd);

				#endif /* EGL_MESA_drm_display */

				#ifdef EGL_MESA_drm_image

				/* Mesa's extension to EGL_MESA_drm_image... */

				#ifndef EGL_DRM_BUFFER_USE_CURSOR_MESA

									
										70

include/GL/internal/dri_interface.h
									
												View File
												
				@@ -79,6 +79,7 @@ typedef struct __DRIdri2LoaderExtensionRec	__DRIdri2LoaderExtension;

				typedef struct __DRI2flushExtensionRec	__DRI2flushExtension;

				typedef struct __DRI2throttleExtensionRec	__DRI2throttleExtension;

				typedef struct __DRI2fenceExtensionRec          __DRI2fenceExtension;

				typedef struct __DRI2interopExtensionRec	__DRI2interopExtension;

				typedef struct __DRIimageLoaderExtensionRec     __DRIimageLoaderExtension;

				@@ -392,6 +393,31 @@ struct __DRI2fenceExtensionRec {

				};

				/**

				 * Extension for API interop.

				 * See GL/mesa_glinterop.h.

				 */

				#define __DRI2_INTEROP "DRI2_Interop"

				#define __DRI2_INTEROP_VERSION 1

				struct mesa_glinterop_device_info;

				struct mesa_glinterop_export_in;

				struct mesa_glinterop_export_out;

				struct __DRI2interopExtensionRec {

				   __DRIextension base;

				   /** Same as MesaGLInterop*QueryDeviceInfo. */

				   int (*query_device_info)(__DRIcontext *ctx,

				                            struct mesa_glinterop_device_info *out);

				   /** Same as MesaGLInterop*ExportObject. */

				   int (*export_object)(__DRIcontext *ctx,

				                        struct mesa_glinterop_export_in *in,

				                        struct mesa_glinterop_export_out *out);

				};

				/*@}*/

				/**

				@@ -1068,7 +1094,7 @@ struct __DRIdri2ExtensionRec {

				 * extensions.

				 */

				#define __DRI_IMAGE "DRI_IMAGE"

				#define __DRI_IMAGE_VERSION 11

				#define __DRI_IMAGE_VERSION 12

				/**

				 * These formats correspond to the similarly named MESA_FORMAT_*

				@@ -1100,8 +1126,18 @@ struct __DRIdri2ExtensionRec {

				#define __DRI_IMAGE_USE_SCANOUT		0x0002

				#define __DRI_IMAGE_USE_CURSOR		0x0004 /* Depricated */

				#define __DRI_IMAGE_USE_LINEAR		0x0008

				/* The buffer will only be read by an external process after SwapBuffers,

				 * in contrary to gbm buffers, front buffers and fake front buffers, which

				 * could be read after a flush."

				 */

				#define __DRI_IMAGE_USE_BACKBUFFER      0x0010

				#define __DRI_IMAGE_TRANSFER_READ            0x1

				#define __DRI_IMAGE_TRANSFER_WRITE           0x2

				#define __DRI_IMAGE_TRANSFER_READ_WRITE      \

				        (__DRI_IMAGE_TRANSFER_READ | __DRI_IMAGE_TRANSFER_WRITE)

				/**

				 * Four CC formats that matches with WL_DRM_FORMAT_* from wayland_drm.h,

				 * GBM_FORMAT_* from gbm.h, and DRM_FORMAT_* from drm_fourcc.h. Used with

				@@ -1127,6 +1163,11 @@ struct __DRIdri2ExtensionRec {

				#define __DRI_IMAGE_FOURCC_NV16		0x3631564e

				#define __DRI_IMAGE_FOURCC_YUYV		0x56595559

				#define __DRI_IMAGE_FOURCC_YVU410	0x39555659

				#define __DRI_IMAGE_FOURCC_YVU411	0x31315659

				#define __DRI_IMAGE_FOURCC_YVU420	0x32315659

				#define __DRI_IMAGE_FOURCC_YVU422	0x36315659

				#define __DRI_IMAGE_FOURCC_YVU444	0x34325659

				/**

				 * Queryable on images created by createImageFromNames.

				@@ -1350,6 +1391,33 @@ struct __DRIimageExtensionRec {

				    * \since 10

				    */

				   int (*getCapabilities)(__DRIscreen *screen);

				   /**

				    * Returns a map of the specified region of a __DRIimage for the specified usage.

				    *

				    * flags may include __DRI_IMAGE_TRANSFER_READ, which will populate the

				    * mapping with the current buffer content. If __DRI_IMAGE_TRANSFER_READ

				    * is not included in the flags, the buffer content at map time is

				    * undefined. Users wanting to modify the mapping must include

				    * __DRI_IMAGE_TRANSFER_WRITE; if __DRI_IMAGE_TRANSFER_WRITE is not

				    * included, behaviour when writing the mapping is undefined.

				    *

				    * Returns the byte stride in *stride, and an opaque pointer to data

				    * tracking the mapping in **data, which must be passed to unmapImage().

				    *

				    * \since 12

				    */

				   void *(*mapImage)(__DRIcontext *context, __DRIimage *image,

				                     int x0, int y0, int width, int height,

				                     unsigned int flags, int *stride, void **data);

				   /**

				    * Unmap a previously mapped __DRIimage

				    *

				    * \since 12

				    */

				   void (*unmapImage)(__DRIcontext *context, __DRIimage *image, void *data);

				};

									
										304

include/GL/mesa_glinterop.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,304 @@

				/*

				 * Mesa 3-D graphics library

				 *

				 * Copyright 2016 Advanced Micro Devices, Inc.

				 *

				 * Permission is hereby granted, free of charge, to any person obtaining a

				 * copy of this software and associated documentation files (the "Software"),

				 * to deal in the Software without restriction, including without limitation

				 * the rights to use, copy, modify, merge, publish, distribute, sublicense,

				 * and/or sell copies of the Software, and to permit persons to whom the

				 * Software is furnished to do so, subject to the following conditions:

				 *

				 * The above copyright notice and this permission notice shall be included

				 * in all copies or substantial portions of the Software.

				 *

				 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS

				 * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL

				 * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR

				 * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,

				 * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR

				 * OTHER DEALINGS IN THE SOFTWARE.

				 */

				/* Mesa OpenGL inter-driver interoperability interface designed for but not

				 * limited to OpenCL.

				 *

				 * This is a driver-agnostic, backward-compatible interface. The structures

				 * are only allowed to grow. They can never shrink and their members can

				 * never be removed, renamed, or redefined.

				 *

				 * The interface doesn't return a lot of static texture parameters like

				 * width, height, etc. It mainly returns mutable buffer and texture view

				 * parameters that can't be part of the texture allocation (because they are

				 * mutable). If drivers want to return more data or want to return static

				 * allocation parameters, they can do it in one of these two ways:

				 * - attaching the data to the DMABUF handle in a driver-specific way

				 * - passing the data via "out_driver_data" in the "in" structure.

				 *

				 * Mesa is expected to do a lot of error checking on behalf of OpenCL, such

				 * as checking the target, miplevel, and texture completeness.

				 *

				 * OpenCL, on the other hand, needs to check if the display+context combo

				 * is compatible with the OpenCL driver by querying the device information.

				 * It also needs to check if the texture internal format and channel ordering

				 * (returned in a driver-specific way) is supported by OpenCL, among other

				 * things.

				 */

				#ifndef MESA_GLINTEROP_H

				#define MESA_GLINTEROP_H

				#include <stddef.h>

				#include <stdint.h>

				#ifdef __cplusplus

				extern "C" {

				#endif

				/* Forward declarations to avoid inclusion of GL/glx.h */

				typedef struct _XDisplay Display;

				typedef struct __GLXcontextRec *GLXContext;

				/* Forward declarations to avoid inclusion of EGL/egl.h */

				typedef void *EGLDisplay;

				typedef void *EGLContext;

				/** Returned error codes. */

				enum {

				   MESA_GLINTEROP_SUCCESS = 0,

				   MESA_GLINTEROP_OUT_OF_RESOURCES,

				   MESA_GLINTEROP_OUT_OF_HOST_MEMORY,

				   MESA_GLINTEROP_INVALID_OPERATION,

				   MESA_GLINTEROP_INVALID_VERSION,

				   MESA_GLINTEROP_INVALID_DISPLAY,

				   MESA_GLINTEROP_INVALID_CONTEXT,

				   MESA_GLINTEROP_INVALID_TARGET,

				   MESA_GLINTEROP_INVALID_OBJECT,

				   MESA_GLINTEROP_INVALID_MIP_LEVEL,

				   MESA_GLINTEROP_UNSUPPORTED

				};

				/** Access flags. */

				enum {

				   MESA_GLINTEROP_ACCESS_READ_WRITE = 0,

				   MESA_GLINTEROP_ACCESS_READ_ONLY,

				   MESA_GLINTEROP_ACCESS_WRITE_ONLY

				};

				#define MESA_GLINTEROP_DEVICE_INFO_VERSION 1

				/**

				 * Device information returned by Mesa.

				 */

				struct mesa_glinterop_device_info {

				   /* The caller should set this to the version of the struct they support */

				   /* The callee will overwrite it if it supports a lower version.

				    *

				    * The caller should check the value and access up-to the version supported

				    * by the the callee.

				    */

				   /* NOTE: Do not use the MESA_GLINTEROP_DEVICE_INFO_VERSION macro */

				   uint32_t version;

				   /* PCI location */

				   uint32_t pci_segment_group;

				   uint32_t pci_bus;

				   uint32_t pci_device;

				   uint32_t pci_function;

				   /* Device identification */

				   uint32_t vendor_id;

				   uint32_t device_id;

				   /* Structure version 1 ends here. */

				};

				#define MESA_GLINTEROP_EXPORT_IN_VERSION 1

				/**

				 * Input parameters to Mesa interop export functions.

				 */

				struct mesa_glinterop_export_in {

				   /* The caller should set this to the version of the struct they support */

				   /* The callee will overwrite it if it supports a lower version.

				    *

				    * The caller should check the value and access up-to the version supported

				    * by the the callee.

				    */

				   /* NOTE: Do not use the MESA_GLINTEROP_EXPORT_IN_VERSION macro */

				   uint32_t version;

				   /* One of the following:

				    * - GL_TEXTURE_BUFFER

				    * - GL_TEXTURE_1D

				    * - GL_TEXTURE_2D

				    * - GL_TEXTURE_3D

				    * - GL_TEXTURE_RECTANGLE

				    * - GL_TEXTURE_1D_ARRAY

				    * - GL_TEXTURE_2D_ARRAY

				    * - GL_TEXTURE_CUBE_MAP_ARRAY

				    * - GL_TEXTURE_CUBE_MAP

				    * - GL_TEXTURE_CUBE_MAP_POSITIVE_X

				    * - GL_TEXTURE_CUBE_MAP_NEGATIVE_X

				    * - GL_TEXTURE_CUBE_MAP_POSITIVE_Y

				    * - GL_TEXTURE_CUBE_MAP_NEGATIVE_Y

				    * - GL_TEXTURE_CUBE_MAP_POSITIVE_Z

				    * - GL_TEXTURE_CUBE_MAP_NEGATIVE_Z

				    * - GL_TEXTURE_2D_MULTISAMPLE

				    * - GL_TEXTURE_2D_MULTISAMPLE_ARRAY

				    * - GL_TEXTURE_EXTERNAL_OES

				    * - GL_RENDERBUFFER

				    * - GL_ARRAY_BUFFER

				    */

				   unsigned target;

				   /* If target is GL_ARRAY_BUFFER, it's a buffer object.

				    * If target is GL_RENDERBUFFER, it's a renderbuffer object.

				    * If target is GL_TEXTURE_*, it's a texture object.

				    */

				   unsigned obj;

				   /* Mipmap level. Ignored for non-texture objects. */

				   unsigned miplevel;

				   /* One of MESA_GLINTEROP_ACCESS_* flags. This describes how the exported

				    * object is going to be used.

				    */

				   uint32_t access;

				   /* Size of memory pointed to by out_driver_data. */

				   uint32_t out_driver_data_size;

				   /* If the caller wants to query driver-specific data about the OpenGL

				    * object, this should point to the memory where that data will be stored.

				    * This is expected to be a temporary staging memory. The pointer is not

				    * allowed to be saved for later use by Mesa.

				    */

				   void *out_driver_data;

				   /* Structure version 1 ends here. */

				};

				#define MESA_GLINTEROP_EXPORT_OUT_VERSION 1

				/**

				 * Outputs of Mesa interop export functions.

				 */

				struct mesa_glinterop_export_out {

				   /* The caller should set this to the version of the struct they support */

				   /* The callee will overwrite it if it supports a lower version.

				    *

				    * The caller should check the value and access up-to the version supported

				    * by the the callee.

				    */

				   /* NOTE: Do not use the MESA_GLINTEROP_EXPORT_OUT_VERSION macro */

				   uint32_t version;

				   /* The DMABUF handle. It must be closed by the caller using the POSIX

				    * close() function when it's not needed anymore. Mesa is not responsible

				    * for closing the handle.

				    *

				    * Not closing the handle by the caller will lead to a resource leak,

				    * will prevent releasing the GPU buffer, and may prevent creating new

				    * DMABUF handles within the process.

				    */

				   int dmabuf_fd;

				   /* The mutable OpenGL internal format specified by glTextureView or

				    * glTexBuffer. If the object is not one of those, the original internal

				    * format specified by glTexStorage, glTexImage, or glRenderbufferStorage

				    * will be returned.

				    */

				   unsigned internal_format;

				   /* Buffer offset and size for GL_ARRAY_BUFFER and GL_TEXTURE_BUFFER.

				    * This allows interop with suballocations (a buffer allocated within

				    * a larger buffer).

				    *

				    * Parameters specified by glTexBufferRange for GL_TEXTURE_BUFFER are

				    * applied to these and can shrink the range further.

				    */

				   ptrdiff_t buf_offset;

				   ptrdiff_t buf_size;

				   /* Parameters specified by glTextureView. If the object is not a texture

				    * view, default parameters covering the whole texture will be returned.

				    */

				   unsigned view_minlevel;

				   unsigned view_numlevels;

				   unsigned view_minlayer;

				   unsigned view_numlayers;

				   /* The number of bytes written to out_driver_data. */

				   uint32_t out_driver_data_written;

				   /* Structure version 1 ends here. */

				};

				/**

				 * Query device information.

				 *

				 * \param dpy        GLX display

				 * \param context    GLX context

				 * \param out        where to return the information

				 *

				 * \return MESA_GLINTEROP_SUCCESS or MESA_GLINTEROP_* != 0 on error

				 */

				int

				MesaGLInteropGLXQueryDeviceInfo(Display *dpy, GLXContext context,

				                                struct mesa_glinterop_device_info *out);

				/**

				 * Same as MesaGLInteropGLXQueryDeviceInfo except that it accepts EGLDisplay

				 * and EGLContext.

				 */

				int

				MesaGLInteropEGLQueryDeviceInfo(EGLDisplay dpy, EGLContext context,

				                                struct mesa_glinterop_device_info *out);

				/**

				 * Create and return a DMABUF handle corresponding to the given OpenGL

				 * object, and return other parameters about the OpenGL object.

				 *

				 * \param dpy        GLX display

				 * \param context    GLX context

				 * \param in         input parameters

				 * \param out        return values

				 *

				 * \return MESA_GLINTEROP_SUCCESS or MESA_GLINTEROP_* != 0 on error

				 */

				int

				MesaGLInteropGLXExportObject(Display *dpy, GLXContext context,

				                             struct mesa_glinterop_export_in *in,

				                             struct mesa_glinterop_export_out *out);

				/**

				 * Same as MesaGLInteropGLXExportObject except that it accepts

				 * EGLDisplay and EGLContext.

				 */

				int

				MesaGLInteropEGLExportObject(EGLDisplay dpy, EGLContext context,

				                             struct mesa_glinterop_export_in *in,

				                             struct mesa_glinterop_export_out *out);

				typedef int (PFNMESAGLINTEROPGLXQUERYDEVICEINFOPROC)(Display *dpy, GLXContext context,

				                                                     struct mesa_glinterop_device_info *out);

				typedef int (PFNMESAGLINTEROPEGLQUERYDEVICEINFOPROC)(EGLDisplay dpy, EGLContext context,

				                                                     struct mesa_glinterop_device_info *out);

				typedef int (PFNMESAGLINTEROPGLXEXPORTOBJECTPROC)(Display *dpy, GLXContext context,

				                                                  struct mesa_glinterop_export_in *in,

				                                                  struct mesa_glinterop_export_out *out);

				typedef int (PFNMESAGLINTEROPEGLEXPORTOBJECTPROC)(EGLDisplay dpy, EGLContext context,

				                                                  struct mesa_glinterop_export_in *in,

				                                                  struct mesa_glinterop_export_out *out);

				#ifdef __cplusplus

				}

				#endif

				#endif /* MESA_GLINTEROP_H */

									
										45

include/GL/osmesa.h
									
												View File
												
				@@ -58,8 +58,8 @@ extern "C" {

				#include <GL/gl.h>

				#define OSMESA_MAJOR_VERSION 10

				#define OSMESA_MINOR_VERSION 0

				#define OSMESA_MAJOR_VERSION 11

				#define OSMESA_MINOR_VERSION 2

				#define OSMESA_PATCH_VERSION 0

				@@ -95,6 +95,18 @@ extern "C" {

				#define OSMESA_MAX_WIDTH	0x24  /* new in 4.0 */

				#define OSMESA_MAX_HEIGHT	0x25  /* new in 4.0 */

				/*

				 * Accepted in OSMesaCreateContextAttrib's attribute list.

				 */

				#define OSMESA_DEPTH_BITS            0x30

				#define OSMESA_STENCIL_BITS          0x31

				#define OSMESA_ACCUM_BITS            0x32

				#define OSMESA_PROFILE               0x33

				#define OSMESA_CORE_PROFILE          0x34

				#define OSMESA_COMPAT_PROFILE        0x35

				#define OSMESA_CONTEXT_MAJOR_VERSION 0x36

				#define OSMESA_CONTEXT_MINOR_VERSION 0x37

				typedef struct osmesa_context *OSMesaContext;

				@@ -127,6 +139,35 @@ OSMesaCreateContextExt( GLenum format, GLint depthBits, GLint stencilBits,

				                        GLint accumBits, OSMesaContext sharelist);

				/*

				 * Create an Off-Screen Mesa rendering context with attribute list.

				 * The list is composed of (attribute, value) pairs and terminated with

				 * attribute==0.  Supported Attributes:

				 *

				 * Attributes                    Values

				 * --------------------------------------------------------------------------

				 * OSMESA_FORMAT                 OSMESA_RGBA*, OSMESA_BGRA, OSMESA_ARGB, etc.

				 * OSMESA_DEPTH_BITS             0*, 16, 24, 32

				 * OSMESA_STENCIL_BITS           0*, 8

				 * OSMESA_ACCUM_BITS             0*, 16

				 * OSMESA_PROFILE                OSMESA_COMPAT_PROFILE*, OSMESA_CORE_PROFILE

				 * OSMESA_CONTEXT_MAJOR_VERSION  1*, 2, 3

				 * OSMESA_CONTEXT_MINOR_VERSION  0+

				 *

				 * Note: * = default value

				 *

				 * We return a context version >= what's specified by OSMESA_CONTEXT_MAJOR/

				 * MINOR_VERSION for the given profile.  For example, if you request a GL 1.4

				 * compat profile, you might get a GL 3.0 compat profile.

				 * Otherwise, null is returned if the version/profile is not supported.

				 *

				 * New in Mesa 11.2

				 */

				GLAPI OSMesaContext GLAPIENTRY

				OSMesaCreateContextAttribs( const int *attribList, OSMesaContext sharelist );

				/*

				 * Destroy an Off-Screen Mesa rendering context.

				 *

									
										35

include/c11/threads_posix.h
									
												View File
												
				@@ -169,6 +169,32 @@ mtx_destroy(mtx_t *mtx)

				    pthread_mutex_destroy(mtx);

				}

				/*

				 * XXX: Workaround when building with -O0 and without pthreads link.

				 *

				 * In such cases constant folding and dead code elimination won't be

				 * available, thus the compiler will always add the pthread_mutexattr*

				 * functions into the binary. As we try to link, we'll fail as the

				 * symbols are unresolved.

				 *

				 * Ideally we'll enable the optimisations locally, yet that does not

				 * seem to work.

				 *

				 * So the alternative workaround is to annotate the symbols as weak.

				 * Thus the linker will be happy and things don't clash when building

				 * with -O1 or greater.

				 */

				#ifdef HAVE_FUNC_ATTRIBUTE_WEAK

				__attribute__((weak))

				int pthread_mutexattr_init(pthread_mutexattr_t *attr);

				__attribute__((weak))

				int pthread_mutexattr_settype(pthread_mutexattr_t *attr, int type);

				__attribute__((weak))

				int pthread_mutexattr_destroy(pthread_mutexattr_t *attr);

				#endif

				// 7.25.4.2

				static inline int

				mtx_init(mtx_t *mtx, int type)

				@@ -180,9 +206,14 @@ mtx_init(mtx_t *mtx, int type)

				      && type != (mtx_timed|mtx_recursive)

				      && type != (mtx_try|mtx_recursive))

				        return thrd_error;

				    if ((type & mtx_recursive) == 0) {

				        pthread_mutex_init(mtx, NULL);

				        return thrd_success;

				    }

				    pthread_mutexattr_init(&attr);

				    if ((type & mtx_recursive) != 0)

				        pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_RECURSIVE);

				    pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_RECURSIVE);

				    pthread_mutex_init(mtx, &attr);

				    pthread_mutexattr_destroy(&attr);

				    return thrd_success;

									
										305

include/c99/inttypes.h
									
												View File
											
				@@ -1,305 +0,0 @@

				// ISO C9x  compliant inttypes.h for Microsoft Visual Studio

				// Based on ISO/IEC 9899:TC2 Committee draft (May 6, 2005) WG14/N1124 

				// 

				//  Copyright (c) 2006 Alexander Chemeris

				// 

				// Redistribution and use in source and binary forms, with or without

				// modification, are permitted provided that the following conditions are met:

				// 

				//   1. Redistributions of source code must retain the above copyright notice,

				//      this list of conditions and the following disclaimer.

				// 

				//   2. Redistributions in binary form must reproduce the above copyright

				//      notice, this list of conditions and the following disclaimer in the

				//      documentation and/or other materials provided with the distribution.

				// 

				//   3. The name of the author may be used to endorse or promote products

				//      derived from this software without specific prior written permission.

				// 

				// THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR IMPLIED

				// WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF

				// MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO

				// EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,

				// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,

				// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;

				// OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, 

				// WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR

				// OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF

				// ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

				// 

				///////////////////////////////////////////////////////////////////////////////

				#ifndef _MSC_VER // [

				#error "Use this header only with Microsoft Visual C++ compilers!"

				#endif // _MSC_VER ]

				#ifndef _MSC_INTTYPES_H_ // [

				#define _MSC_INTTYPES_H_

				#if _MSC_VER > 1000

				#pragma once

				#endif

				#include "stdint.h"

				// 7.8 Format conversion of integer types

				typedef struct {

				   intmax_t quot;

				   intmax_t rem;

				} imaxdiv_t;

				// 7.8.1 Macros for format specifiers

				#if !defined(__cplusplus) || defined(__STDC_FORMAT_MACROS) // [   See footnote 185 at page 198

				// The fprintf macros for signed integers are:

				#define PRId8       "d"

				#define PRIi8       "i"

				#define PRIdLEAST8  "d"

				#define PRIiLEAST8  "i"

				#define PRIdFAST8   "d"

				#define PRIiFAST8   "i"

				#define PRId16       "hd"

				#define PRIi16       "hi"

				#define PRIdLEAST16  "hd"

				#define PRIiLEAST16  "hi"

				#define PRIdFAST16   "hd"

				#define PRIiFAST16   "hi"

				#define PRId32       "I32d"

				#define PRIi32       "I32i"

				#define PRIdLEAST32  "I32d"

				#define PRIiLEAST32  "I32i"

				#define PRIdFAST32   "I32d"

				#define PRIiFAST32   "I32i"

				#define PRId64       "I64d"

				#define PRIi64       "I64i"

				#define PRIdLEAST64  "I64d"

				#define PRIiLEAST64  "I64i"

				#define PRIdFAST64   "I64d"

				#define PRIiFAST64   "I64i"

				#define PRIdMAX     "I64d"

				#define PRIiMAX     "I64i"

				#define PRIdPTR     "Id"

				#define PRIiPTR     "Ii"

				// The fprintf macros for unsigned integers are:

				#define PRIo8       "o"

				#define PRIu8       "u"

				#define PRIx8       "x"

				#define PRIX8       "X"

				#define PRIoLEAST8  "o"

				#define PRIuLEAST8  "u"

				#define PRIxLEAST8  "x"

				#define PRIXLEAST8  "X"

				#define PRIoFAST8   "o"

				#define PRIuFAST8   "u"

				#define PRIxFAST8   "x"

				#define PRIXFAST8   "X"

				#define PRIo16       "ho"

				#define PRIu16       "hu"

				#define PRIx16       "hx"

				#define PRIX16       "hX"

				#define PRIoLEAST16  "ho"

				#define PRIuLEAST16  "hu"

				#define PRIxLEAST16  "hx"

				#define PRIXLEAST16  "hX"

				#define PRIoFAST16   "ho"

				#define PRIuFAST16   "hu"

				#define PRIxFAST16   "hx"

				#define PRIXFAST16   "hX"

				#define PRIo32       "I32o"

				#define PRIu32       "I32u"

				#define PRIx32       "I32x"

				#define PRIX32       "I32X"

				#define PRIoLEAST32  "I32o"

				#define PRIuLEAST32  "I32u"

				#define PRIxLEAST32  "I32x"

				#define PRIXLEAST32  "I32X"

				#define PRIoFAST32   "I32o"

				#define PRIuFAST32   "I32u"

				#define PRIxFAST32   "I32x"

				#define PRIXFAST32   "I32X"

				#define PRIo64       "I64o"

				#define PRIu64       "I64u"

				#define PRIx64       "I64x"

				#define PRIX64       "I64X"

				#define PRIoLEAST64  "I64o"

				#define PRIuLEAST64  "I64u"

				#define PRIxLEAST64  "I64x"

				#define PRIXLEAST64  "I64X"

				#define PRIoFAST64   "I64o"

				#define PRIuFAST64   "I64u"

				#define PRIxFAST64   "I64x"

				#define PRIXFAST64   "I64X"

				#define PRIoMAX     "I64o"

				#define PRIuMAX     "I64u"

				#define PRIxMAX     "I64x"

				#define PRIXMAX     "I64X"

				#define PRIoPTR     "Io"

				#define PRIuPTR     "Iu"

				#define PRIxPTR     "Ix"

				#define PRIXPTR     "IX"

				// The fscanf macros for signed integers are:

				#define SCNd8       "d"

				#define SCNi8       "i"

				#define SCNdLEAST8  "d"

				#define SCNiLEAST8  "i"

				#define SCNdFAST8   "d"

				#define SCNiFAST8   "i"

				#define SCNd16       "hd"

				#define SCNi16       "hi"

				#define SCNdLEAST16  "hd"

				#define SCNiLEAST16  "hi"

				#define SCNdFAST16   "hd"

				#define SCNiFAST16   "hi"

				#define SCNd32       "ld"

				#define SCNi32       "li"

				#define SCNdLEAST32  "ld"

				#define SCNiLEAST32  "li"

				#define SCNdFAST32   "ld"

				#define SCNiFAST32   "li"

				#define SCNd64       "I64d"

				#define SCNi64       "I64i"

				#define SCNdLEAST64  "I64d"

				#define SCNiLEAST64  "I64i"

				#define SCNdFAST64   "I64d"

				#define SCNiFAST64   "I64i"

				#define SCNdMAX     "I64d"

				#define SCNiMAX     "I64i"

				#ifdef _WIN64 // [

				#  define SCNdPTR     "I64d"

				#  define SCNiPTR     "I64i"

				#else  // _WIN64 ][

				#  define SCNdPTR     "ld"

				#  define SCNiPTR     "li"

				#endif  // _WIN64 ]

				// The fscanf macros for unsigned integers are:

				#define SCNo8       "o"

				#define SCNu8       "u"

				#define SCNx8       "x"

				#define SCNX8       "X"

				#define SCNoLEAST8  "o"

				#define SCNuLEAST8  "u"

				#define SCNxLEAST8  "x"

				#define SCNXLEAST8  "X"

				#define SCNoFAST8   "o"

				#define SCNuFAST8   "u"

				#define SCNxFAST8   "x"

				#define SCNXFAST8   "X"

				#define SCNo16       "ho"

				#define SCNu16       "hu"

				#define SCNx16       "hx"

				#define SCNX16       "hX"

				#define SCNoLEAST16  "ho"

				#define SCNuLEAST16  "hu"

				#define SCNxLEAST16  "hx"

				#define SCNXLEAST16  "hX"

				#define SCNoFAST16   "ho"

				#define SCNuFAST16   "hu"

				#define SCNxFAST16   "hx"

				#define SCNXFAST16   "hX"

				#define SCNo32       "lo"

				#define SCNu32       "lu"

				#define SCNx32       "lx"

				#define SCNX32       "lX"

				#define SCNoLEAST32  "lo"

				#define SCNuLEAST32  "lu"

				#define SCNxLEAST32  "lx"

				#define SCNXLEAST32  "lX"

				#define SCNoFAST32   "lo"

				#define SCNuFAST32   "lu"

				#define SCNxFAST32   "lx"

				#define SCNXFAST32   "lX"

				#define SCNo64       "I64o"

				#define SCNu64       "I64u"

				#define SCNx64       "I64x"

				#define SCNX64       "I64X"

				#define SCNoLEAST64  "I64o"

				#define SCNuLEAST64  "I64u"

				#define SCNxLEAST64  "I64x"

				#define SCNXLEAST64  "I64X"

				#define SCNoFAST64   "I64o"

				#define SCNuFAST64   "I64u"

				#define SCNxFAST64   "I64x"

				#define SCNXFAST64   "I64X"

				#define SCNoMAX     "I64o"

				#define SCNuMAX     "I64u"

				#define SCNxMAX     "I64x"

				#define SCNXMAX     "I64X"

				#ifdef _WIN64 // [

				#  define SCNoPTR     "I64o"

				#  define SCNuPTR     "I64u"

				#  define SCNxPTR     "I64x"

				#  define SCNXPTR     "I64X"

				#else  // _WIN64 ][

				#  define SCNoPTR     "lo"

				#  define SCNuPTR     "lu"

				#  define SCNxPTR     "lx"

				#  define SCNXPTR     "lX"

				#endif  // _WIN64 ]

				#endif // __STDC_FORMAT_MACROS ]

				// 7.8.2 Functions for greatest-width integer types

				// 7.8.2.1 The imaxabs function

				#define imaxabs _abs64

				// 7.8.2.2 The imaxdiv function

				// This is modified version of div() function from Microsoft's div.c found

				// in %MSVC.NET%\crt\src\div.c

				#ifdef STATIC_IMAXDIV // [

				static

				#else // STATIC_IMAXDIV ][

				_inline

				#endif // STATIC_IMAXDIV ]

				imaxdiv_t __cdecl imaxdiv(intmax_t numer, intmax_t denom)

				{

				   imaxdiv_t result;

				   result.quot = numer / denom;

				   result.rem = numer % denom;

				   if (numer < 0 && result.rem > 0) {

				      // did division wrong; must fix up

				      ++result.quot;

				      result.rem -= denom;

				   }

				   return result;

				}

				// 7.8.2.3 The strtoimax and strtoumax functions

				#define strtoimax _strtoi64

				#define strtoumax _strtoui64

				// 7.8.2.4 The wcstoimax and wcstoumax functions

				#define wcstoimax _wcstoi64

				#define wcstoumax _wcstoui64

				#endif // _MSC_INTTYPES_H_ ]

									
										247

include/c99/stdint.h
									
												View File
											
				@@ -1,247 +0,0 @@

				// ISO C9x  compliant stdint.h for Microsoft Visual Studio

				// Based on ISO/IEC 9899:TC2 Committee draft (May 6, 2005) WG14/N1124 

				// 

				//  Copyright (c) 2006-2008 Alexander Chemeris

				// 

				// Redistribution and use in source and binary forms, with or without

				// modification, are permitted provided that the following conditions are met:

				// 

				//   1. Redistributions of source code must retain the above copyright notice,

				//      this list of conditions and the following disclaimer.

				// 

				//   2. Redistributions in binary form must reproduce the above copyright

				//      notice, this list of conditions and the following disclaimer in the

				//      documentation and/or other materials provided with the distribution.

				// 

				//   3. The name of the author may be used to endorse or promote products

				//      derived from this software without specific prior written permission.

				// 

				// THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR IMPLIED

				// WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF

				// MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO

				// EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,

				// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,

				// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;

				// OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, 

				// WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR

				// OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF

				// ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

				// 

				///////////////////////////////////////////////////////////////////////////////

				#ifndef _MSC_VER // [

				#error "Use this header only with Microsoft Visual C++ compilers!"

				#endif // _MSC_VER ]

				#ifndef _MSC_STDINT_H_ // [

				#define _MSC_STDINT_H_

				#if _MSC_VER > 1000

				#pragma once

				#endif

				#include <limits.h>

				// For Visual Studio 6 in C++ mode and for many Visual Studio versions when

				// compiling for ARM we should wrap <wchar.h> include with 'extern "C++" {}'

				// or compiler give many errors like this:

				//   error C2733: second C linkage of overloaded function 'wmemchr' not allowed

				#ifdef __cplusplus

				extern "C" {

				#endif

				#  include <wchar.h>

				#ifdef __cplusplus

				}

				#endif

				// Define _W64 macros to mark types changing their size, like intptr_t.

				#ifndef _W64

				#  if !defined(__midl) && (defined(_X86_) || defined(_M_IX86)) && _MSC_VER >= 1300

				#     define _W64 __w64

				#  else

				#     define _W64

				#  endif

				#endif

				// 7.18.1 Integer types

				// 7.18.1.1 Exact-width integer types

				// Visual Studio 6 and Embedded Visual C++ 4 doesn't

				// realize that, e.g. char has the same size as __int8

				// so we give up on __intX for them.

				#if (_MSC_VER < 1300)

				   typedef signed char       int8_t;

				   typedef signed short      int16_t;

				   typedef signed int        int32_t;

				   typedef unsigned char     uint8_t;

				   typedef unsigned short    uint16_t;

				   typedef unsigned int      uint32_t;

				#else

				   typedef signed __int8     int8_t;

				   typedef signed __int16    int16_t;

				   typedef signed __int32    int32_t;

				   typedef unsigned __int8   uint8_t;

				   typedef unsigned __int16  uint16_t;

				   typedef unsigned __int32  uint32_t;

				#endif

				typedef signed __int64       int64_t;

				typedef unsigned __int64     uint64_t;

				// 7.18.1.2 Minimum-width integer types

				typedef int8_t    int_least8_t;

				typedef int16_t   int_least16_t;

				typedef int32_t   int_least32_t;

				typedef int64_t   int_least64_t;

				typedef uint8_t   uint_least8_t;

				typedef uint16_t  uint_least16_t;

				typedef uint32_t  uint_least32_t;

				typedef uint64_t  uint_least64_t;

				// 7.18.1.3 Fastest minimum-width integer types

				typedef int8_t    int_fast8_t;

				typedef int16_t   int_fast16_t;

				typedef int32_t   int_fast32_t;

				typedef int64_t   int_fast64_t;

				typedef uint8_t   uint_fast8_t;

				typedef uint16_t  uint_fast16_t;

				typedef uint32_t  uint_fast32_t;

				typedef uint64_t  uint_fast64_t;

				// 7.18.1.4 Integer types capable of holding object pointers

				#ifdef _WIN64 // [

				   typedef signed __int64    intptr_t;

				   typedef unsigned __int64  uintptr_t;

				#else // _WIN64 ][

				   typedef _W64 signed int   intptr_t;

				   typedef _W64 unsigned int uintptr_t;

				#endif // _WIN64 ]

				// 7.18.1.5 Greatest-width integer types

				typedef int64_t   intmax_t;

				typedef uint64_t  uintmax_t;

				// 7.18.2 Limits of specified-width integer types

				#if !defined(__cplusplus) || defined(__STDC_LIMIT_MACROS) // [   See footnote 220 at page 257 and footnote 221 at page 259

				// 7.18.2.1 Limits of exact-width integer types

				#define INT8_MIN     ((int8_t)_I8_MIN)

				#define INT8_MAX     _I8_MAX

				#define INT16_MIN    ((int16_t)_I16_MIN)

				#define INT16_MAX    _I16_MAX

				#define INT32_MIN    ((int32_t)_I32_MIN)

				#define INT32_MAX    _I32_MAX

				#define INT64_MIN    ((int64_t)_I64_MIN)

				#define INT64_MAX    _I64_MAX

				#define UINT8_MAX    _UI8_MAX

				#define UINT16_MAX   _UI16_MAX

				#define UINT32_MAX   _UI32_MAX

				#define UINT64_MAX   _UI64_MAX

				// 7.18.2.2 Limits of minimum-width integer types

				#define INT_LEAST8_MIN    INT8_MIN

				#define INT_LEAST8_MAX    INT8_MAX

				#define INT_LEAST16_MIN   INT16_MIN

				#define INT_LEAST16_MAX   INT16_MAX

				#define INT_LEAST32_MIN   INT32_MIN

				#define INT_LEAST32_MAX   INT32_MAX

				#define INT_LEAST64_MIN   INT64_MIN

				#define INT_LEAST64_MAX   INT64_MAX

				#define UINT_LEAST8_MAX   UINT8_MAX

				#define UINT_LEAST16_MAX  UINT16_MAX

				#define UINT_LEAST32_MAX  UINT32_MAX

				#define UINT_LEAST64_MAX  UINT64_MAX

				// 7.18.2.3 Limits of fastest minimum-width integer types

				#define INT_FAST8_MIN    INT8_MIN

				#define INT_FAST8_MAX    INT8_MAX

				#define INT_FAST16_MIN   INT16_MIN

				#define INT_FAST16_MAX   INT16_MAX

				#define INT_FAST32_MIN   INT32_MIN

				#define INT_FAST32_MAX   INT32_MAX

				#define INT_FAST64_MIN   INT64_MIN

				#define INT_FAST64_MAX   INT64_MAX

				#define UINT_FAST8_MAX   UINT8_MAX

				#define UINT_FAST16_MAX  UINT16_MAX

				#define UINT_FAST32_MAX  UINT32_MAX

				#define UINT_FAST64_MAX  UINT64_MAX

				// 7.18.2.4 Limits of integer types capable of holding object pointers

				#ifdef _WIN64 // [

				#  define INTPTR_MIN   INT64_MIN

				#  define INTPTR_MAX   INT64_MAX

				#  define UINTPTR_MAX  UINT64_MAX

				#else // _WIN64 ][

				#  define INTPTR_MIN   INT32_MIN

				#  define INTPTR_MAX   INT32_MAX

				#  define UINTPTR_MAX  UINT32_MAX

				#endif // _WIN64 ]

				// 7.18.2.5 Limits of greatest-width integer types

				#define INTMAX_MIN   INT64_MIN

				#define INTMAX_MAX   INT64_MAX

				#define UINTMAX_MAX  UINT64_MAX

				// 7.18.3 Limits of other integer types

				#ifdef _WIN64 // [

				#  define PTRDIFF_MIN  _I64_MIN

				#  define PTRDIFF_MAX  _I64_MAX

				#else  // _WIN64 ][

				#  define PTRDIFF_MIN  _I32_MIN

				#  define PTRDIFF_MAX  _I32_MAX

				#endif  // _WIN64 ]

				#define SIG_ATOMIC_MIN  INT_MIN

				#define SIG_ATOMIC_MAX  INT_MAX

				#ifndef SIZE_MAX // [

				#  ifdef _WIN64 // [

				#     define SIZE_MAX  _UI64_MAX

				#  else // _WIN64 ][

				#     define SIZE_MAX  _UI32_MAX

				#  endif // _WIN64 ]

				#endif // SIZE_MAX ]

				// WCHAR_MIN and WCHAR_MAX are also defined in <wchar.h>

				#ifndef WCHAR_MIN // [

				#  define WCHAR_MIN  0

				#endif  // WCHAR_MIN ]

				#ifndef WCHAR_MAX // [

				#  define WCHAR_MAX  _UI16_MAX

				#endif  // WCHAR_MAX ]

				#define WINT_MIN  0

				#define WINT_MAX  _UI16_MAX

				#endif // __STDC_LIMIT_MACROS ]

				// 7.18.4 Limits of other integer types

				#if !defined(__cplusplus) || defined(__STDC_CONSTANT_MACROS) // [   See footnote 224 at page 260

				// 7.18.4.1 Macros for minimum-width integer constants

				#define INT8_C(val)  val##i8

				#define INT16_C(val) val##i16

				#define INT32_C(val) val##i32

				#define INT64_C(val) val##i64

				#define UINT8_C(val)  val##ui8

				#define UINT16_C(val) val##ui16

				#define UINT32_C(val) val##ui32

				#define UINT64_C(val) val##ui64

				// 7.18.4.2 Macros for greatest-width integer constants

				#define INTMAX_C   INT64_C

				#define UINTMAX_C  UINT64_C

				#endif // __STDC_CONSTANT_MACROS ]

				#endif // _MSC_STDINT_H_ ]

									
										58

include/c99_compat.h
									
												View File
												
				@@ -36,17 +36,17 @@

				 */

				#if defined(_MSC_VER)

				#  if _MSC_VER < 1500

				#    error "Microsoft Visual Studio 2008 or higher required"

				#  if _MSC_VER < 1800

				#    error "Microsoft Visual Studio 2013 or higher required"

				#  endif

				   /*

				    * Visual Studio 2012 will complain if we define the `inline` keyword, but

				    * Visual Studio will complain if we define the `inline` keyword, but

				    * actually it only supports the keyword on C++.

				    *

				    * To avoid this the _ALLOW_KEYWORD_MACROS must be set.

				    */

				#  if (_MSC_VER >= 1700) && !defined(_ALLOW_KEYWORD_MACROS)

				#  if !defined(_ALLOW_KEYWORD_MACROS)

				#    define _ALLOW_KEYWORD_MACROS

				#  endif

				@@ -81,8 +81,6 @@

				     /* Intel compiler supports inline keyword */

				#  elif defined(__WATCOMC__) && (__WATCOMC__ >= 1100)

				#    define inline __inline

				#  elif defined(__SUNPRO_C) && defined(__C99FEATURES__)

				     /* C99 supports inline keyword */

				#  elif (__STDC_VERSION__ >= 199901L)

				     /* C99 supports inline keyword */

				#  else

				@@ -100,8 +98,6 @@

				#ifndef restrict

				#  if (__STDC_VERSION__ >= 199901L)

				     /* C99 */

				#  elif defined(__SUNPRO_C) && defined(__C99FEATURES__)

				     /* C99 */

				#  elif defined(__GNUC__)

				#    define restrict __restrict__

				#  elif defined(_MSC_VER)

				@@ -118,8 +114,6 @@

				#ifndef __func__

				#  if (__STDC_VERSION__ >= 199901L)

				     /* C99 */

				#  elif defined(__SUNPRO_C) && defined(__C99FEATURES__)

				     /* C99 */

				#  elif defined(__GNUC__)

				#    define __func__ __FUNCTION__

				#  elif defined(_MSC_VER)

				@@ -141,4 +135,48 @@ test_c99_compat_h(const void * restrict a,

				#endif

				/* Fallback definitions, for build systems other than autoconfig which don't

				 * auto-detect these things. */

				#ifdef HAVE_NO_AUTOCONF

				#  ifndef _WIN32

				#    define HAVE_PTHREAD

				#    define HAVE_POSIX_MEMALIGN

				#  endif

				#  ifdef __GNUC__

				#    if __GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 2)

				#      error "GCC version 4.2 or higher required"

				#    endif

				     /* https://gcc.gnu.org/onlinedocs/gcc-4.2.4/gcc/Other-Builtins.html */

				#    define HAVE___BUILTIN_CLZ 1

				#    define HAVE___BUILTIN_CLZLL 1

				#    define HAVE___BUILTIN_CTZ 1

				#    define HAVE___BUILTIN_EXPECT 1

				#    define HAVE___BUILTIN_FFS 1

				#    define HAVE___BUILTIN_FFSLL 1

				#    define HAVE___BUILTIN_POPCOUNT 1

				#    define HAVE___BUILTIN_POPCOUNTLL 1

				     /* https://gcc.gnu.org/onlinedocs/gcc-4.2.4/gcc/Function-Attributes.html */

				#    define HAVE_FUNC_ATTRIBUTE_FLATTEN 1

				#    define HAVE_FUNC_ATTRIBUTE_UNUSED 1

				#    define HAVE_FUNC_ATTRIBUTE_FORMAT 1

				#    define HAVE_FUNC_ATTRIBUTE_PACKED 1

				#    if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 3)

				       /* https://gcc.gnu.org/onlinedocs/gcc-4.3.6/gcc/Other-Builtins.html */

				#      define HAVE___BUILTIN_BSWAP32 1

				#      define HAVE___BUILTIN_BSWAP64 1

				#    endif

				#    if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 5)

				#      define HAVE___BUILTIN_UNREACHABLE 1

				#    endif

				#  endif /* __GNUC__ */

				#endif /* !HAVE_AUTOCONF */

				#endif /* _C99_COMPAT_H_ */

									
										72

include/c99_math.h
									
												View File
												
				@@ -38,55 +38,16 @@

				#include "c99_compat.h"

				#if defined(_MSC_VER)

				/* This is to ensure that we get M_PI, etc. definitions */

				#if !defined(_USE_MATH_DEFINES)

				#if defined(_MSC_VER) && !defined(_USE_MATH_DEFINES)

				#error _USE_MATH_DEFINES define required when building with MSVC

				#endif 

				#if _MSC_VER < 1800

				#define isfinite(x) _finite((double)(x))

				#define isnan(x) _isnan((double)(x))

				#endif /* _MSC_VER < 1800 */

				#if _MSC_VER < 1800

				static inline double log2( double x )

				{

				   const double invln2 = 1.442695041;

				   return log( x ) * invln2;

				}

				static inline double

				round(double x)

				{

				   return x >= 0.0 ? floor(x + 0.5) : ceil(x - 0.5);

				}

				static inline float

				roundf(float x)

				{

				   return x >= 0.0f ? floorf(x + 0.5f) : ceilf(x - 0.5f);

				}

				#endif

				#ifndef INFINITY

				#include <float.h> // DBL_MAX

				#define INFINITY (DBL_MAX + DBL_MAX)

				#endif

				#ifndef NAN

				#define NAN (INFINITY - INFINITY)

				#endif

				#endif /* _MSC_VER */

				#if (defined(_MSC_VER) && _MSC_VER < 1800) || \

				    (!defined(_MSC_VER) && \

				     __STDC_VERSION__ < 199901L && \

				     (!defined(_XOPEN_SOURCE) || _XOPEN_SOURCE < 600) && \

				     !defined(__cplusplus))

				#if !defined(_MSC_VER) && \

				    __STDC_VERSION__ < 199901L && \

				    (!defined(_XOPEN_SOURCE) || _XOPEN_SOURCE < 600) && \

				    !defined(__cplusplus)

				static inline long int

				lrint(double d)

				@@ -224,4 +185,27 @@ fpclassify(double x)

				#endif

				/* Since C++11, the following functions are part of the std namespace. Their C

				 * counteparts should still exist in the global namespace, however cmath

				 * undefines those functions, which in glibc 2.23, are defined as macros rather

				 * than functions as in glibc 2.22.

				 */

				#if __cplusplus >= 201103L && (__GLIBC__ > 2 || (__GLIBC__ == 2 && __GLIBC_MINOR__ >= 23))

				#include <cmath>

				using std::fpclassify;

				using std::isfinite;

				using std::isinf;

				using std::isnan;

				using std::isnormal;

				using std::signbit;

				using std::isgreater;

				using std::isgreaterequal;

				using std::isless;

				using std::islessequal;

				using std::islessgreater;

				using std::isunordered;

				#endif

				#endif /* #define _C99_MATH_H_ */

									
										6

include/d3dadapter/drm.h
									
												View File
												
				@@ -29,7 +29,11 @@

				#define D3DADAPTER9DRM_NAME "drm"

				/* current version */

				#define D3DADAPTER9DRM_MAJOR 0

				#define D3DADAPTER9DRM_MINOR 0

				#define D3DADAPTER9DRM_MINOR 1

				/* version 0.0: Initial release

				 *         0.1: All IDirect3D objects can be assumed to have a pointer to the

				 *              internal vtable in second position of the structure */

				struct D3DAdapter9DRM

				{

									
										10

include/d3dadapter/present.h
									
												View File
												
				@@ -69,6 +69,12 @@ typedef struct ID3DPresentVtbl

				    HRESULT (WINAPI *SetCursor)(ID3DPresent *This, void *pBitmap, POINT *pHotspot, BOOL bShow);

				    HRESULT (WINAPI *SetGammaRamp)(ID3DPresent *This, const D3DGAMMARAMP *pRamp, HWND hWndOverride);

				    HRESULT (WINAPI *GetWindowInfo)(ID3DPresent *This,  HWND hWnd, int *width, int *height, int *depth);

				    /* Available since version 1.1 */

				    BOOL (WINAPI *GetWindowOccluded)(ID3DPresent *This);

				    /* Available since version 1.2 */

				    BOOL (WINAPI *ResolutionMismatch)(ID3DPresent *This);

				    HANDLE (WINAPI *CreateThread)(ID3DPresent *This, void *pThreadfunc, void *pParam);

				    BOOL (WINAPI *WaitForThread)(ID3DPresent *This, HANDLE thread);

				} ID3DPresentVtbl;

				struct ID3DPresent

				@@ -96,6 +102,10 @@ struct ID3DPresent

				#define ID3DPresent_SetCursor(p,a,b,c) (p)->lpVtbl->SetCursor(p,a,b,c)

				#define ID3DPresent_SetGammaRamp(p,a,b) (p)->lpVtbl->SetGammaRamp(p,a,b)

				#define ID3DPresent_GetWindowInfo(p,a,b,c,d) (p)->lpVtbl->GetWindowSize(p,a,b,c,d)

				#define ID3DPresent_GetWindowOccluded(p) (p)->lpVtbl->GetWindowOccluded(p)

				#define ID3DPresent_ResolutionMismatch(p) (p)->lpVtbl->ResolutionMismatch(p)

				#define ID3DPresent_CreateThread(p,a,b) (p)->lpVtbl->CreateThread(p,a,b)

				#define ID3DPresent_WaitForThread(p,a) (p)->lpVtbl->WaitForThread(p,a)

				typedef struct ID3DPresentGroupVtbl

				{

									
										24

include/pci_ids/i965_pci_ids.h
									
												View File
												
				@@ -112,6 +112,7 @@ CHIPSET(0x162E, bdw_gt3, "Intel(R) Broadwell GT3")

				CHIPSET(0x1902, skl_gt1, "Intel(R) HD Graphics 510 (Skylake GT1)")

				CHIPSET(0x1906, skl_gt1, "Intel(R) HD Graphics 510 (Skylake GT1)")

				CHIPSET(0x190A, skl_gt1, "Intel(R) Skylake GT1")

				CHIPSET(0x190B, skl_gt1, "Intel(R) HD Graphics 510 (Skylake GT1)")

				CHIPSET(0x190E, skl_gt1, "Intel(R) Skylake GT1")

				CHIPSET(0x1912, skl_gt2, "Intel(R) HD Graphics 530 (Skylake GT2)")

				CHIPSET(0x1913, skl_gt2, "Intel(R) Skylake GT2f")

				@@ -122,16 +123,17 @@ CHIPSET(0x191A, skl_gt2, "Intel(R) Skylake GT2")

				CHIPSET(0x191B, skl_gt2, "Intel(R) HD Graphics 530 (Skylake GT2)")

				CHIPSET(0x191D, skl_gt2, "Intel(R) HD Graphics P530 (Skylake GT2)")

				CHIPSET(0x191E, skl_gt2, "Intel(R) HD Graphics 515 (Skylake GT2)")

				CHIPSET(0x1921, skl_gt2, "Intel(R) Skylake GT2")

				CHIPSET(0x1923, skl_gt3, "Intel(R) Iris Graphics 540 (Skylake GT3e)")

				CHIPSET(0x1926, skl_gt3, "Intel(R) HD Graphics 535 (Skylake GT3)")

				CHIPSET(0x1921, skl_gt2, "Intel(R) HD Graphics 520 (Skylake GT2)")

				CHIPSET(0x1923, skl_gt3, "Intel(R) Skylake GT3e")

				CHIPSET(0x1926, skl_gt3, "Intel(R) Iris Graphics 540 (Skylake GT3e)")

				CHIPSET(0x1927, skl_gt3, "Intel(R) Iris Graphics 550 (Skylake GT3e)")

				CHIPSET(0x192A, skl_gt4, "Intel(R) Skylake GT4")

				CHIPSET(0x192B, skl_gt3, "Intel(R) Iris Graphics (Skylake GT3fe)")

				CHIPSET(0x1932, skl_gt4, "Intel(R) Skylake GT4")

				CHIPSET(0x193A, skl_gt4, "Intel(R) Skylake GT4")

				CHIPSET(0x193B, skl_gt4, "Intel(R) Skylake GT4")

				CHIPSET(0x193D, skl_gt4, "Intel(R) Skylake GT4")

				CHIPSET(0x192B, skl_gt3, "Intel(R) Iris Graphics 555 (Skylake GT3e)")

				CHIPSET(0x192D, skl_gt3, "Intel(R) Iris Graphics P555 (Skylake GT3e)")

				CHIPSET(0x1932, skl_gt4, "Intel(R) Iris Pro Graphics 580 (Skylake GT4e)")

				CHIPSET(0x193A, skl_gt4, "Intel(R) Iris Pro Graphics P580 (Skylake GT4e)")

				CHIPSET(0x193B, skl_gt4, "Intel(R) Iris Pro Graphics 580 (Skylake GT4e)")

				CHIPSET(0x193D, skl_gt4, "Intel(R) Iris Pro Graphics P580 (Skylake GT4e)")

				CHIPSET(0x5902, kbl_gt1, "Intel(R) Kabylake GT1")

				CHIPSET(0x5906, kbl_gt1, "Intel(R) Kabylake GT1")

				CHIPSET(0x590A, kbl_gt1, "Intel(R) Kabylake GT1")

				@@ -154,10 +156,12 @@ CHIPSET(0x5932, kbl_gt4, "Intel(R) Kabylake GT4")

				CHIPSET(0x593A, kbl_gt4, "Intel(R) Kabylake GT4")

				CHIPSET(0x593B, kbl_gt4, "Intel(R) Kabylake GT4")

				CHIPSET(0x593D, kbl_gt4, "Intel(R) Kabylake GT4")

				CHIPSET(0x22B0, chv,     "Intel(R) HD Graphics (Cherryview)")

				CHIPSET(0x22B1, chv,     "Intel(R) HD Graphics (Cherryview)")

				CHIPSET(0x22B0, chv,     "Intel(R) HD Graphics (Cherrytrail)")

				CHIPSET(0x22B1, chv,     "Intel(R) HD Graphics XXX (Braswell)") /* Overridden in brw_get_renderer_string */

				CHIPSET(0x22B2, chv,     "Intel(R) HD Graphics (Cherryview)")

				CHIPSET(0x22B3, chv,     "Intel(R) HD Graphics (Cherryview)")

				CHIPSET(0x0A84, bxt,     "Intel(R) HD Graphics (Broxton)")

				CHIPSET(0x1A84, bxt,     "Intel(R) HD Graphics (Broxton)")

				CHIPSET(0x1A85, bxt_2x6, "Intel(R) HD Graphics (Broxton 2x6)")

				CHIPSET(0x5A84, bxt,     "Intel(R) HD Graphics (Broxton)")

				CHIPSET(0x5A85, bxt_2x6, "Intel(R) HD Graphics (Broxton 2x6)")

									
										22

include/pci_ids/radeonsi_pci_ids.h
									
												View File
												
				@@ -182,4 +182,26 @@ CHIPSET(0x9877, CARRIZO_, CARRIZO)

				CHIPSET(0x7300, FIJI_, FIJI)

				CHIPSET(0x67E0, POLARIS11_, POLARIS11)

				CHIPSET(0x67E1, POLARIS11_, POLARIS11)

				CHIPSET(0x67E3, POLARIS11_, POLARIS11)

				CHIPSET(0x67E7, POLARIS11_, POLARIS11)

				CHIPSET(0x67E8, POLARIS11_, POLARIS11)

				CHIPSET(0x67E9, POLARIS11_, POLARIS11)

				CHIPSET(0x67EB, POLARIS11_, POLARIS11)

				CHIPSET(0x67EF, POLARIS11_, POLARIS11)

				CHIPSET(0x67FF, POLARIS11_, POLARIS11)

				CHIPSET(0x67C0, POLARIS10_, POLARIS10)

				CHIPSET(0x67C1, POLARIS10_, POLARIS10)

				CHIPSET(0x67C2, POLARIS10_, POLARIS10)

				CHIPSET(0x67C4, POLARIS10_, POLARIS10)

				CHIPSET(0x67C7, POLARIS10_, POLARIS10)

				CHIPSET(0x67C8, POLARIS10_, POLARIS10)

				CHIPSET(0x67C9, POLARIS10_, POLARIS10)

				CHIPSET(0x67CA, POLARIS10_, POLARIS10)

				CHIPSET(0x67CC, POLARIS10_, POLARIS10)

				CHIPSET(0x67CF, POLARIS10_, POLARIS10)

				CHIPSET(0x67DF, POLARIS10_, POLARIS10)

				CHIPSET(0x98E4, STONEY_, STONEY)

									
										2

include/pci_ids/virtio_gpu_pci_ids.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,2 @@

				CHIPSET(0x0010, VIRTGL, VIRTGL)

				CHIPSET(0x1050, VIRTGL, VIRTGL)

									
										85

include/vulkan/vk_icd.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,85 @@

				#ifndef VKICD_H

				#define VKICD_H

				#include "vk_platform.h"

				/*

				 * The ICD must reserve space for a pointer for the loader's dispatch

				 * table, at the start of <each object>.

				 * The ICD must initialize this variable using the SET_LOADER_MAGIC_VALUE macro.

				 */

				#define ICD_LOADER_MAGIC   0x01CDC0DE

				typedef union _VK_LOADER_DATA {

				  uintptr_t loaderMagic;

				  void *loaderData;

				} VK_LOADER_DATA;

				static inline void set_loader_magic_value(void* pNewObject) {

				    VK_LOADER_DATA *loader_info = (VK_LOADER_DATA *) pNewObject;

				    loader_info->loaderMagic = ICD_LOADER_MAGIC;

				}

				static inline bool valid_loader_magic_value(void* pNewObject) {

				    const VK_LOADER_DATA *loader_info = (VK_LOADER_DATA *) pNewObject;

				    return (loader_info->loaderMagic & 0xffffffff) == ICD_LOADER_MAGIC;

				}

				/*

				 * Windows and Linux ICDs will treat VkSurfaceKHR as a pointer to a struct that

				 * contains the platform-specific connection and surface information.

				 */

				typedef enum _VkIcdWsiPlatform {

				    VK_ICD_WSI_PLATFORM_MIR,

				    VK_ICD_WSI_PLATFORM_WAYLAND,

				    VK_ICD_WSI_PLATFORM_WIN32,

				    VK_ICD_WSI_PLATFORM_XCB,

				    VK_ICD_WSI_PLATFORM_XLIB,

				} VkIcdWsiPlatform;

				typedef struct _VkIcdSurfaceBase {

				    VkIcdWsiPlatform   platform;

				} VkIcdSurfaceBase;

				#ifdef VK_USE_PLATFORM_MIR_KHR

				typedef struct _VkIcdSurfaceMir {

				    VkIcdSurfaceBase   base;

				    MirConnection*     connection;

				    MirSurface*        mirSurface;

				} VkIcdSurfaceMir;

				#endif // VK_USE_PLATFORM_MIR_KHR

				#ifdef VK_USE_PLATFORM_WAYLAND_KHR

				typedef struct _VkIcdSurfaceWayland {

				    VkIcdSurfaceBase   base;

				    struct wl_display* display;

				    struct wl_surface* surface;

				} VkIcdSurfaceWayland;

				#endif // VK_USE_PLATFORM_WAYLAND_KHR

				#ifdef VK_USE_PLATFORM_WIN32_KHR

				typedef struct _VkIcdSurfaceWin32 {

				    VkIcdSurfaceBase   base;

				    HINSTANCE          hinstance;

				    HWND               hwnd;

				} VkIcdSurfaceWin32;

				#endif // VK_USE_PLATFORM_WIN32_KHR

				#ifdef VK_USE_PLATFORM_XCB_KHR

				typedef struct _VkIcdSurfaceXcb {

				    VkIcdSurfaceBase   base;

				    xcb_connection_t*  connection;

				    xcb_window_t       window;

				} VkIcdSurfaceXcb;

				#endif // VK_USE_PLATFORM_XCB_KHR

				#ifdef VK_USE_PLATFORM_XLIB_KHR

				typedef struct _VkIcdSurfaceXlib {

				    VkIcdSurfaceBase   base;

				    Display*           dpy;

				    Window             window;

				} VkIcdSurfaceXlib;

				#endif // VK_USE_PLATFORM_XLIB_KHR

				#endif // VKICD_H

									
										127

include/vulkan/vk_platform.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,127 @@

				//

				// File: vk_platform.h

				//

				/*

				** Copyright (c) 2014-2015 The Khronos Group Inc.

				**

				** Permission is hereby granted, free of charge, to any person obtaining a

				** copy of this software and/or associated documentation files (the

				** "Materials"), to deal in the Materials without restriction, including

				** without limitation the rights to use, copy, modify, merge, publish,

				** distribute, sublicense, and/or sell copies of the Materials, and to

				** permit persons to whom the Materials are furnished to do so, subject to

				** the following conditions:

				**

				** The above copyright notice and this permission notice shall be included

				** in all copies or substantial portions of the Materials.

				**

				** THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,

				** EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF

				** MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.

				** IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY

				** CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,

				** TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE

				** MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.

				*/

				#ifndef VK_PLATFORM_H_

				#define VK_PLATFORM_H_

				#ifdef __cplusplus

				extern "C"

				{

				#endif // __cplusplus

				/*

				***************************************************************************************************

				*   Platform-specific directives and type declarations

				***************************************************************************************************

				*/

				/* Platform-specific calling convention macros.

				 *

				 * Platforms should define these so that Vulkan clients call Vulkan commands

				 * with the same calling conventions that the Vulkan implementation expects.

				 *

				 * VKAPI_ATTR - Placed before the return type in function declarations.

				 *              Useful for C++11 and GCC/Clang-style function attribute syntax.

				 * VKAPI_CALL - Placed after the return type in function declarations.

				 *              Useful for MSVC-style calling convention syntax.

				 * VKAPI_PTR  - Placed between the '(' and '*' in function pointer types.

				 *

				 * Function declaration:  VKAPI_ATTR void VKAPI_CALL vkCommand(void);

				 * Function pointer type: typedef void (VKAPI_PTR *PFN_vkCommand)(void);

				 */

				#if defined(_WIN32)

				    // On Windows, Vulkan commands use the stdcall convention

				    #define VKAPI_ATTR

				    #define VKAPI_CALL __stdcall

				    #define VKAPI_PTR  VKAPI_CALL

				#elif defined(__ANDROID__) && defined(__ARM_EABI__) && !defined(__ARM_ARCH_7A__)

				    // Android does not support Vulkan in native code using the "armeabi" ABI.

				    #error "Vulkan requires the 'armeabi-v7a' or 'armeabi-v7a-hard' ABI on 32-bit ARM CPUs"

				#elif defined(__ANDROID__) && defined(__ARM_ARCH_7A__)

				    // On Android/ARMv7a, Vulkan functions use the armeabi-v7a-hard calling

				    // convention, even if the application's native code is compiled with the

				    // armeabi-v7a calling convention.

				    #define VKAPI_ATTR __attribute__((pcs("aapcs-vfp")))

				    #define VKAPI_CALL

				    #define VKAPI_PTR  VKAPI_ATTR

				#else

				    // On other platforms, use the default calling convention

				    #define VKAPI_ATTR

				    #define VKAPI_CALL

				    #define VKAPI_PTR

				#endif

				#include <stddef.h>

				#if !defined(VK_NO_STDINT_H)

				    #if defined(_MSC_VER) && (_MSC_VER < 1600)

				        typedef signed   __int8  int8_t;

				        typedef unsigned __int8  uint8_t;

				        typedef signed   __int16 int16_t;

				        typedef unsigned __int16 uint16_t;

				        typedef signed   __int32 int32_t;

				        typedef unsigned __int32 uint32_t;

				        typedef signed   __int64 int64_t;

				        typedef unsigned __int64 uint64_t;

				    #else

				        #include <stdint.h>

				    #endif

				#endif // !defined(VK_NO_STDINT_H)

				#ifdef __cplusplus

				} // extern "C"

				#endif // __cplusplus

				// Platform-specific headers required by platform window system extensions.

				// These are enabled prior to #including "vulkan.h". The same enable then

				// controls inclusion of the extension interfaces in vulkan.h.

				#ifdef VK_USE_PLATFORM_ANDROID_KHR

				#include <android/native_window.h>

				#endif

				#ifdef VK_USE_PLATFORM_MIR_KHR

				#include <mir_toolkit/client_types.h>

				#endif

				#ifdef VK_USE_PLATFORM_WAYLAND_KHR

				#include <wayland-client.h>

				#endif

				#ifdef VK_USE_PLATFORM_WIN32_KHR

				#include <windows.h>

				#endif

				#ifdef VK_USE_PLATFORM_XLIB_KHR

				#include <X11/Xlib.h>

				#endif

				#ifdef VK_USE_PLATFORM_XCB_KHR

				#include <xcb/xcb.h>

				#endif

				#endif

3800

include/vulkan/vulkan.h Normal file

View File

File diff suppressed because it is too large Load Diff

									
										62

include/vulkan/vulkan_intel.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,62 @@

				/*

				 * Copyright © 2015 Intel Corporation

				 *

				 * Permission is hereby granted, free of charge, to any person obtaining a

				 * copy of this software and associated documentation files (the "Software"),

				 * to deal in the Software without restriction, including without limitation

				 * the rights to use, copy, modify, merge, publish, distribute, sublicense,

				 * and/or sell copies of the Software, and to permit persons to whom the

				 * Software is furnished to do so, subject to the following conditions:

				 *

				 * The above copyright notice and this permission notice (including the next

				 * paragraph) shall be included in all copies or substantial portions of the

				 * Software.

				 *

				 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL

				 * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

				 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING

				 * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS

				 * IN THE SOFTWARE.

				 */

				#ifndef __VULKAN_INTEL_H__

				#define __VULKAN_INTEL_H__

				#include "vulkan.h"

				#ifdef __cplusplus

				extern "C"

				{

				#endif // __cplusplus

				#define VK_STRUCTURE_TYPE_DMA_BUF_IMAGE_CREATE_INFO_INTEL 1024

				typedef struct VkDmaBufImageCreateInfo_

				{

				    VkStructureType                             sType;                      // Must be VK_STRUCTURE_TYPE_DMA_BUF_IMAGE_CREATE_INFO_INTEL

				    const void*                                 pNext;                      // Pointer to next structure.

				    int                                         fd;

				    VkFormat                                    format;

				    VkExtent3D                                  extent;         // Depth must be 1

				    uint32_t                                    strideInBytes;

				} VkDmaBufImageCreateInfo;

				typedef VkResult (VKAPI_PTR *PFN_vkCreateDmaBufImageINTEL)(VkDevice device, const VkDmaBufImageCreateInfo* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkDeviceMemory* pMem, VkImage* pImage);

				#ifndef VK_NO_PROTOTYPES

				VKAPI_ATTR VkResult VKAPI_CALL vkCreateDmaBufImageINTEL(

				    VkDevice                                    _device,

				    const VkDmaBufImageCreateInfo*              pCreateInfo,

				    const VkAllocationCallbacks*                pAllocator,

				    VkDeviceMemory*                             pMem,

				    VkImage*                                    pImage);

				#endif

				#ifdef __cplusplus

				} // extern "C"

				#endif // __cplusplus

				#endif // __VULKAN_INTEL_H__

									
										21

install-gallium-links.mk
									
												View File
												
				@@ -3,18 +3,18 @@

				if BUILD_SHARED

				if HAVE_COMPAT_SYMLINKS

				all-local : .libs/install-gallium-links

				all-local : .install-gallium-links

				.libs/install-gallium-links : $(dri_LTLIBRARIES) $(egl_LTLIBRARIES) $(lib_LTLIBRARIES)

				.install-gallium-links : $(dri_LTLIBRARIES) $(egl_LTLIBRARIES) $(lib_LTLIBRARIES)

					$(AM_V_GEN)$(MKDIR_P) $(top_builddir)/$(LIB_DIR);	\

					link_dir=$(top_builddir)/$(LIB_DIR)/gallium;		\

					if test x$(egl_LTLIBRARIES) != x; then			\

						link_dir=$(top_builddir)/$(LIB_DIR)/egl;	\

					fi;							\

					$(MKDIR_P) $$link_dir;					\

					file_list=$(dri_LTLIBRARIES:%.la=.libs/%.so);		\

					file_list+=$(egl_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*);	\

					file_list+=$(lib_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*);	\

					file_list="$(dri_LTLIBRARIES:%.la=.libs/%.so)";		\

					file_list+="$(egl_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*)";	\

					file_list+="$(lib_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*)";	\

					for f in $$file_list; do 				\

						if test -h .libs/$$f; then			\

							cp -d $$f $$link_dir;			\

				@@ -23,4 +23,15 @@ all-local : .libs/install-gallium-links

						fi;						\

					done && touch $@

				endif

				clean-local:

					for f in $(notdir $(dri_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*)) \

						 $(notdir $(egl_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*)) \

						 $(notdir $(lib_LTLIBRARIES:%.la=.libs/%.$(LIB_EXT)*)); do \

						echo $$f; \

						$(RM) $(top_builddir)/$(LIB_DIR)/gallium/$$f;   \

					done;

					rmdir $(top_builddir)/$(LIB_DIR)/gallium || true

					$(RM) .install-gallium-links

				endif

7

m4/ax_gcc_func_attribute.m4

View File

@@ -53,6 +53,7 @@
 #    optimize
 #    packed
 #    pure
 #    returns_nonnull
 #    unused
 #    used
 #    visibility
@@ -76,6 +77,9 @@
 #serial 2
 # mattst88:
 #     Added support for returns_nonnull attribute
 AC_DEFUN([AX_GCC_FUNC_ATTRIBUTE], [
     AS_VAR_PUSHDEF([ac_var], [ax_cv_have_func_attribute_$1])
@@ -175,6 +179,9 @@ AC_DEFUN([AX_GCC_FUNC_ATTRIBUTE], [
                 [pure], [
                     int foo( void ) __attribute__(($1));
                 ],
                 [returns_nonnull], [
                     int *foo( void ) __attribute__(($1));
                 ],
                 [unused], [
                     int foo( void ) __attribute__(($1));
                 ],

									
										36

scons/custom.py
									
												View File
												
				@@ -30,11 +30,10 @@ Custom builders and methods.

				#

				import os

				import os.path

				import re

				import sys

				import subprocess

				import modulefinder

				import SCons.Action

				import SCons.Builder

				@@ -44,6 +43,13 @@ import fixes

				import source_list

				# the get_implicit_deps() method changed between 2.4 and 2.5: now it expects

				# a callable that takes a scanner as argument and returns a path, rather than

				# a path directly. We want to support both, so we need to detect the SCons version,

				# for which no API is provided by SCons 8-P

				scons_version = tuple(map(int, SCons.__version__.split('.')))

				def quietCommandLines(env):

				    # Quiet command lines

				    # See also http://www.scons.org/wiki/HidingCommandLinesInOutput

				@@ -93,27 +99,19 @@ def createConvenienceLibBuilder(env):

				    return convenience_lib

				# TODO: handle import statements with multiple modules

				# TODO: handle from import statements

				import_re = re.compile(r'^\s*import\s+(\S+)\s*$', re.M)

				def python_scan(node, env, path):

				    # http://www.scons.org/doc/0.98.5/HTML/scons-user/c2781.html#AEN2789

				    # https://docs.python.org/2/library/modulefinder.html

				    contents = node.get_contents()

				    source_dir = node.get_dir()

				    imports = import_re.findall(contents)

				    finder = modulefinder.ModuleFinder()

				    finder.run_script(node.abspath)

				    results = []

				    for imp in imports:

				        for dir in path:

				            file = os.path.join(str(dir), imp.replace('.', os.sep) + '.py')

				            if os.path.exists(file):

				                results.append(env.File(file))

				                break

				            file = os.path.join(str(dir), imp.replace('.', os.sep), '__init__.py')

				            if os.path.exists(file):

				                results.append(env.File(file))

				                break

				    #print node, map(str, results)

				    for name, mod in finder.modules.iteritems():

				        if mod.__file__ is None:

				            continue

				        assert os.path.exists(mod.__file__)

				        results.append(env.File(mod.__file__))

				    return results

				python_scanner = SCons.Scanner.Scanner(function = python_scan, skeys = ['.py'])

				@@ -138,7 +136,7 @@ def code_generate(env, script, target, source, command):

				    # Explicitly mark that the generated code depends on the generator,

				    # and on implicitly imported python modules

				    path = (script_src.get_dir(),)

				    path = (script_src.get_dir(),) if scons_version < (2, 5, 0) else lambda x: script_src

				    deps = [script_src]

				    deps += script_src.get_implicit_deps(env, python_scanner, path)

				    env.Depends(code, deps)

									
										124

scons/gallium.py
									
												View File
												
				@@ -82,11 +82,6 @@ def install_shared_library(env, sources, version = ()):

				    return targets

				def createInstallMethods(env):

				    env.AddMethod(install_program, 'InstallProgram')

				    env.AddMethod(install_shared_library, 'InstallSharedLibrary')

				def msvc2013_compat(env):

				    if env['gcc']:

				        env.Append(CCFLAGS = [

				@@ -94,16 +89,20 @@ def msvc2013_compat(env):

				            '-Werror=pointer-arith',

				        ])

				def msvc2008_compat(env):

				    msvc2013_compat(env)

				    if env['gcc']:

				        env.Append(CFLAGS = [

				            '-Werror=declaration-after-statement',

				        ])

				def createMSVCCompatMethods(env):

				    env.AddMethod(msvc2013_compat, 'MSVC2013Compat')

				    env.AddMethod(msvc2008_compat, 'MSVC2008Compat')

				def unit_test(env, test_name, program_target, args=None):

				    env.InstallProgram(program_target)

				    cmd = [program_target[0].abspath]

				    if args is not None:

				        cmd += args

				    cmd = ' '.join(cmd)

				    # http://www.scons.org/wiki/UnitTests

				    action = SCons.Action.Action(cmd, "  Running $SOURCE ...")

				    alias = env.Alias(test_name, program_target, action)

				    env.AlwaysBuild(alias)

				    env.Depends('check', alias)

				def num_jobs():

				@@ -172,16 +171,6 @@ def generate(env):

				    # Allow override compiler and specify additional flags from environment

				    if os.environ.has_key('CC'):

				        env['CC'] = os.environ['CC']

				        # Update CCVERSION to match

				        pipe = SCons.Action._subproc(env, [env['CC'], '--version'],

				                                     stdin = 'devnull',

				                                     stderr = 'devnull',

				                                     stdout = subprocess.PIPE)

				        if pipe.wait() == 0:

				            line = pipe.stdout.readline()

				            match = re.search(r'[0-9]+(\.[0-9]+)+', line)

				            if match:

				                env['CCVERSION'] = match.group(0)

				    if os.environ.has_key('CFLAGS'):

				        env['CCFLAGS'] += SCons.Util.CLVar(os.environ['CFLAGS'])

				    if os.environ.has_key('CXX'):

				@@ -194,14 +183,15 @@ def generate(env):

				    # Detect gcc/clang not by executable name, but through pre-defined macros

				    # as autoconf does, to avoid drawing wrong conclusions when using tools

				    # that overrice CC/CXX like scan-build.

				    env['gcc'] = 0

				    env['gcc_compat'] = 0

				    env['clang'] = 0

				    env['msvc'] = 0

				    if host_platform.system() == 'Windows':

				        env['msvc'] = check_cc(env, 'MSVC', 'defined(_MSC_VER)', '/E')

				    if not env['msvc']:

				        env['gcc'] = check_cc(env, 'GCC', 'defined(__GNUC__) && !defined(__clang__)')

				        env['clang'] = check_cc(env, 'Clang', '__clang__')

				        env['gcc_compat'] = check_cc(env, 'GCC', 'defined(__GNUC__)')

				    env['clang'] = check_cc(env, 'Clang', '__clang__')

				    env['gcc'] = env['gcc_compat'] and not env['clang']

				    env['suncc'] = env['platform'] == 'sunos' and os.path.basename(env['CC']) == 'cc'

				    env['icc'] = 'icc' == os.path.basename(env['CC'])

				@@ -214,7 +204,7 @@ def generate(env):

				    platform = env['platform']

				    x86 = env['machine'] == 'x86'

				    ppc = env['machine'] == 'ppc'

				    gcc_compat = env['gcc'] or env['clang']

				    gcc_compat = env['gcc_compat']

				    msvc = env['msvc']

				    suncc = env['suncc']

				    icc = env['icc']

				@@ -300,7 +290,11 @@ def generate(env):

				    # C preprocessor options

				    cppdefines = []

				    cppdefines += ['__STDC_LIMIT_MACROS']

				    cppdefines += [

				        '__STDC_LIMIT_MACROS',

				        '__STDC_CONSTANT_MACROS',

				        'HAVE_NO_AUTOCONF',

				    ]

				    if env['build'] in ('debug', 'checked'):

				        cppdefines += ['DEBUG']

				    else:

				@@ -315,8 +309,6 @@ def generate(env):

				            '_BSD_SOURCE',

				            '_GNU_SOURCE',

				            '_DEFAULT_SOURCE',

				            'HAVE_PTHREAD',

				            'HAVE_POSIX_MEMALIGN',

				        ]

				        if env['platform'] == 'darwin':

				            cppdefines += [

				@@ -337,11 +329,6 @@ def generate(env):

				        if env['platform'] in ('linux', 'darwin'):

				            cppdefines += ['HAVE_XLOCALE_H']

				    if env['platform'] == 'haiku':

				        cppdefines += [

				            'HAVE_PTHREAD',

				            'HAVE_POSIX_MEMALIGN'

				        ]

				    if platform == 'windows':

				        cppdefines += [

				            'WIN32',

				@@ -375,26 +362,6 @@ def generate(env):

				        print 'warning: Floating-point textures enabled.'

				        print 'warning: Please consult docs/patents.txt with your lawyer before building Mesa.'

				        cppdefines += ['TEXTURE_FLOAT_ENABLED']

				    if gcc_compat:

				        ccversion = env['CCVERSION']

				        cppdefines += [

				            'HAVE___BUILTIN_EXPECT',

				            'HAVE___BUILTIN_FFS',

				            'HAVE___BUILTIN_FFSLL',

				            'HAVE_FUNC_ATTRIBUTE_FLATTEN',

				            'HAVE_FUNC_ATTRIBUTE_UNUSED',

				            # GCC 3.0

				            'HAVE_FUNC_ATTRIBUTE_FORMAT',

				            'HAVE_FUNC_ATTRIBUTE_PACKED',

				            # GCC 3.4

				            'HAVE___BUILTIN_CTZ',

				            'HAVE___BUILTIN_POPCOUNT',

				            'HAVE___BUILTIN_POPCOUNTLL',

				            'HAVE___BUILTIN_CLZ',

				            'HAVE___BUILTIN_CLZLL',

				        ]

				        if distutils.version.LooseVersion(ccversion) >= distutils.version.LooseVersion('4.5'):

				            cppdefines += ['HAVE___BUILTIN_UNREACHABLE']

				    env.Append(CPPDEFINES = cppdefines)

				    # C compiler options

				@@ -402,13 +369,8 @@ def generate(env):

				    cxxflags = [] # C++

				    ccflags = [] # C & C++

				    if gcc_compat:

				        ccversion = env['CCVERSION']

				        if env['build'] == 'debug':

				            ccflags += ['-O0']

				        elif env['gcc'] and ccversion.startswith('4.2.'):

				            # gcc 4.2.x optimizer is broken

				            print "warning: gcc 4.2.x optimizer is broken -- disabling optimizations"

				            ccflags += ['-O0']

				        else:

				            ccflags += ['-O3']

				        if env['gcc']:

				@@ -418,7 +380,7 @@ def generate(env):

				        # Work around aliasing bugs - developers should comment this out

				        ccflags += ['-fno-strict-aliasing']

				        ccflags += ['-g']

				        if env['build'] in ('checked', 'profile'):

				        if env['build'] in ('checked', 'profile') or env['asan']:

				            # See http://code.google.com/p/jrfonseca/wiki/Gprof2Dot#Which_options_should_I_pass_to_gcc_when_compiling_for_profiling?

				            ccflags += [

				                '-fno-omit-frame-pointer',

				@@ -479,31 +441,23 @@ def generate(env):

				        # See also:

				        # - http://msdn.microsoft.com/en-us/library/19z1t1wy.aspx

				        # - cl /?

				        if 'MSVC_VERSION' not in env or distutils.version.LooseVersion(env['MSVC_VERSION']) < distutils.version.LooseVersion('12.0'):

				            # Use bundled stdbool.h and stdint.h headers for older MSVC

				            # versions.  stdint.h was introduced in MSVC 2010, but stdbool.h

				            # was only introduced in MSVC 2013.

				            top_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), '..'))

				            env.Append(CPPPATH = [os.path.join(top_dir, 'include/c99')])

				        if env['build'] == 'debug':

				            ccflags += [

				              '/Od', # disable optimizations

				              '/Oi', # enable intrinsic functions

				            ]

				        else:

				            if 'MSVC_VERSION' in env and distutils.version.LooseVersion(env['MSVC_VERSION']) < distutils.version.LooseVersion('11.0'):

				                print 'scons: warning: Visual Studio versions prior to 2012 are known to produce incorrect code when optimizations are enabled ( https://bugs.freedesktop.org/show_bug.cgi?id=58718 )'

				            ccflags += [

				                '/O2', # optimize for speed

				            ]

				        if env['build'] == 'release':

				            ccflags += [

				                '/GL', # enable whole program optimization

				            ]

				            if not env['clang']:

				                ccflags += [

				                    '/GL', # enable whole program optimization

				                ]

				        else:

				            ccflags += [

				                '/Oy-', # disable frame pointer omission

				                '/GL-', # disable whole program optimization

				            ]

				        ccflags += [

				            '/W3', # warning level

				@@ -517,6 +471,10 @@ def generate(env):

				            '/wd4800', # forcing value to bool 'true' or 'false' (performance warning)

				            '/wd4996', # disable deprecated POSIX name warnings

				        ]

				        if env['clang']:

				            ccflags += [

				                '-Wno-microsoft-enum-value', # enumerator value is not representable in underlying type 'int'

				            ]

				        if env['machine'] == 'x86':

				            ccflags += [

				                '/arch:SSE2', # use the SSE2 instructions (default since MSVC 2012)

				@@ -556,6 +514,16 @@ def generate(env):

				            # scan-build will produce more comprehensive output

				            env.Append(CCFLAGS = ['--analyze'])

				    # https://github.com/google/sanitizers/wiki/AddressSanitizer

				    if env['asan']:

				        if gcc_compat:

				            env.Append(CCFLAGS = [

				                '-fsanitize=address',

				            ])

				            env.Append(LINKFLAGS = [

				                '-fsanitize=address',

				            ])

				    # Assembler options

				    if gcc_compat:

				        if env['machine'] == 'x86':

				@@ -593,7 +561,7 @@ def generate(env):

				            shlinkflags += ['-Wl,--enable-stdcall-fixup']

				            #shlinkflags += ['-Wl,--kill-at']

				    if msvc:

				        if env['build'] == 'release':

				        if env['build'] == 'release' and not env['clang']:

				            # enable Link-time Code Generation

				            linkflags += ['/LTCG']

				            env.Append(ARFLAGS = ['/LTCG'])

				@@ -673,8 +641,10 @@ def generate(env):

				    # Custom builders and methods

				    env.Tool('custom')

				    createInstallMethods(env)

				    createMSVCCompatMethods(env)

				    env.AddMethod(install_program, 'InstallProgram')

				    env.AddMethod(install_shared_library, 'InstallSharedLibrary')

				    env.AddMethod(msvc2013_compat, 'MSVC2013Compat')

				    env.AddMethod(unit_test, 'UnitTest')

				    env.PkgCheckModules('X11', ['x11', 'xext', 'xdamage', 'xfixes', 'glproto >= 1.4.13'])

				    env.PkgCheckModules('XCB', ['x11-xcb', 'xcb-glx >= 1.8.1', 'xcb-dri2 >= 1.8'])

									
										14

scons/llvm.py
									
												View File
												
				@@ -106,7 +106,19 @@ def generate(env):

				        ])

				        env.Prepend(LIBPATH = [os.path.join(llvm_dir, 'lib')])

				        # LIBS should match the output of `llvm-config --libs engine mcjit bitwriter x86asmprinter`

				        if llvm_version >= distutils.version.LooseVersion('3.6'):

				        if llvm_version >= distutils.version.LooseVersion('3.7'):

				            env.Prepend(LIBS = [

				                'LLVMBitWriter', 'LLVMX86Disassembler', 'LLVMX86AsmParser',

				                'LLVMX86CodeGen', 'LLVMSelectionDAG', 'LLVMAsmPrinter',

				                'LLVMCodeGen', 'LLVMScalarOpts', 'LLVMProfileData',

				                'LLVMInstCombine', 'LLVMInstrumentation', 'LLVMTransformUtils', 'LLVMipa',

				                'LLVMAnalysis', 'LLVMX86Desc', 'LLVMMCDisassembler',

				                'LLVMX86Info', 'LLVMX86AsmPrinter', 'LLVMX86Utils',

				                'LLVMMCJIT', 'LLVMTarget', 'LLVMExecutionEngine',

				                'LLVMRuntimeDyld', 'LLVMObject', 'LLVMMCParser',

				                'LLVMBitReader', 'LLVMMC', 'LLVMCore', 'LLVMSupport'

				            ])

				        elif llvm_version >= distutils.version.LooseVersion('3.6'):

				            env.Prepend(LIBS = [

				                'LLVMBitWriter', 'LLVMX86Disassembler', 'LLVMX86AsmParser',

				                'LLVMX86CodeGen', 'LLVMSelectionDAG', 'LLVMAsmPrinter',

2301

scripts/get_reviewer.pl Executable file

View File

File diff suppressed because it is too large Load Diff

									
										38

src/Makefile.am
									
												View File
												
				@@ -19,10 +19,28 @@

				# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS

				# IN THE SOFTWARE.

				git_sha1.h:

					@if test -e $(top_srcdir)/.git; then \

						if which git > /dev/null; then \

						    git --git-dir=$(top_srcdir)/.git log -n 1 --oneline | \

							sed 's/^\([^ ]*\) .*/#define MESA_GIT_SHA1 "git-\1"/' \

							> git_sha1.h ; \

						fi \

					fi

				BUILT_SOURCES = git_sha1.h

				SUBDIRS = . gtest util mapi/glapi/gen mapi

				# include only conditionally ?

				SUBDIRS += compiler

				if HAVE_INTEL_DRIVERS

				SUBDIRS += intel

				endif

				if NEED_OPENGL_COMMON

				SUBDIRS += glsl mesa

				SUBDIRS += mesa

				endif

				SUBDIRS += loader

				@@ -31,24 +49,36 @@ if HAVE_DRI_GLX

				SUBDIRS += glx

				endif

				if HAVE_EGL_PLATFORM_WAYLAND

				SUBDIRS += egl/wayland/wayland-egl egl/wayland/wayland-drm

				## Optionally required by GBM and EGL

				if HAVE_PLATFORM_WAYLAND

				SUBDIRS += egl/wayland/wayland-drm

				endif

				## Optionally required by EGL (aka PLATFORM_GBM)

				if HAVE_GBM

				SUBDIRS += gbm

				endif

				## Optionally required by EGL

				if HAVE_PLATFORM_WAYLAND

				SUBDIRS += egl/wayland/wayland-egl

				endif

				if HAVE_EGL

				SUBDIRS += egl

				endif

				## Requires the i965 compiler (part of mesa) and wayland-drm

				if HAVE_INTEL_VULKAN

				SUBDIRS += intel/vulkan

				endif

				if HAVE_GALLIUM

				SUBDIRS += gallium

				endif

				EXTRA_DIST = \

					getopt hgl SConscript

					getopt hgl SConscript git_sha1.h

				AM_CFLAGS = $(VISIBILITY_CFLAGS)

				AM_CXXFLAGS = $(VISIBILITY_CXXFLAGS)

									
										2

src/SConscript
									
												View File
												
				@@ -5,7 +5,7 @@ if env['platform'] == 'windows':

				    SConscript('getopt/SConscript')

				SConscript('util/SConscript')

				SConscript('glsl/SConscript')

				SConscript('compiler/SConscript')

				if env['hostonly']:

				    # We are just compiling the things necessary on the host for cross

5

src/compiler/.gitignore vendored Normal file

View File

@@ -0,0 +1,5 @@
 glsl_compiler
 subtest-cr
 subtest-cr-lf
 subtest-lf
 subtest-lf-cr

									
										80

src/compiler/Android.glsl.gen.mk
									
										Normal file
									
												View File
												
				@@ -0,0 +1,80 @@

				# Mesa 3-D graphics library

				#

				# Copyright (C) 2010-2011 Chia-I Wu <olvaffe@gmail.com>

				# Copyright (C) 2010-2011 LunarG Inc.

				#

				# Permission is hereby granted, free of charge, to any person obtaining a

				# copy of this software and associated documentation files (the "Software"),

				# to deal in the Software without restriction, including without limitation

				# the rights to use, copy, modify, merge, publish, distribute, sublicense,

				# and/or sell copies of the Software, and to permit persons to whom the

				# Software is furnished to do so, subject to the following conditions:

				#

				# The above copyright notice and this permission notice shall be included

				# in all copies or substantial portions of the Software.

				#

				# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL

				# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

				# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING

				# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER

				# DEALINGS IN THE SOFTWARE.

				# included by glsl Android.mk for source generation

				ifeq ($(LOCAL_MODULE_CLASS),)

				LOCAL_MODULE_CLASS := STATIC_LIBRARIES

				endif

				intermediates := $(call local-generated-sources-dir)

				LOCAL_SRC_FILES := $(LOCAL_SRC_FILES)

				LOCAL_C_INCLUDES += \

					$(intermediates)/glsl \

					$(intermediates)/glsl/glcpp \

					$(LOCAL_PATH)/glsl \

					$(LOCAL_PATH)/glsl/glcpp \

				LOCAL_GENERATED_SOURCES += $(addprefix $(intermediates)/, \

					$(LIBGLCPP_GENERATED_FILES) \

					$(LIBGLSL_GENERATED_CXX_FILES))

				define local-l-or-ll-to-c-or-cpp

					@mkdir -p $(dir $@)

					@echo "Mesa Lex: $(PRIVATE_MODULE) <= $<"

					$(hide) $(LEX) --nounistd -o$@ $<

				endef

				define glsl_local-y-to-c-and-h

					@mkdir -p $(dir $@)

					@echo "Mesa Yacc: $(PRIVATE_MODULE) <= $<"

					$(hide) $(YACC) -o $@ -p "glcpp_parser_" $<

				endef

				YACC_HEADER_SUFFIX := .hpp

				define local-yy-to-cpp-and-h

					@mkdir -p $(dir $@)

					@echo "Mesa Yacc: $(PRIVATE_MODULE) <= $<"

					$(hide) $(YACC) -p "_mesa_glsl_" -o $@ $<

					touch $(@:$1=$(YACC_HEADER_SUFFIX))

					echo '#ifndef '$(@F:$1=_h) > $(@:$1=.h)

					echo '#define '$(@F:$1=_h) >> $(@:$1=.h)

					cat $(@:$1=$(YACC_HEADER_SUFFIX)) >> $(@:$1=.h)

					echo '#endif' >> $(@:$1=.h)

					rm -f $(@:$1=$(YACC_HEADER_SUFFIX))

				endef

				$(intermediates)/glsl/glsl_lexer.cpp: $(LOCAL_PATH)/glsl/glsl_lexer.ll

					$(call local-l-or-ll-to-c-or-cpp)

				$(intermediates)/glsl/glsl_parser.cpp: $(LOCAL_PATH)/glsl/glsl_parser.yy

					$(call local-yy-to-cpp-and-h,.cpp)

				$(intermediates)/glsl/glcpp/glcpp-lex.c: $(LOCAL_PATH)/glsl/glcpp/glcpp-lex.l

					$(call local-l-or-ll-to-c-or-cpp)

				$(intermediates)/glsl/glcpp/glcpp-parse.c: $(LOCAL_PATH)/glsl/glcpp/glcpp-parse.y

					$(call glsl_local-y-to-c-and-h)

									
										28

src/glsl/Android.mk → src/compiler/Android.glsl.mk
									
												View File
												
				@@ -36,39 +36,19 @@ include $(CLEAR_VARS)

				LOCAL_SRC_FILES := \

					$(LIBGLCPP_FILES) \

					$(LIBGLSL_FILES) \

					$(NIR_FILES)

				LOCAL_C_INCLUDES := \

					$(MESA_TOP)/src/compiler/nir \

					$(MESA_TOP)/src/mapi \

					$(MESA_TOP)/src/mesa \

					$(MESA_TOP)/src/gallium/include \

					$(MESA_TOP)/src/gallium/auxiliary

				LOCAL_STATIC_LIBRARIES := libmesa_compiler

				LOCAL_MODULE := libmesa_glsl

				include $(LOCAL_PATH)/Android.gen.mk

				include $(LOCAL_PATH)/Android.glsl.gen.mk

				include $(MESA_COMMON_MK)

				include $(BUILD_STATIC_LIBRARY)

				# ---------------------------------------

				# Build glsl_compiler

				# ---------------------------------------

				include $(CLEAR_VARS)

				LOCAL_SRC_FILES := \

					$(GLSL_COMPILER_CXX_FILES)

				LOCAL_C_INCLUDES := \

					$(MESA_TOP)/src/mapi \

					$(MESA_TOP)/src/mesa \

					$(MESA_TOP)/src/gallium/include \

					$(MESA_TOP)/src/gallium/auxiliary

				LOCAL_STATIC_LIBRARIES := libmesa_glsl libmesa_glsl_utils libmesa_util

				LOCAL_MODULE_TAGS := eng

				LOCAL_MODULE := glsl_compiler

				include $(MESA_COMMON_MK)

				include $(BUILD_EXECUTABLE)

									
										48

src/compiler/Android.mk
									
										Normal file
									
												View File
												
				@@ -0,0 +1,48 @@

				# Mesa 3-D graphics library

				#

				# Copyright (C) 2015 Intel Corporation

				#

				# Permission is hereby granted, free of charge, to any person obtaining a

				# copy of this software and associated documentation files (the "Software"),

				# to deal in the Software without restriction, including without limitation

				# the rights to use, copy, modify, merge, publish, distribute, sublicense,

				# and/or sell copies of the Software, and to permit persons to whom the

				# Software is furnished to do so, subject to the following conditions:

				#

				# The above copyright notice and this permission notice shall be included

				# in all copies or substantial portions of the Software.

				#

				# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL

				# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

				# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING

				# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER

				# DEALINGS IN THE SOFTWARE.

				LOCAL_PATH := $(call my-dir)

				include $(LOCAL_PATH)/Makefile.sources

				# ---------------------------------------

				# Build libmesa_compiler

				# ---------------------------------------

				include $(CLEAR_VARS)

				LOCAL_SRC_FILES := $(LIBCOMPILER_FILES)

				LOCAL_C_INCLUDES := \

					$(MESA_TOP)/src/mapi \

					$(MESA_TOP)/src/mesa \

					$(MESA_TOP)/src/gallium/include \

					$(MESA_TOP)/src/gallium/auxiliary

				LOCAL_MODULE := libmesa_compiler

				include $(MESA_COMMON_MK)

				include $(BUILD_STATIC_LIBRARY)

				include $(LOCAL_PATH)/Android.glsl.mk

				include $(LOCAL_PATH)/Android.nir.mk

									
										49

src/glsl/Android.gen.mk → src/compiler/Android.nir.gen.mk
									
												View File
												
				@@ -32,55 +32,20 @@ intermediates := $(call local-generated-sources-dir)

				LOCAL_SRC_FILES := $(LOCAL_SRC_FILES)

				LOCAL_C_INCLUDES += \

					$(intermediates)/glcpp \

					$(intermediates)/nir \

					$(MESA_TOP)/src/glsl/glcpp \

					$(MESA_TOP)/src/glsl/nir

					$(MESA_TOP)/src/compiler/nir

				LOCAL_EXPORT_C_INCLUDE_DIRS += \

					$(intermediates)/nir \

					$(MESA_TOP)/src/glsl/nir

					$(MESA_TOP)/src/compiler/nir

				LOCAL_GENERATED_SOURCES += $(addprefix $(intermediates)/, \

					$(LIBGLCPP_GENERATED_FILES) \

					$(NIR_GENERATED_FILES) \

					$(LIBGLSL_GENERATED_CXX_FILES))

					$(NIR_GENERATED_FILES))

				define local-l-or-ll-to-c-or-cpp

					@mkdir -p $(dir $@)

					@echo "Mesa Lex: $(PRIVATE_MODULE) <= $<"

					$(hide) $(LEX) --nounistd -o$@ $<

				endef

				define glsl_local-y-to-c-and-h

					@mkdir -p $(dir $@)

					@echo "Mesa Yacc: $(PRIVATE_MODULE) <= $<"

					$(hide) $(YACC) -o $@ -p "glcpp_parser_" $<

				endef

				define local-yy-to-cpp-and-h

					@mkdir -p $(dir $@)

					@echo "Mesa Yacc: $(PRIVATE_MODULE) <= $<"

					$(hide) $(YACC) -p "_mesa_glsl_" -o $@ $<

					touch $(@:$1=$(YACC_HEADER_SUFFIX))

					echo '#ifndef '$(@F:$1=_h) > $(@:$1=.h)

					echo '#define '$(@F:$1=_h) >> $(@:$1=.h)

					cat $(@:$1=$(YACC_HEADER_SUFFIX)) >> $(@:$1=.h)

					echo '#endif' >> $(@:$1=.h)

					rm -f $(@:$1=$(YACC_HEADER_SUFFIX))

				endef

				$(intermediates)/glsl_lexer.cpp: $(LOCAL_PATH)/glsl_lexer.ll

					$(call local-l-or-ll-to-c-or-cpp)

				$(intermediates)/glsl_parser.cpp: $(LOCAL_PATH)/glsl_parser.yy

					$(call local-yy-to-cpp-and-h,.cpp)

				$(intermediates)/glcpp/glcpp-lex.c: $(LOCAL_PATH)/glcpp/glcpp-lex.l

					$(call local-l-or-ll-to-c-or-cpp)

				$(intermediates)/glcpp/glcpp-parse.c: $(LOCAL_PATH)/glcpp/glcpp-parse.y

					$(call glsl_local-y-to-c-and-h)

				# Modules using libmesa_nir must set LOCAL_GENERATED_SOURCES to this

				MESA_GEN_NIR_H := $(addprefix $(call local-generated-sources-dir)/, \

					nir/nir_opcodes.h \

					nir/nir_builder_opcodes.h)

				nir_builder_opcodes_gen := $(LOCAL_PATH)/nir/nir_builder_opcodes_h.py

				nir_builder_opcodes_deps := \

									
										49

src/compiler/Android.nir.mk
									
										Normal file
									
												View File
												
				@@ -0,0 +1,49 @@

				# Mesa 3-D graphics library

				#

				# Copyright (C) 2015 Intel Corporation

				#

				# Permission is hereby granted, free of charge, to any person obtaining a

				# copy of this software and associated documentation files (the "Software"),

				# to deal in the Software without restriction, including without limitation

				# the rights to use, copy, modify, merge, publish, distribute, sublicense,

				# and/or sell copies of the Software, and to permit persons to whom the

				# Software is furnished to do so, subject to the following conditions:

				#

				# The above copyright notice and this permission notice shall be included

				# in all copies or substantial portions of the Software.

				#

				# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL

				# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

				# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING

				# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER

				# DEALINGS IN THE SOFTWARE.

				LOCAL_PATH := $(call my-dir)

				include $(LOCAL_PATH)/Makefile.sources

				# ---------------------------------------

				# Build libmesa_nir

				# ---------------------------------------

				include $(CLEAR_VARS)

				LOCAL_SRC_FILES := \

					$(NIR_FILES)

				LOCAL_C_INCLUDES := \

					$(MESA_TOP)/src/mapi \

					$(MESA_TOP)/src/mesa \

					$(MESA_TOP)/src/gallium/include \

					$(MESA_TOP)/src/gallium/auxiliary

				LOCAL_STATIC_LIBRARIES := libmesa_compiler

				LOCAL_MODULE := libmesa_nir

				include $(LOCAL_PATH)/Android.nir.gen.mk

				include $(MESA_COMMON_MK)

				include $(BUILD_STATIC_LIBRARY)

									
										63

src/compiler/Makefile.am
									
										Normal file
									
												View File
												
				@@ -0,0 +1,63 @@

				#

				# Copyright © 2012 Jon TURNEY

				# Copyright (C) 2015 Intel Corporation

				#

				# Permission is hereby granted, free of charge, to any person obtaining a

				# copy of this software and associated documentation files (the "Software"),

				# to deal in the Software without restriction, including without limitation

				# the rights to use, copy, modify, merge, publish, distribute, sublicense,

				# and/or sell copies of the Software, and to permit persons to whom the

				# Software is furnished to do so, subject to the following conditions:

				#

				# The above copyright notice and this permission notice (including the next

				# paragraph) shall be included in all copies or substantial portions of the

				# Software.

				#

				# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL

				# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

				# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING

				# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS

				# IN THE SOFTWARE.

				include Makefile.sources

				AM_CPPFLAGS = \

					-I$(top_srcdir)/include \

					-I$(top_srcdir)/src \

					-I$(top_srcdir)/src/mapi \

					-I$(top_srcdir)/src/mesa/ \

					-I$(top_builddir)/src/compiler/glsl\

					-I$(top_srcdir)/src/compiler/glsl\

					-I$(top_srcdir)/src/compiler/glsl/glcpp\

					-I$(top_builddir)/src/compiler/nir \

					-I$(top_srcdir)/src/compiler/nir \

					-I$(top_srcdir)/src/gallium/include \

					-I$(top_srcdir)/src/gallium/auxiliary \

					-I$(top_srcdir)/src/gtest/include \

					$(DEFINES)

				AM_CFLAGS = \

					$(VISIBILITY_CFLAGS) \

					$(MSVC2013_COMPAT_CFLAGS)

				AM_CXXFLAGS = \

					$(VISIBILITY_CXXFLAGS) \

					$(MSVC2013_COMPAT_CXXFLAGS)

				noinst_LTLIBRARIES = libcompiler.la

				libcompiler_la_SOURCES = $(LIBCOMPILER_FILES)

				check_PROGRAMS =

				TESTS =

				BUILT_SOURCES =

				CLEANFILES =

				EXTRA_DIST = SConscript

				MKDIR_GEN = $(AM_V_at)$(MKDIR_P) $(@D)

				include Makefile.glsl.am

				include Makefile.nir.am

									
										224

src/compiler/Makefile.glsl.am
									
										Normal file
									
												View File
												
				@@ -0,0 +1,224 @@

				#

				# Copyright © 2012 Jon TURNEY

				# Copyright (C) 2015 Intel Corporation

				#

				# Permission is hereby granted, free of charge, to any person obtaining a

				# copy of this software and associated documentation files (the "Software"),

				# to deal in the Software without restriction, including without limitation

				# the rights to use, copy, modify, merge, publish, distribute, sublicense,

				# and/or sell copies of the Software, and to permit persons to whom the

				# Software is furnished to do so, subject to the following conditions:

				#

				# The above copyright notice and this permission notice (including the next

				# paragraph) shall be included in all copies or substantial portions of the

				# Software.

				#

				# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL

				# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

				# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING

				# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS

				# IN THE SOFTWARE.

				EXTRA_DIST += glsl/tests glsl/glcpp/tests glsl/README	\

					glsl/TODO glsl/glcpp/README			\

					glsl/glsl_lexer.ll				\

					glsl/glsl_parser.yy				\

					glsl/glcpp/glcpp-lex.l				\

					glsl/glcpp/glcpp-parse.y			\

					SConscript.glsl

				TESTS += glsl/glcpp/tests/glcpp-test			\

					glsl/glcpp/tests/glcpp-test-cr-lf		\

					glsl/tests/blob-test				\

					glsl/tests/general-ir-test			\

					glsl/tests/optimization-test			\

					glsl/tests/sampler-types-test			\

					glsl/tests/uniform-initializer-test             \

					glsl/tests/warnings-test

				TESTS_ENVIRONMENT= \

					export PYTHON2=$(PYTHON2); \

					export PYTHON_FLAGS=$(PYTHON_FLAGS);

				check_PROGRAMS +=					\

					glsl/glcpp/glcpp				\

					glsl/glsl_test					\

					glsl/tests/blob-test				\

					glsl/tests/general-ir-test			\

					glsl/tests/sampler-types-test			\

					glsl/tests/uniform-initializer-test

				noinst_PROGRAMS = glsl_compiler

				glsl_tests_blob_test_SOURCES =				\

					glsl/tests/blob_test.c

				glsl_tests_blob_test_LDADD =				\

					glsl/libglsl.la

				glsl_tests_general_ir_test_SOURCES =			\

					glsl/tests/builtin_variable_test.cpp		\

					glsl/tests/invalidate_locations_test.cpp	\

					glsl/tests/general_ir_test.cpp			\

					glsl/tests/varyings_test.cpp

				glsl_tests_general_ir_test_CFLAGS =			\

					$(PTHREAD_CFLAGS)

				glsl_tests_general_ir_test_LDADD =			\

					$(top_builddir)/src/gtest/libgtest.la		\

					glsl/libglsl.la		\

					glsl/libstandalone.la				\

					$(top_builddir)/src/libglsl_util.la		\

					$(PTHREAD_LIBS)

				glsl_tests_uniform_initializer_test_SOURCES =		\

					glsl/tests/copy_constant_to_storage_tests.cpp	\

					glsl/tests/set_uniform_initializer_tests.cpp	\

					glsl/tests/uniform_initializer_utils.cpp	\

					glsl/tests/uniform_initializer_utils.h

				glsl_tests_uniform_initializer_test_CFLAGS =		\

					$(PTHREAD_CFLAGS)

				glsl_tests_uniform_initializer_test_LDADD =		\

					$(top_builddir)/src/gtest/libgtest.la		\

					glsl/libglsl.la		\

					$(top_builddir)/src/libglsl_util.la		\

					$(PTHREAD_LIBS)

				glsl_tests_sampler_types_test_SOURCES =			\

					glsl/tests/sampler_types_test.cpp

				glsl_tests_sampler_types_test_CFLAGS =			\

					$(PTHREAD_CFLAGS)

				glsl_tests_sampler_types_test_LDADD =			\

					$(top_builddir)/src/gtest/libgtest.la		\

					glsl/libglsl.la					\

					$(top_builddir)/src/libglsl_util.la		\

					$(PTHREAD_LIBS)

				noinst_LTLIBRARIES += glsl/libglsl.la glsl/libglcpp.la glsl/libstandalone.la

				glsl_libglcpp_la_LIBADD =				\

					$(top_builddir)/src/util/libmesautil.la

				glsl_libglcpp_la_SOURCES =				\

					glsl/glcpp/glcpp-lex.c				\

					glsl/glcpp/glcpp-parse.c			\

					glsl/glcpp/glcpp-parse.h			\

					$(LIBGLCPP_FILES)

				glsl_glcpp_glcpp_SOURCES =				\

					glsl/glcpp/glcpp.c

				glsl_glcpp_glcpp_LDADD =				\

					glsl/libglcpp.la	\

					$(top_builddir)/src/libglsl_util.la		\

					-lm

				glsl_libglsl_la_LIBADD = \

					nir/libnir.la \

					glsl/libglcpp.la

				glsl_libglsl_la_SOURCES =				\

					glsl/glsl_lexer.cpp				\

					glsl/glsl_parser.cpp				\

					glsl/glsl_parser.h				\

					$(LIBGLSL_FILES)

				glsl_libstandalone_la_SOURCES = \

					$(GLSL_COMPILER_CXX_FILES)

				glsl_libstandalone_la_LIBADD =				\

					glsl/libglsl.la					\

					$(top_builddir)/src/libglsl_util.la		\

					$(top_builddir)/src/util/libmesautil.la		\

					$(PTHREAD_LIBS)

				glsl_compiler_SOURCES = \

					glsl/main.cpp

				glsl_compiler_LDADD = \

					glsl/libstandalone.la

				glsl_glsl_test_SOURCES = \

					glsl/test.cpp \

					glsl/test_optpass.cpp \

					glsl/test_optpass.h

				glsl_glsl_test_LDADD =					\

					glsl/libglsl.la					\

					glsl/libstandalone.la				\

					$(top_builddir)/src/libglsl_util.la		\

					$(PTHREAD_LIBS)

				# We write our own rules for yacc and lex below. We'd rather use automake,

				# but automake makes it especially difficult for a number of reasons:

				#

				#  * < automake-1.12 generates .h files from .yy and .ypp files, but

				#    >=automake-1.12 generates .hh and .hpp files respectively. There's no

				#    good way of making a project that uses C++ yacc files compatible with

				#    both versions of automake. Strong work automake developers.

				#

				#  * Since we're generating code from .l/.y files in a subdirectory (glcpp/)

				#    we'd like the resulting generated code to also go in glcpp/ for purposes

				#    of distribution. Automake gives no way to do this.

				#

				#  * Since we're building multiple yacc parsers into one library (and via one

				#    Makefile) we have to use per-target YFLAGS. Using per-target YFLAGS causes

				#    automake to name the resulting generated code as <library-name>_filename.c.

				#    Frankly, that's ugly and we don't want a libglcpp_glcpp_parser.h file.

				# In order to make build output print "LEX" and "YACC", we reproduce the

				# automake variables below.

				AM_V_LEX = $(am__v_LEX_$(V))

				am__v_LEX_ = $(am__v_LEX_$(AM_DEFAULT_VERBOSITY))

				am__v_LEX_0 = @echo "  LEX     " $@;

				am__v_LEX_1 =

				AM_V_YACC = $(am__v_YACC_$(V))

				am__v_YACC_ = $(am__v_YACC_$(AM_DEFAULT_VERBOSITY))

				am__v_YACC_0 = @echo "  YACC    " $@;

				am__v_YACC_1 =

				YACC_GEN = $(AM_V_YACC)$(YACC) $(YFLAGS)

				LEX_GEN = $(AM_V_LEX)$(LEX) $(LFLAGS)

				glsl/glsl_parser.cpp glsl/glsl_parser.h: glsl/glsl_parser.yy

					$(MKDIR_GEN)

					$(YACC_GEN) -o $@ -p "_mesa_glsl_" --defines=$(builddir)/glsl/glsl_parser.h $(srcdir)/glsl/glsl_parser.yy

				glsl/glsl_lexer.cpp: glsl/glsl_lexer.ll

					$(MKDIR_GEN)

					$(LEX_GEN) -o $@ $(srcdir)/glsl/glsl_lexer.ll

				glsl/glcpp/glcpp-parse.c glsl/glcpp/glcpp-parse.h: glsl/glcpp/glcpp-parse.y

					$(MKDIR_GEN)

					$(YACC_GEN) -o $@ -p "glcpp_parser_" --defines=$(builddir)/glsl/glcpp/glcpp-parse.h $(srcdir)/glsl/glcpp/glcpp-parse.y

				glsl/glcpp/glcpp-lex.c: glsl/glcpp/glcpp-lex.l

					$(MKDIR_GEN)

					$(LEX_GEN) -o $@ $(srcdir)/glsl/glcpp/glcpp-lex.l

				# Only the parsers (specifically the header files generated at the same time)

				# need to be in BUILT_SOURCES. Though if we list the parser headers YACC is

				# called for the .c/.cpp file and the .h files. By listing the .c/.cpp files

				# YACC is only executed once for each parser. The rest of the generated code

				# will be created at the appropriate times according to standard automake

				# dependency rules.

				BUILT_SOURCES +=					\

					glsl/glsl_parser.cpp				\

					glsl/glsl_lexer.cpp				\

					glsl/glcpp/glcpp-parse.c			\

					glsl/glcpp/glcpp-lex.c

				CLEANFILES +=						\

					glsl/glcpp/glcpp-parse.h			\

					glsl/glsl_parser.h				\

					glsl/glsl_parser.cpp				\

					glsl/glsl_lexer.cpp				\

					glsl/glcpp/glcpp-parse.c			\

					glsl/glcpp/glcpp-lex.c

				clean-local:

					$(RM) -r subtest-cr subtest-cr-lf subtest-lf subtest-lf-cr

				dist-hook:

					$(RM) glsl/glcpp/tests/*.out

					$(RM) glsl/glcpp/tests/subtest*/*.out

									
										89

src/compiler/Makefile.nir.am
									
										Normal file
									
												View File
												
				@@ -0,0 +1,89 @@

				#

				# Copyright © 2012 Jon TURNEY

				# Copyright (C) 2015 Intel Corporation

				#

				# Permission is hereby granted, free of charge, to any person obtaining a

				# copy of this software and associated documentation files (the "Software"),

				# to deal in the Software without restriction, including without limitation

				# the rights to use, copy, modify, merge, publish, distribute, sublicense,

				# and/or sell copies of the Software, and to permit persons to whom the

				# Software is furnished to do so, subject to the following conditions:

				#

				# The above copyright notice and this permission notice (including the next

				# paragraph) shall be included in all copies or substantial portions of the

				# Software.

				#

				# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL

				# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

				# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING

				# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS

				# IN THE SOFTWARE.

				noinst_LTLIBRARIES += nir/libnir.la

				nir_libnir_la_LIBADD = \

					libcompiler.la

				nir_libnir_la_SOURCES =					\

					$(NIR_FILES)					\

					$(SPIRV_FILES)					\

					$(NIR_GENERATED_FILES)

				PYTHON_GEN = $(AM_V_GEN)$(PYTHON2) $(PYTHON_FLAGS)

				nir/nir_builder_opcodes.h: nir/nir_opcodes.py nir/nir_builder_opcodes_h.py

					$(MKDIR_GEN)

					$(PYTHON_GEN) $(srcdir)/nir/nir_builder_opcodes_h.py > $@ || ($(RM) $@; false)

				nir/nir_constant_expressions.c: nir/nir_opcodes.py nir/nir_constant_expressions.py

					$(MKDIR_GEN)

					$(PYTHON_GEN) $(srcdir)/nir/nir_constant_expressions.py > $@ || ($(RM) $@; false)

				nir/nir_opcodes.h: nir/nir_opcodes.py nir/nir_opcodes_h.py

					$(MKDIR_GEN)

					$(PYTHON_GEN) $(srcdir)/nir/nir_opcodes_h.py > $@ || ($(RM) $@; false)

				nir/nir_opcodes.c: nir/nir_opcodes.py nir/nir_opcodes_c.py

					$(MKDIR_GEN)

					$(PYTHON_GEN) $(srcdir)/nir/nir_opcodes_c.py > $@ || ($(RM) $@; false)

				nir/nir_opt_algebraic.c: nir/nir_opt_algebraic.py nir/nir_algebraic.py

					$(MKDIR_GEN)

					$(PYTHON_GEN) $(srcdir)/nir/nir_opt_algebraic.py > $@ || ($(RM) $@; false)

				check_PROGRAMS += nir/tests/control_flow_tests

				nir_tests_control_flow_tests_CPPFLAGS = \

					$(AM_CPPFLAGS) \

					-I$(top_builddir)/src/compiler/nir \

					-I$(top_srcdir)/src/compiler/nir

				nir_tests_control_flow_tests_SOURCES =			\

					nir/tests/control_flow_tests.cpp

				nir_tests_control_flow_tests_CFLAGS =			\

					$(PTHREAD_CFLAGS)

				nir_tests_control_flow_tests_LDADD =			\

					$(top_builddir)/src/gtest/libgtest.la		\

					nir/libnir.la	\

					$(top_builddir)/src/util/libmesautil.la		\

					$(PTHREAD_LIBS)

				TESTS += nir/tests/control_flow_tests

				BUILT_SOURCES += $(NIR_GENERATED_FILES)

				CLEANFILES += $(NIR_GENERATED_FILES)

				EXTRA_DIST += \

					nir/nir_algebraic.py				\

					nir/nir_builder_opcodes_h.py			\

					nir/nir_constant_expressions.py			\

					nir/nir_opcodes.py				\

					nir/nir_opcodes_c.py				\

					nir/nir_opcodes_h.py				\

					nir/nir_opt_algebraic.py			\

					nir/tests

									
										255

src/compiler/Makefile.sources
									
										Normal file
									
												View File
												
				@@ -0,0 +1,255 @@

				LIBCOMPILER_FILES = \

					builtin_type_macros.h \

					glsl_types.cpp \

					glsl_types.h \

					nir_types.cpp \

					nir_types.h \

					shader_enums.c \

					shader_enums.h

				# libglsl

				LIBGLSL_FILES = \

					glsl/ast.h \

					glsl/ast_array_index.cpp \

					glsl/ast_expr.cpp \

					glsl/ast_function.cpp \

					glsl/ast_to_hir.cpp \

					glsl/ast_type.cpp \

					glsl/blob.c \

					glsl/blob.h \

					glsl/builtin_functions.cpp \

					glsl/builtin_types.cpp \

					glsl/builtin_variables.cpp \

					glsl/glsl_parser_extras.cpp \

					glsl/glsl_parser_extras.h \

					glsl/glsl_symbol_table.cpp \

					glsl/glsl_symbol_table.h \

					glsl/glsl_to_nir.cpp \

					glsl/glsl_to_nir.h \

					glsl/hir_field_selection.cpp \

					glsl/ir_basic_block.cpp \

					glsl/ir_basic_block.h \

					glsl/ir_builder.cpp \

					glsl/ir_builder.h \

					glsl/ir_clone.cpp \

					glsl/ir_constant_expression.cpp \

					glsl/ir.cpp \

					glsl/ir.h \

					glsl/ir_equals.cpp \

					glsl/ir_expression_flattening.cpp \

					glsl/ir_expression_flattening.h \

					glsl/ir_function_can_inline.cpp \

					glsl/ir_function_detect_recursion.cpp \

					glsl/ir_function_inlining.h \

					glsl/ir_function.cpp \

					glsl/ir_hierarchical_visitor.cpp \

					glsl/ir_hierarchical_visitor.h \

					glsl/ir_hv_accept.cpp \

					glsl/ir_import_prototypes.cpp \

					glsl/ir_optimization.h \

					glsl/ir_print_visitor.cpp \

					glsl/ir_print_visitor.h \

					glsl/ir_reader.cpp \

					glsl/ir_reader.h \

					glsl/ir_rvalue_visitor.cpp \

					glsl/ir_rvalue_visitor.h \

					glsl/ir_set_program_inouts.cpp \

					glsl/ir_uniform.h \

					glsl/ir_validate.cpp \

					glsl/ir_variable_refcount.cpp \

					glsl/ir_variable_refcount.h \

					glsl/ir_visitor.h \

					glsl/linker.cpp \

					glsl/linker.h \

					glsl/link_atomics.cpp \

					glsl/link_functions.cpp \

					glsl/link_interface_blocks.cpp \

					glsl/link_uniforms.cpp \

					glsl/link_uniform_initializers.cpp \

					glsl/link_uniform_block_active_visitor.cpp \

					glsl/link_uniform_block_active_visitor.h \

					glsl/link_uniform_blocks.cpp \

					glsl/link_varyings.cpp \

					glsl/link_varyings.h \

					glsl/list.h \

					glsl/loop_analysis.cpp \

					glsl/loop_analysis.h \

					glsl/loop_controls.cpp \

					glsl/loop_unroll.cpp \

					glsl/lower_buffer_access.cpp \

					glsl/lower_buffer_access.h \

					glsl/lower_const_arrays_to_uniforms.cpp \

					glsl/lower_discard.cpp \

					glsl/lower_discard_flow.cpp \

					glsl/lower_distance.cpp \

					glsl/lower_if_to_cond_assign.cpp \

					glsl/lower_instructions.cpp \

					glsl/lower_jumps.cpp \

					glsl/lower_mat_op_to_vec.cpp \

					glsl/lower_noise.cpp \

					glsl/lower_offset_array.cpp \

					glsl/lower_packed_varyings.cpp \

					glsl/lower_named_interface_blocks.cpp \

					glsl/lower_packing_builtins.cpp \

					glsl/lower_subroutine.cpp \

					glsl/lower_tess_level.cpp \

					glsl/lower_texture_projection.cpp \

					glsl/lower_variable_index_to_cond_assign.cpp \

					glsl/lower_vec_index_to_cond_assign.cpp \

					glsl/lower_vec_index_to_swizzle.cpp \

					glsl/lower_vector.cpp \

					glsl/lower_vector_derefs.cpp \

					glsl/lower_vector_insert.cpp \

					glsl/lower_vertex_id.cpp \

					glsl/lower_output_reads.cpp \

					glsl/lower_shared_reference.cpp \

					glsl/lower_ubo_reference.cpp \

					glsl/opt_algebraic.cpp \

					glsl/opt_array_splitting.cpp \

					glsl/opt_conditional_discard.cpp \

					glsl/opt_constant_folding.cpp \

					glsl/opt_constant_propagation.cpp \

					glsl/opt_constant_variable.cpp \

					glsl/opt_copy_propagation.cpp \

					glsl/opt_copy_propagation_elements.cpp \

					glsl/opt_dead_builtin_variables.cpp \

					glsl/opt_dead_builtin_varyings.cpp \

					glsl/opt_dead_code.cpp \

					glsl/opt_dead_code_local.cpp \

					glsl/opt_dead_functions.cpp \

					glsl/opt_flatten_nested_if_blocks.cpp \

					glsl/opt_flip_matrices.cpp \

					glsl/opt_function_inlining.cpp \

					glsl/opt_if_simplification.cpp \

					glsl/opt_minmax.cpp \

					glsl/opt_noop_swizzle.cpp \

					glsl/opt_rebalance_tree.cpp \

					glsl/opt_redundant_jumps.cpp \

					glsl/opt_structure_splitting.cpp \

					glsl/opt_swizzle_swizzle.cpp \

					glsl/opt_tree_grafting.cpp \

					glsl/opt_vectorize.cpp \

					glsl/program.h \

					glsl/propagate_invariance.cpp \

					glsl/s_expression.cpp \

					glsl/s_expression.h

				# glsl_compiler

				GLSL_COMPILER_CXX_FILES = \

					glsl/standalone_scaffolding.cpp \

					glsl/standalone_scaffolding.h \

					glsl/standalone.cpp \

					glsl/standalone.h

				# libglsl generated sources

				LIBGLSL_GENERATED_CXX_FILES = \

					glsl/glsl_lexer.cpp \

					glsl/glsl_parser.cpp

				# libglcpp

				LIBGLCPP_FILES = \

					glsl/glcpp/glcpp.h \

					glsl/glcpp/pp.c

				LIBGLCPP_GENERATED_FILES = \

					glsl/glcpp/glcpp-lex.c \

					glsl/glcpp/glcpp-parse.c

				NIR_GENERATED_FILES = \

					nir/nir_builder_opcodes.h \

					nir/nir_constant_expressions.c \

					nir/nir_opcodes.c \

					nir/nir_opcodes.h \

					nir/nir_opt_algebraic.c

				NIR_FILES = \

					nir/nir.c \

					nir/nir.h \

					nir/nir_array.h \

					nir/nir_builder.h \

					nir/nir_clone.c \

					nir/nir_constant_expressions.h \

					nir/nir_control_flow.c \

					nir/nir_control_flow.h \

					nir/nir_control_flow_private.h \

					nir/nir_dominance.c \

					nir/nir_from_ssa.c \

					nir/nir_gather_info.c \

					nir/nir_gs_count_vertices.c \

					nir/nir_inline_functions.c \

					nir/nir_instr_set.c \

					nir/nir_instr_set.h \

					nir/nir_intrinsics.c \

					nir/nir_intrinsics.h \

					nir/nir_liveness.c \

					nir/nir_lower_alu_to_scalar.c \

					nir/nir_lower_atomics.c \

					nir/nir_lower_bitmap.c \

					nir/nir_lower_clamp_color_outputs.c \

					nir/nir_lower_clip.c \

					nir/nir_lower_double_ops.c \

					nir/nir_lower_double_packing.c \

					nir/nir_lower_drawpixels.c \

					nir/nir_lower_global_vars_to_local.c \

					nir/nir_lower_gs_intrinsics.c \

					nir/nir_lower_load_const_to_scalar.c \

					nir/nir_lower_locals_to_regs.c \

					nir/nir_lower_idiv.c \

					nir/nir_lower_indirect_derefs.c \

					nir/nir_lower_io.c \

					nir/nir_lower_io_to_temporaries.c \

					nir/nir_lower_io_types.c \

					nir/nir_lower_passthrough_edgeflags.c \

					nir/nir_lower_phis_to_scalar.c \

					nir/nir_lower_returns.c \

					nir/nir_lower_samplers.c \

					nir/nir_lower_system_values.c \

					nir/nir_lower_tex.c \

					nir/nir_lower_to_source_mods.c \

					nir/nir_lower_two_sided_color.c \

					nir/nir_lower_vars_to_ssa.c \

					nir/nir_lower_var_copies.c \

					nir/nir_lower_vec_to_movs.c \

					nir/nir_lower_wpos_center.c \

					nir/nir_lower_wpos_ytransform.c \

					nir/nir_metadata.c \

					nir/nir_move_vec_src_uses_to_dest.c \

					nir/nir_normalize_cubemap_coords.c \

					nir/nir_opt_constant_folding.c \

					nir/nir_opt_copy_propagate.c \

					nir/nir_opt_cse.c \

					nir/nir_opt_dce.c \

					nir/nir_opt_dead_cf.c \

					nir/nir_opt_gcm.c \

					nir/nir_opt_global_to_local.c \

					nir/nir_opt_peephole_select.c \

					nir/nir_opt_remove_phis.c \

					nir/nir_opt_undef.c \

					nir/nir_phi_builder.c \

					nir/nir_phi_builder.h \

					nir/nir_print.c \

					nir/nir_remove_dead_variables.c \

					nir/nir_repair_ssa.c \

					nir/nir_search.c \

					nir/nir_search.h \

					nir/nir_split_var_copies.c \

					nir/nir_sweep.c \

					nir/nir_to_ssa.c \

					nir/nir_validate.c \

					nir/nir_vla.h \

					nir/nir_worklist.c \

					nir/nir_worklist.h

				SPIRV_FILES = \

					spirv/GLSL.std.450.h \

					spirv/nir_spirv.h \

					spirv/spirv.h \

					spirv/spirv_to_nir.c \

					spirv/vtn_alu.c \

					spirv/vtn_cfg.c \

					spirv/vtn_glsl450.c \

					spirv/vtn_private.h \

					spirv/vtn_variables.c

Compare commits

7544 Commits mesa-11.1. ... 12.0-branc

1 .dir-locals.el Unescape Escape View File

3 .gitignore vendored Unescape Escape View File

460 .mailmap Normal file Unescape Escape View File

101 .travis.yml Normal file Unescape Escape View File

25 Android.common.mk Unescape Escape View File

17 Android.mk Unescape Escape View File

14 Makefile.am Unescape Escape View File

106 REVIEWERS Normal file Unescape Escape View File

19 SConstruct Unescape Escape View File

2 VERSION Unescape Escape View File

76 appveyor.yml Normal file Unescape Escape View File

5 bin/.cherry-ignore Unescape Escape View File

35 bin/get-extra-pick-list.sh Executable file Unescape Escape View File

1 common.py Unescape Escape View File

485 configure.ac Unescape Escape View File

490 docs/COPYING Unescape Escape View File

382 docs/GL3.txt Unescape Escape View File

4 docs/contents.html Unescape Escape View File

4 docs/download.html Unescape Escape View File

8 docs/egl.html Unescape Escape View File

33 docs/envvars.html Unescape Escape View File

73 docs/index.html Unescape Escape View File

8 docs/install.html Unescape Escape View File

14 docs/license.html Unescape Escape View File

11 docs/relnotes.html Unescape Escape View File

2 docs/relnotes/11.0.5.html Unescape Escape View File

154 docs/relnotes/11.0.7.html Normal file Unescape Escape View File

200 docs/relnotes/11.0.8.html Normal file Unescape Escape View File

127 docs/relnotes/11.0.9.html Normal file Unescape Escape View File

3 docs/relnotes/11.1.1.html Unescape Escape View File

182 docs/relnotes/11.1.2.html Normal file Unescape Escape View File

319 docs/relnotes/11.1.3.html Normal file Unescape Escape View File

182 docs/relnotes/11.1.4.html Normal file Unescape Escape View File

296 docs/relnotes/11.2.0.html Normal file Unescape Escape View File

119 docs/relnotes/11.2.1.html Normal file Unescape Escape View File

210 docs/relnotes/11.2.2.html Normal file Unescape Escape View File

89 docs/relnotes/11.3.0.html Normal file Unescape Escape View File

34 docs/repository.html Unescape Escape View File

45 docs/shading.html Unescape Escape View File

3 docs/systems.html Unescape Escape View File

4 docs/thanks.html Unescape Escape View File

2 docs/utilities.html Unescape Escape View File

8 doxygen/.gitignore vendored Unescape Escape View File

7 doxygen/Makefile Unescape Escape View File

51 doxygen/common.doxy Unescape Escape View File

3 doxygen/core_subset.doxy Unescape Escape View File

9 doxygen/doxy.bat Unescape Escape View File

6 doxygen/gbm.doxy Unescape Escape View File

8 doxygen/glapi.doxy Unescape Escape View File

9 doxygen/glsl.doxy Unescape Escape View File

2 doxygen/header.html Unescape Escape View File

1 doxygen/header_subset.html Unescape Escape View File

2 doxygen/i965.doxy Unescape Escape View File

1 doxygen/main.doxy Unescape Escape View File

2 doxygen/math.doxy Unescape Escape View File

43 doxygen/shader.doxy → doxygen/nir.doxy Unescape Escape View File

3 doxygen/radeon_subset.doxy Unescape Escape View File

4 doxygen/swrast.doxy Unescape Escape View File

2 doxygen/swrast_setup.doxy Unescape Escape View File

9 doxygen/tnl.doxy Unescape Escape View File

5 doxygen/tnl_dd.doxy Unescape Escape View File

3 doxygen/vbo.doxy Unescape Escape View File

10 include/D3D9/d3d9.h Unescape Escape View File

21 include/D3D9/d3d9types.h Unescape Escape View File

11 include/EGL/eglmesaext.h Unescape Escape View File

70 include/GL/internal/dri_interface.h Unescape Escape View File

304 include/GL/mesa_glinterop.h Normal file Unescape Escape View File

45 include/GL/osmesa.h Unescape Escape View File

35 include/c11/threads_posix.h Unescape Escape View File

305 include/c99/inttypes.h Unescape Escape View File

247 include/c99/stdint.h Unescape Escape View File

58 include/c99_compat.h Unescape Escape View File

72 include/c99_math.h Unescape Escape View File

6 include/d3dadapter/drm.h Unescape Escape View File

10 include/d3dadapter/present.h Unescape Escape View File

24 include/pci_ids/i965_pci_ids.h Unescape Escape View File

22 include/pci_ids/radeonsi_pci_ids.h Unescape Escape View File

2 include/pci_ids/virtio_gpu_pci_ids.h Normal file Unescape Escape View File

7544 Commits

mesa-11.1. ... 12.0-branc

1

.dir-locals.el

View File

3

.gitignore vendored

View File

460

.mailmap Normal file

View File

101

.travis.yml Normal file

View File

25

Android.common.mk

View File

17

Android.mk

View File

14

Makefile.am

View File

106

REVIEWERS Normal file

View File

19

SConstruct

View File

2

VERSION

View File

76

appveyor.yml Normal file

View File

5

bin/.cherry-ignore

View File

35

bin/get-extra-pick-list.sh Executable file

View File

1

common.py

View File

485

configure.ac

View File

490

docs/COPYING

View File

382

docs/GL3.txt

View File

4

docs/contents.html

View File

4

docs/download.html

View File

8

docs/egl.html

View File

33

docs/envvars.html

View File

73

docs/index.html

View File

8

docs/install.html

View File

14

docs/license.html

View File

11

docs/relnotes.html

View File

2

docs/relnotes/11.0.5.html

View File

154

docs/relnotes/11.0.7.html Normal file

View File

200

docs/relnotes/11.0.8.html Normal file

View File

127

docs/relnotes/11.0.9.html Normal file

View File

3

docs/relnotes/11.1.1.html

View File

182

docs/relnotes/11.1.2.html Normal file

View File

319

docs/relnotes/11.1.3.html Normal file

View File

182

docs/relnotes/11.1.4.html Normal file

View File

296

docs/relnotes/11.2.0.html Normal file

View File

119

docs/relnotes/11.2.1.html Normal file

View File

210

docs/relnotes/11.2.2.html Normal file

View File

89

docs/relnotes/11.3.0.html Normal file

View File

34

docs/repository.html

View File

45

docs/shading.html

View File

3

docs/systems.html

View File

4

docs/thanks.html

View File

2

docs/utilities.html

View File

8

doxygen/.gitignore vendored

View File

7

doxygen/Makefile

View File

51

doxygen/common.doxy

View File

3

doxygen/core_subset.doxy

View File

9

doxygen/doxy.bat

View File

6

doxygen/gbm.doxy

View File

8

doxygen/glapi.doxy

View File

9

doxygen/glsl.doxy

View File

2

doxygen/header.html

View File

1

doxygen/header_subset.html

View File

2

doxygen/i965.doxy

View File

1

doxygen/main.doxy

View File

2

doxygen/math.doxy

View File

43

doxygen/shader.doxy → doxygen/nir.doxy

View File

3

doxygen/radeon_subset.doxy

View File

4

doxygen/swrast.doxy

View File

2

doxygen/swrast_setup.doxy

View File

9

doxygen/tnl.doxy

View File

5

doxygen/tnl_dd.doxy

View File

3

doxygen/vbo.doxy

View File

10

include/D3D9/d3d9.h

View File

21

include/D3D9/d3d9types.h

View File

11

include/EGL/eglmesaext.h

View File

70

include/GL/internal/dri_interface.h

View File

304

include/GL/mesa_glinterop.h Normal file

View File

45

include/GL/osmesa.h

View File

35

include/c11/threads_posix.h

View File

305

include/c99/inttypes.h

View File

247

include/c99/stdint.h

View File

58

include/c99_compat.h

View File

72

include/c99_math.h

View File

6

include/d3dadapter/drm.h

View File

10

include/d3dadapter/present.h

View File

24

include/pci_ids/i965_pci_ids.h

View File

22

include/pci_ids/radeonsi_pci_ids.h

View File

2

include/pci_ids/virtio_gpu_pci_ids.h Normal file

View File

85

include/vulkan/vk_icd.h Normal file

View File